咨询与建议

限定检索结果

文献类型

  • 528 篇 会议
  • 297 篇 期刊文献
  • 3 册 图书

馆藏范围

  • 828 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 520 篇 工学
    • 387 篇 计算机科学与技术...
    • 336 篇 软件工程
    • 142 篇 信息与通信工程
    • 56 篇 生物工程
    • 45 篇 控制科学与工程
    • 40 篇 电子科学与技术(可...
    • 35 篇 仪器科学与技术
    • 33 篇 化学工程与技术
    • 30 篇 电气工程
    • 21 篇 生物医学工程(可授...
    • 16 篇 机械工程
    • 16 篇 光学工程
    • 7 篇 建筑学
    • 6 篇 材料科学与工程(可...
  • 291 篇 理学
    • 167 篇 物理学
    • 118 篇 数学
    • 62 篇 生物学
    • 55 篇 统计学(可授理学、...
    • 31 篇 化学
    • 18 篇 系统科学
  • 120 篇 管理学
    • 79 篇 图书情报与档案管...
    • 45 篇 管理科学与工程(可...
    • 15 篇 工商管理
  • 15 篇 法学
    • 13 篇 社会学
  • 15 篇 医学
    • 13 篇 临床医学
    • 10 篇 基础医学(可授医学...
    • 8 篇 药学(可授医学、理...
  • 12 篇 文学
    • 8 篇 中国语言文学
    • 8 篇 外国语言文学
  • 10 篇 农学
    • 7 篇 作物学
  • 4 篇 教育学
  • 3 篇 经济学
  • 3 篇 艺术学
  • 1 篇 军事学

主题

  • 77 篇 speech recogniti...
  • 73 篇 training
  • 50 篇 acoustics
  • 46 篇 speech processin...
  • 44 篇 speech
  • 33 篇 hidden markov mo...
  • 31 篇 signal processin...
  • 29 篇 feature extracti...
  • 26 篇 decoding
  • 23 篇 speech enhanceme...
  • 21 篇 computational mo...
  • 20 篇 speech synthesis
  • 20 篇 linguistics
  • 19 篇 predictive model...
  • 18 篇 data models
  • 17 篇 neural networks
  • 17 篇 natural language...
  • 16 篇 accuracy
  • 15 篇 conferences
  • 15 篇 training data

机构

  • 70 篇 national enginee...
  • 55 篇 school of comput...
  • 47 篇 audio speech and...
  • 42 篇 beijing engineer...
  • 27 篇 department of co...
  • 25 篇 center for langu...
  • 21 篇 department of co...
  • 18 篇 mainlp center fo...
  • 18 篇 department of co...
  • 15 篇 audio speech and...
  • 14 篇 iflytek research
  • 14 篇 national enginee...
  • 12 篇 munich
  • 11 篇 department of co...
  • 10 篇 center for infor...
  • 10 篇 ict cluster sing...
  • 10 篇 audio speech and...
  • 9 篇 center for infor...
  • 9 篇 department of co...
  • 9 篇 center for speec...

作者

  • 71 篇 lei xie
  • 54 篇 ling zhen-hua
  • 37 篇 huang heyan
  • 32 篇 ai yang
  • 23 篇 plank barbara
  • 21 篇 zhen-hua ling
  • 18 篇 zheng thomas fan...
  • 18 篇 yarowsky david
  • 18 篇 thomas fang zhen...
  • 18 篇 yang ai
  • 17 篇 wang dong
  • 17 篇 heyan huang
  • 17 篇 khudanpur sanjee...
  • 16 篇 lu ye-xin
  • 15 篇 pengcheng guo
  • 15 篇 gu jia-chen
  • 15 篇 van der goot rob
  • 14 篇 du jun
  • 14 篇 mao xian-ling
  • 14 篇 xie lei

语言

  • 739 篇 英文
  • 84 篇 其他
  • 8 篇 中文
检索条件"机构=Center for Language and Speech Processing and Computer Science"
828 条 记 录,以下是61-70 订阅
排序:
A Neural Denoising Vocoder for Clean Waveform Generation from Noisy Mel-Spectrogram based on Amplitude and Phase Predictions
arXiv
收藏 引用
arXiv 2024年
作者: Du, Hui-Peng Lu, Ye-Xin Ai, Yang Ling, Zhen-Hua National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei China
This paper proposes a novel neural denoising vocoder that can generate clean speech waveforms from noisy mel-spectrograms. The proposed neural denoising vocoder consists of two components, i.e., a spectrum predictor a... 详细信息
来源: 评论
Refining Self-Supervised Learnt speech Representation using Brain Activations
arXiv
收藏 引用
arXiv 2024年
作者: Li, Hengyu Mei, Kangdi Liu, Zhaoci Ai, Yang Chen, Liping Zhang, Jie Ling, Zhenhua National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei China
It was shown in literature that speech representations extracted by self-supervised pre-trained models exhibit similarities with brain activations of human for speech perception and fine-tuning speech representation m... 详细信息
来源: 评论
PITCH-AND-SPECTRUM-AWARE SINGING QUALITY ASSESSMENT WITH BIAS CORRECTION AND MODEL FUSION
arXiv
收藏 引用
arXiv 2024年
作者: Shi, Yu-Fei Ai, Yang Lu, Ye-Xin Du, Hui-Peng Ling, Zhen-Hua National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei China
We participated in track 2 of the VoiceMOS Challenge 2024, which aimed to predict the mean opinion score (MOS) of singing samples. Our submission secured the first place among all participating teams, excluding the of... 详细信息
来源: 评论
SQ-Whisper: Speaker-Querying Based Whisper Model for Target-Speaker ASR
IEEE Transactions on Audio, Speech and Language Processing
收藏 引用
IEEE Transactions on Audio, speech and language processing 2024年 33卷 175-185页
作者: Pengcheng Guo Xuankai Chang Hang Lv Shinji Watanabe Lei Xie Audio Speech and Language Processing Group (ASLP@NPU) School of Computer Science Northwestern Polytechnical University Xi'an China Carnegie Mellon University Pittsburgh PA USA
Benefiting from massive and diverse data sources, speech foundation models exhibit strong generalization and knowledge transfer capabilities to a wide range of downstream tasks. However, a limitation arises from their... 详细信息
来源: 评论
Effective Integration of Text Diffusion and Pre-Trained language Models with Linguistic Easy-First Schedule  30
Effective Integration of Text Diffusion and Pre-Trained Lang...
收藏 引用
Joint 30th International Conference on Computational Linguistics and 14th International Conference on language Resources and Evaluation, LREC-COLING 2024
作者: Ou, Yimin Jian, Ping School of Computer Science and Technology Beijing Institute of Technology Beijing China Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications Beijing Institute of Technology Beijing China
Diffusion models have become a powerful generative modeling paradigm, achieving great success in continuous data patterns. However, the discrete nature of text data results in compatibility issues between continuous d... 详细信息
来源: 评论
Improving Implicit Discourse Relation Recognition with Semantics Confrontation  30
Improving Implicit Discourse Relation Recognition with Seman...
收藏 引用
Joint 30th International Conference on Computational Linguistics and 14th International Conference on language Resources and Evaluation, LREC-COLING 2024
作者: Cai, Mingyang Yang, Zhen Jian, Ping School of Computer Science and Technology Beijing Institute of Technology Beijing China Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications Beijing Institute of Technology Beijing China
Implicit Discourse Relation Recognition (IDRR), which infers discourse logical relations without explicit connectives, is one of the most challenging tasks in natural language processing (NLP). Recently, pre-trained l... 详细信息
来源: 评论
Enhancing Lip Reading with Multi-Scale Video and Multi-Encoder
Enhancing Lip Reading with Multi-Scale Video and Multi-Encod...
收藏 引用
IEEE International Conference on Multimedia and Expo Workshops (ICMEW)
作者: He Wang Pengcheng Guo Xucheng Wan Huan Zhou Lei Xie Audio Speech and Language Processing Group (ASLP@NPU) School of Computer Science Northwestern Polytechnical University Xian China IT Innovation and Research Center Huawei Technologies
Automatic lip-reading (ALR) aims to automatically tran-scribe spoken content from a speaker's silent lip motion captured in video. Current mainstream lip-reading approaches only use a single visual encoder to mode... 详细信息
来源: 评论
PQLM - Multilingual Decentralized Portable Quantum language Model  48
PQLM - Multilingual Decentralized Portable Quantum Language ...
收藏 引用
48th IEEE International Conference on Acoustics, speech and Signal processing, ICASSP 2023
作者: Li, Shuyue Stella Zhang, Xiangyu Zhou, Shu Shu, Hongchao Liang, Ruixing Liu, Hexin Garcia, Leibny Paola Hong Kong University of Science and Technology Department of Physics Hong Kong Nanyang Technological University School of Electrical and Electronic Engineering Singapore Johns Hopkins University Center for Language and Speech Processing United States Johns Hopkins University Human Language Technology Center of Excellence United States
With careful manipulation, malicious agents can reverse engineer private information encoded in pre-trained language models. Security concerns motivate the development of quantum pre-training. In this work, we propose... 详细信息
来源: 评论
Adaptive Data Augmentation with Naturalspeech3 for Far-field Speaker Verification
Adaptive Data Augmentation with NaturalSpeech3 for Far-field...
收藏 引用
IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
作者: Li Zhang Jiyao Liu Lei Xie Audio Speech and Language Processing Group (ASLP@NPU) School of Computer Science Northwestern Polytechnical University (NPU) Xi’an China
The scarcity of speaker-annotated far-field speech presents a significant challenge in developing high-performance far-field speaker verification (SV) systems. While data augmentation using large-scale near-field spee... 详细信息
来源: 评论
Low-Latency Neural speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses for speech Generation Tasks
arXiv
收藏 引用
arXiv 2024年
作者: Ai, Yang Ling, Zhen-Hua The National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei230027 China
This paper presents a novel neural speech phase prediction model which predicts wrapped phase spectra directly from amplitude spectra. The proposed model is a cascade of a residual convolutional network and a parallel... 详细信息
来源: 评论