咨询与建议

限定检索结果

文献类型

  • 267 篇 会议
  • 154 篇 期刊文献

馆藏范围

  • 421 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 282 篇 工学
    • 184 篇 计算机科学与技术...
    • 164 篇 软件工程
    • 111 篇 信息与通信工程
    • 28 篇 生物工程
    • 27 篇 电子科学与技术(可...
    • 24 篇 电气工程
    • 23 篇 控制科学与工程
    • 21 篇 仪器科学与技术
    • 19 篇 化学工程与技术
    • 11 篇 机械工程
    • 8 篇 生物医学工程(可授...
    • 6 篇 光学工程
    • 5 篇 建筑学
    • 4 篇 土木工程
    • 3 篇 材料科学与工程(可...
  • 176 篇 理学
    • 137 篇 物理学
    • 56 篇 数学
    • 31 篇 生物学
    • 19 篇 化学
    • 16 篇 统计学(可授理学、...
    • 8 篇 系统科学
  • 44 篇 管理学
    • 37 篇 图书情报与档案管...
    • 7 篇 管理科学与工程(可...
  • 11 篇 法学
    • 11 篇 社会学
  • 8 篇 医学
    • 7 篇 临床医学
    • 6 篇 基础医学(可授医学...
    • 5 篇 药学(可授医学、理...
  • 7 篇 文学
    • 6 篇 中国语言文学
    • 5 篇 外国语言文学
  • 4 篇 教育学
    • 4 篇 教育学
  • 3 篇 农学
  • 2 篇 艺术学

主题

  • 59 篇 speech recogniti...
  • 51 篇 training
  • 33 篇 acoustics
  • 31 篇 speech
  • 20 篇 speech processin...
  • 19 篇 feature extracti...
  • 18 篇 hidden markov mo...
  • 18 篇 signal processin...
  • 16 篇 computational mo...
  • 15 篇 conferences
  • 14 篇 speech enhanceme...
  • 13 篇 predictive model...
  • 13 篇 decoding
  • 12 篇 machine translat...
  • 11 篇 speech synthesis
  • 10 篇 training data
  • 10 篇 neural networks
  • 10 篇 data models
  • 9 篇 transformers
  • 9 篇 self-supervised ...

机构

  • 70 篇 national enginee...
  • 51 篇 human language t...
  • 45 篇 center for langu...
  • 31 篇 human language t...
  • 21 篇 center for langu...
  • 21 篇 center for langu...
  • 13 篇 center for langu...
  • 11 篇 iflytek research
  • 10 篇 center for langu...
  • 9 篇 ict cluster sing...
  • 9 篇 human language t...
  • 8 篇 national enginee...
  • 8 篇 center for langu...
  • 8 篇 human language t...
  • 7 篇 center for langu...
  • 7 篇 human language t...
  • 7 篇 university of sc...
  • 7 篇 xiaomi corp.
  • 6 篇 university of sc...
  • 6 篇 state key labora...

作者

  • 49 篇 ling zhen-hua
  • 47 篇 khudanpur sanjee...
  • 35 篇 dehak najim
  • 32 篇 ai yang
  • 29 篇 sanjeev khudanpu...
  • 23 篇 dredze mark
  • 22 篇 zhen-hua ling
  • 19 篇 povey daniel
  • 18 篇 villalba jesús
  • 18 篇 van durme benjam...
  • 18 篇 daniel povey
  • 18 篇 yang ai
  • 17 篇 post matt
  • 16 篇 hermansky hynek
  • 16 篇 lu ye-xin
  • 15 篇 zelasko piotr
  • 14 篇 du hui-peng
  • 13 篇 raj desh
  • 13 篇 gu jia-chen
  • 13 篇 watanabe shinji

语言

  • 364 篇 英文
  • 57 篇 其他
检索条件"机构=Center for Language and Speech Processing & Human Language Technology"
421 条 记 录,以下是51-60 订阅
排序:
A Neural Denoising Vocoder for Clean Waveform Generation from Noisy Mel-Spectrogram based on Amplitude and Phase Predictions
arXiv
收藏 引用
arXiv 2024年
作者: Du, Hui-Peng Lu, Ye-Xin Ai, Yang Ling, Zhen-Hua National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei China
This paper proposes a novel neural denoising vocoder that can generate clean speech waveforms from noisy mel-spectrograms. The proposed neural denoising vocoder consists of two components, i.e., a spectrum predictor a... 详细信息
来源: 评论
PITCH-AND-SPECTRUM-AWARE SINGING QUALITY ASSESSMENT WITH BIAS CORRECTION AND MODEL FUSION
arXiv
收藏 引用
arXiv 2024年
作者: Shi, Yu-Fei Ai, Yang Lu, Ye-Xin Du, Hui-Peng Ling, Zhen-Hua National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei China
We participated in track 2 of the VoiceMOS Challenge 2024, which aimed to predict the mean opinion score (MOS) of singing samples. Our submission secured the first place among all participating teams, excluding the of... 详细信息
来源: 评论
Refining Self-Supervised Learnt speech Representation using Brain Activations
arXiv
收藏 引用
arXiv 2024年
作者: Li, Hengyu Mei, Kangdi Liu, Zhaoci Ai, Yang Chen, Liping Zhang, Jie Ling, Zhenhua National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei China
It was shown in literature that speech representations extracted by self-supervised pre-trained models exhibit similarities with brain activations of human for speech perception and fine-tuning speech representation m... 详细信息
来源: 评论
Identifying Context-Dependent Translations for Evaluation Set Production
arXiv
收藏 引用
arXiv 2023年
作者: Wicks, Rachel Post, Matt Human Language Technology Center of Excellence Johns Hopkins University United States Center of Language and Speech Processing Johns Hopkins University United States Microsoft United States
A major impediment to the transition to context-aware machine translation is the absence of good evaluation metrics and test sets. Sentences that require context to be translated correctly are rare in test sets, reduc... 详细信息
来源: 评论
Noise-robust speech Separation with Fast Generative Correction
arXiv
收藏 引用
arXiv 2024年
作者: Wang, Helin Villalba, Jesús Moro-Velazquez, Laureano Hai, Jiarui Thebaud, Thomas Dehak, Najim Center for Language and Speech Processing Johns Hopkins University United States Human Language Technology Center of Excellence Johns Hopkins University United States Laboratory for Computational Auditory Perception Johns Hopkins University United States
speech separation, the task of isolating multiple speech sources from a mixed audio signal, remains challenging in noisy environments. In this paper, we propose a generative correction method to enhance the output of ... 详细信息
来源: 评论
Self-FiLM: Conditioning GANs with self-supervised representations for bandwidth extension based speaker recognition
arXiv
收藏 引用
arXiv 2023年
作者: Kataria, Saurabh Villalba, Jesús Moro-Velázquez, Laureano Thebaud, Thomas Dehak, Najim Center for Language and Speech Processing Johns Hopkins University BaltimoreMD United States Human Language Technology Center of Excellence Johns Hopkins University BaltimoreMD United States
speech super-resolution/Bandwidth Extension (BWE) can improve downstream tasks like Automatic Speaker Verification (ASV). We introduce a simple novel technique called Self-FiLM to inject self-supervision into existing... 详细信息
来源: 评论
Low-Latency Neural speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses for speech Generation Tasks
arXiv
收藏 引用
arXiv 2024年
作者: Ai, Yang Ling, Zhen-Hua The National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei230027 China
This paper presents a novel neural speech phase prediction model which predicts wrapped phase spectra directly from amplitude spectra. The proposed model is a cascade of a residual convolutional network and a parallel... 详细信息
来源: 评论
HK-LegiCoST: Leveraging Non-Verbatim Transcripts for speech Translation
arXiv
收藏 引用
arXiv 2023年
作者: Xiao, Cihan Xinyuan, Henry Li Yang, Jinyi Gao, Dongji Wiesner, Matthew Duh, Kevin Khudanpur, Sanjeev Center for Language and Speech Processing Johns Hopkins University BaltimoreMD United States Human Language Technology Center of Excellence Johns Hopkins University BaltimoreMD United States
We introduce HK-LegiCoST, a new three-way parallel corpus of Cantonese-English translations, containing 600+ hours of Cantonese audio, its standard traditional Chinese transcript, and English translation, segmented an... 详细信息
来源: 评论
APCodec: A Neural Audio Codec with Parallel Amplitude and Phase Spectrum Encoding and Decoding
arXiv
收藏 引用
arXiv 2024年
作者: Ai, Yang Jiang, Xiao-Hang Lu, Ye-Xin Du, Hui-Peng Ling, Zhen-Hua National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei230027 China
This paper introduces a novel neural audio codec targeting high waveform sampling rates and low bitrates named APCodec, which seamlessly integrates the strengths of parametric codecs and waveform codecs. The APCodec r... 详细信息
来源: 评论
ADAPTING SELF-SUPERVISED MODELS TO MULTI-TALKER speech RECOGNITION USING SPEAKER EMBEDDINGS
arXiv
收藏 引用
arXiv 2022年
作者: Huang, Zili Raj, Desh García, Paola Khudanpur, Sanjeev Center for Language and Speech Processing Human Language Technology Center of Excellence Johns Hopkins University Baltimore United States
Self-supervised learning (SSL) methods which learn representations of data without explicit supervision have gained popularity in speech-processing tasks, particularly for single-talker applications. However, these mo... 详细信息
来源: 评论