咨询与建议

限定检索结果

文献类型

  • 267 篇 会议
  • 154 篇 期刊文献

馆藏范围

  • 421 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 282 篇 工学
    • 184 篇 计算机科学与技术...
    • 164 篇 软件工程
    • 111 篇 信息与通信工程
    • 28 篇 生物工程
    • 27 篇 电子科学与技术(可...
    • 24 篇 电气工程
    • 23 篇 控制科学与工程
    • 21 篇 仪器科学与技术
    • 19 篇 化学工程与技术
    • 11 篇 机械工程
    • 8 篇 生物医学工程(可授...
    • 6 篇 光学工程
    • 5 篇 建筑学
    • 4 篇 土木工程
    • 3 篇 材料科学与工程(可...
  • 176 篇 理学
    • 137 篇 物理学
    • 56 篇 数学
    • 31 篇 生物学
    • 19 篇 化学
    • 16 篇 统计学(可授理学、...
    • 8 篇 系统科学
  • 44 篇 管理学
    • 37 篇 图书情报与档案管...
    • 7 篇 管理科学与工程(可...
  • 11 篇 法学
    • 11 篇 社会学
  • 8 篇 医学
    • 7 篇 临床医学
    • 6 篇 基础医学(可授医学...
    • 5 篇 药学(可授医学、理...
  • 7 篇 文学
    • 6 篇 中国语言文学
    • 5 篇 外国语言文学
  • 4 篇 教育学
    • 4 篇 教育学
  • 3 篇 农学
  • 2 篇 艺术学

主题

  • 59 篇 speech recogniti...
  • 51 篇 training
  • 33 篇 acoustics
  • 31 篇 speech
  • 20 篇 speech processin...
  • 19 篇 feature extracti...
  • 18 篇 hidden markov mo...
  • 18 篇 signal processin...
  • 16 篇 computational mo...
  • 15 篇 conferences
  • 14 篇 speech enhanceme...
  • 13 篇 predictive model...
  • 13 篇 decoding
  • 12 篇 machine translat...
  • 11 篇 speech synthesis
  • 10 篇 training data
  • 10 篇 neural networks
  • 10 篇 data models
  • 9 篇 transformers
  • 9 篇 self-supervised ...

机构

  • 70 篇 national enginee...
  • 51 篇 human language t...
  • 45 篇 center for langu...
  • 31 篇 human language t...
  • 21 篇 center for langu...
  • 21 篇 center for langu...
  • 13 篇 center for langu...
  • 11 篇 iflytek research
  • 10 篇 center for langu...
  • 9 篇 ict cluster sing...
  • 9 篇 human language t...
  • 8 篇 national enginee...
  • 8 篇 center for langu...
  • 8 篇 human language t...
  • 7 篇 center for langu...
  • 7 篇 human language t...
  • 7 篇 university of sc...
  • 7 篇 xiaomi corp.
  • 6 篇 university of sc...
  • 6 篇 state key labora...

作者

  • 49 篇 ling zhen-hua
  • 47 篇 khudanpur sanjee...
  • 35 篇 dehak najim
  • 32 篇 ai yang
  • 29 篇 sanjeev khudanpu...
  • 23 篇 dredze mark
  • 22 篇 zhen-hua ling
  • 19 篇 povey daniel
  • 18 篇 villalba jesús
  • 18 篇 van durme benjam...
  • 18 篇 daniel povey
  • 18 篇 yang ai
  • 17 篇 post matt
  • 16 篇 hermansky hynek
  • 16 篇 lu ye-xin
  • 15 篇 zelasko piotr
  • 14 篇 du hui-peng
  • 13 篇 raj desh
  • 13 篇 gu jia-chen
  • 13 篇 watanabe shinji

语言

  • 364 篇 英文
  • 57 篇 其他
检索条件"机构=Center for Language and Speech Processing & Human Language Technology"
421 条 记 录,以下是31-40 订阅
排序:
Finding Spoken Identifications: Using GPT-4 Annotation For An Efficient And Fast Dataset Creation Pipeline  30
Finding Spoken Identifications: Using GPT-4 Annotation For A...
收藏 引用
Joint 30th International Conference on Computational Linguistics and 14th International Conference on language Resources and Evaluation, LREC-COLING 2024
作者: Jahan, Maliha Wang, Helin Thebaud, Thomas Sun, Yinglun Le, Giang Fagyal, Zsuzsanna Scharenborg, Odette Hasegawa-Johnson, Mark Moro-Velazquez, Laureano Dehak, Najim Center for Language and Speech Processing Johns Hopkins University BaltimoreMD United States University of Illinois Urbana-Champaign ChampaignIL United States Multimedia Computing Group Delft University of Technology Netherlands
The growing emphasis on fairness in speech-processing tasks requires datasets with speakers from diverse subgroups that allow training and evaluating fair speech technology systems. However, creating such datasets thr... 详细信息
来源: 评论
Joint Generative-Contrastive Representation Learning for Anomalous Sound Detection  48
Joint Generative-Contrastive Representation Learning for Ano...
收藏 引用
48th IEEE International Conference on Acoustics, speech and Signal processing, ICASSP 2023
作者: Zeng, Xiao-Min Song, Yan Zhuo, Zhu Zhou, Yu Li, Yu-Hong Xue, Hui Dai, Li-Rong McLoughlin, Ian Alibaba Group China University of Science and Technology of China National Engineering Research Center of Speech and Language Information Processing Hefei China Singapore Institute of Technology Ict Cluster Singapore
In this paper, we propose a joint generative and contrastive representation learning method (GeCo) for anomalous sound detection (ASD). GeCo exploits a Predictive AutoEncoder (PAE) equipped with self-attention as a ge... 详细信息
来源: 评论
Voice Attribute Editing With Text Prompt
IEEE Transactions on Audio, Speech and Language Processing
收藏 引用
IEEE Transactions on Audio, speech and language processing 2025年 33卷 1641-1652页
作者: Zheng-Yan Sheng Li-Juan Liu Yang Ai Jia Pan Zhen-Hua Ling National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei China iFLYTEK Research Hefei China
Despite recent advancements in speech generation with text prompt providing control over speech style, voice attributes in synthesized speech remain elusive and challenging to control. This paper introduces a novel ta... 详细信息
来源: 评论
SELF-SUPERVISED LEARNING WITH speech MODULATION DROPOUT
arXiv
收藏 引用
arXiv 2023年
作者: Sadhu, Samik Hermansky, Hynek Center for Language and Speech Processing Johns Hopkins University United States Human Language Technology Center of Excellence Johns Hopkins University United States
We show that training a multi-headed self-attention-based deep network to predict deleted, information-dense 2-8 Hz speech modulations over a 1.5-second section of a speech utterance is an effective way to make machin... 详细信息
来源: 评论
Leveraging Pretrained Image-text Models for Improving Audio-Visual Learning
arXiv
收藏 引用
arXiv 2023年
作者: Bhati, Saurabhchand Villalba, Jesús Moro-Velazquez, Laureano Thebaud, Thomas Dehak, Najim Center for Language and Speech Processing Johns Hopkins University United States Human Language Technology Center of Excellence Johns Hopkins University United States
Visually grounded speech systems learn from paired images and their spoken captions. Recently, there have been attempts to utilize the visually grounded models trained from images and their corresponding text captions... 详细信息
来源: 评论
Why Does Zero-Shot Cross-Lingual Generation Fail? An Explanation and a Solution
arXiv
收藏 引用
arXiv 2023年
作者: Li, Tianjian Murray, Kenton Center for Language and Speech Processing Johns Hopkins University United States Human Language Technology Center of Excellence Johns Hopkins University United States
Zero-shot cross-lingual transfer is when a multilingual model is trained to perform a task in one language and then is applied to another language. Although the zero-shot cross-lingual transfer approach has achieved s... 详细信息
来源: 评论
RPO: Retrieval Preference Optimization for Robust Retrieval-Augmented Generation
arXiv
收藏 引用
arXiv 2025年
作者: Yan, Shi-Qi Ling, Zhen-Hua National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei China
While Retrieval-Augmented Generation (RAG) has exhibited promise in utilizing external knowledge, its generation process heavily depends on the quality and accuracy of the retrieved context. Large language models (LLM... 详细信息
来源: 评论
ERVQ: Enhanced Residual Vector Quantization with Intra-and-Inter-Codebook Optimization for Neural Audio Codecs
arXiv
收藏 引用
arXiv 2024年
作者: Zheng, Rui-Chen Du, Hui-Peng Jiang, Xiao-Hang Ai, Yang Ling, Zhen-Hua National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China China
Current neural audio codecs typically use residual vector quantization (RVQ) to discretize speech signals. However, they often experience codebook collapse, which reduces the effective codebook size and leads to subop... 详细信息
来源: 评论
SAMOS: A Neural MOS Prediction Model Leveraging Semantic Representations and Acoustic Features
SAMOS: A Neural MOS Prediction Model Leveraging Semantic Rep...
收藏 引用
International Symposium on Chinese Spoken language processing
作者: Yu-Fei Shi Yang Ai Ye-Xin Lu Hui-Peng Du Zhen-Hua Ling National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei
Assessing the naturalness of speech using mean opinion score (MOS) prediction models has positive implications for the auto-matic evaluation of speech synthesis systems. Early MOS prediction models took the raw wavefo... 详细信息
来源: 评论
APCodec+: A Spectrum-Coding-Based High-Fidelity and High-Compression-Rate Neural Audio Codec with Staged Training Paradigm
APCodec+: A Spectrum-Coding-Based High-Fidelity and High-Com...
收藏 引用
International Symposium on Chinese Spoken language processing
作者: Hui-Peng Du Yang Ai Rui-Chen Zheng Zhen-Hua Ling National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei
This paper proposes a novel neural audio codec, named AP-Codec+, which is an improved version of APCodec. The AP-Codec+ takes the audio amplitude and phase spectra as the coding object, and employs an adversarial trai... 详细信息
来源: 评论