咨询与建议

限定检索结果

文献类型

  • 315 篇 会议
  • 126 篇 期刊文献

馆藏范围

  • 441 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 315 篇 工学
    • 236 篇 计算机科学与技术...
    • 206 篇 软件工程
    • 98 篇 信息与通信工程
    • 24 篇 生物工程
    • 17 篇 控制科学与工程
    • 17 篇 化学工程与技术
    • 16 篇 电气工程
    • 14 篇 电子科学与技术(可...
    • 13 篇 仪器科学与技术
    • 10 篇 生物医学工程(可授...
    • 7 篇 机械工程
    • 7 篇 建筑学
    • 6 篇 安全科学与工程
    • 5 篇 土木工程
    • 5 篇 农业工程
  • 165 篇 理学
    • 118 篇 物理学
    • 54 篇 数学
    • 28 篇 生物学
    • 20 篇 统计学(可授理学、...
    • 17 篇 化学
    • 10 篇 系统科学
  • 78 篇 管理学
    • 69 篇 图书情报与档案管...
    • 6 篇 管理科学与工程(可...
  • 14 篇 医学
    • 12 篇 基础医学(可授医学...
    • 12 篇 临床医学
    • 7 篇 药学(可授医学、理...
  • 9 篇 法学
    • 7 篇 社会学
  • 8 篇 文学
    • 6 篇 中国语言文学
    • 5 篇 外国语言文学
  • 5 篇 教育学
  • 5 篇 农学
    • 5 篇 作物学
  • 1 篇 经济学

主题

  • 44 篇 speech recogniti...
  • 30 篇 speech
  • 30 篇 training
  • 18 篇 acoustics
  • 14 篇 machine translat...
  • 12 篇 decoding
  • 12 篇 social networkin...
  • 12 篇 speaker recognit...
  • 11 篇 computational mo...
  • 11 篇 semantics
  • 10 篇 conferences
  • 10 篇 hidden markov mo...
  • 9 篇 speech processin...
  • 9 篇 computational li...
  • 9 篇 embeddings
  • 8 篇 training data
  • 8 篇 feature extracti...
  • 8 篇 natural language...
  • 8 篇 pipelines
  • 7 篇 lattices

机构

  • 88 篇 human language t...
  • 54 篇 human language t...
  • 43 篇 center for langu...
  • 21 篇 center for langu...
  • 20 篇 human language t...
  • 20 篇 human language t...
  • 18 篇 center for langu...
  • 15 篇 human language t...
  • 13 篇 center for langu...
  • 12 篇 human language t...
  • 11 篇 human language t...
  • 10 篇 johns hopkins un...
  • 9 篇 johns hopkins un...
  • 8 篇 human language t...
  • 7 篇 human language t...
  • 7 篇 department of co...
  • 7 篇 xiaomi corp.
  • 6 篇 computer and inf...
  • 6 篇 xiaomi corporati...
  • 6 篇 center for langu...

作者

  • 64 篇 dredze mark
  • 50 篇 khudanpur sanjee...
  • 43 篇 van durme benjam...
  • 30 篇 dehak najim
  • 27 篇 sanjeev khudanpu...
  • 21 篇 post matt
  • 20 篇 mcnamee paul
  • 20 篇 hermansky hynek
  • 20 篇 callison-burch c...
  • 19 篇 villalba jesús
  • 18 篇 povey daniel
  • 16 篇 duh kevin
  • 16 篇 mayfield james
  • 15 篇 zelasko piotr
  • 15 篇 daniel povey
  • 15 篇 watanabe shinji
  • 14 篇 wiesner matthew
  • 14 篇 andrews nicholas
  • 13 篇 paul michael j.
  • 13 篇 mccree alan

语言

  • 432 篇 英文
  • 9 篇 其他
检索条件"机构=Center for Language and Speech Processing and Human Language Technology Center of Excellence"
441 条 记 录,以下是21-30 订阅
排序:
Leveraging Pretrained Image-text Models for Improving Audio-Visual Learning
arXiv
收藏 引用
arXiv 2023年
作者: Bhati, Saurabhchand Villalba, Jesús Moro-Velazquez, Laureano Thebaud, Thomas Dehak, Najim Center for Language and Speech Processing Johns Hopkins University United States Human Language Technology Center of Excellence Johns Hopkins University United States
Visually grounded speech systems learn from paired images and their spoken captions. Recently, there have been attempts to utilize the visually grounded models trained from images and their corresponding text captions... 详细信息
来源: 评论
Why Does Zero-Shot Cross-Lingual Generation Fail? An Explanation and a Solution
arXiv
收藏 引用
arXiv 2023年
作者: Li, Tianjian Murray, Kenton Center for Language and Speech Processing Johns Hopkins University United States Human Language Technology Center of Excellence Johns Hopkins University United States
Zero-shot cross-lingual transfer is when a multilingual model is trained to perform a task in one language and then is applied to another language. Although the zero-shot cross-lingual transfer approach has achieved s... 详细信息
来源: 评论
Building Keyword Search System from End-To-End Asr Systems
Building Keyword Search System from End-To-End Asr Systems
收藏 引用
International Conference on Acoustics, speech, and Signal processing (ICASSP)
作者: Ruizhe Huang Matthew Wiesner Leibny Paola Garcia-Perera Dan Povey Jan Trmal Sanjeev Khudanpur Center for Language and Speech Processing Johns Hopkins University USA Human Language Technology Center of Excellence Johns Hopkins University USA Xiaomi Corporation Beijing China
Keyword search (KWS) systems are commonly built on top of existing automatic speech recognition (ASR) systems. However, end-to-end (E2E) ASR models are not naturally equipped with word-level timing information or conf... 详细信息
来源: 评论
Privacy Versus Emotion Preservation Trade-Offs in Emotion-Preserving Speaker Anonymization
Privacy Versus Emotion Preservation Trade-Offs in Emotion-Pr...
收藏 引用
IEEE Spoken language technology Workshop
作者: Zexin Cai Henry Li Xinyuan Ashi Garg Leibny Paola García-Perera Kevin Duh Sanjeev Khudanpur Nicholas Andrews Matthew Wiesner Human Language Technology Center of Excellence Johns Hopkins University
Advances in speech technology now allow unprecedented access to personally identifiable information through speech. To protect such information, the differential privacy field has explored ways to anonymize speech whi... 详细信息
来源: 评论
SURT 2.0: Advances in Transducer-based Multi-talker speech Recognition
arXiv
收藏 引用
arXiv 2023年
作者: Raj, Desh Povey, Daniel Khudanpur, Sanjeev The Center for Language and Speech Processing Johns Hopkins University BaltimoreMD21218 United States Xiaomi Corp. Beijing China The Center for Language and Speech Processing Human Language Technology Center of Excellence Johns Hopkins University BaltimoreMD21218 United States
The Streaming Unmixing and Recognition Transducer (SURT) model was proposed recently as an end-to-end approach for continuous, streaming, multi-talker speech recognition (ASR). Despite impressive results on multi-turn... 详细信息
来源: 评论
Identifying Context-Dependent Translations for Evaluation Set Production
arXiv
收藏 引用
arXiv 2023年
作者: Wicks, Rachel Post, Matt Human Language Technology Center of Excellence Johns Hopkins University United States Center of Language and Speech Processing Johns Hopkins University United States Microsoft United States
A major impediment to the transition to context-aware machine translation is the absence of good evaluation metrics and test sets. Sentences that require context to be translated correctly are rare in test sets, reduc... 详细信息
来源: 评论
Finding Spoken Identifications: Using GPT-4 Annotation For An Efficient And Fast Dataset Creation Pipeline  30
Finding Spoken Identifications: Using GPT-4 Annotation For A...
收藏 引用
Joint 30th International Conference on Computational Linguistics and 14th International Conference on language Resources and Evaluation, LREC-COLING 2024
作者: Jahan, Maliha Wang, Helin Thebaud, Thomas Sun, Yinglun Le, Giang Fagyal, Zsuzsanna Scharenborg, Odette Hasegawa-Johnson, Mark Moro-Velazquez, Laureano Dehak, Najim Center for Language and Speech Processing Johns Hopkins University BaltimoreMD United States University of Illinois Urbana-Champaign ChampaignIL United States Multimedia Computing Group Delft University of Technology Netherlands
The growing emphasis on fairness in speech-processing tasks requires datasets with speakers from diverse subgroups that allow training and evaluating fair speech technology systems. However, creating such datasets thr... 详细信息
来源: 评论
Noise-robust speech Separation with Fast Generative Correction
arXiv
收藏 引用
arXiv 2024年
作者: Wang, Helin Villalba, Jesús Moro-Velazquez, Laureano Hai, Jiarui Thebaud, Thomas Dehak, Najim Center for Language and Speech Processing Johns Hopkins University United States Human Language Technology Center of Excellence Johns Hopkins University United States Laboratory for Computational Auditory Perception Johns Hopkins University United States
speech separation, the task of isolating multiple speech sources from a mixed audio signal, remains challenging in noisy environments. In this paper, we propose a generative correction method to enhance the output of ... 详细信息
来源: 评论
Self-FiLM: Conditioning GANs with self-supervised representations for bandwidth extension based speaker recognition
arXiv
收藏 引用
arXiv 2023年
作者: Kataria, Saurabh Villalba, Jesús Moro-Velázquez, Laureano Thebaud, Thomas Dehak, Najim Center for Language and Speech Processing Johns Hopkins University BaltimoreMD United States Human Language Technology Center of Excellence Johns Hopkins University BaltimoreMD United States
speech super-resolution/Bandwidth Extension (BWE) can improve downstream tasks like Automatic Speaker Verification (ASV). We introduce a simple novel technique called Self-FiLM to inject self-supervision into existing... 详细信息
来源: 评论
GenVC: Self-Supervised Zero-Shot Voice Conversion
arXiv
收藏 引用
arXiv 2025年
作者: Cai, Zexin Xinyuan, Henry Li Garg, Ashi García-Perera, Leibny Paola Duh, Kevin Khudanpur, Sanjeev Wiesner, Matthew Andrews, Nicholas Human Language Technology Center of Excellence Johns Hopkins University United States
Zero-shot voice conversion has recently made substantial progress, but many models still depend on external supervised systems to disentangle speaker identity and linguistic content. Furthermore, current methods often... 详细信息
来源: 评论