咨询与建议

限定检索结果

文献类型

  • 267 篇 会议
  • 155 篇 期刊文献

馆藏范围

  • 422 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 282 篇 工学
    • 184 篇 计算机科学与技术...
    • 164 篇 软件工程
    • 111 篇 信息与通信工程
    • 28 篇 生物工程
    • 27 篇 电子科学与技术(可...
    • 24 篇 电气工程
    • 23 篇 控制科学与工程
    • 21 篇 仪器科学与技术
    • 19 篇 化学工程与技术
    • 11 篇 机械工程
    • 8 篇 生物医学工程(可授...
    • 6 篇 光学工程
    • 5 篇 建筑学
    • 4 篇 土木工程
    • 3 篇 材料科学与工程(可...
  • 176 篇 理学
    • 137 篇 物理学
    • 56 篇 数学
    • 31 篇 生物学
    • 19 篇 化学
    • 16 篇 统计学(可授理学、...
    • 8 篇 系统科学
  • 44 篇 管理学
    • 37 篇 图书情报与档案管...
    • 7 篇 管理科学与工程(可...
  • 11 篇 法学
    • 11 篇 社会学
  • 8 篇 医学
    • 7 篇 临床医学
    • 6 篇 基础医学(可授医学...
    • 5 篇 药学(可授医学、理...
  • 7 篇 文学
    • 6 篇 中国语言文学
    • 5 篇 外国语言文学
  • 4 篇 教育学
    • 4 篇 教育学
  • 3 篇 农学
  • 2 篇 艺术学

主题

  • 59 篇 speech recogniti...
  • 52 篇 training
  • 33 篇 acoustics
  • 31 篇 speech
  • 20 篇 speech processin...
  • 19 篇 feature extracti...
  • 18 篇 hidden markov mo...
  • 18 篇 signal processin...
  • 16 篇 computational mo...
  • 15 篇 conferences
  • 14 篇 speech enhanceme...
  • 13 篇 predictive model...
  • 13 篇 decoding
  • 12 篇 machine translat...
  • 11 篇 speech synthesis
  • 10 篇 training data
  • 10 篇 neural networks
  • 10 篇 data models
  • 9 篇 transformers
  • 9 篇 self-supervised ...

机构

  • 71 篇 national enginee...
  • 51 篇 human language t...
  • 46 篇 center for langu...
  • 31 篇 human language t...
  • 21 篇 center for langu...
  • 21 篇 center for langu...
  • 13 篇 center for langu...
  • 11 篇 iflytek research
  • 10 篇 center for langu...
  • 9 篇 ict cluster sing...
  • 9 篇 human language t...
  • 8 篇 national enginee...
  • 8 篇 center for langu...
  • 8 篇 human language t...
  • 7 篇 center for langu...
  • 7 篇 human language t...
  • 7 篇 university of sc...
  • 7 篇 xiaomi corp.
  • 6 篇 university of sc...
  • 6 篇 state key labora...

作者

  • 49 篇 ling zhen-hua
  • 47 篇 khudanpur sanjee...
  • 35 篇 dehak najim
  • 32 篇 ai yang
  • 29 篇 sanjeev khudanpu...
  • 23 篇 zhen-hua ling
  • 23 篇 dredze mark
  • 19 篇 povey daniel
  • 19 篇 yang ai
  • 18 篇 villalba jesús
  • 18 篇 van durme benjam...
  • 18 篇 daniel povey
  • 17 篇 post matt
  • 16 篇 hermansky hynek
  • 16 篇 lu ye-xin
  • 15 篇 zelasko piotr
  • 14 篇 du hui-peng
  • 13 篇 raj desh
  • 13 篇 gu jia-chen
  • 13 篇 watanabe shinji

语言

  • 344 篇 英文
  • 78 篇 其他
  • 2 篇 中文
检索条件"机构=Center for Language and Speech Processing & Human Language Technology"
422 条 记 录,以下是181-190 订阅
排序:
Learning speaker embedding from text-to-speech
arXiv
收藏 引用
arXiv 2020年
作者: Cho, Jaejin Zelasko, Piotr Villalba, Jesús Watanabe, Shinji Dehak, Najim Center for Language and Speech Processing Human Language Technology Center of Excellence Johns Hopkins University BaltimoreMD United States
Zero-shot multi-speaker Text-to-speech (TTS) generates target speaker voices given an input text and the corresponding speaker embedding. In this work, we investigate the effectiveness of the TTS reconstruction object... 详细信息
来源: 评论
Wake Word Detection with Streaming Transformers
Wake Word Detection with Streaming Transformers
收藏 引用
IEEE International Conference on Acoustics, speech and Signal processing
作者: Yiming Wang Hang Lv Daniel Povey Lei Xie Sanjeev Khudanpur Center for Language and Speech Processing Johns Hopkins University Baltimore MD USA School of Computer Science Northwestern Polytechnical University Xi’an China Xiaomi Corporation Beijing China Human Language Technology Center of Excellence Johns Hopkins University Baltimore MD USA
Modern wake word detection systems usually rely on neural networks for acoustic modeling. Transformers has recently shown superior performance over LSTM and convolutional networks in various sequence modeling tasks wi... 详细信息
来源: 评论
Eye movement patterns are similar during accurate multiple-target tracking
Eye movement patterns are similar during accurate multiple-t...
收藏 引用
International Conference on Cognitive Infocommunications (CogInfoCom)
作者: Kamyar Bagha Shiva Kamkar Hamid Abrishami Moghaddam Lauri Oksama Jie Li Jukka Hyönä Computer Engineering Department Khatam University Tehran Iran Machine Vision and Medical Image Processing (MVMIP) Laboratory Faculty of Electrical Engineering K.N.Toosi University of Technology Tehran Iran Center for International Scientific Studies and Collaboration (CISSC) Tehran Iran Department of Psychology and Speech-Language Pathology University of Turku Turku Finland Center for Cognition and Brain Disorders Hangzhou Normal University Hangzhou China
Understanding how the brain works is a base of cognitive info-communication. To this aim we focus on multiple target tracking (MTT) as a key task that involves two important cognitive factors, attention and memory. Hu... 详细信息
来源: 评论
The JHU Multi-Microphone Multi-Speaker ASR System for the CHiME-6 Challenge
arXiv
收藏 引用
arXiv 2020年
作者: Arora, Ashish Raj, Desh Subramanian, Aswin Shanmugam Li, Ke Ben-Yair, Bar Maciejewski, Matthew Zelasko, Piotr García, Paola Watanabe, Shinji Khudanpur, Sanjeev Center for Language and Speech Processing & Human Language Technology Center of Excellence Johns Hopkins University BaltimoreMD21218 United States
This paper summarizes the JHU team’s efforts in tracks 1 and 2 of the CHiME-6 challenge for distant multi-microphone conversational speech diarization and recognition in everyday home environments. We explore multi-a... 详细信息
来源: 评论
WIDER & CLOSER: Mixture of Short-channel Distillers for Zero-shot Cross-lingual Named Entity Recognition
arXiv
收藏 引用
arXiv 2022年
作者: Ma, Jun-Yu Chen, Beiduo Gu, Jia-Chen Ling, Zhen-Hua Guo, Wu Liu, Quan Chen, Zhigang Liu, Cong National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei China State Key Laboratory of Cognitive Intelligence China iFLYTEK Research Hefei China Jilin Kexun Information Technology Co. Ltd China
Zero-shot cross-lingual named entity recognition (NER) aims at transferring knowledge from annotated and rich-resource data in source languages to unlabeled and lean-resource data in target languages. Existing mainstr... 详细信息
来源: 评论
CopyPaste: An augmentation method for speech emotion recognition
arXiv
收藏 引用
arXiv 2020年
作者: Pappagari, Raghavendra Villalba, Jesús Zelasko, Piotr Moro-Velazquez, Laureano Dehak, Najim Center for Language and Speech Processing United States Human Language Technology Center of Excellence Johns Hopkins University BaltimoreMD United States
Data augmentation is a widely used strategy for training robust machine learning models. It partially alleviates the problem of limited data for tasks like speech emotion recognition (SER), where collecting data is ex... 详细信息
来源: 评论
OOV Recovery with Efficient 2nd Pass Decoding and Open-vocabulary Word-level RNNLM Rescoring for Hybrid ASR
OOV Recovery with Efficient 2nd Pass Decoding and Open-vocab...
收藏 引用
International Conference on Acoustics, speech, and Signal processing (ICASSP)
作者: Xiaohui Zhang Daniel Povey Sanjeev Khudanpur Facebook AI US Center for Language and Speech Processing & Human Language Technology Center of Excellence The Johns Hopkins University Baltimore MD US
In this paper, we investigate out-of-vocabulary (OOV) word recovery in hybrid automatic speech recognition (ASR) systems, with emphasis on dynamic vocabulary expansion for both Weight Finite State Transducer (WFST)-ba...
来源: 评论
Multi-class spectral clustering with overlaps for speaker diarization
arXiv
收藏 引用
arXiv 2020年
作者: Raj, Desh Huang, Zili Khudanpur, Sanjeev Center for Language and Speech Processing United States Human Language Technology Center of Excellence The Johns Hopkins University BaltimoreMD21218 United States
This paper describes a method for overlap-aware speaker diarization. Given an overlap detector and a speaker embedding extractor, our method performs spectral clustering of segments informed by the output of the overl... 详细信息
来源: 评论
Integration of speech Separation, Diarization, and Recognition for Multi-Speaker Meetings: System Description, Comparison, and Analysis
Integration of Speech Separation, Diarization, and Recogniti...
收藏 引用
IEEE Spoken language technology Workshop
作者: Desh Raj Pavel Denisov Zhuo Chen Hakan Erdogan Zili Huang Maokui He Shinji Watanabe Jun Du Takuya Yoshioka Yi Luo Naoyuki Kanda Jinyu Li Scott Wisdom John R. Hershey Center for Language and Speech Processing The Johns Hopkins University Baltimore MD Institute for Natural Language Processing University of Stuttgart Germany Microsoft Corp Redmond WA Google Research Cambridge MA University of Science and Technology of China HeFei China Columbia University NY
Multi-speaker speech recognition of unsegmented recordings has diverse applications such as meeting transcription and automatic subtitle generation. With technical advances in systems dealing with speech separation, s... 详细信息
来源: 评论
Discovering Phonetic Inventories with Crosslingual Automatic speech Recognition
arXiv
收藏 引用
arXiv 2022年
作者: Żelasko, Piotr Feng, Siyuan Velázquez, Laureano Moro Abavisani, Ali Bhati, Saurabhchand Scharenborg, Odette Hasegawa-Johnson, Mark Dehak, Najim Center of Language and Speech Processing The Johns Hopkins University 3400 North Charles Street BaltimoreMD21218 United States Human Language Technology Center of Excellence The Johns Hopkins University 810 Wyman Park Drive BaltimoreMD21218 United States Multimedia Computing Group Delft University of Technology Van Mourik Broekmanweg 6 Delft2628 XE Netherlands Department of Electrical and Computer Engineering University of Illinois 405 N Mathews UrbanaIL61801 United States
The high cost of data acquisition makes Automatic speech Recognition (ASR) model training problematic for most existing languages, including languages that do not even have a written script, or for which the phone inv... 详细信息
来源: 评论