咨询与建议

限定检索结果

文献类型

  • 267 篇 会议
  • 155 篇 期刊文献

馆藏范围

  • 422 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 282 篇 工学
    • 184 篇 计算机科学与技术...
    • 164 篇 软件工程
    • 111 篇 信息与通信工程
    • 28 篇 生物工程
    • 27 篇 电子科学与技术(可...
    • 24 篇 电气工程
    • 23 篇 控制科学与工程
    • 21 篇 仪器科学与技术
    • 19 篇 化学工程与技术
    • 11 篇 机械工程
    • 8 篇 生物医学工程(可授...
    • 6 篇 光学工程
    • 5 篇 建筑学
    • 4 篇 土木工程
    • 3 篇 材料科学与工程(可...
  • 176 篇 理学
    • 137 篇 物理学
    • 56 篇 数学
    • 31 篇 生物学
    • 19 篇 化学
    • 16 篇 统计学(可授理学、...
    • 8 篇 系统科学
  • 44 篇 管理学
    • 37 篇 图书情报与档案管...
    • 7 篇 管理科学与工程(可...
  • 11 篇 法学
    • 11 篇 社会学
  • 8 篇 医学
    • 7 篇 临床医学
    • 6 篇 基础医学(可授医学...
    • 5 篇 药学(可授医学、理...
  • 7 篇 文学
    • 6 篇 中国语言文学
    • 5 篇 外国语言文学
  • 4 篇 教育学
    • 4 篇 教育学
  • 3 篇 农学
  • 2 篇 艺术学

主题

  • 59 篇 speech recogniti...
  • 52 篇 training
  • 33 篇 acoustics
  • 31 篇 speech
  • 20 篇 speech processin...
  • 19 篇 feature extracti...
  • 18 篇 hidden markov mo...
  • 18 篇 signal processin...
  • 16 篇 computational mo...
  • 15 篇 conferences
  • 14 篇 speech enhanceme...
  • 13 篇 predictive model...
  • 13 篇 decoding
  • 12 篇 machine translat...
  • 11 篇 speech synthesis
  • 10 篇 training data
  • 10 篇 neural networks
  • 10 篇 data models
  • 9 篇 transformers
  • 9 篇 self-supervised ...

机构

  • 71 篇 national enginee...
  • 51 篇 human language t...
  • 46 篇 center for langu...
  • 31 篇 human language t...
  • 21 篇 center for langu...
  • 21 篇 center for langu...
  • 13 篇 center for langu...
  • 11 篇 iflytek research
  • 10 篇 center for langu...
  • 9 篇 ict cluster sing...
  • 9 篇 human language t...
  • 8 篇 national enginee...
  • 8 篇 center for langu...
  • 8 篇 human language t...
  • 7 篇 center for langu...
  • 7 篇 human language t...
  • 7 篇 university of sc...
  • 7 篇 xiaomi corp.
  • 6 篇 university of sc...
  • 6 篇 state key labora...

作者

  • 49 篇 ling zhen-hua
  • 47 篇 khudanpur sanjee...
  • 35 篇 dehak najim
  • 32 篇 ai yang
  • 29 篇 sanjeev khudanpu...
  • 23 篇 zhen-hua ling
  • 23 篇 dredze mark
  • 19 篇 povey daniel
  • 19 篇 yang ai
  • 18 篇 villalba jesús
  • 18 篇 van durme benjam...
  • 18 篇 daniel povey
  • 17 篇 post matt
  • 16 篇 hermansky hynek
  • 16 篇 lu ye-xin
  • 15 篇 zelasko piotr
  • 14 篇 du hui-peng
  • 13 篇 raj desh
  • 13 篇 gu jia-chen
  • 13 篇 watanabe shinji

语言

  • 344 篇 英文
  • 78 篇 其他
  • 2 篇 中文
检索条件"机构=Center for Language and Speech Processing & Human Language Technology"
422 条 记 录,以下是191-200 订阅
排序:
An asynchronous wfst-based decoder for automatic speech recognition
arXiv
收藏 引用
arXiv 2021年
作者: Lv, Hang Chen, Zhehuai Xu, Hainan Povey, Daniel Xie, Lei Khudanpur, Sanjeev School of Computer Science Northwestern Polytechnical University Xi'an China Center of Language and Speech Processing United States Human Language Technology Center of Excellence Johns Hopkins University BaltimoreMD United States Xiaomi Corporation Beijing China SpeechLab Department of Computer Science and Engineering Shanghai Jiao Tong University China
We introduce asynchronous dynamic decoder, which adopts an efficient A∗ algorithm to incorporate big language models in the onepass decoding for large vocabulary continuous speech recognition. Unlike standard one-pass... 详细信息
来源: 评论
Frustratingly easy noise-aware training of acoustic models
arXiv
收藏 引用
arXiv 2020年
作者: Raj, Desh Villalba, Jesús Povey, Daniel Khudanpur, Sanjeev Center for Language and Speech Processing & Human Language Technology Center of Excellence The Johns Hopkins University BaltimoreMD21218 United States Xiaomi Corp. Beijing China
Environmental noises and reverberation have a detrimental effect on the performance of automatic speech recognition (ASR) systems. Multi-condition training of neural network-based acoustic models is used to deal with ... 详细信息
来源: 评论
Mixture of speaker-type PLDAs for children's speech diarization
arXiv
收藏 引用
arXiv 2020年
作者: Xie, Jiamin Sia, Suzanna García, Paola Povey, Daniel Khudanpur, Sanjeev Center for Language and Speech Processing Human Language Technology Center of Excellence Johns Hopkins University BaltimoreMD21218 United States Xiaomi Corp. Beijing China
In diarization, the PLDA is typically used to model an inference structure which assumes the variation in speech segments be induced by various speakers. The speaker variation is then learned from the training data. H... 详细信息
来源: 评论
Neural language modeling with implicit cache pointers
arXiv
收藏 引用
arXiv 2020年
作者: Li, Ke Povey, Daniel Khudanpur, Sanjeev Center for Language and Speech Processing Human Language Technology Center of Excellence Johns Hopkins University BaltimoreMD21218 United States Xiaomi Corp. Beijing China
A cache-inspired approach is proposed for neural language models (LMs) to improve long-range dependency and better predict rare words from long contexts. This approach is a simpler alternative to attention-based point... 详细信息
来源: 评论
SPECTRE: Visual speech-Informed Perceptual 3D Facial Expression Reconstruction from Videos
SPECTRE: Visual Speech-Informed Perceptual 3D Facial Express...
收藏 引用
IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
作者: Panagiotis P. Filntisis George Retsinas Foivos Paraperas-Papantoniou Athanasios Katsamanis Anastasios Roussos Petros Maragos Institute of Robotics Athena Research Center Maroussi Greece School of Electrical & Computer Engineering National Technical University of Athens Greece Imperial College London UK Institute for Language and Speech Processing Athena R.C. Greece Institute of Computer Science (ICS) Foundation for Research & Technology - Hellas (FORTH) Greece College of Engineering Mathematics and Physical Sciences University of Exeter UK
The recent state of the art on monocular 3D face reconstruction from image data has made some impressive advancements, thanks to the advent of Deep Learning. However, it has mostly focused on input coming from a singl...
来源: 评论
Target-speaker voice activity detection with improved i-vector estimation for unknown number of speaker
arXiv
收藏 引用
arXiv 2021年
作者: He, Maokui Raj, Desh Huang, Zili Du, Jun Chen, Zhuo Watanabe, Shinji University of Science and Technology of China HeFei China Center for Language and Speech Processing The Johns Hopkins University BaltimoreMD United States Microsoft Corp RedmondWA United States
Target-speaker voice activity detection (TS-VAD) has recently shown promising results for speaker diarization on highly overlapped speech. However, the original model requires a fixed (and known) number of speakers, w... 详细信息
来源: 评论
Align or attend? Toward More Efficient and Accurate Spoken Word Discovery Using speech-to-Image Retrieval
Align or attend? Toward More Efficient and Accurate Spoken W...
收藏 引用
IEEE International Conference on Acoustics, speech and Signal processing
作者: Liming Wang Xinsheng Wang Mark Hasegawa-Johnson Odette Scharenborg Najim Dehak University of Illinois at Urbana-Champaign School of Software Engineering Xi’an Jiaotong University Multimedia Computing Group Delft University of Technology Center for Language and Speech Processing Johns Hopkins University
Multimodal word discovery (MWD) is often treated as a byproduct of the speech-to-image retrieval problem. However, our theoretical analysis shows that some kind of alignment/attention mechanism is crucial for a MWD sy... 详细信息
来源: 评论
Training noisy single-channel speech separation with noisy oracle sources: A large gap and a small step
arXiv
收藏 引用
arXiv 2020年
作者: Maciejewski, Matthew Shi, Jing Watanabe, Shinji Khudanpur, Sanjeev Center for Language and Speech Processing The Johns Hopkins University United States Human Language Technology Center of Excellence The Johns Hopkins University United States Institute of Automation Chinese Academy of Sciences China
As the performance of single-channel speech separation systems has improved, there has been a desire to move to more challenging conditions than the clean, near-field speech that initial systems were developed on. Whe... 详细信息
来源: 评论
Dover-lap: A method for combining overlap-aware diarization outputs
arXiv
收藏 引用
arXiv 2020年
作者: Raj, Desh Garcia-Perera, Leibny Paola Huang, Zili Watanabe, Shinji Povey, Daniel Stolcke, Andreas Khudanpur, Sanjeev Center for Language and Speech Processing & Human Language Technlogy Center of Excellence The Johms Hopkins University BaltimoreMD21218 United States Human Language Technology Center of Excellence Johns Hopkins University BaltimoreMD21218 United States Xiaomi Corp. Beijing China Amazon Alexa Speech SunnyvaleCA United States
Several advances have been made recently towards handling overlapping speech for speaker diarization. Since speech and natural language tasks often benefit from ensemble techniques, we propose an algorithm for combini... 详细信息
来源: 评论
SAV-SE: Scene-aware Audio-Visual speech Enhancement with Selective State Space Model
arXiv
收藏 引用
arXiv 2024年
作者: Qian, Xinyuan Gao, Jiaran Zhang, Yaodan Zhang, Qiquan Liu, Hexin Garcia, Leibny Paola Li, Haizhou School of Computer and Communication Engineering University of Science and Technology Beijing Beijing100083 China School of Electrical Engineering and Telecommunications The University of New South Wales Sydney2052 Australia College of Computing and Data Science Nanyang Technological University Singapore Center for Language and Speech Processing Johns Hopkins University United States Guangdong Provincial Key Laboratory of Big Data Computing The Chinese University of Hong Kong Shenzhen518172 China Shenzhen Research Institute of Big data Shenzhen51872 China
speech enhancement plays an essential role in various applications, and the integration of visual information has been demonstrated to bring substantial advantages. However, the majority of current research concentrat... 详细信息
来源: 评论