咨询与建议

限定检索结果

文献类型

  • 267 篇 会议
  • 155 篇 期刊文献

馆藏范围

  • 422 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 282 篇 工学
    • 184 篇 计算机科学与技术...
    • 164 篇 软件工程
    • 111 篇 信息与通信工程
    • 28 篇 生物工程
    • 27 篇 电子科学与技术(可...
    • 24 篇 电气工程
    • 23 篇 控制科学与工程
    • 21 篇 仪器科学与技术
    • 19 篇 化学工程与技术
    • 11 篇 机械工程
    • 8 篇 生物医学工程(可授...
    • 6 篇 光学工程
    • 5 篇 建筑学
    • 4 篇 土木工程
    • 3 篇 材料科学与工程(可...
  • 176 篇 理学
    • 137 篇 物理学
    • 56 篇 数学
    • 31 篇 生物学
    • 19 篇 化学
    • 16 篇 统计学(可授理学、...
    • 8 篇 系统科学
  • 44 篇 管理学
    • 37 篇 图书情报与档案管...
    • 7 篇 管理科学与工程(可...
  • 11 篇 法学
    • 11 篇 社会学
  • 8 篇 医学
    • 7 篇 临床医学
    • 6 篇 基础医学(可授医学...
    • 5 篇 药学(可授医学、理...
  • 7 篇 文学
    • 6 篇 中国语言文学
    • 5 篇 外国语言文学
  • 4 篇 教育学
    • 4 篇 教育学
  • 3 篇 农学
  • 2 篇 艺术学

主题

  • 59 篇 speech recogniti...
  • 52 篇 training
  • 33 篇 acoustics
  • 31 篇 speech
  • 20 篇 speech processin...
  • 19 篇 feature extracti...
  • 18 篇 hidden markov mo...
  • 18 篇 signal processin...
  • 16 篇 computational mo...
  • 15 篇 conferences
  • 14 篇 speech enhanceme...
  • 13 篇 predictive model...
  • 13 篇 decoding
  • 12 篇 machine translat...
  • 11 篇 speech synthesis
  • 10 篇 training data
  • 10 篇 neural networks
  • 10 篇 data models
  • 9 篇 transformers
  • 9 篇 self-supervised ...

机构

  • 71 篇 national enginee...
  • 51 篇 human language t...
  • 46 篇 center for langu...
  • 31 篇 human language t...
  • 21 篇 center for langu...
  • 21 篇 center for langu...
  • 13 篇 center for langu...
  • 11 篇 iflytek research
  • 10 篇 center for langu...
  • 9 篇 ict cluster sing...
  • 9 篇 human language t...
  • 8 篇 national enginee...
  • 8 篇 center for langu...
  • 8 篇 human language t...
  • 7 篇 center for langu...
  • 7 篇 human language t...
  • 7 篇 university of sc...
  • 7 篇 xiaomi corp.
  • 6 篇 university of sc...
  • 6 篇 state key labora...

作者

  • 49 篇 ling zhen-hua
  • 47 篇 khudanpur sanjee...
  • 35 篇 dehak najim
  • 32 篇 ai yang
  • 29 篇 sanjeev khudanpu...
  • 23 篇 zhen-hua ling
  • 23 篇 dredze mark
  • 19 篇 povey daniel
  • 19 篇 yang ai
  • 18 篇 villalba jesús
  • 18 篇 van durme benjam...
  • 18 篇 daniel povey
  • 17 篇 post matt
  • 16 篇 hermansky hynek
  • 16 篇 lu ye-xin
  • 15 篇 zelasko piotr
  • 14 篇 du hui-peng
  • 13 篇 raj desh
  • 13 篇 gu jia-chen
  • 13 篇 watanabe shinji

语言

  • 344 篇 英文
  • 78 篇 其他
  • 2 篇 中文
检索条件"机构=Center for Language and Speech Processing & Human Language Technology"
422 条 记 录,以下是141-150 订阅
排序:
Two-Stage Augmentation and Adaptive CTC Fusion for Improved Robustness of Multi-Stream end-to-end ASR
Two-Stage Augmentation and Adaptive CTC Fusion for Improved ...
收藏 引用
IEEE Spoken language technology Workshop
作者: Ruizhi Li Gregory Sell Hynek Hermansky Center for Language and Speech Processing The Johns Hopkins University USA Human Language Technology Center of Excellence The Johns Hopkins University USA
Performance degradation of an Automatic speech Recognition (ASR) system is commonly observed when the test acoustic condition is different from training. Hence, it is essential to make ASR systems robust against vario... 详细信息
来源: 评论
Quality-Aware End-to-End Audio-Visual Neural Speaker Diarization
arXiv
收藏 引用
arXiv 2024年
作者: He, Mao-Kui Du, Jun Niu, Shu-Tong Liu, Qing-Feng Lee, Chin-Hui National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Anhui Hefei China IFlytek Hefei Anhui China School of Electrical and Computer Engineering Georgia Institute of Technology AtlantaGA United States
In this paper, we propose a quality-aware end-to-end audio-visual neural speaker diarization framework, which comprises three key techniques. First, our audio-visual model takes both audio and visual features as input... 详细信息
来源: 评论
Wav2f0: Exploring the Potential of Wav2vec 2.0 for speech Fundamental Frequency Extraction
Wav2f0: Exploring the Potential of Wav2vec 2.0 for Speech Fu...
收藏 引用
International Symposium on Chinese Spoken language processing
作者: Rui Feng Yin-Long Liu Zhen-Hua Ling Jia-Hong Yuan National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei P. R. China Interdisciplinary Research Center for Linguistic Sciences University of Science and Technology of China Hefei P. R. China
speech fundamental frequency (F0) extraction is one of the most important tasks in speech signal processing. This paper aims to explore the feasibility of using deep learning for speech fundamental frequency extractio... 详细信息
来源: 评论
SHINE: Syntax-augmented Hierarchical Interactive Encoder for Zero-shot Cross-lingual Information Extraction
arXiv
收藏 引用
arXiv 2023年
作者: Ma, Jun-Yu Gu, Jia-Chen Ling, Zhen-Hua Liu, Quan Liu, Cong Hu, Guoping National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei China State Key Laboratory of Cognitive Intelligence China iFLYTEK Research Hefei China
Zero-shot cross-lingual information extraction (IE) aims at constructing an IE model for some low-resource target languages, given annotations exclusively in some rich-resource languages. Recent studies based on langu... 详细信息
来源: 评论
GIFT: Graph-Induced Fine-Tuning for Multi-Party Conversation Understanding
arXiv
收藏 引用
arXiv 2023年
作者: Gu, Jia-Chen Ling, Zhen-Hua Liu, Quan Liu, Cong Hu, Guoping National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei China State Key Laboratory of Cognitive Intelligence China iFLYTEK Research Hefei China
Addressing the issues of who saying what to whom in multi-party conversations (MPCs) has recently attracted a lot of research attention. However, existing methods on MPC understanding typically embed interlocutors and... 详细信息
来源: 评论
TEGTOK: Augmenting Text Generation via Task-specific and Open-world Knowledge
arXiv
收藏 引用
arXiv 2022年
作者: Tan, Chao-Hong Gu, Jia-Chen Tao, Chongyang Ling, Zhen-Hua Xu, Can Hu, Huang Geng, Xiubo Jiang, Daxin National Engineering Research Center for Speech and Language Information Processing University of Science and Technology of China Hefei China Microsoft Beijing China
Generating natural and informative texts has been a long-standing problem in NLP. Much effort has been dedicated into incorporating pre-trained language models (PLMs) with various open-world knowledge, such as knowled... 详细信息
来源: 评论
Focus on the Present: A Regularization Method for the ASR Source-Target Attention Layer
Focus on the Present: A Regularization Method for the ASR So...
收藏 引用
International Conference on Acoustics, speech, and Signal processing (ICASSP)
作者: Nanxin Chen Piotr Żelasko Jesús Villalba Najim Dehak Center for Language and Speech Processing Johns Hopkins University Baltimore MD Human Language Technology Center of Excellence Johns Hopkins University Baltimore MD
This paper introduces a novel method to diagnose the source-target attention in state-of-the-art end-to-end speech recognition models with joint connectionist temporal classification (CTC) and attention training. Our ... 详细信息
来源: 评论
Injecting Text and Cross-lingual Supervision in Few-shot Learning from Self-Supervised Models
arXiv
收藏 引用
arXiv 2021年
作者: Wiesner, Matthew Raj, Desh Khudanpur, Sanjeev Human Language Technology Center of Excellence Johns Hopkins University United States Center for Language and Speech Processing Johns Hopkins University United States
Self-supervised model pre-training has recently garnered significant interest, but relatively few efforts have explored using additional resources in fine-tuning these models. We demonstrate how universal phoneset aco... 详细信息
来源: 评论
Two-stage augmentation and adaptive CTC fusion for improved robustness of multi-stream end-to-end ASR
arXiv
收藏 引用
arXiv 2021年
作者: Li, Ruizhi Sell, Gregory Hermansky, Hynek Center for Language and Speech Processing Johns Hopkins University United States Human Language Technology Center of Excellence Johns Hopkins University United States
Performance degradation of an Automatic speech Recognition (ASR) system is commonly observed when the test acoustic condition is different from training. Hence, it is essential to make ASR systems robust against vario... 详细信息
来源: 评论
Radically old way of computing spectra: Applications in end-to-end ASR
arXiv
收藏 引用
arXiv 2021年
作者: Sadhu, Samik Hermansky, Hynek Center for Language and Speech Processing Johns Hopkins University United States Human Language Technology Center of Excellence Johns Hopkins University United States
We propose a technique to compute spectrograms using Frequency Domain Linear Prediction (FDLP) that uses all-pole models to fit the squared Hilbert envelope of speech in different frequency sub-bands. The spectrogram ... 详细信息
来源: 评论