咨询与建议

限定检索结果

文献类型

  • 232 篇 会议
  • 127 篇 期刊文献
  • 1 册 图书

馆藏范围

  • 360 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 219 篇 工学
    • 140 篇 计算机科学与技术...
    • 123 篇 软件工程
    • 88 篇 信息与通信工程
    • 28 篇 电子科学与技术(可...
    • 26 篇 仪器科学与技术
    • 21 篇 电气工程
    • 20 篇 生物工程
    • 18 篇 控制科学与工程
    • 15 篇 化学工程与技术
    • 13 篇 机械工程
    • 7 篇 建筑学
    • 6 篇 土木工程
    • 3 篇 光学工程
    • 3 篇 生物医学工程(可授...
  • 155 篇 理学
    • 114 篇 物理学
    • 56 篇 数学
    • 23 篇 生物学
    • 20 篇 统计学(可授理学、...
    • 15 篇 化学
    • 5 篇 系统科学
  • 52 篇 管理学
    • 37 篇 图书情报与档案管...
    • 18 篇 管理科学与工程(可...
    • 10 篇 工商管理
  • 13 篇 法学
    • 10 篇 社会学
    • 3 篇 法学
  • 7 篇 教育学
    • 6 篇 教育学
    • 4 篇 心理学(可授教育学...
  • 7 篇 文学
    • 7 篇 外国语言文学
    • 6 篇 中国语言文学
  • 3 篇 医学
  • 2 篇 经济学
    • 2 篇 应用经济学
  • 2 篇 农学

主题

  • 59 篇 speech recogniti...
  • 38 篇 speech processin...
  • 26 篇 training
  • 21 篇 acoustics
  • 19 篇 signal processin...
  • 17 篇 natural language...
  • 17 篇 speech enhanceme...
  • 16 篇 automatic speech...
  • 15 篇 feature extracti...
  • 15 篇 robustness
  • 13 篇 speech
  • 12 篇 speech synthesis
  • 11 篇 error analysis
  • 10 篇 hidden markov mo...
  • 10 篇 predictive model...
  • 9 篇 decoding
  • 8 篇 training data
  • 8 篇 transformers
  • 8 篇 self-supervised ...
  • 8 篇 accuracy

机构

  • 68 篇 national enginee...
  • 18 篇 hitachi ltd. res...
  • 15 篇 institute for la...
  • 15 篇 center for langu...
  • 13 篇 center for langu...
  • 10 篇 iflytek research
  • 10 篇 institute for la...
  • 9 篇 department of in...
  • 9 篇 ict cluster sing...
  • 8 篇 robust speech pr...
  • 8 篇 national enginee...
  • 7 篇 university of sc...
  • 7 篇 iflytek research...
  • 7 篇 school of ece na...
  • 6 篇 robust speech pr...
  • 6 篇 state key labora...
  • 6 篇 institute for la...
  • 6 篇 national enginee...
  • 5 篇 university of sc...
  • 5 篇 ibm thomas j. wa...

作者

  • 51 篇 ling zhen-hua
  • 32 篇 ai yang
  • 21 篇 hansen john h.l.
  • 19 篇 zhen-hua ling
  • 17 篇 hansen john h. l...
  • 16 篇 watanabe shinji
  • 16 篇 lu ye-xin
  • 15 篇 yang ai
  • 14 篇 gu jia-chen
  • 14 篇 katsouros vassil...
  • 14 篇 potamianos alexa...
  • 14 篇 j.h.l. hansen
  • 14 篇 du hui-peng
  • 13 篇 fujita yusuke
  • 13 篇 paraskevopoulos ...
  • 13 篇 katsamanis athan...
  • 12 篇 androutsopoulos ...
  • 10 篇 horiguchi shota
  • 10 篇 shinji watanabe
  • 10 篇 zheng rui-chen

语言

  • 331 篇 英文
  • 29 篇 其他
检索条件"机构=Center for Research in Speech and Language Processing"
360 条 记 录,以下是41-50 订阅
排序:
Refining Self-Supervised Learnt speech Representation using Brain Activations
arXiv
收藏 引用
arXiv 2024年
作者: Li, Hengyu Mei, Kangdi Liu, Zhaoci Ai, Yang Chen, Liping Zhang, Jie Ling, Zhenhua National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei China
It was shown in literature that speech representations extracted by self-supervised pre-trained models exhibit similarities with brain activations of human for speech perception and fine-tuning speech representation m... 详细信息
来源: 评论
PITCH-AND-SPECTRUM-AWARE SINGING QUALITY ASSESSMENT WITH BIAS CORRECTION AND MODEL FUSION
arXiv
收藏 引用
arXiv 2024年
作者: Shi, Yu-Fei Ai, Yang Lu, Ye-Xin Du, Hui-Peng Ling, Zhen-Hua National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei China
We participated in track 2 of the VoiceMOS Challenge 2024, which aimed to predict the mean opinion score (MOS) of singing samples. Our submission secured the first place among all participating teams, excluding the of... 详细信息
来源: 评论
Designing and Evaluating speech Emotion Recognition Systems: A Reality Check Case Study with IEMOCAP  48
Designing and Evaluating Speech Emotion Recognition Systems:...
收藏 引用
48th IEEE International Conference on Acoustics, speech and Signal processing, ICASSP 2023
作者: Antoniou, Nikolaos Katsamanis, Athanasios Giannakopoulos, Theodoros Narayanan, Shrikanth Behavioral Signal Technologies Los AngelesCA United States Athena Research Center Institute for Language and Speech Processing Athens Greece SAIL-University of Southern California Los AngelesCA United States
There is an imminent need for guidelines and standard test sets to allow direct and fair comparisons of speech emotion recognition (SER). While resources, such as the Interactive Emotional Dyadic Motion Capture (IEMOC... 详细信息
来源: 评论
Long-Form speech Translation through Segmentation with Finite-State Decoding Constraints on Large language Models
arXiv
收藏 引用
arXiv 2023年
作者: McCarthy, Arya D. Zhang, Hao Kumar, Shankar Stahlberg, Felix Wu, Ke Center for Language and Speech Processing Johns Hopkins University United States Google Research
One challenge in speech translation is that plenty of spoken content is long-form, but short units are necessary for obtaining high-quality translations. To address this mismatch, we adapt large language models (LLMs)... 详细信息
来源: 评论
Low-Latency Neural speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses for speech Generation Tasks
arXiv
收藏 引用
arXiv 2024年
作者: Ai, Yang Ling, Zhen-Hua The National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei230027 China
This paper presents a novel neural speech phase prediction model which predicts wrapped phase spectra directly from amplitude spectra. The proposed model is a cascade of a residual convolutional network and a parallel... 详细信息
来源: 评论
Towards robust one-shot voice conversion with cycle phonetic posteriorgrams and multi-scale speaker representations  24
Towards robust one-shot voice conversion with cycle phonetic...
收藏 引用
24th International Congress on Acoustics, ICA 2022
作者: Chen, Yannian Liu, Lijuan Hu, Yajun Ling, Zhenhua National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China China IFLYTEK Research IFLYTEK Co. Ltd. China
One-shot voice conversion (VC) aims to convert the voice across arbitrary speakers even unseen during training, with only one reference utterance from the target speaker. It is still a challenging task as both content... 详细信息
来源: 评论
APCodec: A Neural Audio Codec with Parallel Amplitude and Phase Spectrum Encoding and Decoding
arXiv
收藏 引用
arXiv 2024年
作者: Ai, Yang Jiang, Xiao-Hang Lu, Ye-Xin Du, Hui-Peng Ling, Zhen-Hua National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei230027 China
This paper introduces a novel neural audio codec targeting high waveform sampling rates and low bitrates named APCodec, which seamlessly integrates the strengths of parametric codecs and waveform codecs. The APCodec r... 详细信息
来源: 评论
Towards High-Quality and Efficient speech Bandwidth Extension with Parallel Amplitude and Phase Prediction
arXiv
收藏 引用
arXiv 2024年
作者: Lu, Ye-Xin Ai, Yang Du, Hui-Peng Ling, Zhen-Hua National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei230027 China
speech bandwidth extension (BWE) refers to widening the frequency bandwidth range of speech signals, enhancing the speech quality towards brighter and fuller. This paper proposes a generative adversarial network (GAN)... 详细信息
来源: 评论
Aligning Noisy-Clean speech Pairs at Feature and Embedding Levels for Learning Noise-Invariant Speaker Representations
Aligning Noisy-Clean Speech Pairs at Feature and Embedding L...
收藏 引用
International Conference on Acoustics, speech, and Signal processing (ICASSP)
作者: Zuoliang Li Yang Ai Jie Zhang Shengyu Peng Yu Guan Bin Gu Wu Guo The National Engineering Research Center for Speech and Language Information Processing (NERC-SLIP) University of Science and Technology of China (USTC) Hefei China
In this paper, we propose a noise-invariant speaker representation learning (SRL) approach by aligning noisy-clean speech pairs at both the feature and embedding levels for model training. Specifically, we first const... 详细信息
来源: 评论
Recursive Feature Learning from Pre-Trained Models for Spoofing speech Detection
Recursive Feature Learning from Pre-Trained Models for Spoof...
收藏 引用
International Conference on Acoustics, speech, and Signal processing (ICASSP)
作者: Yu Guan Yang Ai Zuoliang Li Shengyu Peng Wu Guo National Engineering Research Center for Speech and Language Information Processing (NERC-SLIP) University of Science and Technology of China (USTC) Hefei China
It was recently revealed that using features extracted from pre-trained models can achieve much better performance than using conventional hand-crafted acoustic features for spoofing speech detection. In this paper, w... 详细信息
来源: 评论