咨询与建议

限定检索结果

文献类型

  • 332 篇 期刊文献
  • 318 篇 会议
  • 1 册 图书

馆藏范围

  • 651 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 413 篇 工学
    • 322 篇 计算机科学与技术...
    • 274 篇 软件工程
    • 83 篇 信息与通信工程
    • 57 篇 生物工程
    • 44 篇 控制科学与工程
    • 30 篇 电子科学与技术(可...
    • 29 篇 光学工程
    • 26 篇 电气工程
    • 25 篇 化学工程与技术
    • 25 篇 生物医学工程(可授...
    • 13 篇 机械工程
    • 10 篇 动力工程及工程热...
    • 8 篇 土木工程
    • 7 篇 材料科学与工程(可...
    • 7 篇 核科学与技术
    • 6 篇 力学(可授工学、理...
    • 6 篇 仪器科学与技术
  • 268 篇 理学
    • 138 篇 物理学
    • 96 篇 数学
    • 69 篇 生物学
    • 27 篇 统计学(可授理学、...
    • 24 篇 化学
    • 11 篇 系统科学
  • 81 篇 管理学
    • 49 篇 图书情报与档案管...
    • 33 篇 管理科学与工程(可...
    • 15 篇 工商管理
  • 16 篇 医学
    • 15 篇 基础医学(可授医学...
    • 15 篇 临床医学
    • 11 篇 药学(可授医学、理...
  • 14 篇 法学
    • 14 篇 社会学
  • 3 篇 经济学
  • 1 篇 教育学
  • 1 篇 文学

主题

  • 28 篇 speech recogniti...
  • 27 篇 training
  • 26 篇 semantics
  • 18 篇 signal processin...
  • 14 篇 speech enhanceme...
  • 12 篇 acoustics
  • 12 篇 computational li...
  • 12 篇 feature extracti...
  • 12 篇 machine learning
  • 12 篇 embeddings
  • 11 篇 computational mo...
  • 11 篇 adaptation model...
  • 10 篇 syntactics
  • 10 篇 neural machine t...
  • 9 篇 speech processin...
  • 9 篇 degradation
  • 9 篇 robustness
  • 8 篇 computer archite...
  • 8 篇 self-supervised ...
  • 8 篇 decoding

机构

  • 152 篇 moe key lab of a...
  • 129 篇 department of co...
  • 60 篇 key laboratory o...
  • 52 篇 moe key lab of a...
  • 32 篇 department of co...
  • 27 篇 department of co...
  • 27 篇 x-lance lab depa...
  • 22 篇 x-lance lab depa...
  • 22 篇 suzhou laborator...
  • 16 篇 research center ...
  • 15 篇 aispeech co. ltd...
  • 15 篇 ji hua laborator...
  • 15 篇 key lab. of shan...
  • 15 篇 shanghai jiao to...
  • 10 篇 shanghai jiao to...
  • 10 篇 auditory cogniti...
  • 10 篇 alibaba group
  • 9 篇 peng cheng labor...
  • 9 篇 kyoto
  • 8 篇 university of sc...

作者

  • 106 篇 yu kai
  • 91 篇 zhao hai
  • 59 篇 chen lu
  • 56 篇 qian yanmin
  • 40 篇 zhang zhuosheng
  • 38 篇 yan junchi
  • 37 篇 yanmin qian
  • 36 篇 chen xie
  • 31 篇 li zuchao
  • 28 篇 wu mengyue
  • 23 篇 zhu su
  • 22 篇 guo yiwei
  • 19 篇 yang xiaokang
  • 19 篇 kai yu
  • 18 篇 chen zhengyang
  • 17 篇 xu hongshen
  • 17 篇 junchi yan
  • 16 篇 cao ruisheng
  • 16 篇 du chenpeng
  • 15 篇 ma ziyang

语言

  • 603 篇 英文
  • 47 篇 其他
  • 2 篇 中文
检索条件"机构=Dept. of Computer Science and Engineering & MoE Key Lab of AI"
651 条 记 录,以下是11-20 订阅
排序:
Fast-Hubert: an Efficient Training Framework for Self-Supervised Speech Representation Learning
Fast-Hubert: an Efficient Training Framework for Self-Superv...
收藏 引用
2023 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2023
作者: Yang, Guanrou Ma, Ziyang Zheng, Zhisheng Song, Yakun Niu, Zhikang Chen, Xie Shanghai Jiao Tong University MoE Key Lab of Artificial Intelligence Ai Institute X-LANCE Lab Department of Computer Science and Engineering China
Recent years have witnessed significant advancements in self-supervised learning (SSL) methods for speech-processing tasks. Various speech-based SSL models have been developed and present promising performance on a ra... 详细信息
来源: 评论
Predictive Skim: Contrastive Predictive Coding for Low-Latency Online Speech Separation  48
Predictive Skim: Contrastive Predictive Coding for Low-Laten...
收藏 引用
48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023
作者: Li, Chenda Wu, Yifei Qian, Yanmin Shanghai Jiao Tong University MoE Key Lab of Artificial Intelligence AI Institute X-LANCE Lab Department of Computer Science and Engineering China
In online speech separation, there is a trade-off between inherent latency and speech separation performance. When processing the current input audio, looking ahead to more future context usually brings better speech ... 详细信息
来源: 评论
Advanced Zero-Shot Text-to-Speech for Background Removal and Preservation with Controllable Masked Speech Prediction
Advanced Zero-Shot Text-to-Speech for Background Removal and...
收藏 引用
2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
作者: Zhang, Leying Zhang, Wangyou Chen, Zhengyang Qian, Yanmin Auditory Cognition and Computational Acoustics Lab MoE Key Lab of Artificial Intelligence AI Institute Department of Computer Science and Engineering Shanghai Jiao Tong University Shanghai China
The acoustic background plays a crucial role in natural conversation. It provides context and helps listeners understand the environment, but a strong background makes it difficult for listeners to understand spoken w... 详细信息
来源: 评论
Advancing Non-intrusive Suppression on Enhancement Distortion for Noise Robust ASR
Advancing Non-intrusive Suppression on Enhancement Distortio...
收藏 引用
2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
作者: Wang, Wei Zhao, Siyi Qian, Yanmin Auditory Cognition and Computational Acoustics Lab MoE Key Lab of Artificial Intelligence AI Institute Department of Computer Science and Engineering Shanghai Jiao Tong University Shanghai China
Recent advancements in speech enhancement (SE) techniques have greatly improved speech clarity and intelligibility in challenging acoustic environments. However, integrating SE into automatic speech recognition (ASR) ... 详细信息
来源: 评论
Exploring Time-Frequency Domain Target Speaker Extraction For Causal and Non-Causal Processing
Exploring Time-Frequency Domain Target Speaker Extraction Fo...
收藏 引用
2023 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2023
作者: Zhang, Wangyou Yang, Lei Qian, Yanmin Shanghai Jiao Tong University MoE Key Lab of Artificial Intelligence Ai Institute Department of Computer Science and Engineering Shanghai China China
In recent years, target speaker extraction (TSE) has drawn increasing interest as an alternative to speech separation in realistic applications. While time-domain methods have been widely used in recent studies to ach... 详细信息
来源: 评论
Robust Audio-Visual ASR with Unified Cross-Modal Attention  48
Robust Audio-Visual ASR with Unified Cross-Modal Attention
收藏 引用
48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023
作者: Li, Jiahong Li, Chenda Wu, Yifei Qian, Yanmin Shanghai Jiao Tong University MoE Key Lab of Artificial Intelligence AI Institute X-LANCE Lab Department of Computer Science and Engineering Shanghai China
Audio-visual speech recognition (AVSR) takes advantage of noise-invariant visual information to improve the robustness of automatic speech recognition (ASR) systems. While previous works mainly focused on the clean co... 详细信息
来源: 评论
Emodiff: Intensity Controllable Emotional Text-to-Speech with Soft-label Guidance  48
Emodiff: Intensity Controllable Emotional Text-to-Speech wit...
收藏 引用
48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023
作者: Guo, Yiwei Du, Chenpeng Chen, Xie Yu, Kai Shanghai Jiao Tong University MoE Key Lab of Artificial Intelligence Ai Institute X-LANCE Lab Department of Computer Science and Engineering Shanghai China
Although current neural text-to-speech (TTS) models are able to generate high-quality speech, intensity controllable emotional TTS is still a challenging task. Most existing methods need external optimizations for int... 详细信息
来源: 评论
DiffVoice: Text-to-Speech with Latent Diffusion  48
DiffVoice: Text-to-Speech with Latent Diffusion
收藏 引用
48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023
作者: Liu, Zhijun Guo, Yiwei Yu, Kai Shanghai Jiao Tong University MoE Key Lab of Artificial Intelligence Ai Institute X-Lance Lab Department of Computer Science and Engineering Shanghai China
In this work, we present DiffVoice, a novel text-to-speech model based on latent diffusion. We propose to first encode speech signals into a phoneme-rate latent representation with a variational autoencoder enhanced b... 详细信息
来源: 评论
Adaptive Large Margin Fine-Tuning For Robust Speaker Verification  48
Adaptive Large Margin Fine-Tuning For Robust Speaker Verific...
收藏 引用
48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023
作者: Zhang, Leying Chen, Zhengyang Qian, Yanmin Shanghai Jiao Tong University MoE Key Lab of Artificial Intelligence Ai Institute X-LANCE Lab Department of Computer Science and Engineering Shanghai China
Large margin fine-tuning (LMFT) is an effective strategy to improve the speaker verification system's performance and is widely used in speaker verification challenge systems. Because the large margin in the loss ... 详细信息
来源: 评论
Factorized AED: Factorized Attention-Based Encoder-Decoder for Text-Only Domain Adaptive ASR  48
Factorized AED: Factorized Attention-Based Encoder-Decoder f...
收藏 引用
48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023
作者: Gong, Xun Wang, Wei Shao, Hang Chen, Xie Qian, Yanmin Shanghai Jiao Tong University MoE Key Lab of Artificial Intelligence Ai Institute X-LANCE Lab Department of Computer Science and Engineering Shanghai China
End-to-end automatic speech recognition (ASR) systems have gained popularity given their simplified architecture and promising results. However, text-only domain adaptation remains a big challenge for E2E systems. Tex... 详细信息
来源: 评论