咨询与建议

限定检索结果

文献类型

  • 288 篇 期刊文献
  • 221 篇 会议

馆藏范围

  • 509 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 318 篇 工学
    • 263 篇 计算机科学与技术...
    • 224 篇 软件工程
    • 67 篇 信息与通信工程
    • 47 篇 生物工程
    • 31 篇 控制科学与工程
    • 24 篇 电子科学与技术(可...
    • 21 篇 电气工程
    • 21 篇 化学工程与技术
    • 17 篇 光学工程
    • 16 篇 生物医学工程(可授...
    • 9 篇 机械工程
    • 6 篇 力学(可授工学、理...
    • 6 篇 土木工程
    • 5 篇 仪器科学与技术
    • 5 篇 材料科学与工程(可...
    • 5 篇 动力工程及工程热...
  • 211 篇 理学
    • 115 篇 物理学
    • 67 篇 数学
    • 57 篇 生物学
    • 20 篇 化学
    • 18 篇 统计学(可授理学、...
    • 6 篇 系统科学
    • 4 篇 地质学
  • 65 篇 管理学
    • 45 篇 图书情报与档案管...
    • 21 篇 管理科学与工程(可...
    • 8 篇 工商管理
  • 13 篇 医学
    • 13 篇 基础医学(可授医学...
    • 12 篇 临床医学
    • 10 篇 药学(可授医学、理...
  • 12 篇 法学
    • 12 篇 社会学
  • 2 篇 经济学
  • 1 篇 教育学
  • 1 篇 文学

主题

  • 28 篇 speech recogniti...
  • 26 篇 semantics
  • 23 篇 training
  • 18 篇 signal processin...
  • 14 篇 speech enhanceme...
  • 12 篇 acoustics
  • 12 篇 machine learning
  • 12 篇 embeddings
  • 11 篇 computational li...
  • 11 篇 adaptation model...
  • 10 篇 computational mo...
  • 10 篇 syntactics
  • 10 篇 neural machine t...
  • 9 篇 speech processin...
  • 9 篇 feature extracti...
  • 9 篇 degradation
  • 9 篇 robustness
  • 8 篇 self-supervised ...
  • 8 篇 decoding
  • 7 篇 object detection

机构

  • 153 篇 moe key lab of a...
  • 131 篇 department of co...
  • 60 篇 key laboratory o...
  • 53 篇 moe key lab of a...
  • 32 篇 department of co...
  • 28 篇 department of co...
  • 28 篇 x-lance lab depa...
  • 23 篇 suzhou laborator...
  • 22 篇 x-lance lab depa...
  • 16 篇 key lab. of shan...
  • 16 篇 research center ...
  • 15 篇 aispeech co. ltd...
  • 15 篇 ji hua laborator...
  • 15 篇 shanghai jiao to...
  • 10 篇 shanghai jiao to...
  • 10 篇 auditory cogniti...
  • 9 篇 kyoto
  • 8 篇 department of co...
  • 8 篇 aispeech ltd
  • 8 篇 microsoft resear...

作者

  • 106 篇 yu kai
  • 93 篇 zhao hai
  • 61 篇 chen lu
  • 56 篇 qian yanmin
  • 40 篇 zhang zhuosheng
  • 39 篇 yan junchi
  • 38 篇 yanmin qian
  • 36 篇 chen xie
  • 32 篇 li zuchao
  • 28 篇 wu mengyue
  • 23 篇 zhu su
  • 22 篇 guo yiwei
  • 20 篇 kai yu
  • 19 篇 yang xiaokang
  • 18 篇 chen zhengyang
  • 17 篇 xu hongshen
  • 17 篇 du chenpeng
  • 17 篇 junchi yan
  • 16 篇 cao ruisheng
  • 16 篇 ma ziyang

语言

  • 464 篇 英文
  • 45 篇 其他
  • 1 篇 中文
检索条件"机构=Dep. of Computer Science and Engineering & MoE Key Lab of AI"
509 条 记 录,以下是121-130 订阅
排序:
DiffVoice: Text-to-Speech with Latent Diffusion
DiffVoice: Text-to-Speech with Latent Diffusion
收藏 引用
International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
作者: Zhijun Liu Yiwei Guo Kai Yu Department of Computer Science and Engineering MoE Key Lab of Artificial Intelligence AI Institute X-Lance Lab Shanghai Jiao Tong University Shanghai China
In this work, we present DiffVoice, a novel text-to-speech model based on latent diffusion. We propose to first encode speech signals into a phoneme-rate latent representation with a variational autoencoder enhanced b... 详细信息
来源: 评论
ACOUSTIC BPE FOR SPEECH GENERATION WITH DISCRETE TOKENS
arXiv
收藏 引用
arXiv 2023年
作者: Shen, Feiyu Guo, Yiwei Du, Chenpeng Chen, Xie Yu, Kai MoE Key Lab of Artificial Intelligence AI Institute X-LANCE Lab Department of Computer Science and Engineering Shanghai Jiao Tong University Shanghai China
Discrete audio tokens derived from self-supervised learning models have gained widespread usage in speech generation. However, current practice of directly utilizing audio tokens poses challenges for sequence modeling... 详细信息
来源: 评论
Complementary Classifier Induced Partial label Learning
arXiv
收藏 引用
arXiv 2023年
作者: Jia, Yuheng Si, Chongjie Zhang, Min-Ling Key Laboratory of Computer Network and Information Integration School of Computer Science and Engineering Nanjing210096 China MoE Key Lab of Artificial Intelligence AI Institute Shanghai200240 China
In partial label learning (PLL), each training sample is associated with a set of candidate labels, among which only one is valid. The core of PLL is to disambiguate the candidate labels to get the ground-truth one. I... 详细信息
来源: 评论
Light-Weight Visualvoice: Neural Network Quantization On Audio Visual Speech Separation
Light-Weight Visualvoice: Neural Network Quantization On Aud...
收藏 引用
Acoustics, Speech, and Signal Processing Workshops (ICASSPW), IEEE International Conference on
作者: Yifei Wu Chenda Li Yanmin Qian Department of Computer Science and Engineering MoE Key Lab of Artificial Intelligence AI Institute X-LANCE Lab Shanghai Jiao Tong University Shanghai China
As multi-modal systems show superior performance on more tasks, the huge amount of computational resources they need becomes one of the critical problems to be solved. In this work, we explore neural network quantizat...
来源: 评论
Factorized AED: Factorized Attention-Based Encoder-Decoder for Text-Only Domain Adaptive ASR
Factorized AED: Factorized Attention-Based Encoder-Decoder f...
收藏 引用
International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
作者: Xun Gong Wei Wang Hang Shao Xie Chen Yanmin Qian Department of Computer Science and Engineering MoE Key Lab of Artificial Intelligence AI Institute X-LANCE Lab Shanghai Jiao Tong University Shanghai China
End-to-end automatic speech recognition (ASR) systems have gained popularity given their simplified architecture and promising results. However, text-only domain adaptation remains a big challenge for E2E systems. Tex... 详细信息
来源: 评论
Exploring Binary Classification Loss for Speaker Verification
Exploring Binary Classification Loss for Speaker Verificatio...
收藏 引用
International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
作者: Bing Han Zhengyang Chen Yanmin Qian Department of Computer Science and Engineering MoE Key Lab of Artificial Intelligence AI Institute X-LANCE Lab Shanghai Jiao Tong University Shanghai China
The mismatch between close-set training and open-set testing usually leads to significant performance degradation for speaker verification task. For existing loss functions, metric learning-based objectives dep.nd str... 详细信息
来源: 评论
DIFFVOICE: TEXT-TO-SPEECH WITH LATENT DIFFUSION
arXiv
收藏 引用
arXiv 2023年
作者: Liu, Zhijun Guo, Yiwei Yu, Kai MoE Key Lab of Artificial Intelligence AI Institute X-LANCE Lab Department of Computer Science and Engineering Shanghai Jiao Tong University Shanghai China
In this work, we present DiffVoice, a novel text-to-speech model based on latent diffusion. We propose to first encode speech signals into a phoneme-rate latent representation with a variational autoencoder enhanced b... 详细信息
来源: 评论
VOICEFLOW: EFFICIENT TEXT-TO-SPEECH WITH RECTIFIED FLOW MATCHING
arXiv
收藏 引用
arXiv 2023年
作者: Guo, Yiwei Du, Chenpeng Ma, Ziyang Chen, Xie Yu, Kai X-LANCE Lab Department of Computer Science and Engineering Shanghai Jiao Tong University MoE Key Lab of Artificial Intelligence AI Institute Shanghai China
Although diffusion models in text-to-speech have become a popular choice due to their strong generative ability, the intrinsic complexity of sampling from diffusion models harms their efficiency. Alternatively, we pro... 详细信息
来源: 评论
Code-Switching Text Generation and Injection in Mandarin-English ASR
Code-Switching Text Generation and Injection in Mandarin-Eng...
收藏 引用
International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
作者: Haibin Yu Yuxuan Hu Yao Qian Ma Jin Linquan Liu Shujie Liu Yu Shi Yanmin Qian Edward Lin Michael Zeng Department of Computer Science and Engineering MoE Key Lab of Artificial Intelligence AI Institute X-LANCE Lab Shanghai Jiao Tong University Microsoft Corporation
Code-switching speech refers to a means of expression by mixing two or more languages within a single utterance. Automatic Speech Recognition (ASR) with End-to-End (E2E) modeling for such speech can be a challenging t... 详细信息
来源: 评论
EXPLORING BINARY CLASSIFICATION LOSS FOR SPEAKER VERIFICATION
arXiv
收藏 引用
arXiv 2023年
作者: Han, Bing Chen, Zhengyang Qian, Yanmin MoE Key Lab of Artificial Intelligence AI Institute X-LANCE Lab Department of Computer Science and Engineering Shanghai Jiao Tong University Shanghai China
The mismatch between close-set training and open-set testing usually leads to significant performance degradation for speaker verification task. For existing loss functions, metric learning-based objectives dep.nd str... 详细信息
来源: 评论