咨询与建议

限定检索结果

文献类型

  • 528 篇 会议
  • 297 篇 期刊文献
  • 3 册 图书

馆藏范围

  • 828 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 520 篇 工学
    • 387 篇 计算机科学与技术...
    • 336 篇 软件工程
    • 142 篇 信息与通信工程
    • 56 篇 生物工程
    • 45 篇 控制科学与工程
    • 40 篇 电子科学与技术(可...
    • 35 篇 仪器科学与技术
    • 33 篇 化学工程与技术
    • 30 篇 电气工程
    • 21 篇 生物医学工程(可授...
    • 16 篇 机械工程
    • 16 篇 光学工程
    • 7 篇 建筑学
    • 6 篇 材料科学与工程(可...
  • 291 篇 理学
    • 167 篇 物理学
    • 118 篇 数学
    • 62 篇 生物学
    • 55 篇 统计学(可授理学、...
    • 31 篇 化学
    • 18 篇 系统科学
  • 120 篇 管理学
    • 79 篇 图书情报与档案管...
    • 45 篇 管理科学与工程(可...
    • 15 篇 工商管理
  • 15 篇 法学
    • 13 篇 社会学
  • 15 篇 医学
    • 13 篇 临床医学
    • 10 篇 基础医学(可授医学...
    • 8 篇 药学(可授医学、理...
  • 12 篇 文学
    • 8 篇 中国语言文学
    • 8 篇 外国语言文学
  • 10 篇 农学
    • 7 篇 作物学
  • 4 篇 教育学
  • 3 篇 经济学
  • 3 篇 艺术学
  • 1 篇 军事学

主题

  • 77 篇 speech recogniti...
  • 73 篇 training
  • 50 篇 acoustics
  • 46 篇 speech processin...
  • 44 篇 speech
  • 33 篇 hidden markov mo...
  • 31 篇 signal processin...
  • 29 篇 feature extracti...
  • 26 篇 decoding
  • 23 篇 speech enhanceme...
  • 21 篇 computational mo...
  • 20 篇 speech synthesis
  • 20 篇 linguistics
  • 19 篇 predictive model...
  • 18 篇 data models
  • 17 篇 neural networks
  • 17 篇 natural language...
  • 16 篇 accuracy
  • 15 篇 conferences
  • 15 篇 training data

机构

  • 70 篇 national enginee...
  • 55 篇 school of comput...
  • 47 篇 audio speech and...
  • 42 篇 beijing engineer...
  • 27 篇 department of co...
  • 25 篇 center for langu...
  • 21 篇 department of co...
  • 18 篇 mainlp center fo...
  • 18 篇 department of co...
  • 15 篇 audio speech and...
  • 14 篇 iflytek research
  • 14 篇 national enginee...
  • 12 篇 munich
  • 11 篇 department of co...
  • 10 篇 center for infor...
  • 10 篇 ict cluster sing...
  • 10 篇 audio speech and...
  • 9 篇 center for infor...
  • 9 篇 department of co...
  • 9 篇 center for speec...

作者

  • 71 篇 lei xie
  • 54 篇 ling zhen-hua
  • 37 篇 huang heyan
  • 32 篇 ai yang
  • 23 篇 plank barbara
  • 21 篇 zhen-hua ling
  • 18 篇 zheng thomas fan...
  • 18 篇 yarowsky david
  • 18 篇 thomas fang zhen...
  • 18 篇 yang ai
  • 17 篇 wang dong
  • 17 篇 heyan huang
  • 17 篇 khudanpur sanjee...
  • 16 篇 lu ye-xin
  • 15 篇 pengcheng guo
  • 15 篇 gu jia-chen
  • 15 篇 van der goot rob
  • 14 篇 du jun
  • 14 篇 mao xian-ling
  • 14 篇 xie lei

语言

  • 739 篇 英文
  • 84 篇 其他
  • 8 篇 中文
检索条件"机构=Center for Language and Speech Processing and Computer Science"
828 条 记 录,以下是41-50 订阅
排序:
Joint Generative-Contrastive Representation Learning for Anomalous Sound Detection  48
Joint Generative-Contrastive Representation Learning for Ano...
收藏 引用
48th IEEE International Conference on Acoustics, speech and Signal processing, ICASSP 2023
作者: Zeng, Xiao-Min Song, Yan Zhuo, Zhu Zhou, Yu Li, Yu-Hong Xue, Hui Dai, Li-Rong McLoughlin, Ian Alibaba Group China University of Science and Technology of China National Engineering Research Center of Speech and Language Information Processing Hefei China Singapore Institute of Technology Ict Cluster Singapore
In this paper, we propose a joint generative and contrastive representation learning method (GeCo) for anomalous sound detection (ASD). GeCo exploits a Predictive AutoEncoder (PAE) equipped with self-attention as a ge... 详细信息
来源: 评论
DQ-Data2vec: Decoupling Quantization for Multilingual speech Recognition
IEEE Transactions on Audio, Speech and Language Processing
收藏 引用
IEEE Transactions on Audio, speech and language processing 2025年 33卷 1337-1348页
作者: Qijie Shao Linhao Dong Kun Wei Sining Sun Lei Xie Audio Speech and Language Processing Group (ASLP) School of Computer Science and Engineering Northwestern Polytechnical University Xi'an China Bytedance Speech Beijing Bytedance Technology Company Ltd. Beijing China
Data2vec is a self-supervised learning (SSL) approach that employs a teacher-student architecture for contextual representation learning via masked prediction, demonstrating remarkable performance in monolingual ASR. ... 详细信息
来源: 评论
Vec-Tok speech: speech Vectorization and Tokenization for Neural speech Generation
IEEE Transactions on Audio, Speech and Language Processing
收藏 引用
IEEE Transactions on Audio, speech and language processing 2025年 33卷 1243-1254页
作者: Xinfa Zhu Yuanjun Lv Yi Lei Tao Li Wendi He Hongbin Zhou Heng Lu Lei Xie Audio Speech and Language Processing Group (ASLP@NPU) School of Computer Science Northwestern Polytechnical University Xi'an China Ximalaya Inc. Shanghai China
language models (LMs) have recently flourished in natural language processing and computer vision, generating high-quality texts and images in various tasks. While current speech LMs have made significant progress, th... 详细信息
来源: 评论
RPO: Retrieval Preference Optimization for Robust Retrieval-Augmented Generation
arXiv
收藏 引用
arXiv 2025年
作者: Yan, Shi-Qi Ling, Zhen-Hua National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei China
While Retrieval-Augmented Generation (RAG) has exhibited promise in utilizing external knowledge, its generation process heavily depends on the quality and accuracy of the retrieved context. Large language models (LLM... 详细信息
来源: 评论
Flatness-Aware Prompt Selection Improves Accuracy and Sample Efficiency
arXiv
收藏 引用
arXiv 2023年
作者: Shen, Lingfeng Tan, Weiting Zheng, Boyuan Khashabi, Daniel Center for Language and Speech Processing and Computer Science Department Johns Hopkins University BaltimoreMD United States
With the growing capabilities of large language models, prompting them has become the dominant way to access them. This has motivated the development of strategies for automatically selecting effective language prompt... 详细信息
来源: 评论
ERVQ: Enhanced Residual Vector Quantization with Intra-and-Inter-Codebook Optimization for Neural Audio Codecs
arXiv
收藏 引用
arXiv 2024年
作者: Zheng, Rui-Chen Du, Hui-Peng Jiang, Xiao-Hang Ai, Yang Ling, Zhen-Hua National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China China
Current neural audio codecs typically use residual vector quantization (RVQ) to discretize speech signals. However, they often experience codebook collapse, which reduces the effective codebook size and leads to subop... 详细信息
来源: 评论
SAMOS: A Neural MOS Prediction Model Leveraging Semantic Representations and Acoustic Features
SAMOS: A Neural MOS Prediction Model Leveraging Semantic Rep...
收藏 引用
International Symposium on Chinese Spoken language processing
作者: Yu-Fei Shi Yang Ai Ye-Xin Lu Hui-Peng Du Zhen-Hua Ling National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei
Assessing the naturalness of speech using mean opinion score (MOS) prediction models has positive implications for the auto-matic evaluation of speech synthesis systems. Early MOS prediction models took the raw wavefo... 详细信息
来源: 评论
APCodec+: A Spectrum-Coding-Based High-Fidelity and High-Compression-Rate Neural Audio Codec with Staged Training Paradigm
APCodec+: A Spectrum-Coding-Based High-Fidelity and High-Com...
收藏 引用
International Symposium on Chinese Spoken language processing
作者: Hui-Peng Du Yang Ai Rui-Chen Zheng Zhen-Hua Ling National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei
This paper proposes a novel neural audio codec, named AP-Codec+, which is an improved version of APCodec. The AP-Codec+ takes the audio amplitude and phase spectra as the coding object, and employs an adversarial trai... 详细信息
来源: 评论
DiffAttack: Diffusion-based Timbre-reserved Adversarial Attack in Speaker Identification
DiffAttack: Diffusion-based Timbre-reserved Adversarial Atta...
收藏 引用
International Conference on Acoustics, speech, and Signal processing (ICASSP)
作者: Qing Wang Jixun Yao Zhaokai Sun Pengcheng Guo Lei Xie John H.L. Hansen Audio Speech and Language Processing Group (ASLP@NPU) School of Computer Science Northwestern Polytechnical University Xian China Center for Robust Speech Systems (CRSS) The University of Texas Dallas USA
Being a form of biometric identification, the security of the speaker identification (SID) system is of utmost importance. To better understand the robustness of SID systems, we aim to perform more realistic attacks i... 详细信息
来源: 评论
CSDNet: cross-sketch with dual gated attention for fine-grained image captioning network
收藏 引用
Multimedia Tools and Applications 2024年 1-28页
作者: Hossain, Md. Shamim Aktar, Shamima Hossen, Md. Bipul Hossain, Mohammad Alamgir Gu, Naijie Huang, Zhangjin School of Computer Science and Technology University of Science and Technology of China Anhui Hefei230027 China Deqing Alpha Innovation Institute Huzhou313299 China Department of Mathematics Jashore University of Science and Technology Jashore7408 Bangladesh Department of Statistics Begum Rokeya University Rangpur5404 Bangladesh National Engineering Laboratory for Speech and Language Information Processing University of Science and Technology of China Anhui Hefei230027 China
In the realm of extracting inter and intra-modal interactions, contemporary models often face challenges such as reduced computational efficiency, particularly when dealing with lengthy visual sequences. To address th... 详细信息
来源: 评论