咨询与建议

限定检索结果

文献类型

  • 232 篇 会议
  • 127 篇 期刊文献
  • 1 册 图书

馆藏范围

  • 360 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 219 篇 工学
    • 140 篇 计算机科学与技术...
    • 123 篇 软件工程
    • 88 篇 信息与通信工程
    • 28 篇 电子科学与技术(可...
    • 26 篇 仪器科学与技术
    • 21 篇 电气工程
    • 20 篇 生物工程
    • 18 篇 控制科学与工程
    • 15 篇 化学工程与技术
    • 13 篇 机械工程
    • 7 篇 建筑学
    • 6 篇 土木工程
    • 3 篇 光学工程
    • 3 篇 生物医学工程(可授...
  • 155 篇 理学
    • 114 篇 物理学
    • 56 篇 数学
    • 23 篇 生物学
    • 20 篇 统计学(可授理学、...
    • 15 篇 化学
    • 5 篇 系统科学
  • 52 篇 管理学
    • 37 篇 图书情报与档案管...
    • 18 篇 管理科学与工程(可...
    • 10 篇 工商管理
  • 13 篇 法学
    • 10 篇 社会学
    • 3 篇 法学
  • 7 篇 教育学
    • 6 篇 教育学
    • 4 篇 心理学(可授教育学...
  • 7 篇 文学
    • 7 篇 外国语言文学
    • 6 篇 中国语言文学
  • 3 篇 医学
  • 2 篇 经济学
    • 2 篇 应用经济学
  • 2 篇 农学

主题

  • 59 篇 speech recogniti...
  • 38 篇 speech processin...
  • 26 篇 training
  • 21 篇 acoustics
  • 19 篇 signal processin...
  • 17 篇 natural language...
  • 17 篇 speech enhanceme...
  • 16 篇 automatic speech...
  • 15 篇 feature extracti...
  • 15 篇 robustness
  • 13 篇 speech
  • 12 篇 speech synthesis
  • 11 篇 error analysis
  • 10 篇 hidden markov mo...
  • 10 篇 predictive model...
  • 9 篇 decoding
  • 8 篇 training data
  • 8 篇 transformers
  • 8 篇 self-supervised ...
  • 8 篇 accuracy

机构

  • 68 篇 national enginee...
  • 18 篇 hitachi ltd. res...
  • 15 篇 institute for la...
  • 15 篇 center for langu...
  • 13 篇 center for langu...
  • 10 篇 iflytek research
  • 10 篇 institute for la...
  • 9 篇 department of in...
  • 9 篇 ict cluster sing...
  • 8 篇 robust speech pr...
  • 8 篇 national enginee...
  • 7 篇 university of sc...
  • 7 篇 iflytek research...
  • 7 篇 school of ece na...
  • 6 篇 robust speech pr...
  • 6 篇 state key labora...
  • 6 篇 institute for la...
  • 6 篇 national enginee...
  • 5 篇 university of sc...
  • 5 篇 ibm thomas j. wa...

作者

  • 51 篇 ling zhen-hua
  • 32 篇 ai yang
  • 21 篇 hansen john h.l.
  • 19 篇 zhen-hua ling
  • 17 篇 hansen john h. l...
  • 16 篇 watanabe shinji
  • 16 篇 lu ye-xin
  • 15 篇 yang ai
  • 14 篇 gu jia-chen
  • 14 篇 katsouros vassil...
  • 14 篇 potamianos alexa...
  • 14 篇 j.h.l. hansen
  • 14 篇 du hui-peng
  • 13 篇 fujita yusuke
  • 13 篇 paraskevopoulos ...
  • 13 篇 katsamanis athan...
  • 12 篇 androutsopoulos ...
  • 10 篇 horiguchi shota
  • 10 篇 shinji watanabe
  • 10 篇 zheng rui-chen

语言

  • 331 篇 英文
  • 29 篇 其他
检索条件"机构=Center for Research in Speech and Language Processing"
360 条 记 录,以下是21-30 订阅
排序:
Zero-Shot Singing Voice Conversion Based on Timbre Space Modeling and Excitation Signal Control  18th
Zero-Shot Singing Voice Conversion Based on Timbre Space Mod...
收藏 引用
18th National Conference on Man-Machine speech Communication, NCMMSC 2023
作者: Jiang, Yuan Chen, Yan-Nian Liu, Li-Juan Hu, Ya-Jun Fang, Xin Ling, Zhen-Hua National Engineering Research Center for Speech and Language Information Processing University of Science and Technology of China Hefei230026 China iFLYTEK Research iFLYTEK Co. Ltd. Hefei230088 China
In recent years, singing voice conversion technology has rapidly advanced and is capable of generating high-quality singing voices. However, challenges persist, such as pitch fluctuations and significant differences i... 详细信息
来源: 评论
The Greek podcast corpus: Competitive speech models for low-resourced languages with weakly supervised data
arXiv
收藏 引用
arXiv 2024年
作者: Paraskevopoulos, Georgios Tsoukala, Chara Katsamanis, Athanasios Katsouros, Vassilis Institute for Speech and Language Processing Athena Research Center Athens Greece
The development of speech technologies for languages with limited digital representation poses significant challenges, primarily due to the scarcity of available data. This issue is exacerbated in the era of large, da... 详细信息
来源: 评论
Meltemi: The first open Large language Model for Greek
arXiv
收藏 引用
arXiv 2024年
作者: Voukoutis, Leon Roussis, Dimitris Paraskevopoulos, Georgios Sofianopoulos, Sokratis Prokopidis, Prokopis Papavasileiou, Vassilis Katsamanis, Athanasios Piperidis, Stelios Katsouros, Vassilis Institute for Speech and Language Processing Athena Research Center Artemidos 6 & Epidavrou Athens Greece
We describe the development and capabilities of Meltemi 7B, the first open Large language Model for the Greek language. Meltemi 7B has 7 billion parameters and is trained on a 40 billion token Greek corpus. For the de...
来源: 评论
Joint Generative-Contrastive Representation Learning for Anomalous Sound Detection  48
Joint Generative-Contrastive Representation Learning for Ano...
收藏 引用
48th IEEE International Conference on Acoustics, speech and Signal processing, ICASSP 2023
作者: Zeng, Xiao-Min Song, Yan Zhuo, Zhu Zhou, Yu Li, Yu-Hong Xue, Hui Dai, Li-Rong McLoughlin, Ian Alibaba Group China University of Science and Technology of China National Engineering Research Center of Speech and Language Information Processing Hefei China Singapore Institute of Technology Ict Cluster Singapore
In this paper, we propose a joint generative and contrastive representation learning method (GeCo) for anomalous sound detection (ASD). GeCo exploits a Predictive AutoEncoder (PAE) equipped with self-attention as a ge... 详细信息
来源: 评论
Arduino Voice Control for Arabic speech Recognition using Smartphone  6
Arduino Voice Control for Arabic Speech Recognition using Sm...
收藏 引用
6th International Hybrid Conference on Informatics and Applied Mathematics, IAM 2023
作者: Bakri, Adil Lounnas, Khaled Lichouri, Mohamed Scientific and Technical Research Centre on Arid Regions CRSTRA Biskra Algeria Scientific Research and Technical Center for the Development of Arabic Language CRSTDLA Algiers Algeria Speech Communication and Signal Processing Laboratory LCPTS Faculty of Electronics and Computer Science USTHB Algiers Algeria
Engaging with our surroundings through voice control has emerged as an increasingly intriguing aspect. This technology is gaining prevalence in our daily lives, whether applied in smart homes, mobile phones, or the co... 详细信息
来源: 评论
RPO: Retrieval Preference Optimization for Robust Retrieval-Augmented Generation
arXiv
收藏 引用
arXiv 2025年
作者: Yan, Shi-Qi Ling, Zhen-Hua National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei China
While Retrieval-Augmented Generation (RAG) has exhibited promise in utilizing external knowledge, its generation process heavily depends on the quality and accuracy of the retrieved context. Large language models (LLM... 详细信息
来源: 评论
SAMOS: A Neural MOS Prediction Model Leveraging Semantic Representations and Acoustic Features
SAMOS: A Neural MOS Prediction Model Leveraging Semantic Rep...
收藏 引用
International Symposium on Chinese Spoken language processing
作者: Yu-Fei Shi Yang Ai Ye-Xin Lu Hui-Peng Du Zhen-Hua Ling National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei
Assessing the naturalness of speech using mean opinion score (MOS) prediction models has positive implications for the auto-matic evaluation of speech synthesis systems. Early MOS prediction models took the raw wavefo... 详细信息
来源: 评论
ERVQ: Enhanced Residual Vector Quantization with Intra-and-Inter-Codebook Optimization for Neural Audio Codecs
arXiv
收藏 引用
arXiv 2024年
作者: Zheng, Rui-Chen Du, Hui-Peng Jiang, Xiao-Hang Ai, Yang Ling, Zhen-Hua National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China China
Current neural audio codecs typically use residual vector quantization (RVQ) to discretize speech signals. However, they often experience codebook collapse, which reduces the effective codebook size and leads to subop... 详细信息
来源: 评论
APCodec+: A Spectrum-Coding-Based High-Fidelity and High-Compression-Rate Neural Audio Codec with Staged Training Paradigm
APCodec+: A Spectrum-Coding-Based High-Fidelity and High-Com...
收藏 引用
International Symposium on Chinese Spoken language processing
作者: Hui-Peng Du Yang Ai Rui-Chen Zheng Zhen-Hua Ling National Engineering Research Center of Speech and Language Information Processing University of Science and Technology of China Hefei
This paper proposes a novel neural audio codec, named AP-Codec+, which is an improved version of APCodec. The AP-Codec+ takes the audio amplitude and phase spectra as the coding object, and employs an adversarial trai... 详细信息
来源: 评论
WEAKLY-SUPERVISED AUTOMATED AUDIO CAPTIONING VIA TEXT ONLY TRAINING
arXiv
收藏 引用
arXiv 2023年
作者: Kouzelis, Theodoros Katsouros, Vassilis Institute for Language and Speech Processing Athena Research Center Marousi15125 Greece
In recent years, datasets of paired audio and captions have enabled remarkable success in automatically generating descriptions for audio clips, namely Automated Audio Captioning (AAC). However, it is labor-intensive ... 详细信息
来源: 评论