文献详情 >Speaker Extraction with Verifi... 收藏

Speaker Extraction with Verification of Present and Absent Target Speakers

结合目标说话人存在与否验证的说话人提取

作者：Zhang, Ke Borsdorf, Marvin Liu, Tianchi Wang, Shuai Wei, Yangjie Li, Haizhou

作者机构：Key Laboratory of Intelligent Computing in Medical Image Northeastern University Shenyang110819 China Shenzhen Guangdong518000 China Machine Listening Lab University of Bremen Bremen28359 Germany Department of Electrical and Computer Engineering National University of Singapore Singapore119077 Singapore

出版物：《Journal of Shanghai Jiaotong University (Science)》 (J. Shanghai Jiaotong Univ. Sci.)

年卷期：2025年

页面：1-6页

核心收录：

基　　金：Foundation item: the Deutsche Forschungsgemeinschaft (DFG German Research Foundation) under Germany\u2019s Excellence Strategy (University Allowance EXC 2077 University of Bremen) the National Natural Science Foundation of China (Nos. 62401377 and 62271432) and the Internal Project of Shenzhen Research Institute of Big Data (No. T00120220002)

主　　题：Speech recognition

摘要：Target speaker extraction (TSE) models are expected to extract the target speech from a cocktail party mixture signal. When only trained with present target speaker samples (PT), these models output noise in the absence of the target speaker (AT). One may enhance the TSE quality by providing the information about the PT and AT. However, the detection of the target speaker is not perfect. In this paper, we propose a new model, TSEV, which performs target speaker extraction and speaker verification simultaneously. The TSEV model outputs an extracted speech and generates two speaker embeddings per inference to detect the target speaker. By sharing the speaker encoder and low-level modules, the speaker verification task can be performed in low signal-to-noise ratio scenarios. We train the TSEV model on multi-talker PT and AT conditions with fully overlapped speech. Experiments verify the superiority of jointly performing two tasks in the proposed model. The TSEV model achieves better verification performance without degrading the extraction performance compared with the baseline. © Shanghai Jiao Tong University 2025.

本地馆藏 | 借阅须知 | 我要预约

已订购，未入库

sda

目录详情 | 试阅读 |

读者评论与其他读者分享你的观点

学校读者

FontfaceFontSizeBoldItalicUnderlineBackColorAlignListLinkImgEmot

用户名:未登录

我的评分

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Speaker Extraction with Verification of Present and Absent Target Speakers

读者评论与其他读者分享你的观点

请选择收藏分类：

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Speaker Extraction with Verification of Present and Absent Target Speakers

读者评论 与其他读者分享你的观点

请选择收藏分类： 新增自定义分类 确定 取消

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

读者评论与其他读者分享你的观点

请选择收藏分类：