咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Multi-microphone Complex Spect... 收藏

Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speech Separation

作     者:Wang, Zhong-Qiu Wang, Peidong Wang, DeLiang 

作者机构:Ohio State Univ Dept Comp Sci & Engn Columbus OH 43210 USA Mitsubishi Elect Res Labs Cambridge MA 02139 USA Ohio State Univ Ctr Cognit & Brain Sci Columbus OH 43210 USA 

出 版 物:《IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING》 (IEEE ACM Trans. Audio Speech Lang. Process.)

年 卷 期:2021年第29卷

页      面:2001-2014页

核心收录:

学科分类:0808[工学-电气工程] 08[工学] 0702[理学-物理学] 

基  金:NIDCD [R01 DC012048] NSF [ECCS-1808932] Ohio Supercomputer Center 

主  题:Geometry Array signal processing Speech processing Microphone arrays Covariance matrices Deep learning Training Complex spectral mapping speaker separation microphone array processing deep learning 

摘      要:We propose multi-microphone complex spectral mapping, a simple way of applying deep learning for time-varying non-linear beamforming, for speaker separation in reverberant conditions. We aim at both speaker separation and dereverberation. Our study first investigates offline utterance-wise speaker separation and then extends to block-online continuous speech separation (CSS). Assuming a fixed array geometry between training and testing, we train deep neural networks (DNN) to predict the real and imaginary (RI) components of target speech at a reference microphone from the RI components of multiple microphones. We then integrate multi-microphone complex spectral mapping with minimum variance distortionless response (MVDR) beamforming and post-filtering to further improve separation, and combine it with frame-level speaker counting for block-online CSS. Although our system is trained on simulated room impulse responses (RIR) based on a fixed number of microphones arranged in a given geometry, it generalizes well to a real array with the same geometry. State-of-the-art separation performance is obtained on the simulated two-talker SMS-WSJ corpus and the real-recorded LibriCSS dataset.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分