检索结果-内蒙古大学图书馆

Single-Channel Speech Separation Based on Non-negative Matrix Factorization and Factorial Conditional Random Field

Chinese Journal of Electronics 2018年第5期27卷 1063-1070页

作者： LI Xu TU Ming WANG Xiaofei WU Chao FU Qiang YAN Yonghong Key Laboratory of Speech Acoustics and Content Understanding Institute of AcousticsChinese Academy of Sciences Signal Analysis Representation and Perception Laboratory Arizona State University Xinjiang Laboratory of Minority Speech and Language Information Processing

A new Non-negative matrix factorization(NMF) based algorithm is proposed for single-channel speech separation with a prior known speakers, which aims to better model the spectral structure and temporal continuity of speech signal. First, NMF and k-means clustering are employed to obtain multiple small dictionaries as well as a state sequence that describes the temporal dynamics between these dictionaries for each ***, a Factorial conditional random field(FCRF) model is trained using the state sequences and dictionaries to jointly model the temporal continuity of two speakers' mixed signal for separation. Experiments show that the proposed algorithm outperforms the baselines with respect to all metrics, for example sparse NMF(+1.12 dB SDR, +2.37 dB SIR, +0.40 dB SAR, +0.2 MOS), nonnegative factorial hidden Markov model(+2.04 dB SDR,+4.26 dB SIR, +0.62 dB SAR, +1.0 MOS) and standard NMF(+2.8 dB SDR, +5.08 dB SIR, +1.06 dB SAR, +1.2 MOS).

关键词： Single-channel speech separation Non-negative matrix factorization Factorial conditional random field k-means clustering

来源：评论

学校读者我要写书评

暂无评论

基于NMF和FCRF的单通道语音分离算法

基于NMF和FCRF的单通道语音分离算法

引用

第十三届全国人机语音通讯学术会议(NCMMSC2015)

作者：李煦屠明吴超国雁萌纳跃跃付强颜永红中国科学院声学研究所语言声学与内容理解重点实验室 Signal Analysis Representation and Perception Laboratory Arizona State University

近年来,非负矩阵分解(Non-negative matrix factorization,NMF)被广泛应用于单通道语音分离问题。然而,标准的NMF算法假设语音的相邻帧之间是相互独立的,不能表征语音信号的时间连续性信息。为此,本文提出了一种新的语音分离算法,首先将... 详细信息

近年来,非负矩阵分解(Non-negative matrix factorization,NMF)被广泛应用于单通道语音分离问题。然而,标准的NMF算法假设语音的相邻帧之间是相互独立的,不能表征语音信号的时间连续性信息。为此,本文提出了一种新的语音分离算法,首先将NMF和k均值聚类结合对纯净语音的频谱结构以及时间连续性进行建模,然后利用得到的模型训练因子条件随机场(factorial conditional random field,FCRF),进而对混合语音信号进行分离。结果表明本文提出的算法相比于没有考虑语音时间连续特性的基于NMF的算法,如Active-Set Newton Algorithm(ASNA),在客观指标上有明显提高。

关键词：单通道语音分离因子条件随机场非负矩阵分解 k均值聚类

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：