版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Institute for Research and Development on Bioengineering and Bioinformatics CONICET-UNER Oro Verde Argentina Department of Electronic Engineering and the Advanced Center for Electrical and Electronic Engineering Universidad Técnica Federico Santa María Valparaíso Chile
出 版 物:《IEEE Transactions on Audio, Speech and Language Processing》
年 卷 期:2024年第33卷
页 面:152-162页
基 金:Consejo Nacional de Investigaciones Científicas y Técnicas Ministerio de Ciencia, Tecnología e innovación Universidad Nacional de Entre Ríos National Institutes of Health Agencia Nacional de Investigación y Desarrollo National Institutes of Health
主 题:Filters Filtering Kernel Cost function Acoustics Predictive models Lips Estimation Prediction algorithms Frequency estimation
摘 要:Voice inverse filtering methods aim at noninvasively estimating the glottal source information from the voice signal. These inverse filtering strategies typically rely on parametric models and variants of linear prediction for tuning the vocal tract filter. Weighted linear prediction schemes have proved to be the best performing for inverse filtering applications. However, the linear prediction and its variants are sensitive to the impulse-like acoustic excitations triggered by the abrupt glottal closure during voiced phonation. The present study examines the maximum correntropy criterion-based linear prediction (MCLP) for voice inverse filtering. Correntropy is a nonlinear, localized similarity measure inherently insensitive to peak-like outliers. Here, a theoretical framework is established for studying the properties of correntropy relevant for voice inverse filtering and for developing an algorithm to estimate vocal tract filter coefficients. The proposed algorithm results in a robust weighted linear prediction, where a correntropy weighting function is adjusted iteratively by a data-driven optimization scheme. The effects of correntropy kernel parameters on the performance of the MCLP method are analyzed. Characterization of the MCLP method for voice inverse filtering is addressed based on synthetic and natural sustained vowel signals. Simulations show that MCLP naturally overweights samples in the glottal closed phase, where the phonation model is more accurate. MCLP does not require prior information about the glottal instants, nor applying a predefined weighting function. Results show that MCLP performs similarly or better than other well-established inverse filtering methods based on weighted linear prediction.