版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Department of Artificial Intelligence Korea University Seoul South Korea
出 版 物:《IEEE Transactions on Audio, Speech and Language Processing》
年 卷 期:2025年第33卷
页 面:557-569页
基 金:Artificial Intelligence Graduate School Program Korea government MSIT Institute of Information & communications Technology Planning & Evaluation Artificial Intelligence Innovation Hub Program funded by the Korea government MSIT IITP
主 题:Optimization Convex functions Time-domain analysis Spectrogram Signal processing algorithms Vectors Sparse approximation Approximation algorithms Knowledge engineering Time-frequency analysis
摘 要:In this paper, a deep phase retrieval algorithm for speech signals based on the dual Alternating Direction Method of Multipliers (ADMM) incorporating a deep prior network that exploits the sparsity of speech signals is presented. The proposed network, named DADMM-net, unfolds the dual ADMM for the $\ell _{1}$ -regularized non-convex optimization problem of phase retrieval with several two-dimensional convolutional neural networks (2D-CNNs). In order to efficiently optimize the deep unfolding network for high-dimensional parameter vectors, a novel updating scheme referred to as soft coordinate descent (soft-CD) is proposed, where dual parameter updates are determined through interpolation between the current values and the updated values coordinate-wise with respect to the weights computed by deep networks in each layer. Numerical simulations on a publicly available dataset confirm the state-of-the-art performance of the proposed method in terms of perceptual evaluation of speech quality and short-time objective intelligibility with a significantly faster convergence speed compared to existing methods.