For quantization of line spectral frequency (LSF), Gaussian mixture model (GMM) based switched split vector quantization (SSVQ) has been reported as the best performing intra-frame coding method. However, GMM-SSVQ par...
详细信息
For quantization of line spectral frequency (LSF), Gaussian mixture model (GMM) based switched split vector quantization (SSVQ) has been reported as the best performing intra-frame coding method. However, GMM-SSVQ partly recovers correlations between the subvectors of splitvectorquantization (SVQ). In the proposed GMM-SSVQ with the Karhunen-LoSve Transform (KLT), KLT-domain quantization for each mixture with a novel region-clustering algorithm is applied to GMM-SSVQ. Compared with SVQ and GMM-SSVQ, it provides 4 and 1 bit higher performance in terms of average spectral distortion and outliers, respectively. Computational complexity and memory requirements are similar to GMM-SSVQ.
splitvectorquantization (SVQ) performs well and efficiently for line spectral frequency (LSF) quantization, but misses some component dependencies. switched SVQ (SSVQ) can restore some advantage due to nonlinear dep...
详细信息
ISBN:
(纸本)9781467358057
splitvectorquantization (SVQ) performs well and efficiently for line spectral frequency (LSF) quantization, but misses some component dependencies. switched SVQ (SSVQ) can restore some advantage due to nonlinear dependencies through Gaussian Mixture Models (GMM). Remaining linear dependencies or correlations between vector components can be used to advantage by transform coding. The Karhunen-Loeve transform (KLT) is normally used but eigendecomposition and full transform matrices make it computationally complex. However, a family of transforms has been recently characterized by the capability of generalized triangular decomposition (GTD) of the source covariance matrix. The prediction-based lower triangular transform (PLT) is the least complex of such transforms and is a component in the implementation of all of them. This paper proposes a minimum noise structure for PLT SVQ. Coding results for 16-dimensional LSF vectors from wideband speech show that GMM PLT SSVQ can achieve transparent quantization down to 41 bit/frame with distortion performance close to GMM KLT SSVQ at about three-fourths as much operational complexity. Other members of the GTD family, such as the geometric mean decomposition (GMD) transform and the bidiagonal (BID) transform, fail to capitalize on their advantageous features due to the low bit rate per component in the range tested.
暂无评论