This paper introduces a novel scalable speech coding scheme based on embedded matrix quantization of LSF parameters in an LPC model. In the proposed quantizer, codewords are organized based on a tree structure through...
详细信息
ISBN:
(纸本)9781424425709
This paper introduces a novel scalable speech coding scheme based on embedded matrix quantization of LSF parameters in an LPC model. In the proposed quantizer, codewords are organized based on a tree structure through a cell-merging process, which leads to a fine-grain scalable coder at rates below 900 bps. Near natural sounding is achieved at very low rates by employing an efficient adaptive dual-band scheme to approximate the LPC excitation signals. Evaluation results, obtained from both overall quality measurement and intelligibility assessment, show that the proposed coder could be a reasonable choice for improving the bottom-line speech quality in low bit rates.
This paper proposes an efficient codebook design for tree-structured vector quantization (TSVQ) that is embedded in nature. We modify two speechcoding standards by replacing their original quantizers for line spectra...
详细信息
This paper proposes an efficient codebook design for tree-structured vector quantization (TSVQ) that is embedded in nature. We modify two speechcoding standards by replacing their original quantizers for line spectral frequencies (LSF's) and/or Fourier magnitudes quantization with TSVQ-based quantizers. The modified coders are fine-granular bit-rate scalable with gradual change in quality for the synthetic speech. A fast search encoding algorithm using multistage tree-structured vector quantization (MTVQ) is proposed for quantization of LSF's. The proposed method is compared to the multipath sequential tree-assisted search (MSTS) and to the well known multipath sequential search (MSS) or M-L search algorithms. (C) 2011 Elsevier B.V. All rights reserved.
SNR scalable speech coding is desirable for a number of network multimedia applications, but relatively few SNR-scalablespeech coders exist for operation at rates below 16 kb/s. We investigate several SNR scalable so...
详细信息
SNR scalable speech coding is desirable for a number of network multimedia applications, but relatively few SNR-scalablespeech coders exist for operation at rates below 16 kb/s. We investigate several SNR scalable source coding structures and define the new concepts of dependent and independent SNR scalability, where independent SNR scalable coders depend on the core layer coder only through the core layer output. Independent SNR scalable structures offer the possibility of providing bit rate scalable functionality to existing nonscalable coders and standards. We show that the MPEG-4 scalable coders are examples of dependent SNR scalable coders, and we introduce a new independent SNR scalable coder called CELPTree, which has the additional advantage of being low delay. We compare the performance of the MPEG-4 coders and CELPTree for both clean and noisy speech, and we examine the effects of frequency-weighted distortion measures in the enhancement layers of SNR scalablespeech coders.
This paper presents a way of using a linear regression model to produce a single-valued criterion that indicates the perceived importance of each block in a stream of speech blocks. This method is superior to the conv...
详细信息
This paper presents a way of using a linear regression model to produce a single-valued criterion that indicates the perceived importance of each block in a stream of speech blocks. This method is superior to the conventional approach, voice activity detection (VAD), in that it provides a dynamically changing priority value for speech segments with finer granularity. The approach can be used in conjunction with scalable speech coding techniques in the context of IP QoS services to achieve a flexible form of quality control for speech transmission. A simple linear regression model is used to estimate a mean opinion score (MOS) of the various cases of missing speech segments. The estimated MOS is a continuous value that can be mapped to priority levels with arbitrary granularity. Through subjective evaluation, we show the validity of the calculated priority values.
Most modern bandwidth extension techniques predict the high- frequency band based on features extracted from the lower band. While this works for some frames, problems arise when the correlation between the low and th...
详细信息
ISBN:
(纸本)1424407281
Most modern bandwidth extension techniques predict the high- frequency band based on features extracted from the lower band. While this works for some frames, problems arise when the correlation between the low and the high band is insufficient. In these situations, additional high-band information must be sent to the decoder. In this paper, we propose a scalable speech coding method based on the principles of bandwidth extension. The rate selection is based on explicit psychoacoustic criteria, while the bandwidth extension is performed using a constrained MMSE estimation technique. Objective and subjective evaluations indicate that the proposed system performs at a lower average bit rate when compared to other similar algorithms while improving speech quality.
暂无评论