The modified discrete cosine transform (MDCT) is always employed in transform-coding schemes as the analysis/ synthesis filter bank. In this paper, an efficient algorithm for MDCT and inverse MDCT (IMDCT) computation ...
详细信息
The modified discrete cosine transform (MDCT) is always employed in transform-coding schemes as the analysis/ synthesis filter bank. In this paper, an efficient algorithm for MDCT and inverse MDCT (IMDCT) computation for MPEG-1 audio layer III and MPEG-2 international audio-coding standards is proposed, using only the type-II DCT. Finally, the proposed algorithm is compared to the similar algorithms in this paper. (C) 2005 Elsevier B.V. All rights reserved.
In this paper, we present a novel audio coder using the discrete wavelet transform (DWT) and warped linear prediction (WLP). In contrast to conventional LP, WLP allows for the control of frequency resolution to closel...
详细信息
In this paper, we present a novel audio coder using the discrete wavelet transform (DWT) and warped linear prediction (WLP). In contrast to conventional LP, WLP allows for the control of frequency resolution to closely match the response of the human auditory system. The structure of the system is similar to the transform coded excitation techniques used in wideband speech coding, where LP has been replaced with WLP, and the residual is analyzed by a wavelet filterbank designed to approximate the critical bands. The inherent shaping of the WLP synthesis filter, and a controlled bit allocation to the wavelet coefficients helps minimise the perceptually significant noise due to the quantization error in the residual. For monophonic signals sampled at 44.1 kHz, the coder achieves near transparent to transparent quality for a variety of speech and music signals at an average bitrate of about 64 kb/s. Tests also show that the coder (in its initial implementation) delivers superior quality to the MPEG layer III and comparable quality to the MPEG2-AAC codec when operating at the same bitrate.
In recent decades, digital video and audio coding technologies have helped revolutionize the ways we create, deliver and consume audiovisual content. This is exemplified by digital television (DTV), which is emerging ...
详细信息
In recent decades, digital video and audio coding technologies have helped revolutionize the ways we create, deliver and consume audiovisual content. This is exemplified by digital television (DTV), which is emerging as a captivating new program and data broadcasting service. This paper provides an overview of the video and audio coding subsystems of the Advanced Television Systems Committee (ATSC) DTV standard. We first review the motivation for data compression in digital broadcasting. The MPEG-2 video and AC-3 audio compression algorithms are described, with emphasis on basic concepts, system features, and coding performance. Next-generation video and audio codecs currently under consideration for advanced services are also presented.
In 1988 the International Standardization Organization (ISO) established the Motion Picture Expert Group (MPEG) to develop a digital coding standard for video and audio signals in order to enable interactive video and...
详细信息
In 1988 the International Standardization Organization (ISO) established the Motion Picture Expert Group (MPEG) to develop a digital coding standard for video and audio signals in order to enable interactive video and audio signals on digital storage media. The MPEG audio Group started with members from 14 research institutions in order to develop a digital audio coding standard guided by a chairman. As a result the MPEG-1, Layer I, Layer II and Layer III coding standards have been developed and proposed for coding of stereo audio signals at 2 x 192 kbit/s, 2 x 128 kbit/s and 2 x 64 kbit/s in 1992. Later, the abbreviation "mp3" or "MP3" was introduced in order to substitute the long name of the successful MPEG-1, Layer III coding standard. This paper describes the development of the MP3 coding standard and its essential components as contributed by the members of the MPEG audio Group.
This paper considers the problem of selecting a set of parameter values from a given parameter space, in order to perform rate-distortion optimization in the context of audio compression. Due to interdependencies betw...
详细信息
This paper considers the problem of selecting a set of parameter values from a given parameter space, in order to perform rate-distortion optimization in the context of audio compression. Due to interdependencies between parameters, separate optimization of parameter values is inherently suboptimal, yet a straightforward brute-force joint search involves prohibitive computational complexity. This work proposes a new method for joint rate-distortion optimization, while accounting for interparameter dependencies. The optimal solution is achieved, at significantly reduced complexity as compared to a brute-force search, by employing a Viterbi search over a trellis. Two objective distortion metrics are specifically considered: the average, and the maximum noise-to-mask ratio. Subjective (AB/MOS) and objective (average/maximum noise-to-mask ratio) tests demonstrate considerable gains at low bit rates of 16 kbps per channel for a 44.1-kHz sampled audio signal using the proposed approach.
In this letter, we present a decomposition for sinusoidal coding of audio, based on an amplitude modulation of sinusoids via a linear combination of arbitrary basis vectors. The proposed method, which incorporates a p...
详细信息
In this letter, we present a decomposition for sinusoidal coding of audio, based on an amplitude modulation of sinusoids via a linear combination of arbitrary basis vectors. The proposed method, which incorporates a perceptual distortion measure, is based on a relaxation of a nonlinear least-squares minimization. Rate-distortion curves and listening tests show that, compared to a constant-amplitude sinusoidal coder, the proposed decomposition offers perceptually significant improvements in critical transient signals.
In this paper, a few low-complexity and high-performance rate-distortion control algorithms for MPEG-4 Advanced audio coding (AAC) are proposed. One key element in producing good quality compressed audio particularly ...
详细信息
In this paper, a few low-complexity and high-performance rate-distortion control algorithms for MPEG-4 Advanced audio coding (AAC) are proposed. One key element in producing good quality compressed audio particularly at medium and low rates is a high performance rate-distortion controller in the audio encoder. Although the trellis-based rate-distortion control algorithms previously proposed can achieve a praiseworthy performance, their computational complexity is extremely high. Therefore, for practical applications, it is very desirable to achieve a similar performance at a much lower complexity. Two types of techniques are proposed in this paper to reduce the computational burden of the trellis-based algorithms. One is splitting a very heavy calculation stage into two sequential steps with much less computation. The other is reducing the candidates in the trellis for parameter search. Together, when applicable, our approach achieves a similar coding performance (audio quality) but requires less than 1/1000 complexity in computation.
We propose two quantization techniques for improving the bit-rate scalability of compression systems that optimize a weighted squared error (WSE) distortion metric. We show that quantization of the base-layer reconstr...
详细信息
We propose two quantization techniques for improving the bit-rate scalability of compression systems that optimize a weighted squared error (WSE) distortion metric. We show that quantization of the base-layer reconstruction error using entropy-coded scalar quantizers is suboptimal for the WSE metric. By considering the compandor representation of the quantizer, we demonstrate that asymptotic (high resolution) optimal scalability in the operational rate-distortion sense is achievable by quantizing the reconstruction error in the compandor's companded domain. We then fundamentally extend this work to the low-rate case by the use of enhancement-layer quantization which is conditional on the base-layer information. In the practically important case that the source is well modeled as a Laplacian process, we show that such conditional coding is implementable by only two distinct switchable quantizers. Conditional coding leads to substantial improvement over the companded scalable quantization scheme introduced in the first part, which itself significantly outperforms standard techniques. Simulation results are presented for synthetic memoryless Laplacian sources with P-law companding, and for real-world audio signals in conjunction with MPEG AAC. Using the objective noise-mask ratio (NMR) metric, the proposed approaches were found to result in bit-rate savings of a factor of 2 to 3 when implemented within the scalable MPEG AAC. Moreover, the four-layer scalable coder consisting of 16-kb/s layers achieves performance close to that of the 64-kb/s nonscalable coder on the standard test database of 44.1-kHz audio.
In this paper, a novel lossless coding method of spectral coefficients for audio codec is proposed. Conventional lossless coder directly codes the spectral coefficients based on their statistical characteristics, but ...
详细信息
ISBN:
(纸本)9781424445219
In this paper, a novel lossless coding method of spectral coefficients for audio codec is proposed. Conventional lossless coder directly codes the spectral coefficients based on their statistical characteristics, but does not provide the high coding efficiency due to its simple structure. To solve this limitation, a new lossless coding scheme consisting of bitplane coding and runlength coding is proposed. In the proposed scheme, the spectral coefficients are first transformed by bitplane coding to a bit stream, and the resulting bit stream is coded by runlength and finally entropy coded. In addition, the coding performance is further increased by applying the proposed bitplane coding selectively to spectral bands. The performance of proposed coding method is measured in terms of the ideal number of bits based on the entropy, which shows that the proposed method has better performance than the conventional lossless coder in AAC audio codec.
Abstract The method of quantization noise control of audio coding in the wavelet domain is proposed. Using the inverse Discrete Fourier Transform (DFT), it converts the masking threshold coming from MPEG psycho-acou...
详细信息
Abstract The method of quantization noise control of audio coding in the wavelet domain is proposed. Using the inverse Discrete Fourier Transform (DFT), it converts the masking threshold coming from MPEG psycho-acoustic model in the frequency domain to the signal in the time domain; the Discrete Wavelet Packet Transform (DWPF) is performed; the energy in each subband is regarded as the maximum allowed quantization noise energy. The experimental result shows that the proposed method can attain the nearly transparent audio quality below 64kbps for the most testing audio signals.
暂无评论