In this paper we study coding artifacts in MPEG-compressed scalable audio. Specifically, we consider the MPEG advanced audio coder (AAC) using bit slice scalable arithmetic coding (BSAC) as implemented in the MPEG 4 r...
详细信息
ISBN:
(纸本)0769514774
In this paper we study coding artifacts in MPEG-compressed scalable audio. Specifically, we consider the MPEG advanced audio coder (AAC) using bit slice scalable arithmetic coding (BSAC) as implemented in the MPEG 4 reference software. First, we perform human subjective testing using the comparison category rating (CCR) approach, quantitatively comparing the performance of scalable BSAC with the nonscalable TwinVQ and AAC algorithms. This testing indicates that scalable BSAC performs very poorly relative to TwinVQ at the lowest bitrate considered (16 kb/s) largely because of an annoying and seemingly random mid-range tonal signal that is superimposed onto the desired output. In order to better understand and perceptually quantify the various forms of distortion introduced into compressed audio at low bit rates, we apply two analysis techniques: Reng probing and time-frequency decomposition. The Reng probing technique is capable of separating the Linear time-invariant component of a multirate system from its nonlinear and periodically time-varying components. Using this technique, we conclude that aliasing is probably not the cause of the annoying tonal signal;instead, time-frequency analysis indicates that its cause is most likely suboptimal bit allocation.
audio coding is used to compress digital audio signals, thereby reducing the amount of bits needed to transmit or to store an audio signal. This is useful when network bandwidth or storage capacity is very limited. Au...
详细信息
audio coding is used to compress digital audio signals, thereby reducing the amount of bits needed to transmit or to store an audio signal. This is useful when network bandwidth or storage capacity is very limited. audio compression algorithms are based on an encoding and decoding process. In the encoding step, the uncompressed audio signal is transformed into a coded representation, thereby compressing the audio signal. Thereafter, the coded audio signal eventually needs to be restored (e.g. for playing back) through decoding of the coded audio signal. The decoder receives the bitstream and reconverts it into an uncompressed signal. ISO-MPEG is a standard for high-quality, low bit-rate video and audio coding. The audio part of the standard is composed by algorithms for high-quality low-bit-rate audio coding, i.e. algorithms that reduce the original bit-rate, while guaranteeing high quality of the audio signal. The audio coding algorithms consists of MPEG-1 (with three different layers), MPEG-2, MPEG-2 AAC, and MPEG-4. This work presents a study of the MPEG-4 AAC audio coding algorithm. Besides, it presents the implementation of the AAC algorithm on different platforms, and comparisons among implementations. The implementations are in C language, in Assembly of Intel Pentium, in C-language using DSP processor, and in HDL. Since each implementation has its own application niche, each one is valid as a final solution. Moreover, another purpose of this work is the comparison among these implementations, considering estimated costs, execution time, and advantages and disadvantages of each one. ...
audio coding originally has been developed as a means for making digital radio possible and for distributing audio via phone lines. The most apparent application today is electronic distribution of music via the Inter...
详细信息
audio coding originally has been developed as a means for making digital radio possible and for distributing audio via phone lines. The most apparent application today is electronic distribution of music via the Internet. While legal services and copy protection methods have been around for seven years, the public attention and the bulk of actual use is still on unauthorized copying of music files. The talk will introduce the basic techniques for high quality coding of music and discuss some possibilities to reduce the unauthorized copying of music.
A new approach for transients detection and estimation in the context of hybrid audio coding is presented. The basic idea is to use an orthogonal dyadic wavelet expansion, followed by hidden Markov tree modeling of wa...
详细信息
A new approach for transients detection and estimation in the context of hybrid audio coding is presented. The basic idea is to use an orthogonal dyadic wavelet expansion, followed by hidden Markov tree modeling of wavelet coefficients. Coefficients may be cast as "transient type" or "residual type", and the estimated transient is reconstructed from the transient type coefficients only. The estimation procedure involves the classical two steps of hidden Markov models: parameters estimation and state estimation. The implementation of those two steps in the case of wavelet coefficient trees is discussed in some detail, and numerical results are given. The application to audio signal encoding is also discussed.
Although widely used in many coding applications, the Modified Discrete Cosine Transform (MDCT) has the drawback of being sensitive to time shifts. With the popular choice a sine window, we show that it is possible to...
详细信息
Although widely used in many coding applications, the Modified Discrete Cosine Transform (MDCT) has the drawback of being sensitive to time shifts. With the popular choice a sine window, we show that it is possible to compute an explicit formulation of this time dependency. Starting from the exact MDCT of a pure sine and a simple interpretation in terms of combined modulations, we propose a regularization method that computes a "pseudo-spectrum" for the set of MDCT coefficients. This pseudo-spectum is shown to provide, at a low computational, cost, a good approximation of the local spectrum of the signal, with an improved behavior with respect to frequency and phase than the classical MDCT spectrum, ie. the absolute value of the coefficients. Amongst other applications, this procedure can be used to reduce some of the artifacts that appear in MDCT-based audio coders at low bit-rates.
This paper addresses the problem of streaming packetized media over a lossy packet network to a wireless client, in a rate-distortion optimized way. We introduce an incremental redundancy error-correction scheme that ...
详细信息
This paper addresses the problem of streaming packetized media over a lossy packet network to a wireless client, in a rate-distortion optimized way. We introduce an incremental redundancy error-correction scheme that combats the effects of both packet loss and bit errors in an end-to-end fashion, without support from the underlying network or from an intermediate base station. The scheme is employed within an optimization framework that enables the sender to compute which packets it should send, out of all the packets it could send at a given transmission opportunity, in order to meet an average transmission-rate constraint while minimizing the average end-to-end distortion. Experimental results show that our system is robust and maintains quality of service over a wide range of channel conditions. Up to 8 dB performance gains are registered over systems that are not rate-distortion optimized, at bit-error rates as large as 10(-2).
We study coding artifacts in MPEG-compressed scalable audio. Specifically, we consider the MPEG advanced audio coder (AAC) using bit slice scalable arithmetic coding (BSAC) as implemented in the MPEG 4 reference softw...
详细信息
ISBN:
(纸本)0769514774
We study coding artifacts in MPEG-compressed scalable audio. Specifically, we consider the MPEG advanced audio coder (AAC) using bit slice scalable arithmetic coding (BSAC) as implemented in the MPEG 4 reference software. First, we perform human subjective testing using the comparison category rating (CCR) approach, quantitatively comparing the performance of scalable BSAC with the nonscalable TwinVQ and AAC algorithms. This testing indicates that scalable BSAC performs very poorly relative to TwinVQ at the lowest bitrate considered (16 kb/s), largely because of an annoying and seemingly random mid-range tonal signal that is superimposed onto the desired output. In order to understand better and quantify perceptually the various forms of distortion introduced into compressed audio at low bit rates, we apply two analysis techniques: Reng probing and time-frequency decomposition. The Reng probing technique is capable of separating the linear time-invariant component of a multirate system from its nonlinear and periodically time-varying components. Using this technique, we conclude that aliasing is probably not the cause of the annoying tonal signal; instead, time-frequency analysis indicates that its cause is most likely suboptimal bit allocation.
MPEG-4 structured audio (SA) has been proposed as a flexible standard for generalized audio coding. Originating out of Netsound software developed at MIT SA is based on MIDI-synthesis of sound, but it is enriched with...
详细信息
MPEG-4 structured audio (SA) has been proposed as a flexible standard for generalized audio coding. Originating out of Netsound software developed at MIT SA is based on MIDI-synthesis of sound, but it is enriched with DSP algorithms so as to allow emulation of other types of coders designed for speech and audio signals. We have investigated the use of structured audio for lossless coding of audio signals, and have found that certain limitations of structured audio make implementations of lossless coders less straightforward than might be desired. In particular we have used SA to implement an MPEG-4 compliant version of the lossless audio coder audioPaK. To implement and validate our new coder we used the software system Sfront, which translates MPEG-4 SA files into efficient C programs that render the audio signal.
Low bit rate audio coding often relies on Fourier representation despite its limitations for transient signal modeling. This study proposes alternative decompositions and expansion strategies that lead to more accurat...
详细信息
Low bit rate audio coding often relies on Fourier representation despite its limitations for transient signal modeling. This study proposes alternative decompositions and expansion strategies that lead to more accurate modeling. Two classes of methods are considered, subspace decomposition methods, and atomic decomposition methods and their performances are compiled to propose an audio modeling scheme amenable to low bit rate coding.
暂无评论