MP3 and AAC both use the filterbank to convert audio signal from time domain to frequency domain, which plays an important role in the whole encoder processing. Analysis the application of the filterbank in MP3 and AA...
详细信息
MP3 and AAC both use the filterbank to convert audio signal from time domain to frequency domain, which plays an important role in the whole encoder processing. Analysis the application of the filterbank in MP3 and AAC will help to understand the difference of them in coding efficiency and audio quality better. Therefore, from three points, which are the components of filterbank, the selection of window shape and the type of the translation block, this paper analyses the filterbank in MP3 and AAC respectively, then explores the differences between them and gives some related experiments. The result shows that, compared with MP3, AAC shows a better coding efficiency and audio distortion control only in the filterbank module.
We propose an extension of ADPCM that includes adaptive pre- and post-filtering to achieve spectral shaping of the coding noise. The advantage of this coding scheme is that it allows a realization without algorithmic ...
详细信息
ISBN:
(纸本)9781424407286
We propose an extension of ADPCM that includes adaptive pre- and post-filtering to achieve spectral shaping of the coding noise. The advantage of this coding scheme is that it allows a realization without algorithmic delay by making the filters backwards-adaptive. The measurements we present indicate that the addition of adaptive pre- and post-filtering to ADPCM results in a significant improvement in perceived audio quality. We therefore believe that the proposed system is a viable way to near-transparent lossy audio coding without algorithmic delay.
This paper presents a novel solution to multichannel spatial audio coding: Spatial Squeezing Surround audio coding (S(3)AC). The S(3)AC scheme analyses a multichannel audio signal and downmixes it into a stereo signal...
详细信息
ISBN:
(纸本)9781424407286
This paper presents a novel solution to multichannel spatial audio coding: Spatial Squeezing Surround audio coding (S(3)AC). The S(3)AC scheme analyses a multichannel audio signal and downmixes it into a stereo signal pair containing both the monophonic properties of audio sources and their localization information;this avoids the need for side information. The approach uses time-frequency analysis of a spatial audio scene and exploits virtual sources and amplitude panning techniques to 'squeeze' 360 degrees of a horizontal soundfield to a 60 degrees stereo signal pair. In comparison with other spatial audio coding techniques, S(3)AC significantly advances in-band encoding of the localization information in the original sound scene and achieves accurate recoverability of dynamic localized sources.
MPEG-4 AAC audio coding is the most widely used audio coding at present, but the MPEG-4 AAC audio coding standard has high complexity, long time delay and huge computation, what's more, it is not beneficial for re...
详细信息
MPEG-4 AAC audio coding is the most widely used audio coding at present, but the MPEG-4 AAC audio coding standard has high complexity, long time delay and huge computation, what's more, it is not beneficial for real-time applications. Psychoacoustic model is the core part of the audio encoder, so huge computation also exists. Through researching the masking expansion feature of psychoacoustic, the computation process of the spread function is improved, the psychoacoustic computation decreased, the coding time is reduced, and the experimental results are given, which has important practical meaning to the research of real-time audio coding.
In this paper, we present a window switching algorithm employed in A VS (audio Video coding Standard of China) generic audio coding. The algorithm - Energy and Unpredictability Measure based Window Switching Decision ...
详细信息
ISBN:
(纸本)9781424413119
In this paper, we present a window switching algorithm employed in A VS (audio Video coding Standard of China) generic audio coding. The algorithm - Energy and Unpredictability Measure based Window Switching Decision (ENUPM-WSD) - achieves high performance to complexity ratio by taking the advantages of both the low complexity 0 energy based decision in time domain and the high accuracy of unpredictability based decision in frequency domain. It significantly improves encoded audio quality, especially those with large transient portions, while preserves low average computational complexity in A VS generic audio encoder. Due to the merits, it is accepted as a recommendation module of the A VS audio standard.
Sparse coding is a new field in signal processing with possible applications to source coding. In this paper we present a new method that combines the problems of sparse signal approximation with coefficient quantizat...
详细信息
ISBN:
(纸本)9781424407286
Sparse coding is a new field in signal processing with possible applications to source coding. In this paper we present a new method that combines the problems of sparse signal approximation with coefficient quantization. This method uses over-complete dictionaries and exploits signal redundancy. The proposed method will be derived as an extension of a recently presented method (iterative thresholding) to find sparse representations of signals. Because in digital communication and storage we need a quantized representation of the signal, instead of quantization of sparse representations a posteriori, we propose a refined method that combines sparse approximation and quantization. To compare the proposed method to a posteriori quantization, we present an audio example.
We propose a frame loss concealment technique for decoders compatible with MPEG advanced audio coding (AAC). The spectral information of the lost frame is estimated in the modified discrete cosine transform (MDCT) dom...
详细信息
ISBN:
(纸本)9781424407286
We propose a frame loss concealment technique for decoders compatible with MPEG advanced audio coding (AAC). The spectral information of the lost frame is estimated in the modified discrete cosine transform (MDCT) domain via efficient techniques that are tailored to individual source signal components: In noise-like spectral bins the MDCT coefficients are obtained by shaped-noise insertion, while coefficients in tone-dominant bins are estimated by frame interpolation followed by a refinement procedure so as to optimize the fit of the concealed frames with neighboring frames. Experimental results demonstrate that the proposed technique offers performance superior to techniques adopted in commercial AAC decoders.
This paper proposes a method for analyzing the direction of the arrival of sound by estimating the sound intensity vector from the pressure and energy gradients of closely-spaced omnidirectional microphones depending ...
详细信息
This paper proposes a method for analyzing the direction of the arrival of sound by estimating the sound intensity vector from the pressure and energy gradients of closely-spaced omnidirectional microphones depending on frequency. Microphones with relatively large housing, which cause shadowing, are used here to provide inter microphone level differences in order to compute the energy gradients at high frequencies. The proposed method is evaluated in the direction analysis of a spatial-sound processing technique, Directional audio coding (DirAC). It is shown that the method provides a reliable direction estimation at the entire audio frequency range, whereas the traditional method employing the pressure gradients produces correct estimation in a limited frequency window only.
Perceptual audio coding, now a very common technology, is a classic example of a technology that arose in several places simultaneously when the "time was right". Here, we will mostly discuss the timeline fr...
详细信息
ISBN:
(纸本)9781424421091
Perceptual audio coding, now a very common technology, is a classic example of a technology that arose in several places simultaneously when the "time was right". Here, we will mostly discuss the timeline from the point of view of the people who worked on and about perceptual coding (both audio and video) at AT&T Bell Labs and its successor AT&T Labs Research.
Introduction of LTE (Long Term Evolution) brings enhanced quality for 3GPP multimedia services. The high throughput and low latency of LTE enable higher quality media coding than what is possible in UMTS. LIE-specific...
详细信息
Introduction of LTE (Long Term Evolution) brings enhanced quality for 3GPP multimedia services. The high throughput and low latency of LTE enable higher quality media coding than what is possible in UMTS. LIE-specific codecs have not yet been defined but work on them is ongoing in 3GPP. The LIE codecs are expected to improve the basic signal quality, but also to offer new capabilities such as extended audio bandwidth, stereo and multi-channels for voice and higher temporal and spatial resolutions for video. Due to the wide range of functionalities in media coding, LIE gives more flexibility for service provision to cope with heterogeneous terminal capabilities and transmission over heterogeneous network conditions. By adjusting the bit-rate, the computational complexity, and the spatial and temporal resolution of audio and video, transport and rendering can be optimised throughout the media path hence guaranteeing the best possible quality of service. (C) 2010 Elsevier B.V. All rights reserved.
暂无评论