A new linear prediction analysis method for multichannel signals was devised, with the goal of enhancing the compression performance of the MPEG-4 audio Lossless coding (ALS) compliant encoder. The multichannel coding...
详细信息
A new linear prediction analysis method for multichannel signals was devised, with the goal of enhancing the compression performance of the MPEG-4 audio Lossless coding (ALS) compliant encoder. The multichannel coding tool for this standard carries out an adaptively weighted subtraction of the residual signals of the coding channel from those of the reference channel, both of which are produced by independent linear prediction. Our linear prediction method tries to directly minimize the amplitude of the predicted residual signal after subtraction of the signals of the coding channel. The results of a comprehensive evaluation show that this method yields a 0.1% smaller compressed file size averagely, the maximum improvement of compression ratio achieves 14.6%, at the cost of a small increase in computational complexity at the encoder and without increase in decoding time. This is a practical method because the compressed bit stream remains compliant with the MPEG-4 ALS standard.
Compression of digital audio signals has become very important audio computation process. When audio data are compressed it is possible to store more data in a smaller memory and to increase the overall audio data thr...
详细信息
Compression of digital audio signals has become very important audio computation process. When audio data are compressed it is possible to store more data in a smaller memory and to increase the overall audio data throughput transferred through an interface. Several compression schemes were developed and well established. Most of them adopt the MDCT/IMDCT. This paper presents an software tool, an improved MDCT IP core generator with architectural model simulation that is capable to generate several MDCT architectures with adjustable parameters for FPGA-based design. The software tool has integrated functions of the computation precision and area estimation, which facilitate and speed up the design process.
MPEG-4 Scalable Lossless (SLS) coding is the latest released ISO international standard for scalable audio coding. Besides its function as an extension of MPEG-4 Advanced audio coding (AAC) perceptual audio coder, SLS...
详细信息
ISBN:
(纸本)9781424412730
MPEG-4 Scalable Lossless (SLS) coding is the latest released ISO international standard for scalable audio coding. Besides its function as an extension of MPEG-4 Advanced audio coding (AAC) perceptual audio coder, SLS has a "non-core mode" that is able to offer full scalability. The perceptual audio coder is absent in this mode and scalability is achieved through pure bit-plane coding. In this paper, a perceptually enhanced bit-plane coding method, namely Quad-level Bit-Plane coding (QBPC) is proposed to enhance the perceptual quality of fully scalable audio at intermediate bitrates. With QBPC structure, the perceptual quality of fully scalable audio coded by SLS is significantly improved in a wide range of intermediate bitrates. Meanwhile this is achieved with trivial added overhead and complexity.
We propose a novel method for embedding robust forensic tracking watermarks to complement encryption in electronic music distribution applications. The watermark is embedded at the player during the AAC decoding proce...
详细信息
We propose a novel method for embedding robust forensic tracking watermarks to complement encryption in electronic music distribution applications. The watermark is embedded at the player during the AAC decoding process through modification of scale-factors in pre-defined frequency bands. It thereby modulates the short-time envelope of the decoded audio in the corresponding band. The resulting watermark is robust to various attacks such as re-compression and acoustic transmission. The method, due to its negligible computational overhead, can be used in resource constrained devices such as portable players.
During the last decade, new mobile multimedia applications have emerged for mobile and network multimedia, wireless multimedia communication, audio/video teleconferencing, remote assistance, digital storage systems, s...
详细信息
During the last decade, new mobile multimedia applications have emerged for mobile and network multimedia, wireless multimedia communication, audio/video teleconferencing, remote assistance, digital storage systems, secure audio transmission and so on. In order to meet these requirements, tremendous research efforts have been put in the development of efficient digital audio coding technologies. In China, AVS-M audio standard is such an audio technology targeting for mobile multimedia applications which is developed and owned by China audio and Video coding Standard Workgroup. In this paper, AVS-M audio standard is discussed by revealing the technical principles of the en- and decoding, the standardization situation and the suitability of the codec in relation to technology available, economical feasibility and the market needs. Finally it concludes with a brief discussion of future research directions.
Recently lifting-based integer transforms have received much attention, especially in the area of lossless audio and image coding. The usual approach is to apply the lifting scheme to each Givens rotation. Especially ...
详细信息
Recently lifting-based integer transforms have received much attention, especially in the area of lossless audio and image coding. The usual approach is to apply the lifting scheme to each Givens rotation. Especially in the case of long transform sizes in audio coding applications, this leads to a considerable approximation error in the frequency domain. This paper presents a multidimensional lifting approach for reducing this approximation error. In this approach, large parts of the transform are calculated without rounding operations, only the output is rounded and added. The new approach is applied and evaluated for both the integer modified discrete cosine transform (IntMDCT) and the integer fast Fourier transform (IntFFT).
In this paper, an improved parametric audio coder is presented. This coder addresses an important issue in audio coding, namely handling of transients. We propose a dedicated coder for transients based on amplitude mo...
详细信息
In this paper, an improved parametric audio coder is presented. This coder addresses an important issue in audio coding, namely handling of transients. We propose a dedicated coder for transients based on amplitude modulated sinusoids. This coder is then combined with a constant-amplitude sinusoidal coder, and by rate-distortion optimization we choose which of the two is used for each segment. We show by rate-distortion curves and listening tests that the proposed coder offers significant improvements as compared to the constant-amplitude coder.
In this paper, we present Advanced audio Zip (AAZ), a scalable lossless audio coding technology that was recently selected as the reference model for MPEG audio scalable lossless coding (SLS) work. AAZ provides excell...
详细信息
In this paper, we present Advanced audio Zip (AAZ), a scalable lossless audio coding technology that was recently selected as the reference model for MPEG audio scalable lossless coding (SLS) work. AAZ provides excellent compression performance while delivering fine grain bit-rate scalability from lossy to lossless coding. Moreover, AAZ provides backward compatibility to the MPEG advanced audio coding (AAC) system by embedding an AAC compliant bit-stream into the lossless bit-stream. As a result, AAZ serves as a universal coding solution with functionalities that were previously offered by several distinct audio coding technologies such as lossless audio coding, perceptual audio coding, or scalable audio coding; and maximizes the interchangeability for digital audio contents migrating among these application domains.
In this paper we present a wideband (44.1 kHz sampling rate) audio and speech coder that combines two different strategies, namely, parametric and waveform coding. It is shown how this approach can be used to design a...
详细信息
In this paper we present a wideband (44.1 kHz sampling rate) audio and speech coder that combines two different strategies, namely, parametric and waveform coding. It is shown how this approach can be used to design a layered bit stream scalable coder offering a wide variety of decoding bit rates with little scalability loss. Moreover, the bit rates associated with the different layers are competitive, in terms of quality, to those of standardized coders (MP3, AAC) tuned at a particular bit rate.
暂无评论