A novel technique of phase randomization (PR) for bit rate reduction in perceptual coding of high quality audio is presented. Instead of actual phase information, the decoder utilizes nevertheless-carefully-formulated...
详细信息
A novel technique of phase randomization (PR) for bit rate reduction in perceptual coding of high quality audio is presented. Instead of actual phase information, the decoder utilizes nevertheless-carefully-formulated 'random' phase information for signal reconstruction with the help of a code sent by the encoder in every frame of data. The coding scheme is tested on the digital audio broadcasting (DAB) standard, for which broadcasters often desire an increase in the number of transmission channels for densely populated areas. The codec achieves the aim of providing good quality audio at a reduced bit rate assessed with the help of lTU-R BS.1116 methodology for the subjective assessment of audio quality
Sparse signal modeling has been considered widely in the literature. In this paper, we discuss an extension of the matching pursuit sparse modeling algorithm to the case of simultaneously approximating multiple data s...
详细信息
Sparse signal modeling has been considered widely in the literature. In this paper, we discuss an extension of the matching pursuit sparse modeling algorithm to the case of simultaneously approximating multiple data signals; we outline the algorithm for general and for sinusoidal dictionaries. We then apply multichannel sinusoidal pursuit (M-SP) to spatial audio coding (SAC). In most SAC schemes, multichannel audio is coded by forming a downmix signal, compressing the down- mix with a legacy coder, and adding side information about spatial properties of the input audio. In the proposed M-SP system, a multichannel model of the input is used to derive the spatial information as well as a parametric model of an appropriate downmix signal. This joint spatial-parametric approach provides a different multichannel audio coding paradigm than that of previously described SAC methods.
The paper presents a comparison of a previous and a new approach to shape quantization noise in low bit rate predictive audio coding. The previous approach uses an adaptation of the step size of a uniform quantizer, t...
详细信息
The paper presents a comparison of a previous and a new approach to shape quantization noise in low bit rate predictive audio coding. The previous approach uses an adaptation of the step size of a uniform quantizer, the new approach uses a quantizer with clipping. Both approaches are evaluated using a predictive audio coding scheme. The presented results of a listening test show the improved performance of the new approach
We study the behavior of hybrid random waveform models for audio signals, involving sparse random series of waveforms, with random coefficients. Similar approaches have been considered in the recent years. However, th...
详细信息
We study the behavior of hybrid random waveform models for audio signals, involving sparse random series of waveforms, with random coefficients. Similar approaches have been considered in the recent years. However, these do generally not rely on explicit models, and are of more "algorithmical" nature. The models we propose allow us to analyze mathematical properties of such signals and corresponding estimators, and derive estimation algorithms, which do not rely on complex optimization techniques
In this paper, we present a comparative study of different lossless audio coding schemes, which are implemented using different integer transforms. The audio signal under consideration, which is assumed to be integer-...
详细信息
ISBN:
(纸本)1424402719;1424402727
In this paper, we present a comparative study of different lossless audio coding schemes, which are implemented using different integer transforms. The audio signal under consideration, which is assumed to be integer-valued, as in the case of fixed-point implementation, is first decorrelated using the appropriate integer transform. The resulting integer coefficients are then entropy-coded to produce the output stream. Several integer transforms have been considered, such as the integer wavelet transform (IntWT) with different decomposition levels as well as different filters, the integer discrete cosine transform (IntDCT), and the integer Walsh Hadamard transform (IntWHT). Arithmetic and Huffman coding have been used for entropy coding. The performed simulation provides insight on the performance of the different integer transforms in the lossless audio coding context
This paper describes the low-complexity 14 kHz audio coding algorithm which has been recently standardized by ITU-T as Recommendation G.722.1 Annex C ("G.722.1C"). The algorithm is an extension to ITU-T Reco...
详细信息
This paper describes the low-complexity 14 kHz audio coding algorithm which has been recently standardized by ITU-T as Recommendation G.722.1 Annex C ("G.722.1C"). The algorithm is an extension to ITU-T Recommendation G.722.1 and doubles the G.722.1 algorithm to permit 14 kHz audio bandwidth using a 32 kHz audio sample rate, at 24, 32, and 48 kbit/s. The G.722.1C codec features very high audio quality and extremely low computational complexity compared to other state-of-the-art audio coding algorithms. This codec is suitable for use in video conferencing and teleconferencing, and Internet streaming applications. Subjective test results from the characterization phase of G.722.1 C are also presented in the paper
A perceptually scalable audio coder generates a bit-stream that contains layers of audio fidelity and is encoded in such a way that adding one of these layers enhances the reconstructed audio by an amount that is just...
详细信息
A perceptually scalable audio coder generates a bit-stream that contains layers of audio fidelity and is encoded in such a way that adding one of these layers enhances the reconstructed audio by an amount that is just noticeable by the listener. Such algorithms have applications like music on demand at variable levels of fidelity for 3G and 4G cellular radio systems operating at different bit rates. While the MPEG-4 natural audio coder can create scalable bit streams, its perceptual quality at low bit rates is poor. On the other hand, the non scalable TWIN-VQ performs well at low bit rates. In this paper we present a technique to modify the TWIN-VQ algorithm such that it generates a perceptually scalable bit-stream with layers of audio fidelity. Using the TWIN-VQ as our base ensures the best possible perceptual quality at low bit rates (8 - 16 kbps).
We consider the problem of reliable distribution of audio over packet-switched networks. We make use of multiple-description coding combined with transform coding in order to obtain robustness towards packet losses. P...
详细信息
We consider the problem of reliable distribution of audio over packet-switched networks. We make use of multiple-description coding combined with transform coding in order to obtain robustness towards packet losses. Previous approaches to this problem were restricted to the case of only two descriptions. In this work we use n-channel multiple-description lattice vector quantizers (MD-LVQs), which allow for the possibility of using more than two descriptions. For a given packet-loss probability we find the number of descriptions and the bit allocation between transform coefficients which minimizes a perceptual distortion measure subject to an entropy constraint. The optimal quantizers are presented in closed form, thus avoiding any iterative quantizer design procedures. The theoretical results are verified with numerical computer simulations using audio signals and it is shown that in environments with excessive packet losses it is advantageous to use more than two descriptions. We verify in subjective listening tests that using more than two descriptions lead to signals of perceptually higher quality
A method for amplitude modulated sinusoidal audio coding is presented that has low complexity and low delay. This is based on a sub-band processing system, where, in each subband, the signal is modeled as an amplitude...
详细信息
A method for amplitude modulated sinusoidal audio coding is presented that has low complexity and low delay. This is based on a sub-band processing system, where, in each subband, the signal is modeled as an amplitude modulated sum of sinusoids. The envelopes are estimated using frequency-domain linear prediction and the prediction coefficients are quantized. As a proof of concept, we evaluate different configurations in a subjective listening test, and this shows that the proposed method offers significant improvements in sinusoidal coding. Furthermore, the properties of the frequency-domain linear prediction-based envelope estimator are analyzed
The paper presents a very simple enhancement of joint coding of stereo and surround channels within a perceptual audio codec. Moreover, the paper proposes two improvements to standard parametric stereo and spatial aud...
详细信息
The paper presents a very simple enhancement of joint coding of stereo and surround channels within a perceptual audio codec. Moreover, the paper proposes two improvements to standard parametric stereo and spatial audio compression in order to avoid the smearing of transients in the process of channel downmixing. The improvements consist in compensating of inter-channel delays prior to mixdown, as well as additional encoding of the room response for realistic reconstruction of the stereo ambience.
暂无评论