The information content of binaural signals can be beneficial to many algorithms deployed in current digital hearing aids. However, the exchange of such signals over a wireless communication link requires transmission...
详细信息
The information content of binaural signals can be beneficial to many algorithms deployed in current digital hearing aids. However, the exchange of such signals over a wireless communication link requires transmission schemes that must fulfill demanding technical constraints. We present a distributed coding algorithm that builds on psychoacoustic principles in order to achieve this goal with low bitrates, while preserving affordable complexity. The key steps of the proposed algorithm are detailed and the accuracy of the signal exchange mechanism is evaluated in simple simulated acoustic scenarios.
Unified Speech and audio coding (USAC) is an emerging MPEG audio standard striving for efficiently representing both speech and music signals even in very low bitrate ranges. The reference codec takes an approach of u...
详细信息
Unified Speech and audio coding (USAC) is an emerging MPEG audio standard striving for efficiently representing both speech and music signals even in very low bitrate ranges. The reference codec takes an approach of unifying two state-of-the-art speech and audio coding structures in a single platform. This paper proposes an enhanced long term predictor (eLTP) that effectively utilizes periodic redundancies of inter- and intra- time frames. Experimental results with various types of input signals confirm the superiority of the proposed algorithm compared to the reference codec.
Parametric coding of multichannel audio has gained popularity for low bit rate audio coding applications such as digital audio broadcasting. Most of the existing algorithms use MDCT domain techniques for compressing t...
详细信息
Parametric coding of multichannel audio has gained popularity for low bit rate audio coding applications such as digital audio broadcasting. Most of the existing algorithms use MDCT domain techniques for compressing the audio, while the spatialization parameter estimation is done in a different time-frequency domain. An MDCT domain parametric stereo coding algorithm which represents the stereo channels as the linear combination of the `sum' channel derived from the stereo channels and a reverberated channel generated from the `sum' channel has been reported in literature. Spatialization parameters are estimated at the encoder by taking the scaled sub-band projections of stereo channels on `sum' and reverberated channel. This model is inadequate to represent the stereo image since only four parameters per sub-band are used as spatialization parameters. In this work we improve the perceptual quality of this MDCT domain parametric coder with an augmented parameter extraction scheme using an additional reverberated channel. Subjective evaluation using MUSHRA test illustrates that the new algorithm has increased the perceptual audio quality of the encoded audio signal significantly.
Most studies in LPC-based audio coders decompose the signal into the product of excitation and system spectra, and then quantize the excitation by using either a stochastic codebook or multiple pulses. But their near ...
详细信息
ISBN:
(纸本)0780320468
Most studies in LPC-based audio coders decompose the signal into the product of excitation and system spectra, and then quantize the excitation by using either a stochastic codebook or multiple pulses. But their near white spectra cannot precisely describe the harmonic characteristics of excitation, especially when dealing with instrumental music. The paper explores the benefits of sinusoidal representation for excitation in the design of analysis-by-synthesis predictive coders. Furthermore, an efficient parameter extraction algorithm has also been developed to identify the associated parameters of the sinusoidal components. Simulation results indicate that the proposed multi-sinusoid excitation model allows the implementation of an LPC-based audio coder which delivers near toll quality at the rate of 92.61 kbps.< >
A stream coding framework is presented for solving the distortion-constrained time-frequency dependent quantization problem that naturally arises when overlapped time-frequency decompositions are used. The main contri...
详细信息
A stream coding framework is presented for solving the distortion-constrained time-frequency dependent quantization problem that naturally arises when overlapped time-frequency decompositions are used. The main contributions of this paper are: (1) an efficient rate-distortion allocation algorithm for dependent quantization when the neighborhood of dependency is large; and (2) demonstration that a perceptual excitation distortion measure produces better coded audio quality than the conventional noise-to-mask ratio measure.
We develop a new method for quantization in multistage audio coding. We consider the case of a two-stage sinusoidal/waveform coder. Given a distortion measure and a bit-rate constraint, we analytically derive the opti...
详细信息
We develop a new method for quantization in multistage audio coding. We consider the case of a two-stage sinusoidal/waveform coder. Given a distortion measure and a bit-rate constraint, we analytically derive the optimal rate distribution between subcoders (stages) and the corresponding optimal quantizers, which allows the coder to adapt easily to changes in bit-rate requirements. We verify that the performance, both in terms of signal-to-noise ratio (SNR) and perceptual quality, is higher if the input to the second stage is obtained by subtracting the quantized first-stage reconstruction from the original signal, as opposed to subtracting the unquantized reconstruction.
Signal representations in overcomplete dictionaries are considered here as an alternative to the traditional transform representations for fine-grain scalable audio coding. Such representations produce sparser decompo...
详细信息
Signal representations in overcomplete dictionaries are considered here as an alternative to the traditional transform representations for fine-grain scalable audio coding. Such representations produce sparser decompositions and thus allow better coding efficiency than transform coding at very low bitrates. Moreover, the decomposition algorithms are intrinsically progressive, and flexible enough to allow an efficient transient modeling. We propose in this paper a fine-grain scalable audio coder which works on a large range of bitrates (2kbs to 128kbs). Objective measures as well as informal subjective evaluation show that this coder outperforms a comparable transform-based coder at very low bitrates.
A method for exploiting inter-channel redundancies of stereophonic or multichannel audio signals is presented. In contrast to known stereo redundancy reduction techniques used in joint stereo audio coding. Where only ...
详细信息
A method for exploiting inter-channel redundancies of stereophonic or multichannel audio signals is presented. In contrast to known stereo redundancy reduction techniques used in joint stereo audio coding. Where only the statistical dependencies between two concurrent samples of the left and right channel signals are considered, the adaptive inter-channel prediction also takes into account possible phase or time delay between the channels and exploits more than only one value of the cross-correlation function. The analysis of subjective listening test results has shown that this technique is especially effective for a class of test sequences which has proven to be most critical for the ISO MPEG Layer II and Layer III codecs at bit rates of 2/spl times/64 kbit/s. For these signals the gain due to the stereo redundancy reduction technique used in Layer III joint stereo coding is less than 5-10 dB, while in Layer II joint stereo coding no specific stereo redundancy reduction technique is used. In a first step, the adaptive inter-channel prediction has been applied to an ISO MPEG Layer II codec. The simulation results show that a prediction gain up to 30-40 dB can be achieved for large parts of the above mentioned signals.< >
This paper describes the low-complexity 14 kHz audio coding algorithm which has been recently standardized by ITU-T as Recommendation G.722.1 Annex C ("G.722.1C"). The algorithm is an extension to ITU-T Reco...
详细信息
This paper describes the low-complexity 14 kHz audio coding algorithm which has been recently standardized by ITU-T as Recommendation G.722.1 Annex C ("G.722.1C"). The algorithm is an extension to ITU-T Recommendation G.722.1 and doubles the G.722.1 algorithm to permit 14 kHz audio bandwidth using a 32 kHz audio sample rate, at 24, 32, and 48 kbit/s. The G.722.1C codec features very high audio quality and extremely low computational complexity compared to other state-of-the-art audio coding algorithms. This codec is suitable for use in video conferencing and teleconferencing, and Internet streaming applications. Subjective test results from the characterization phase of G.722.1 C are also presented in the paper
暂无评论