For very low bit rate audiocoding applications in mobile communications or on the intemet,parametric audio coding has evolved as a technique complementing the more traditional approaches. These are transform codecs o...
详细信息
For very low bit rate audiocoding applications in mobile communications or on the intemet,parametric audio coding has evolved as a technique complementing the more traditional approaches. These are transform codecs originally designed for achieving CD-like quality on one hand,and specialized speech codecs on the other *** of these techniques usually represent the audio signal waveform in a way such that the decoder output signal gives an approximation of the encoder input signal,while taking into account perceptual *** to this approach,in parametric audio coding the models of the signal source and of human perception are *** source model is now based on the assumption that the audio signal is the sum of"components,"each of which can be approximated by a relatively simple signal model with a small number of *** perception model is based on the assumption that the sound of the decoder output signal should be as similar as possible to that of the encoder input ***,the approximation of waveforms is no longer *** approach can lead to a very efficient representation. However,a suitable set of models for signal components,a good decomposition,and a good parameter estimation are all vital for achieving maximum audio quality. We will give an overview on the current status of parametric audio coding developments and demonstrate advantages and challenges of this ***,we will indicate possible directions of further improvements.
Sinusoidal modeling is one of the most popular techniques for low bitrate audiocoding. Usually, the sinusoidal parameters (amplitude, pulsation and phase of each sinusoidal component) are kept constant within a time ...
详细信息
Sinusoidal modeling is one of the most popular techniques for low bitrate audiocoding. Usually, the sinusoidal parameters (amplitude, pulsation and phase of each sinusoidal component) are kept constant within a time segment. An alternative model, the so-called Exponentially-Damped Sinusoidal (EDS) model, includes an additional damping parameter for each sinusoidal component to better represent the signal characteristics. It was however never shown that the EDS model could be efficient for perceptual audiocoding. To that aim, we propose in this paper an efficient analysis/synthesis framework with dynamic time-segmentation on transients and psychoacoustic modeling, and an asymptotically optimal entropy-constrained quantization method for the four sinusoid parameters (e.g., including damping). We then apply this coding technique to real audio excerpts for a given entropy target corresponding to a low bitrate (20 kbits/s), and compare this method with a classical sinusoidal coding scheme using a constant-amplitude sinusoidal model and the perceptually weighted Matching Pursuit algorithm. Subjective listening tests show that the EDS model is more efficient on audio samples with fast transient content, and similar to the classical model for more stationary audio samples.
In this paper we intend to optimize a wavelet-based dictionary for transient modeling with application to parametric audio coding. Transient modeling is performed by matching pursuit with an overcomplete dictionary co...
详细信息
In this paper we intend to optimize a wavelet-based dictionary for transient modeling with application to parametric audio coding. Transient modeling is performed by matching pursuit with an overcomplete dictionary composed of orthonormal wavelet functions that implement a wavelet-packet filter bank. We try to find the prototype filter length, the decomposition depth and the orthogonal wavelet family that lead to the best balance between mean squared error and computational cost. We are also interested in the structure of the wavelet decomposition tree. In such sense. comparison between the wavelet transform and the full wavelet-packet transform is performed. Finally, comparative analysis between wavelets and exponentially damped sinusoids is shown in experimental results. The proposed transient modeling method is suitable to be integrated into a parametricaudio coder based on the three-part model of sines, transients and noise (STN model). (C) 2009 Elsevier Inc. All rights reserved.
This paper deals with the application of adaptive signal models for parametric audio coding. A fully parametricaudio coder, which decomposes the audio signal into sinusoids, transients and noise, is here proposed. Ad...
详细信息
This paper deals with the application of adaptive signal models for parametric audio coding. A fully parametricaudio coder, which decomposes the audio signal into sinusoids, transients and noise, is here proposed. Adaptive signal models for sinusoidal, transient, and noise modeling are therefore included in the parametric scheme in order to achieve high-quality and low bit-rate audiocoding. In this paper, a new sinusoidal modeling method based on a perceptual distortion measure is proposed. For transient modeling, a fast and effective method based on matching pursuit with a mixed dictionary is chosen. The residue of the previous models is analyzed as a noise-like signal. The proposed parametricaudio coder allows high quality audiocoding for one-channel audio signals at 16 kbits/s (average bit rate). A bit-rate scalable version of the parametricaudio coder is also proposed in this work. Bit-rate scalability is intended for audio streaming applications, which are highly demanded nowadays. The performance of the proposed parametricaudio coders (non-scalable and scalable coders) is assessed in comparison to widely used audio coders operating at similar bit rates.
A Bark-band residual noise model integrated with the human hearing mechanism is proposed to efficiently complement sinusoidal model in parametric audio coding. The time-varying spectrum of the residual noise is retrie...
详细信息
A Bark-band residual noise model integrated with the human hearing mechanism is proposed to efficiently complement sinusoidal model in parametric audio coding. The time-varying spectrum of the residual noise is retrieved by Bark-scale piecewise constant magnitude estimates along with random phases. In the proposed noise model, Bark bands information is obtained by short-time FFT method and window overlap-add technique is exploited to remove boundary discontinuities. SVQ is also incorporated into parameter quantization process for the low bit-rate coding demand. Simulation results and informal listening tests show that when the sinusoidal model is combined with the Bark-band noise model, better synthesis audio quality can be achieved compared with the original sinusoidal modeling audio codec.
In this letter, we propose a novel matching pursuit-based method for transient modeling with application to parametric audio coding. The overcomplete dictionary for the matching pursuit is composed of wavelet function...
详细信息
In this letter, we propose a novel matching pursuit-based method for transient modeling with application to parametric audio coding. The overcomplete dictionary for the matching pursuit is composed of wavelet functions that implement a wavelet-packet filter bank. The proposed transient modeling method is suitable to be integrated into a parametricaudio coder based on the three-part model of sines;transients, and noise (STN model). Comparative analysis between wavelet and exponentially damped sinusoidal functions are shown in experimental results. The mean-squared-error performance of the proposed approach is better than that obtained with damped sinusoids.
In this paper, we propose a method of restoring principal to ambient energy ratio (PAR) at the decoder in the principal component analysis (PCA)-based parametric audio coding. The conventional approach applies the pos...
详细信息
ISBN:
(纸本)9781424442966
In this paper, we propose a method of restoring principal to ambient energy ratio (PAR) at the decoder in the principal component analysis (PCA)-based parametric audio coding. The conventional approach applies the post-scaling at the decoder using the energy information extracted from the input signal at the encoder. However, this approach has a problem that the relative energy of principal source in the reconstructed signal is smaller than the original signal and also affected by the panning angle of principal source. To restore the PAR at the decoder, the proposed method estimates the post-scaling factors using parametric information extracted from PCA-based formulation. The objective and subjective results verify the performance improvement of proposed method.
This paper proposed improvements to the low bit rate parametricaudio coder with sinusoid model as its kernel. Firstly, we propose a new method to effectively order and select the perceptually most important sinusoids...
详细信息
This paper proposed improvements to the low bit rate parametricaudio coder with sinusoid model as its kernel. Firstly, we propose a new method to effectively order and select the perceptually most important sinusoids. The sinusoid which contributes most to the reduction of overall NMR is chosen. Combined with our improved parametric psychoacoustic model and advanced peak riddling techniques, the number of sinusoids required can be greatly reduced and the coding efficiency can be greatly enhanced. A lightweight version is also given to reduce the amount of computation with only little sacrifice of performance. Secondly, we propose two enhancement techniques for sinusoid synthesis: bandwidth enhancement and line enhancement. With little overhead, the effective bandwidth can be extended one more octave; the timbre tends to sound much brighter, thicker and more beautiful.
In this paper, we propose a method of restoring principal to ambient energy ratio (PAR) at the decoder in the principal component analysis (PCA)-based parametric audio coding. The conventional approach applies the pos...
详细信息
ISBN:
(纸本)9781424442959;9781424442966
In this paper, we propose a method of restoring principal to ambient energy ratio (PAR) at the decoder in the principal component analysis (PCA)-based parametric audio coding. The conventional approach applies the post-scaling at the decoder using the energy information extracted from the input signal at the encoder. However, this approach has a problem that the relative energy of principal source in the reconstructed signal is smaller than the original signal and also affected by the panning angle of principal source. To restore the PAR at the decoder, the proposed method estimates the post-scaling factors using parametric information extracted from PCA-based formulation. The objective and subjective results verify the performance improvement of proposed method.
Object-based audiocoding can provide new music applications with interactivity. To efficiently compress a lot of target audio objects, a subband-based parametriccoding scheme has been adopted for MPEG spatial audio ...
详细信息
Object-based audiocoding can provide new music applications with interactivity. To efficiently compress a lot of target audio objects, a subband-based parametriccoding scheme has been adopted for MPEG spatial audio object coding. in this letter, the time-frequency (T/F) subband analysis structure is investigated. A reconfigured T/F structure is also proposed to enhance the generating performance of sound scenes such as 'karaoke' and 'solo' play in interactive music scenarios. From the experimental results, it was confirmed that the proposed scheme remarkably improves the SNR and sound quality.
暂无评论