This paper describes a new audio coding scheme based on sinusoidal coding of signals. Sinusoidal coding permits the representation of a given signal through the summation of sinusoids. The parameters of the sinusoids ...
详细信息
ISBN:
(纸本)0780366859
This paper describes a new audio coding scheme based on sinusoidal coding of signals. Sinusoidal coding permits the representation of a given signal through the summation of sinusoids. The parameters of the sinusoids (the amplitudes, phases and frequencies) are transmitted to allow the signal reconstruction. In the proposed scheme, the sinusoidal parameters are sorted according to energy content and perceptual significance. The most significant parameters are transmitted first allowing the use of only a small set of the parameters for signal reconstruction. The proposed scheme incurs a low delay and uses a 20 ms frame length. Results show that the coder operating at a mean rate of 39 kb/s, performs favorably in comparison with the MPEG-4 coder at 42 kb/s.
This paper proposes a tonal component coding algorithm for a codec that employs a transform followed by Huffman coding, such as MPEG-2 audio AAC (advanced audio coding) [1]. After the input audio signal is mapped onto...
详细信息
This paper proposes a tonal component coding algorithm for a codec that employs a transform followed by Huffman coding, such as MPEG-2 audio AAC (advanced audio coding) [1]. After the input audio signal is mapped onto the frequency domain, the proposed algorithm removes local maxima components that degrade the coding efficiency. As a result of this withdrawal, the flatness of the spectrum increases and the efficiency of Huffman coding is improved. The removed components are encoded separately as side information. When the frequency resolution of the time/frequency mapping is high, this algorithm works more effectively since local maximum samples appear more frequently with such a mapping. Simulation results show that this algorithm achieves as much as an 11% bit reduction per frame and improves the coding efficiency in 41% of all audio frames. (C) 1999 Scripta Technica, Electron Comm Jpn Pt 3, 82(4): 71-78, 1999.
We propose a method that hierarchically quantizes wideband Modified Discrete Cosine Transform (MDCT) coefficients by developing a module that has a transform coding method primarily for audio as the basic structural u...
详细信息
We propose a method that hierarchically quantizes wideband Modified Discrete Cosine Transform (MDCT) coefficients by developing a module that has a transform coding method primarily for audio as the basic structural unit and freely using this module multiple times at the desired frequencies. The major feature of this method is to implement a simple structure having a high degree of freedom in scalable coding to hierarchically quantize MDCT coefficients over a wide band of frequencies by sharing the proposed module and using it multiple times. This paper presents examples using combinations of the module operating at a sampling frequency of 48 kHz and a bit rate of at least 8 kbit/s. In this example, a bit rate of at least 8 kbit/s and a reconstructed frequency band of at least 4 kHz can de selected as the objective. Subjective evaluation tests are performed to verify the effectiveness oft he proposed method. (C) 2001 Scripta Technica
An improved efficiency perceptual audio codec is presented which analyzes each block of input signal and selects a suitable time/frequency mapping transform for it. The selection is based on statistics of the input si...
详细信息
An improved efficiency perceptual audio codec is presented which analyzes each block of input signal and selects a suitable time/frequency mapping transform for it. The selection is based on statistics of the input signal vis-a-vis energy compaction and resolution power properties of the transforms employed which include the DFT (uniform subbands), DFT (critical subbands), DCT and CELP( for speech only blocks). The performance of the codec is compared with the widely famous MPEG-1 Layer-III algorithm. Efficiency enhancement is indicated by improved grades of subjective listening test for the proposed codec compared to those for MPEG-1 Layer-III at similar bit rates.. The paper concludes with a discussion of future research implications of the work.
This article provides a compact overview of the history, technology, and performance of MPEG Surround. The technology of MPEG Surround is based on the spatial audio coding (SAC) principle: In the encoder, a mono- or s...
详细信息
This article provides a compact overview of the history, technology, and performance of MPEG Surround. The technology of MPEG Surround is based on the spatial audio coding (SAC) principle: In the encoder, a mono- or stereophonic down- mix is generated from the multichannel input signal, and additional parametric side information is extracted to guide the subsequent up-mix procedure in the *** Moving Pictures Expert Group (MPEG) Surround, a data-rate efficient coding scheme for high-quality multichannel sound using novel parametric coding techniques has been standardized.
This paper proposes a lossless scalable audio coding scheme and quality enhancement processing at the decoder to compensate for some missing scalable units of information. The bit rate scalability is achieved by combi...
详细信息
This paper proposes a lossless scalable audio coding scheme and quality enhancement processing at the decoder to compensate for some missing scalable units of information. The bit rate scalability is achieved by combining high-compression coding, such as MPEG-4, and horizontal bit slicing of the PCN4-coded error signal between the original waveform and the locally, reconstructed MPEG-4 signal. The horizontally sliced stream may be transported through an IP network with priority. Even if some units are missing at the decoder, reasonable quality waveform can be reconstructed by means of preserving the important packets. In addition, quality enhancement procedures including scale adjustment and post-processing have been proposed. The scale adjustment eliminates unnecessary zero's, and the post-processing recovers the spectral envelope characteristics of the original input signal. As a result of objective quality evaluation, the two techniques are confirmed to be useful for quality enhancement when lower priority packets are lost. This scheme enables graceful degradation by supporting lossless, near lossless, and high-compression coding within a single scalable framework, and is useful for narrowband to broadband audio streaming.
This paper proposes audio coding using an efficient long-term prediction method to enhance the perceptual quality of audio codecs to speech input signals at low bit-rates. The MPEG-4 AAC-LTP exploited a similar concep...
详细信息
This paper proposes audio coding using an efficient long-term prediction method to enhance the perceptual quality of audio codecs to speech input signals at low bit-rates. The MPEG-4 AAC-LTP exploited a similar concept, but its improvement was not significant because of small prediction gain due to long prediction lags and aliased components caused by the transformation with a time-domain aliasing cancelation (TDAC) technique. The proposed algorithm increases the prediction gain by employing a deharmonizing predictor and a long-term compensation filter. The look-back memory elements are first constructed by applying the de-harmonizing predictor to the input signal, then the prediction residual is encoded and decoded by transform audio coding. Finally, the long-term compensation filter is applied to the updated look-back memory of the decoded prediction residual to obtain synthesized signals. Experimental results show that the proposed algorithm has much lower spectral distortion and higher perceptual quality than conventional approaches especially for harmonic signals, such as voiced speech.
Automatic speech/music discrimination is an important tool used in many multimedia applications, becoming a research topic of interest in the last years. This paper presents our last works in the speech/music discrimi...
详细信息
Automatic speech/music discrimination is an important tool used in many multimedia applications, becoming a research topic of interest in the last years. This paper presents our last works in the speech/music discrimination field, aiming to improve the coding efficiency of standard audio coders (i.e. MP3, AAC) when speech and music signals are involved. In order to discriminate between speech and music, a fuzzy rules-based expert system is incorporated into the decision-taking stage of traditional speech/music discrimination systems. The knowledge base of the fuzzy expert system has been obtained by means of a typical genetic learning algorithm (the Pittsburgh algorithm). The proposed speech/music discrimination scheme manages the operation of an intelligent audio coder, which selects a GSM coder for speech frames and an AAC coder for music ones, resulting in a lower bit rate regarding the case of using a standardized audio coder (AAC in this work). Further, the intelligent audio coder has been designed aiming to obtain a similar subjective audio quality than AAC. GSM operates at 13 kbits/s, while in the experiments the bit rate specification for AAC has been 32 kbits/s for one-channel audio signals.
audio Video coding Standard (AVS) is a second-generation source coding standard and the first standard for audio and video coding in China with independent intellectual property rights. Its performance has reached t...
详细信息
audio Video coding Standard (AVS) is a second-generation source coding standard and the first standard for audio and video coding in China with independent intellectual property rights. Its performance has reached the international standard. Its coding efficiency is 2 to 3 times greater than that of MPEG -2. This technical solution is more simple, and it can greatly save channel resource. After more than ten years' development, AVS has achieved great success. The latest version of the AVS audio coding standard is ongoing and mainly aims at the increasing demand for low bitrate and high quality audio services. The paper reviews the history and recent development of AVS audio coding standard in terms of basic features, key techniques and performance. Finally, the future development of AVS audio coding standard is discussed.
暂无评论