Perceptual audio coders use an estimated masked threshold for the determination of the maximum permissible just-inaudible noise level introduced by quantization. This estimate is derived from a psychoacoustic model mi...
详细信息
Perceptual audio coders use an estimated masked threshold for the determination of the maximum permissible just-inaudible noise level introduced by quantization. This estimate is derived from a psychoacoustic model mimicking the properties of masking. Most psychoacoustic models for coding applications use a uniform (equal bandwidth) spectral decomposition as a first step to approximate the frequency selectivity of the human auditory system. However, the equal filter properties of the uniform subbands do not match the nonuniform characteristics of cochlear filters and reduce the precision of psychoacoustic modeling. Even so, uniform filter banks are applied because they are computationally efficient. This paper presents a psychoacoustic model based on an efficient nonuniform cochlear filter bank and a simple masked threshold estimation.. The novel filter-bank structure employs cascaded low-order HR filters and appropriate down-sampling to increase efficiency. The filter responses are. optimized for the modeling of auditory masking effects. Results of the new psychoacoustic model applied to audio coding show better performance in terms of bit rate and/or quality of the new model in comparison with other state-of-the-art models using a uniform spectral decomposition. The low delay of the new model is particularly suitable for low-delay coders.
Hybrid In Band on Channel (IBOC) digital audio broadcasting simultaneously with analog amplitude modulation (AM) has been proposed as a hybrid solution to digital audio broadcasting in the AM band. Since the AM band i...
详细信息
Hybrid In Band on Channel (IBOC) digital audio broadcasting simultaneously with analog amplitude modulation (AM) has been proposed as a hybrid solution to digital audio broadcasting in the AM band. Since the AM band is crowded and since the available bandwidth per program is limited, adding digital transmission is a challenging proposition. To achieve FM like audio quality, an audio coder rate of 32-64 kb/sec may be required. One of the currently proposed hybrid IBOC-AM systems is 30 kHz wide. Severe second adjacent interference may occur in certain geographical. areas. This may lead to loss of 40% of the effective transmission audio bit rate. For coping with such harsh transmission conditions, we present a solution based on embedded/multidescriptive audio coding with matched multistream transmission in separate frequency bands. With loss of one frequency band, the embedded system blends to a lower audio coder rate with a much better quality than analog AM. The nonembedded system without multistream transmission fails catastrophically when a little more than one sideband is severely interfered with causing a severe discontinuity in quality while blending directly to analog AM. A number of detailed robust embedded systems are outlined. We also show how multistream transmission schemes can be used with nonembedded audio coders. Both daytime and nighttime scenarios are included. This paper contains a catalog of possible systems for different audio quality levels and interference scenarios, including systems with 20 kHz bandwidth rather than 30 kHz.
The problem of computing, in a subband audio coder, the maximum quantisation noise power that can be injected in each band to ensure transparent coding when low selectivity filter banks are used, is addressed. A low c...
详细信息
The problem of computing, in a subband audio coder, the maximum quantisation noise power that can be injected in each band to ensure transparent coding when low selectivity filter banks are used, is addressed. A low complexity strategy, taking into account the frequency responses of the synthesis filter bank, is proposed for achieving an overall distortion due to quantisation noise always below the masking threshold (provided by a psycho-acoustic model) for any length prototype filters.
A new algorithm for achieving flexible tiling, of the time axis for audio coding purposes is presented, It is based on the calculus of the distances among a predetermined number of time-frequency pairs, From the compu...
详细信息
A new algorithm for achieving flexible tiling, of the time axis for audio coding purposes is presented, It is based on the calculus of the distances among a predetermined number of time-frequency pairs, From the computed distances. a clustering process determines the final subdivision of each audio frame. Experimental results demonstrates the good performance of the proposed algorithm. which provides high coding, efficiency with a reduced complexity.
This paper presents specialized DSP instructions and their hardware architecture for high-quality audio algorithms, such as the MPEG-2/4 advanced audio coding (AAC), Dolby AC-3, MPEG-2 backward compatible (BC), etc. T...
详细信息
ISBN:
(纸本)0780388348
This paper presents specialized DSP instructions and their hardware architecture for high-quality audio algorithms, such as the MPEG-2/4 advanced audio coding (AAC), Dolby AC-3, MPEG-2 backward compatible (BC), etc. The proposed architecture is specially designed and optimized for the IMDCT (inverse modified discrete cosine transform), and Huffman decoding in the AAC decoding algorithm. Performance comparisons show a significant improvement compared with TMS320C62/spl times/ and ASDSP21060 for the IMDCT computation. Furthermore, the dedicated Huffman accelerator performs the decoding process in only one cycle. The proposed DPU (data processing unit) consists of 107,860 gates and achieves 150 MIPS.
This paper proposes a specialized DSP architecture and their instructions, which efficiently support MPEG-2/4 AAC high-quality audio algorithms. The proposed architecture is specially designed and optimized for the IM...
详细信息
This paper proposes a specialized DSP architecture and their instructions, which efficiently support MPEG-2/4 AAC high-quality audio algorithms. The proposed architecture is specially designed and optimized for the IMDCT (inverse modified discrete cosine transform), Huffman decoding, etc. Performance comparisons show significant improvement compared with TMS320C62x and ASDSP21060 for the IMDCT computation. Furthermore, the dedicated Huffman accelerator performs the decoding process in only 2 cycles. The proposed DSP has been synthesized using the Samsung SEC 0.18 /spl mu/m standard cell library. The proposed DSP core consists of 120,283 gates and runs at 200 MHz.
High audio data compression can be achieved by removing irrelevant signal information that is not detectable by even a well-trained or sensitive listener. Contemporary audio coding schemes like MP3, AAC, and Ogg Vorbi...
详细信息
High audio data compression can be achieved by removing irrelevant signal information that is not detectable by even a well-trained or sensitive listener. Contemporary audio coding schemes like MP3, AAC, and Ogg Vorbis identify the irrelevant information during signal analysis by incorporating into the coder several psychoacoustic principles, including absolute hearing thresholds, critical band analysis, simultaneous masking, and temporal masking (Painter and Spanias, 2000). Masking is the process of removing faint but normally audible sound signals that are rendered inaudible as they are very close in frequency to or have much smaller amplitudes than surrounding sounds. Numerous studies have been conducted on genetic algorithms, which solve problems by modeling the Darwinian evolution. The algorithms have been recently applied to audio coding with some success (Galos et al., 2003). To achieve audio compression, genetic algorithms analyze a large number of sound files to determine the chunks that are most likely to contain irrelevant signals. The combination of the irrelevant chunks, form a solution which will be used to compress any sound files. We present in this paper a study of the comparison of applying psychoacoustic principles and genetic algorithms to compress audio signals. We developed a coder to perform the experiment, where like most well-known audio coders, Huffman coding is used to handle lossless compression and modified discrete cosine transform (MDCT) is used to transform the time-domain signals to the frequency domain. The results are compared using signal-to-noise ratios (SNRs) and subjective testing, where eighteen subjects (who are students in CSUSB) are asked to listen and rate the decompressed files by the two methods.
A novel approach is proposed for effective high frequency regeneration in audio coding, which is based on a sinusoids plus noise model. It assumes a standard high efficiency advanced audio coding (HE-AAC) encoder, and...
详细信息
A novel approach is proposed for effective high frequency regeneration in audio coding, which is based on a sinusoids plus noise model. It assumes a standard high efficiency advanced audio coding (HE-AAC) encoder, and modifies the decoder to exploit all available information in estimating the model parameters. From the lower band reconstruction of core AAC, frequency parameters of the high band sinusoids are estimated. Side information about spectral energy and the regenerated high band of standard HE-AAC are employed in estimating the magnitude parameters of the high band sinusoids as well as noise model parameters. The gains achieved by the proposed technique, over conventional HE-AAC, are demonstrated by subjective quality tests that were carried out on audio signals with significant harmonics in the high band.
暂无评论