The ISO/IEC MPEG-4 audio standard includes the TwinVQ encoding tool. This tool is suitable for low-bit-rate general audio coding, but a drawback is the computational complexity of the encoder. To develop a faster Twin...
详细信息
ISBN:
(纸本)0780371658
The ISO/IEC MPEG-4 audio standard includes the TwinVQ encoding tool. This tool is suitable for low-bit-rate general audio coding, but a drawback is the computational complexity of the encoder. To develop a faster TwinVQ encoder, new fast vector quantization algorithms - area localized pre-selection and hit zone masking - are introduced. These algorithms exploit the pre- and main-selection procedure scheme of the conjugate structure vector quantization which is used in the TwinVQ. The improvement is evaluated by measuring the encoding speed and the sound quality of reproduction. Finally, additionally optimized encoders are developed by making use of the 3DNow!/SSE technologies. In total, encoding times are reduced to 1/2-1/3 compared to the original TwinVQ encoder in the ISO/IEC reference encoder.
Presents MPEG-2 AAC LC Profile encoder software for an Intel Pentium III processor. Modified discrete cosine transform (MDCT) and quantization processing are accelerated by the use of SIMD instructions. Psycho-acousti...
详细信息
Presents MPEG-2 AAC LC Profile encoder software for an Intel Pentium III processor. Modified discrete cosine transform (MDCT) and quantization processing are accelerated by the use of SIMD instructions. Psycho-acoustic analysis in the MDCT domain makes the use of FFTs unnecessary. Better sound quality is provided by greater efficiency in quantization processing and Huffman coding. All of this results in high-quality and processor-efficient implementation of an MPEG-2 AAC encoder. Sound quality achieved at 96 kbps/stereo is significantly better than that of MP3 at the same bitrate. The encoder works 13 times faster than realtime for stereo encoding on an 800MHz Pentium III processor.
Stimulated by the ever-increasing amount of available multimedia data, content-related techniques for the management of audio material have received much interest recently. This paper discusses the problem of robust i...
详细信息
Stimulated by the ever-increasing amount of available multimedia data, content-related techniques for the management of audio material have received much interest recently. This paper discusses the problem of robust identification of audio signals by matching them to a known reference. In order to perform well under realworld conditions, the matching process needs to rely on features which are robust with respect to common signal distortions. A family of suitable features with favorable properties is proposed and evaluated for their recognition performance. Applications of signal matching, including fingerprinting, are discussed.
MPEG-1 is a successful international standard for video and audio coding, and has been widely used in many fields, such as entertainment, education, digital library, video on demand etc. As the MPEG-1 stream has its o...
详细信息
ISBN:
(纸本)0780370104
MPEG-1 is a successful international standard for video and audio coding, and has been widely used in many fields, such as entertainment, education, digital library, video on demand etc. As the MPEG-1 stream has its own semantic structure, the multicast application may destroy its syntax which results in decoding failure. To deal with this problem, a novel random access method is proposed in this paper to repair the corrupted stream structure. This solution consists of two main steps: first, it should extract necessary information from the multicast stream, and then a system header would be constructed using the information, whereafter this header would be inserted to the beginning of the decoder buffer. By this way, the decoder can play the received multicast stream without any problem.
Hybrid in band on channel (IBOC) digital audio broadcasting (DAB) simultaneously with analog amplitude modulation (AM) has been proposed as a hybrid solution to digital audio broadcasting in the AM band. Adding digita...
详细信息
Hybrid in band on channel (IBOC) digital audio broadcasting (DAB) simultaneously with analog amplitude modulation (AM) has been proposed as a hybrid solution to digital audio broadcasting in the AM band. Adding digital transmission in the crowded AM band is a challenging proposition. To achieve FM like audio quality, an audio coder rate of 32-64 kbit/s may be required. One of the currently proposed hybrid IBOC-AM systems is 30 kHz wide. Severe second adjacent interference may occur in certain geographical areas. For coping with such harsh transmission conditions, we present a solution based on embedded/multi-descriptive audio coding with matched multistream transmission in separate frequency bands. With loss of one frequency band, the embedded system blends to a lower audio coder rate with still a better quality than analog AM. The non-embedded system without multi-stream transmission fails catastrophically causing a severe discontinuity in quality while blending directly to analog AM.
An improved pre-sorting and reordering algorithm is proposed for enhancing the quality of transmitting real-time MPEG-4 AAC (advanced audio coding) audio over EGPRS (Enhanced General Packet Radio Service). Compared wi...
详细信息
An improved pre-sorting and reordering algorithm is proposed for enhancing the quality of transmitting real-time MPEG-4 AAC (advanced audio coding) audio over EGPRS (Enhanced General Packet Radio Service). Compared with the standard Huffman codeword reordering (HCR) algorithm, lower implementation complexity and much better audio signal reconstruction quality may be achieved with the proposed scheme.
A fast bit allocation algorithm for the MPEG audio encoder is proposed. The proposed algorithm employs the bit allocation information of the previous audio frame as a reference for allocating the restricted bits to ea...
详细信息
ISBN:
(纸本)9628576623
A fast bit allocation algorithm for the MPEG audio encoder is proposed. The proposed algorithm employs the bit allocation information of the previous audio frame as a reference for allocating the restricted bits to each of the 32 subbands in the current audio frame such that the number of iterations can be significantly reduced. A process of bit reallocation is also suggested to ensure the generation of an identical MPEG bitstream produced by the standard bit allocation algorithm described in the MPEG audio standard. The result shows that the speed-up of the proposed algorithm is remarkable at different encoded bitrates.
We introduce a new scheme for simultaneous placement of a number of sources in auditory space. The scheme is based on an assumption about the relevance of localization cues in different critical bands. Given the sum s...
详细信息
We introduce a new scheme for simultaneous placement of a number of sources in auditory space. The scheme is based on an assumption about the relevance of localization cues in different critical bands. Given the sum signal of a number of sources, i.e. a monophonic signal, and a set of parameters (side-information) the scheme is capable of generating a binaural signal by spatially placing the sources contained in the monophonic signal. Potential applications for the scheme are multi-talker desktop conferencing and audio coding. Preliminary experimental results suggest that the listener's ability to identify messages in a multi-talker environment significantly improves by enhancing a monophonic signal with the proposed scheme.
We propose a method for obtaining an improved representation of transients in audio signals. The representation is based on a damped sinusoidal model. To improve the representation, transient locations are modified in...
详细信息
We propose a method for obtaining an improved representation of transients in audio signals. The representation is based on a damped sinusoidal model. To improve the representation, transient locations are modified in such a way that a transient can start only at the beginning of a sinusoidal segment. The introduced modifications facilitate a reduction of the number of damped sinusoids needed to model a transient well and eliminate pre-echo artifacts. We verify with a listening test that the modifications do not result in a perceptual difference between the original and modified audio signals.
暂无评论