Scalability, a well known concept in video coding, has only recently been introduced to audio coding. In this paper, a novel approach to a two-stage wavelet packet based scalable audio coding system is presented. Two ...
详细信息
Scalability, a well known concept in video coding, has only recently been introduced to audio coding. In this paper, a novel approach to a two-stage wavelet packet based scalable audio coding system is presented. Two different structures have been designed and implemented, one in the time-domain, and its dual in the wavelet-domain; these are compared with an MPEG based scalable codec. Results at different bit-rates are shown, while trade-offs and limitations together with future developments for further reduced bit-rates are discussed.
The paper describes a low-cost MPEG-2 audio decoder with a modified fast algorithm for decoding. In the modified decoding scheme, the computation amount of the bottleneck module can be reduced into one-forths of the o...
详细信息
The paper describes a low-cost MPEG-2 audio decoder with a modified fast algorithm for decoding. In the modified decoding scheme, the computation amount of the bottleneck module can be reduced into one-forths of the original one. Also, the major memory storage only requires half size of the standard synthesis subband filterbank. The decoder is developed for the approaches of simplicity and low-cost architecture design, with the techniques of intelligent data arrangement and memory configuration.
Degrouping is the key component in MPEG Layer II audio decoding. It mainly contains the arithmetic operations of division and modulo, which requires a great deal of hardware and computation time. In this paper, we pro...
详细信息
Degrouping is the key component in MPEG Layer II audio decoding. It mainly contains the arithmetic operations of division and modulo, which requires a great deal of hardware and computation time. In this paper, we propose a novel degrouping algorithm with a low complexity design concept. By using mode selection and iterative decomposition, only the addition and subtraction are needed. The elimination of the multiplier, divider and ROM table can therefore save a lot of chip area, but still retains the high efficiency without loss of any accuracy.
The sinusoidal model has proven useful for representation and modification of speech and audio. One drawback, however, is that a sinusoidal signal model is typically derived using a fixed frame size, which corresponds...
详细信息
The sinusoidal model has proven useful for representation and modification of speech and audio. One drawback, however, is that a sinusoidal signal model is typically derived using a fixed frame size, which corresponds to a rigid signal segmentation. For nonstationary signals, the resolution limitations that result from this rigidity lead to reconstruction artifacts. It is shown in this paper that such artifacts can be significantly reduced by using a signal-adaptive segmentation derived by a dynamic program. An atomic interpretation of the sinusoidal model is given; this perspective suggests that algorithms for adaptive segmentation can be viewed as methods for adapting the time scales of the constituent atoms so as to improve the model by employing appropriate time-frequency tradeoffs.
Remote news reporting via fixed telephones suffers from a lack of mobility and sound quality limitations. GSM high speed circuit switched data (HSCSD) offers bit rates up to 80 kb/s suitable for MPEG coded audio news ...
详细信息
Remote news reporting via fixed telephones suffers from a lack of mobility and sound quality limitations. GSM high speed circuit switched data (HSCSD) offers bit rates up to 80 kb/s suitable for MPEG coded audio news reports. Mean opinion scores for MPEG-2 Layer 3 audio coded at various bit rates within the HSCSD range explore the trade-off between increased audio coding rate and the listeners perception of the quality. Computer simulations demonstrate that a mean of 3 GSM channels (aggregate of 28.8 kb/s) should be available for each HSCSD mobile transmission. Two methods of returning enhanced quality audio (10 kHz) are considered and the effects of transmission errors in urban and rural environments simulated. Techniques to reduce the effect of errors are also evaluated. The MOS results compare the schemes with a standard GSM traffic channel.
New algorithms for the computation of orthogonal and biorthogonal modulated lapped transforms (MLTs) are presented. The new structures are obtained by combining the MLT window operators with stages from a previously i...
详细信息
New algorithms for the computation of orthogonal and biorthogonal modulated lapped transforms (MLTs) are presented. The new structures are obtained by combining the MLT window operators with stages from a previously introduced structure for the type-IV discrete cosine transform (DCT-IV). The net result is fewer multiplications and additions than previously reported algorithms. For the orthogonal MLT, in particular, the new structure requires the computation of a slightly modified DCT-IV and some extra additions, but no further multiplications; so it demonstrates that the multiplicative complexity of the orthogonal MLT is the same as that of the DCT-IV.
We present a numerically robust method for modeling audio signals which is based on a exponential data model. This model is a generalization of the classical sinusoidal model in the sense that it allows the amplitude ...
详细信息
We present a numerically robust method for modeling audio signals which is based on a exponential data model. This model is a generalization of the classical sinusoidal model in the sense that it allows the amplitude of the sinusoids to evolve exponentially. We show that, using this model, so called attacks can be represented very efficiently and we propose an algorithm for finding the exponentials in a robust way. Moreover, we show that by using a proper segmentation of the input data into variable length segments the signal-to-noise ratio can be drastically improved as compared to a fixed-length analysis.
This paper reviews the key issues in hypermedia systems as an overture to the proposal of a new semiotic paradigm for hypermedia data and coding models. The hypertext concept permits users to interact with and manage ...
详细信息
This paper reviews the key issues in hypermedia systems as an overture to the proposal of a new semiotic paradigm for hypermedia data and coding models. The hypertext concept permits users to interact with and manage data as high-level conceptual objects rather than as symbol streams. Current hypermedia systems can best be defined as an amalgamation of hypertext and multimedia. While the hypertext data model enables this goal, that is not true for the data models of other media forms. A new semiotic paradigm that addresses these deficiencies and supports object-oriented interaction with compressed multimedia streams is proposed. This paper initially presents an overview of the hypertext data model, contrasting it with existing multimedia data and coding models. The framework for the new paradigm is then presented in a brief review of cognitive, psychological, and semiotic principles. This analysis culminates in the proposal of semiotically based data models and representations predisposed to the hypermedia paradigm.
The target of this work is the high quality audio coding at low bit rate. It is shown how pyramid vector coding (PVC) can conveniently replace the classical Huffman coding technique in audio compression systems, provi...
详细信息
The target of this work is the high quality audio coding at low bit rate. It is shown how pyramid vector coding (PVC) can conveniently replace the classical Huffman coding technique in audio compression systems, providing also an advantage in the bit allocation procedure. The compression performances can be further improved by fixing an upper limit value to the vector components.
Considerable research has been devoted to the development of algorithms for perceptually transparent coding of high-fidelity (CD-quality) digital audio. As a result, many algorithms have been proposed and several have...
详细信息
Considerable research has been devoted to the development of algorithms for perceptually transparent coding of high-fidelity (CD-quality) digital audio. As a result, many algorithms have been proposed and several have now become international and/or commercial product standards. This paper reviews algorithms for perceptually transparent coding of CD-quality digital audio, including both research and standardization activities. First, psychoacoustic principles are described with the MPEG psychoacoustic signal analysis model 1 discussed in some detail. Then, we review methodologies which achieve perceptually transparent coding of FM- and CD-quality audio signals, including algorithms which manipulate transform components and subband signal decompositions. The discussion concentrates on architectures and applications of those techniques which utilize psychoacoustic models to exploit efficiently masking characteristics of the human receiver. Several algorithms which have become international and/or commercial standards are also presented, including the ISO/MPEG family and the Dolby AC-3 algorithms. The paper concludes with a brief discussion of future research directions.
暂无评论