Line Spectral Frequencies (LSF) have been widely used to represent Linear Prediction Coefficients (LPC) due to their excellent quantization characteristics. Among others, MPEG4-compliant audio decoders include a modul...
详细信息
ISBN:
(纸本)9781612848570
Line Spectral Frequencies (LSF) have been widely used to represent Linear Prediction Coefficients (LPC) due to their excellent quantization characteristics. Among others, MPEG4-compliant audio decoders include a module for converting the LSF to LPC coefficients. This paper presents a new simplified computational technique for carrying out the LSF-to-LPC conversion process. The proposed technique cuts down on several arithmetic operations by establishing direct links between the LPC coefficients and the two polynomials representing the two extreme glottis conditions. The technique is a good candidate for hardware implementation on mobile devices designed for low bit-rate wireless communication. Results of two testing scenarios are reported to show the computational distortion of the hardware conversion module relative to that resulting from quantizing the LSF vectors.
In this paper, we consider the problem of finding sparse representations of audio signals for coding purposes. In doing so, it is of utmost importance that when only a subset of the present components of an audio sign...
详细信息
ISBN:
(纸本)9781467303231
In this paper, we consider the problem of finding sparse representations of audio signals for coding purposes. In doing so, it is of utmost importance that when only a subset of the present components of an audio signal are extracted, it is the perceptually most important ones. To this end, we propose a new iterative algorithm based on two principles: 1) a reweighted 1-norm based measure of sparsity;and 2) a reweighted 2-norm based measure of perceptual distortion. Using these measures, the considered problem is posed as a constrained convex optimization problem that can be solved optimally using standard software. A prominent feature of the new method is that it solves a problem that is closely related to the objective of coding, namely rate-distortion optimization. In computer simulations, we demonstrate the properties of the algorithm and its application to real audio signals.
A sound-coding strategy for users of cochlear implants, named enhanced-envelope-encoded tone (eTone), was developed to improve coding of fundamental frequency (F0) in the temporal envelopes of the electrical stimulus ...
详细信息
A sound-coding strategy for users of cochlear implants, named enhanced-envelope-encoded tone (eTone), was developed to improve coding of fundamental frequency (F0) in the temporal envelopes of the electrical stimulus signals. It is based on the advanced combinational encoder (ACE) strategy and includes additional processing that explicitly applies F0 modulation to channel envelope signals that contain harmonics of prominent complex tones. Channels that contain only inharmonic signals retain envelopes normally produced by ACE. The strategy incorporates an F0 estimator to determine the frequency of modulation and a harmonic probability estimator to control the amount of modulation enhancement applied to each channel. The F0 estimator was designed to provide an accurate estimate of F0 with minimal processing lag and robustness to the effects of competing noise. Error rates for the F0 estimator and accuracy of the harmonic probability estimator were compared with previous approaches and outcomes demonstrated that the strategy operates effectively across a range of signals and conditions that are relevant to cochlear implant users. (C) 2011 Acoustical Society of America. [DOI: 10.1121/1.3573988]
This paper presents high quality monaural superwideband extensions to G.711.1 and G.722, recently standardized as Recommendations ITU-T G.711.1 Annex D and G.722 Annex B. The superwideband (50-14000 Hz) functionality ...
详细信息
ISBN:
(纸本)9781457705397
This paper presents high quality monaural superwideband extensions to G.711.1 and G.722, recently standardized as Recommendations ITU-T G.711.1 Annex D and G.722 Annex B. The superwideband (50-14000 Hz) functionality is achieved using embedded scalable structure that adds extension layers on top of the wideband core codecs. The bit rates are extended to 96/112/128 and 64/80/96 kbit/s for G.711.1 and G.722, respectively. The main technologies include lower and higher band (0-4 kHz and 4-8 kHz) enhancements, 8-14 kHz bandwidth extension and transform coding based on algebraic vector quantization. The codecs' performance is illustrated with listening test results extracted from formal ITU-T Characterization tests.
This paper introduces a novel and highly efficient realization of a spherical vector quantizer (SVQ), the "Gosset Low Complexity Vector Quantizer" (GLCVQ). The GLCVQ codebook is composed of vectors that are ...
详细信息
ISBN:
(纸本)9781457705397
This paper introduces a novel and highly efficient realization of a spherical vector quantizer (SVQ), the "Gosset Low Complexity Vector Quantizer" (GLCVQ). The GLCVQ codebook is composed of vectors that are located on spherical shells of the Gosset lattice E-8. A high encoding efficiency is achieved by representing the spherical vector codebook as aggregated permutation codes. Compared to previous algorithms, the computational complexity and memory consumption is further reduced by exploiting the properties of so called classleader root vectors and by a novel approach for the codevector-to-index-mapping. The GLCVQ concept can be generalized to vector dimensions that are multiples of eight. In particular, GLCVQ for 16-dimensional vectors is used in Amd. 6 to ITU-T Rec. G.729.1.
This paper describes applying the Set Partitioning in Hierarchical Trees (SPIHT) algorithm using a devised model to compress audio signals. It allows choosing the amount of compression based upon bit rate requirements...
详细信息
ISBN:
(纸本)9780819489326
This paper describes applying the Set Partitioning in Hierarchical Trees (SPIHT) algorithm using a devised model to compress audio signals. It allows choosing the amount of compression based upon bit rate requirements. A threshold setting model, based on energy and frequency patterns of the signal is used to assist the SPIHT encoder set efficient threshold values, based upon the nature of the audio. It is thus adaptable to the nature of the audio as well as the bit rate requirement. The implementation can be used for storage as well as progressive transmission.
Context based entropy coding has the potential to provide higher gain over memoryless entropy coding. However serious difficulties arise regarding the practical implementation in real-time applications due to its very...
详细信息
ISBN:
(纸本)9781457705380
Context based entropy coding has the potential to provide higher gain over memoryless entropy coding. However serious difficulties arise regarding the practical implementation in real-time applications due to its very high memory requirements. This paper presents an efficient method for designing context adaptive entropy coding while fulfilling low memory requirements. From a study of coding gain scalability as a function of context size, new context design and validation procedures are derived. Further, supervised clustering and mapping optimization are introduced to model efficiently the context. The resulting context modelling associated with an arithmetic coder was successfully implemented in a transform-based audio coder for real-time processing. It shows significant improvement over the entropy coding used in MPEG-4 AAC.
This paper focuses on utilization of B-format signals in sound source localization using two different methods. Whereas the first method is based on the principle of B-format signals, the second one is based on energe...
详细信息
This paper focuses on utilization of B-format signals in sound source localization using two different methods. Whereas the first method is based on the principle of B-format signals, the second one is based on energetic analysis of B-format signals. Average square difference function method is simulated and tested in order to be compared with the previous methods.
This paper presents an experimental evaluation of oversampled, modulated filter banks for joint subband audio processing and coding applications. Joint subband processing and coding may be useful in some wireless audi...
详细信息
This paper presents an experimental evaluation of oversampled, modulated filter banks for joint subband audio processing and coding applications. Joint subband processing and coding may be useful in some wireless audio devices such as advanced wireless digital hearing aids. We examine the use of oversampled GDFT and cosine modulated filter banks and propose using single sideband (SSB) real-valued filter banks as a compromise which is ideal for this application. The SSB filter bank provides real-valued signals for audio coding which are free from any aliasing cancellation constraints and hence are also suitable for audio processing such as subband gain adjustment. We support this conclusion with an experimental analysis of various filter bank designs for subband gain adjustment and subband audio coding.
In perceptual audio coding, it is necessary to deal with the transient signals in a frame. A reasonable transient detection is a premise of the treatment. According to the characteristics of transient signals in time ...
详细信息
In perceptual audio coding, it is necessary to deal with the transient signals in a frame. A reasonable transient detection is a premise of the treatment. According to the characteristics of transient signals in time domain and frequency domain, a time-frequency transient detection method which uses flatness measure is proposed in this paper. Comparing with existing transient detection methods, transient signal detection technique based on flatness measure not only has advantages to reduce missed and misused detection, but also hardly produce redundant detection in low-energy transient segment. Simultaneously, the complexity of the algorithm adaptively changes with the apparent extent of the transient signal. Simulation results show that this method is high in detection accurancy and simple in algorithm realization.
暂无评论