audio coders generally operate on a per-frame basis to tackle signal non-stationarity. The choice of frame length is crucial to system performance since it trades off coding efficiency and pre-echo avoidance. A new al...
详细信息
audio coders generally operate on a per-frame basis to tackle signal non-stationarity. The choice of frame length is crucial to system performance since it trades off coding efficiency and pre-echo avoidance. A new algorithm based on multiple length adaptive windowing is proposed as an enhanced front-end for audio coders. Overhead reductions in excess of 40% are reported.
We present an overview of the audioBIFS system, part of the Binary Format for Scene Description (BIFS) tool in the MPEG-4 International Standard. audioBIFS is the tool that integrates the synthetic and natural sound c...
详细信息
We present an overview of the audioBIFS system, part of the Binary Format for Scene Description (BIFS) tool in the MPEG-4 International Standard. audioBIFS is the tool that integrates the synthetic and natural sound coding functions in MPEG-4. It allows the flexible construction of soundtracks and sound scenes using compressed sound, sound synthesis, streaming audio, interactive and terminal-dependent presentation, three-dimensional (3-D) spatialization, environmental auralization, and dynamic download of custom signal-processing effects algorithms. MPEG-4 sound scenes are based on a model that is a superset of the model in VRML 2.0, and we describe how MPEG-4 is built upon VRML and the new capabilities provided by MPEG-4. We discuss the use of structured audio orchestra language, the MPEG-4 SAOL, for writing downloadable effects, present an example sound scene built with audioBIFS, and describe the current state of implementations of the standard.
In this paper, an in depth investigation and comparison of the performance obtainable with short wavelet filters for low bit rate perceptual audio coding is presented. This a priori knowledge of the short wavelet filt...
详细信息
ISBN:
(纸本)0819425915
In this paper, an in depth investigation and comparison of the performance obtainable with short wavelet filters for low bit rate perceptual audio coding is presented. This a priori knowledge of the short wavelet filters performance evaluation open new horizons in their usage, especially, when combined with the Moving Pictures Expert Group (MPEG-4) requirements for segmental signal to noise ratio (SSNR) scalable audio coding.
Most LPC-based audio coders employ simplistic noise-shaping operations to perform psychoacoustic control of quantization noise. In this paper, we report on new approaches to exploiting perceptual masking in the design...
详细信息
ISBN:
(纸本)0818679190
Most LPC-based audio coders employ simplistic noise-shaping operations to perform psychoacoustic control of quantization noise. In this paper, we report on new approaches to exploiting perceptual masking in the design of adaptive quantization of LPC excitation parameters. Due to its localized spectral sensitivity, sinusoidal excitation representation is preferred to spectrally flat signals for use in excitation modeling. Simulation results indicate that the proposed multisinusoid excited coder can deliver high quality audio reproduction at the rate of 72 kb/s.
Wavelet filtering is a promising tool for use in audio signal compression. What is still lacking, however, is a thorough understanding of wavelet filters performance relative to the more sophisticated examples of conv...
详细信息
Wavelet filtering is a promising tool for use in audio signal compression. What is still lacking, however, is a thorough understanding of wavelet filters performance relative to the more sophisticated examples of conventional filters. So, as we seek to apply wavelet filters to low bit rate audio coding, attention must be focused not only to the bit-rate/signal-quality trade-off, but also the complexity and processing delay should not be underemphasised. The results presented in this paper attempt to clarify these issues. To assess the coding gain of wavelet and conventional filters, various codec models have been designed and implemented based on a wavelet packet algorithm, an auditory perception model and entropy noiseless coding. The wavelet packet based coding approach is compared to the MPEG-audio international standard in terms of objective and subjective measurements and is shown to be superior to MPEG-audio layer I and competitive with layer II.
Network signal processing aspects dominate in speech and audio coding applications such as Internet telephony or packet radio networks. We demonstrate that our approach to speech coding in a perceptual domain provides...
详细信息
Network signal processing aspects dominate in speech and audio coding applications such as Internet telephony or packet radio networks. We demonstrate that our approach to speech coding in a perceptual domain provides an implicit forward error concealment mechanism to handle random erasures of the channel. To this end, the individual acoustic subchannels of our auditory model are grouped into different transport subchannels or packets. Due to the strongly overlapping, redundant filterbank structure of the model, reconstruction of speech without audible degradation becomes possible even if a significant percentage of channels is erased (e.g., up to 40% in a 50-channel auditory model for narrowband speech). We discuss this result both from a hearing-physiology and a frame-theoretic perspective.
We describe six algorithms for bit-allocation in audio coding. Each algorithm stems from the minimization of a different perceptually-motivated objective function. Three of these objective functions are extensions of ...
详细信息
We describe six algorithms for bit-allocation in audio coding. Each algorithm stems from the minimization of a different perceptually-motivated objective function. Three of these objective functions are extensions of existing ones, and three are new. Closed-form bit-allocation equations result in five cases, and an iterative approach is required in the sixth.
Historically, the choice of the optimum filterbank has been the subject of much research and discussion in the development of perceptual audio coders. Desirable properties of a good filterbank include both a good extr...
详细信息
Historically, the choice of the optimum filterbank has been the subject of much research and discussion in the development of perceptual audio coders. Desirable properties of a good filterbank include both a good extraction of the signal's redundancy and effective utilization of that redundancy while maintaining control over perceptual demands. Often, there is a conflict between the use of perceptual constraints and the redundancy extraction, in that a filterbank with good resolution in both time and frequency is needed. Recently, a method for performing temporal noise shaping (TNS) of the error signal of a perceptual audio coder has been proposed, providing control over both the time and frequency structure of the coding noise. This paper focuses on the core part of the scheme, forming a continuously adaptive filterbank, and discusses its theoretical background, properties and limitations.
A combined speech and audio coder is proposed. The coder structure resembles a low-delay CELP coder, however, the excitation gain is adapted non-linearly in a sample-by-sample fashion by using a trained neural network...
详细信息
ISBN:
(纸本)0780340736
A combined speech and audio coder is proposed. The coder structure resembles a low-delay CELP coder, however, the excitation gain is adapted non-linearly in a sample-by-sample fashion by using a trained neural network, and the spectral parameters are derived from backward non-linear prediction based on a second-order Volterra filter. A perceptual weighting filter derived from psychoacoustic analysis in the spectral domain is used to shape the coding noise. The proposed non-linear adaptation schemes significantly improve the effectiveness of using an analysis-by-synthesis model for codingaudio signal. Simulation results show that transparent coding of wideband (7 kHz) speech and audio at 24 kbps is achieved.
Advanced audio coding (AAC), part of ISO/MPEG-2, issued as an international standard in April, 1997. It supports single or multiple channel audio programs and delivers excellent audio quality at or below 64 kbps/chann...
详细信息
Advanced audio coding (AAC), part of ISO/MPEG-2, issued as an international standard in April, 1997. It supports single or multiple channel audio programs and delivers excellent audio quality at or below 64 kbps/channel by exploiting the compression capabilities of a high-resolution filterbank, backward-adaptive prediction, joint channel coding, nonlinear quantizers and noiseless (Huffman) coding. This paper describes the flexible Huffman coding algorithm used in AAC and discusses the compression provided by this component of the standard.
暂无评论