The aim of the artificial bandwidth extension (BWE) of speech signals is to recover wideband speech from bandlimited speech. As the BWE algorithm is supposed to operate without additional side information on the origi...
详细信息
The aim of the artificial bandwidth extension (BWE) of speech signals is to recover wideband speech from bandlimited speech. As the BWE algorithm is supposed to operate without additional side information on the original wideband speech, it has to exploit mutual dependencies between the available and missing frequency bands of the speech signal. In this paper the BWE is examined from an information theoretic perspective. After defining a performance measure, and introducing a few assumptions on a generalized BWE algorithm, a general relationship between mutual information and the maximum achievable estimation performance is formulated, which ensues an upper bound on the performance of BWE algorithms. Finally, some measurements considering a representative BWE scenario are presented.
In this contribution, a new software environment for an embedded system is presented that is part of a laboratory for teaching students in digital audio signal processing. This environment enables even students with o...
详细信息
In this contribution, a new software environment for an embedded system is presented that is part of a laboratory for teaching students in digital audio signal processing. This environment enables even students with only a limited background in programming embedded technology to achieve remarkable implementations of realtime projects, including runtime user interaction. Hardware and algorithm related programming issues are separated for audio processing and interactive messaging. While the hardware related part is provided along with the software environment, the students can completely focus on algorithmic aspects. A well defined interface builds the bridge between the two programming components. Examples are presented to demonstrate the flexibility and efficiency of the given software platform. The underlying projects were defined, implemented and finally demonstrated by students in an interactive real-time demonstration.
Methods for measuring the impulse response of a linear transmission system and system identification algorithms in general must be robust against noise in the measured system response. To handle the noise it is of gre...
详细信息
ISBN:
(纸本)9781617388767
Methods for measuring the impulse response of a linear transmission system and system identification algorithms in general must be robust against noise in the measured system response. To handle the noise it is of great advantage to know the instantaneous signal-to-noise ratio (SNR), especially in situations with changing noise conditions. In this paper we present a new approach for estimating the SNR during an impulse response measurement by means of the so-called sliding window correlation (SWiC) as introduced in this paper. The performance of the proposed method is evaluated by means of simulation results.
In analysis-by-ryndzesis coders the problem of approximating the Original signal by the synthesized signal is solved over a limited time interval only. In this contribution a systematic investigation of possible impro...
详细信息
In analysis-by-ryndzesis coders the problem of approximating the Original signal by the synthesized signal is solved over a limited time interval only. In this contribution a systematic investigation of possible improvements by using extended or even unlimited intervals is presented.
Digital speech transmission systems use source coding to reduce the bit rate and channel coding to correct transmission errors. Furthermore, in periods of a very poor channel quality error concealment of residual bit ...
详细信息
Digital speech transmission systems use source coding to reduce the bit rate and channel coding to correct transmission errors. Furthermore, in periods of a very poor channel quality error concealment of residual bit errors becomes necessary as channel decoding fails. However, if the channel is clear, channel coding would not be required at all and the speech quality could be improved by allowing a higher bit rate for source encoding. Usually a compromise is taken between speech quality in case of a clear channel and error robustness in case of poor channel quality. This paper addresses the problem of a joint optimization of error concealment and source/channel coding. Under the premise of a minimum mean square error criterion for signal reconstruction it turns out that error concealment instead of error correction may be the best choice if source coding leaves sufficient residual parameter correlations by less bit rate reduction.
In this paper we propose an algorithm for reduction of noise in audio signals. In contrast to several previous approaches we do not try to achieve a complete removal of the noise, but instead our goal is to preserve a...
详细信息
In this paper we propose an algorithm for reduction of noise in audio signals. In contrast to several previous approaches we do not try to achieve a complete removal of the noise, but instead our goal is to preserve a pre-defined amount of the original noise in the processed signal. This is accomplished by exploiting the masking properties of the human auditory system. The speech and noise distortions are considered separately. The spectral weighting rule, adapted by utilizing only estimates of the masking threshold and the noise power spectral density, has been designed to guarantee complete masking of distortions of the residual noise. Simulation results confirm that no audible artifacts are left in the processed signal, while speech distortions are comparable to those caused by conventional noise reduction techniques.
A novel design for a two-channel IIR quadrature-mirror filter (QMF) bank with near-perfect reconstruction (NPR) is presented. The analysis filter-bank is given by an efficient polyphase network (PPN) implementation ba...
详细信息
A novel design for a two-channel IIR quadrature-mirror filter (QMF) bank with near-perfect reconstruction (NPR) is presented. The analysis filter-bank is given by an efficient polyphase network (PPN) implementation based on allpass filters. The arising phase distortions are almost compensated by stable allpass filters, designed via analytical closed-form expressions. In a first design, the remaining aliasing, amplitude and phase distortions become arbitrarily small in dependence of the tolerable system delay and algorithmic complexity, respectively. In a second design, aliasing and amplitude distortions are completely canceled and phase distortions are minimized at the expense of an additional signal delay. The proposed QMF banks have a lower algorithmic complexity than comparable designs.
For acoustical background noise reduction a computationally efficient joint MAP estimator with a super-Gaussian speech model is presented. Compared to a recently introduced MAP estimator the new joint MAP estimator al...
详细信息
For acoustical background noise reduction a computationally efficient joint MAP estimator with a super-Gaussian speech model is presented. Compared to a recently introduced MAP estimator the new joint MAP estimator allows an optimal adjustment of the underlying statistical model to the real PDF of the speech spectral amplitude. The computationally efficient estimator outperforms the Ephraim-Malah estimator and the recently proposed MAP estimator in a single microphone noise reduction framework due to the more accurate statistical model.
In digital mobile radio systems the speech quality can be degraded severely if the channel decoder produces residual bit errors, e.g., due to heavy burst errors on the channel. A novel combined speech extrapolation an...
详细信息
In digital mobile radio systems the speech quality can be degraded severely if the channel decoder produces residual bit errors, e.g., due to heavy burst errors on the channel. A novel combined speech extrapolation and error detection algorithm is presented which can improve the speech significantly in the case of residual bit errors. This algorithm, which is part of the speech decoding process, uses a posteriori probabilities of speech parameters. With the extracted a posteriori probability, optimum estimators adapted to human perception can be applied and soft decision information can be exploited fully. In terms of perceptual performance, the MS (mean-square) estimator is superior to the MAP (maximum a posteriori) estimator. The method was tested under realistic conditions using an 8-kbit/s CELP (code excited linear prediction) codec. A significant improvement of subjective speech quality can be achieved.< >
暂无评论