In this paper, a new approach is presented for single-channel speech enhancement which is based on Nonnegative Matrix Factorization (NMF). The proposed scheme combines the noise Power Spectral Density (PSD) estimation...
详细信息
Common types of hearing impairment are caused mainly by a loss of nearly instantaneous compressive amplification in the inner ear. Therefore, it seems plausible that the loss might be compensated by fast frequency-dep...
详细信息
Common types of hearing impairment are caused mainly by a loss of nearly instantaneous compressive amplification in the inner ear. Therefore, it seems plausible that the loss might be compensated by fast frequency-dependent compression in the hearing aid. We simulated impaired listeners' auditory analysis of hearing-aid processed speech in noise using a functional auditory model. Using hidden Markov signal models, we estimated the mutual information between the phonetic structure of clean speech and the neural output from the auditory model, with fast and slow versions of hearing-aid compression. The long-term speech spectrum of amplified sound was identical in both systems, as specified individually by the widely accepted NAL prescription for the gain frequency response. The calculation showed clearly better speech-to-auditory information transmission with slow quasi-linear amplification than with fast hearing-aid compression, for speech in speech-shaped noise at signal-to-noise ratios ranging from -10 to +20 dB. copyright by EURASIP.
The line spectral frequencies (LSF) are known to be the most efficient representation of the linear predictive coding (LPC) parameters from both the distortion and perceptual point of view. By considering the bounded ...
详细信息
In this paper, we model the underlying probability density function (PDF) of the speech line spectral frequencies (LSF) parameters with a Dirichlet mixture model (DMM). The LSF parameters have two special features: 1)...
详细信息
The probability density function (PDF) optimized quantization has been shown to be more efficient than the conventional quantization methods. In practical application, the data with bounded support can be modelled bet...
详细信息
To facilitate real-time voice communication through the Internet, forward error correction (FEC) and multiple description coding (MDC) can be used as low-delay packet-loss recovery techniques. We use both a Gilbert ch...
详细信息
To facilitate real-time voice communication through the Internet, forward error correction (FEC) and multiple description coding (MDC) can be used as low-delay packet-loss recovery techniques. We use both a Gilbert channel model and data obtained from real IP connections to compare the rate-distortion performance of different variants of FEC and MDC. Using identical overall rates with stringent delay constraints, we find that side-distortion optimized MDC generally performs better than Reed-Solomon based FEC. If the channel condition is known from feedback through the Real-Time Control Protocol (RTCP), then channel-optimized MDC can be used to exploit this information, resulting in significantly improved performance.
In this paper, the application of a well known mathematical theorem, Banach's fixed point theorem [1], is investigated in iterative signal processing in communications. In most practical communication systems some...
详细信息
In this paper the earlier proposed short-time objective intelligibility predictor (STOI) is simplified such that it can be expressed as a weighted l(2) norm in the auditory domain. Due to the mathematical properties o...
详细信息
ISBN:
(纸本)9781467310680
In this paper the earlier proposed short-time objective intelligibility predictor (STOI) is simplified such that it can be expressed as a weighted l(2) norm in the auditory domain. Due to the mathematical properties of a norm, STOI can now be used with the matching pursuit algorithm in the n-of-m channel selection technique as found in several cochlear implant (CI) coding strategies. With this technique only a subset of frequency channels (electrodes) are stimulated, such that important channels can be updated more frequently and less significant channels are omitted. Intelligibility predictions with acoustic CI-simulations for normal-hearing listeners indicate that more intelligible speech is obtained with the proposed method compared to a conventional channel selection method based on peak picking. Reasons for this difference in performance are: (1) STOI considers an analysis window of a few hundreds of milliseconds in order to account for important low temporal modulations for speech intelligibility and (2) spectral leakage per channel is accounted for in the mathematical optimization process.
This paper demonstrates the potential of theoretically motivated learning methods in solving the problem of non-intrusive quality estimation for which the state-of-the-art is represented by ITU-T P.563 standard. To co...
详细信息
Mutual information (MI) is an important information theoretic concept which has many applications in telecommunications, in blind source separation, and in machine learning. More recently, it has been also employed fo...
详细信息
暂无评论