Model-based single-channel source separation (SCSS) is an ill-posed problem requiring source-specific prior knowledge. In this paper, we use representation learning and compare general stochastic networks (GSNs), Gaus...
详细信息
Real-time speechcommunication over packet switched networks requires low delay packet loss concealment (PLC) methods. There are several PLC methods used in IP telephony to cope with packet losses. Two commonly used m...
详细信息
Real-time speechcommunication over packet switched networks requires low delay packet loss concealment (PLC) methods. There are several PLC methods used in IP telephony to cope with packet losses. Two commonly used methods are silence substitution and packet (waveform) repetition. We compare these methods according to the rate distortion criterion by introducing a penalty for packet (waveform) repetition. This analysis allows us for fair comparison between the two methods. We also compare the results with that of ITU-T's E-model and find them to be in agreement.
To approximate the speech quality of a given speech enhancement system, most of the existing instrumental metrics rely on the calculation of a distortion metric defined between the clean reference signal and the enhan...
详细信息
We present a Matlab-based game, teaching the interplay between the pole-zero chart of a linear filter and its magnitude response. The estimation error, which acts as a game score, is based on concepts undergraduate st...
详细信息
The task of localizing single and multiple concurrent speakers in a reverberant environment with background noise poses several problems. One of the major problems is the severe corruption of the frame-wise localizati...
详细信息
ISBN:
(纸本)9783200019409
The task of localizing single and multiple concurrent speakers in a reverberant environment with background noise poses several problems. One of the major problems is the severe corruption of the frame-wise localization estimates. To improve the overall localization accuracy, we propose a particle filter based tracking algorithm using the recently proposed Multiband Joint Position-Pitch (M-PoPi) localization algorithm as a frame wise likelihood estimate. To prove the performance of our approach, we tested it on real-world recordings of seven different speakers and of up to three concurrent speakers. We compared our new approach to the well-known SRP-PHAT algorithm as frame-wise likelihood estimates. Finally, we compared both particle filter based tracking algorithms with their frame-wise localization algorithms. The M-PoPi based particle filter tracking algorithm outperforms the SRPPHAT based particle filter tracking algorithm. The comparison with their frame wise localization algorithms shows that this improved performance stems from the more robust M-PoPi frame wise localization estimate.
In single-channel speech enhancement the spectral amplitude of the noisy signal is often modified while the noisy spectral phase is directly employed for signal reconstruction. Recently, additional improvement in spee...
详细信息
Although nonnegative matrix factorization (NMF) favors a part-based and sparse representation of its input, there is no guarantee for this behavior. Several extensions to NMF have been proposed in order to introduce s...
详细信息
We present a combination of the multiband joint positionpitch (M-PoPi) estimation algorithm with the particle filtering framework to enhance the localization accuracy when tracking multiple concurrent speakers. A new ...
详细信息
We propose a robust and efficient lung sound classification system using a snapshot ensemble of convolutional neural networks (CNNs). A robust CNN architecture is used to extract high-level features from log mel spect...
详细信息
Several iterative approaches have been proposed for speech enhancement. This work reviews these methods and further presents a novel iterative estimation scheme to jointly estimate the harmonic parameters, i.e., ampli...
详细信息
暂无评论