Auditory masking is aggressively exploited by algorithms used for the lossy compression of audio signals. In compression of audio signals, the intent is to hide the noise introduced by the coding below the masking thr...
详细信息
Auditory masking is aggressively exploited by algorithms used for the lossy compression of audio signals. In compression of audio signals, the intent is to hide the noise introduced by the coding below the masking threshold, thus making the noise inaudible. This will render the coding process transparent, enabling better compression without audible degradation of the signal. In this article, we show that using masking properties of the hearing system allows for improved noise reduction. A novel method for noise reduction in speech signals is proposed. This method is shown to outperform non-auditory based methods, and compares well with other perceptually motivated noise reduction methods. It is found that the proposed method, Soulodre's PNRF combined with the ITU's PEAQ auditory model, have more musical noise but less signal distortion than the method proposed by Tsoukalas, which obtain marginally better results in informal testing results.
We have recently introduced an incremental learning algorithm, called Learn ++ .NSE, designed for Non-Stationary Environments (concept drift), where the underlying data distribution changes over time. With each datase...
详细信息
We have recently introduced an incremental learning algorithm, called Learn ++ .NSE, designed for Non-Stationary Environments (concept drift), where the underlying data distribution changes over time. With each dataset drawn from a new environment, Learn ++ .NSE generates a new classifier to form an ensemble of classifiers. The ensemble members are combined through a dynamically weighted majority voting, where voting weights are determined based on classifiers' age-adjusted accuracy on current and past environments. Unlike other ensemble-based concept drift algorithms, Learn ++ .NSE does not discard prior classifiers, allowing potentially cyclical environments to be learned more effectively. While Learn ++ .NSE has been shown to work well on a variety of concept drift problems, a potential shortcoming of this approach is the cumulative nature of the ensemble size. In this contribution, we expand our analysis of the algorithm to include various ensemble pruning methods to introduce controlled forgetting. Error or age-based pruning methods have been integrated into the algorithm to prevent potential out-voting from irrelevant classifiers or simply to save memory over an extended period of time. Here, we analyze the tradeoff between these precautions and the desire to handle recurring contexts (cyclical data). Comparisons are made using several scenarios that introduce various types of drift.
This article describes a new approach for higher radix butterflies suitable for pipeline implementation. Based on the butterfly computation introduced by Cooley-Tukey [1], we introduce a novel approach for the factori...
详细信息
This article describes a new approach for higher radix butterflies suitable for pipeline implementation. Based on the butterfly computation introduced by Cooley-Tukey [1], we introduce a novel approach for the factorization of the Discrete Fourier Transform (DFT), by redefining the butterfly computation, which is more suitable for efficient VLSI implementation. This proposed factorization motivated us to present a new concept of a radix-r Fast Fourier Transform (FFT), in which the radix-r butterfly computation concept was formulated as composite engines to implement each of the butterfly computations. This concept enables the radix r butterfly-processing element (BPE) to be designed by maintaining only one complex value multiplier in the butterfly critical path for any given r. Algorithmic description and performance of low complexity FFT method are considered in this paper and parallel pipelined FFT in a companion paper [15], Part II Parallel Pipelined FFT Processing.
This paper proposes a robust adaptive algorithm for adjusting coefficients of an adaptive filter, which is used in active noise canceller (ANC). The filtered LMS algorithm, which is widely used in digital signal proce...
详细信息
This paper proposes a robust adaptive algorithm for adjusting coefficients of an adaptive filter, which is used in active noise canceller (ANC). The filtered LMS algorithm, which is widely used in digital signal processing, is deployed to reduce the effect of acoustic interference in a noisy environment. In this paper the zero noise output of the proposed one and two stages dual predictive line ANC (DPL-ANC) algorithm, which could be deployed in underground communication system, is presented. A second DPL-ANC using voice activity detection (VAD) to control the updated filter coefficients is also proposed. Evaluation results with real-world underground and noisy speech data exhibit significant improvement on the convergence and the zero noise output of both proposed two stages DPL-ANC.
The success of computational science to accurately describe and model the real world has helped to fuel the ever increasing demand for cheap computing power. This paper presents a solution to the FFT's parallel mu...
详细信息
The success of computational science to accurately describe and model the real world has helped to fuel the ever increasing demand for cheap computing power. This paper presents a solution to the FFT's parallel multiprocessing problem, and involves novel concepts wherein the realization of parallel pipelines and multistage parallel pipelines are possible. The problem resides in defining the mathematical model of the socalled combination phase, in which the concept of representing the discrete Fourier transform (DFT) in terms of its partial DFTs should be well structured to obtain the right mathematical model. The resulting implementation in which r parallel processors operate simultaneously within a single instruction reduces the number of communications phases and the no-operation states (NOP) to their minimum values. The two papers, Part I and II, Butterfly processing element and Parallel pipelined processing, provide a new FFT concept for efficient VLSI implementation.
Linear spectral mixture analysis (LSMA) has been widely used in remote sensing community. Recently, kernel-based approaches have received considerable interest in hyperspectral image analysis where nonlinear kernels a...
详细信息
Linear spectral mixture analysis (LSMA) has been widely used in remote sensing community. Recently, kernel-based approaches have received considerable interest in hyperspectral image analysis where nonlinear kernels are used to resolve the issue of nonlinear separability in classification. This paper extends the LSMA to kernel-based LSMA where three least squares-based LSMA techniques, least squares orthogonal subspace projection (LSOSP), non-negativity constrained least squares (NCLS) and fully constrained least squares (FCLS) are extended to their kernel counterparts, KLSOSP, KNCLS and KFCLS.
This paper presents a voice activity detection (VAD) algorithm based on the Wavelet Packet Transform and the Teager Energy Operation (TEO) processing. The signal is decomposed into subband signals. We used the multi-r...
详细信息
This paper presents a voice activity detection (VAD) algorithm based on the Wavelet Packet Transform and the Teager Energy Operation (TEO) processing. The signal is decomposed into subband signals. We used the multi-resolution analysis property of the Wavelet Transform to extract and analyse time-frequency components corresponding to speech. In order to obtain a parameter called Voice Activity Shape (VAS), we used TEO processing to better distinguish subband signals corresponding to speech. The subband variance values of each TEO signal are summed to obtain the VAS, which is higher in speech regions than in non speech regions. Experimental results show that our VAD perform better than the G729B, particularly in difficult noise conditions and also in the case when the speech sound is passed in a nonlinear communication channel. Experimental results are shown in the case of real speech communications from a spaceship to terrestrial 3G cellular network assuming nonlinear interferences.
The use of omnidirectional cameras for videoconferencing promises to simplify the hardware setup necessary for large groups of participants. We investigate the use of a multimodal speaker detection algorithm on audio-...
详细信息
The use of omnidirectional cameras for videoconferencing promises to simplify the hardware setup necessary for large groups of participants. We investigate the use of a multimodal speaker detection algorithm on audio-visual sequences captured with such a camera, in particular, an algorithm that uses the audio energy together with the optical flow. We analyze several types of optical flow methods to determine the one which is appropriate to the omnidirectional context.
The problem of automatic object categorization is investigated under the proposed bag of feature object categorization framework. The framework consists of feature detection and representation which uses the scale inv...
详细信息
The problem of automatic object categorization is investigated under the proposed bag of feature object categorization framework. The framework consists of feature detection and representation which uses the scale invariant feature transform (SIFT) as local feature and bag of feature model to represent the image. Learning process utilizes k-NN (k-nearest neighbour). In this paper, we propose the dimensionality reduction of SIFT using principal component analysis (PCA) on each object category to reduce computational complexity and memory requirement during training process. Experimental results show that our proposed technique can reduce the dimension of SIFT up to around 80% with the same average precision compared to baseline technique without our proposed method.
暂无评论