Spotforming is a target-speaker extraction technique that uses multiple microphone arrays. This method applies beamforming (BF) to each microphone array, and the common components among the BF outputs are estimated as...
详细信息
ISBN:
(纸本)9789464593617;9798331519773
Spotforming is a target-speaker extraction technique that uses multiple microphone arrays. This method applies beamforming (BF) to each microphone array, and the common components among the BF outputs are estimated as the target source. This study proposes a new common component extraction method based on nonnegativetensorfactorization (NTF) for higher model interpretability and more robust spotforming against hyperparameters. Moreover, attractor-based regularization was introduced to facilitate the automatic selection of optimal target bases in the NTF. Experimental results show that the proposed method performs better than conventional methods in spotforming performance and also shows some characteristics suitable for practical use.
In this paper, we present a new joint factorization algorithm, called nonnegativetensor cofactorization (NTCoF). The key idea is to simultaneously factorize multiple visual features of the same data into nonnegative ...
详细信息
In this paper, we present a new joint factorization algorithm, called nonnegativetensor cofactorization (NTCoF). The key idea is to simultaneously factorize multiple visual features of the same data into nonnegative dimensionality-reduced representations, and meanwhile, to maximize the correlations of the low-dimensional representations. The data are generally encoded as tensors of arbitrary order, rather than vectors, to preserve the original data structures. NTCoF provides a simple and efficient way to fuse multiple complementary features for enhancing the discriminative power of the desired rank-reduced representations under the nonnegative constraints. We formulate the related objectives with a block-wise quadratic nonnegative function. To optimize, a unified convergence provable solution is developed. This solution is applicable for any nonnegative optimization problems with block-wise quadratic objective functions, and thus offer an unified platform based on which specific solution can be directly derived by skipping over tedious proof about algorithmic convergence. We apply the proposed algorithm and solution on three image tasks, face recognition, multiclass image categorization, and multilabel image annotation. Results with comparisons on public challenging data sets show that the proposed algorithm can outperform both the traditional nonnegative methods and the popular feature combination methods.
New applications of Electroencephalographic recording (EEG) require light and easy-to-handle equipment involving powerful algorithms of artifact removal. In our work, we exploit informed source separation methods for ...
详细信息
ISBN:
(纸本)9781479911806
New applications of Electroencephalographic recording (EEG) require light and easy-to-handle equipment involving powerful algorithms of artifact removal. In our work, we exploit informed source separation methods for artifact removal in EEG recordings with a low number of sensors, especially in the extreme case of single-channel recording, by exploiting prior knowledge from auxiliary lightweight sensors capturing artifactual signals. To achieve this, we propose a method using Non-negative tensorfactorization (NTF) in a Gaussian source separation framework that proves competitive against the classic Independent Component Analysis (ICA) technique. Additionally the both NTF and ICA methods are used in an original scheme that jointly processes the EEG and auxiliary signals. The adopted NTF strategy is shown to improve the source estimates accuracy in comparison with the usual multi-channel ICA approach.
This study compares the row-wise unfolding nonnegativetensorfactorization (NTF) and the standard nonnegativematrixfactorization (NMF) in extracting time-frequency represented event-related potentials-mismatch nega...
详细信息
ISBN:
(纸本)9783642133176
This study compares the row-wise unfolding nonnegativetensorfactorization (NTF) and the standard nonnegativematrixfactorization (NMF) in extracting time-frequency represented event-related potentials-mismatch negativity (MMN) and P3a from EEG under the two-dimensional decomposition. The criterion to judge performance of NMF and NTF is based on psychology knowledge of MMN and P3a. MMN is elicited by an oddball paradigm and may be proportionally modulated by the attention. So, participants are usually instructed to ignore the stimuli. However the deviant stimulus inevitably attracts some attention of the participant towards the stimuli. Thus, P3a often follows MMN. As a result, if P3a was larger, it could mean that more attention would be attracted by the deviant stimulus, and then MMN could be enlarged. The MMN and P3a extracted by the row-wise unfolding NIT revealed this coupling feature. However, through the standard NW. or the raw data, such characteristic was not evidently observed.
Room reverberation and environmental noise present challenges for integration of speech recognition technology in smart room applications. We present a multichannel enhancement framework for distributed microphone arr...
详细信息
ISBN:
(纸本)9781479971299
Room reverberation and environmental noise present challenges for integration of speech recognition technology in smart room applications. We present a multichannel enhancement framework for distributed microphone arrays to mitigate the effects of both additive noise and reverberation on distant-talking microphones. The proposed approach uses techniques of nonnegativematrix and tensorfactorization to achieve both noise suppression (through sparse representation of speech spectra) and dereverberation (through decomposition of magnitude spectra into convolutive components). Results of ASR experiments on the DIRHA-GRID corpus confirm that the proposed approach can achieve relative improvements of up to +20% in recognition accuracy in highly reverberant and noisy conditions using clean-trained models.
暂无评论