In this study, we investigate the effectiveness of spatial features in acoustic scene classification using distributed microphone arrays. Under the assumption that multiple subarrays, each equipped with microphones, a...
详细信息
In this study, we investigate the effectiveness of spatial features in acoustic scene classification using distributed microphone arrays. Under the assumption that multiple subarrays, each equipped with microphones, are synchronized, we investigate two types of spatial feature: intra- and inter-generalized cross-correlation phase transforms (GCC-PHATs). These are derived from channels within the same subarray and between different subarrays, respectively. Our approach treats the log-Mel spectrogram as a spectral feature and intra- and/or inter-GCC-PHAT as a spatial feature. We propose two integration methods for spectral and spatial features: (a) middle integration, which fuses embeddings obtained by spectral and spatial features, and (b) late integration, which fuses decisions estimated using spectral and spatial features. The evaluation experiments showed that, when using only spectral features, employing all channels did not markedly improve the F1-score compared with the single-channel case. In contrast, integrating both spectral and spatial features improved the F1-score compared with using only spectral features. Additionally, we confirmed that the F1-score for late integration was slightly higher than that for middle integration.
Acoustic source localization using distributed microphone array is a challenging task due to the influences of noise and reverberation. In this paper, acoustic source localization using kernel-based extreme learning m...
详细信息
Acoustic source localization using distributed microphone array is a challenging task due to the influences of noise and reverberation. In this paper, acoustic source localization using kernel-based extreme learning machine in distributed microphone array is proposed. Specifically, the space of interest is divided into some labeled positions, and the candidate generalized cross correlation function in each node is treated as the feature mapped into the hidden nodes of extreme learning machine. During the training phase, by the implementation of kernel function, the output weights of the classifier are calculated and do not need to be tuned. After the kernel-based extreme learning machine (K-ELM) is well trained, the measured generalized cross correlation data are fed into the K-ELM classifier, and the output is the estimated acoustic source position. The proposed method needs less human intervention for both training and testing and it does not need to calibrate the node in advance. Simulation and real-world experimental results reveal that the proposed method has extremely fast training and testing speeds, and can obtain better localization performance than steered response power, K-nearest neighbor, and support vector machine methods.
microphone positions have to be calibrated in distributed microphone array applications. A constrained total least squares calibration method for distributed microphone arrays is proposed in this paper. All the source...
详细信息
microphone positions have to be calibrated in distributed microphone array applications. A constrained total least squares calibration method for distributed microphone arrays is proposed in this paper. All the source event positions are first estimated by the weighted multidimensional scaling algorithm. Then the suitable source events are picked up by the TDOA selection strategy at each node. Finally, the node microphones are calibrated by the total least squares, and further refined based on constrained total least squares when the estimated results suffer from large errors. The proposed method can obtain higher calibration accuracy and works well in noise and reverberation conditions. Simulation and real-world experiment results reveal the validity of the proposed method.
We propose a novel framework for reducing distant noise by using a distributed microphone array;reducing noise propagated from a far distance in real-time. Previous studies have revealed that a distributedmicrophone ...
详细信息
ISBN:
(纸本)9789082797015
We propose a novel framework for reducing distant noise by using a distributed microphone array;reducing noise propagated from a far distance in real-time. Previous studies have revealed that a distributed microphone array with an instantaneous mixing assumption can effectively reduce noise when the target and noise sources are significantly far apart. However, in distant noise reduction, the target and noise sources are not usually instantaneously mixed because the reverberation-and propagation-time from the noise sources to a microphone is longer than the short-time Fourier transform (STFT) length. To express reverberation-and propagation-parameters, we introduce a multi-delay noise model that represents the reverberation-time as a convolution of the transfer-function-gains and the noise sources and the propagation-time as time-frame delays. These parameters are estimated on the basis of the maximum a posteriori (MAP) estimation. Experimental results show that the proposed method outperformed conventional methods in several performance measurements and could reduce distant noise propagated from more than 100 m away in a real-environment.
In this paper, with the aim of using the spatial information obtained from a distributed microphone array employed for acoustic scene analysis, we propose a robust and efficient method, which is called the spatial cep...
详细信息
In this paper, with the aim of using the spatial information obtained from a distributed microphone array employed for acoustic scene analysis, we propose a robust and efficient method, which is called the spatial cepstrum. In our approach, similarly to the cepstrum, which is widely used as a spectral feature, the logarithm of the amplitude in multichannel observation is converted to a feature vector by a linear orthogonal transformation. This linear orthogonal transformation is achieved by principal component analysis (PCA) in general. Moreover, we also show that for a circularly symmetric microphone arrangement with an isotropic sound field, PCA is identical to the inverse discrete Fourier transform and the spatial cepstrum exactly corresponds to the cepstrum. The proposed approach does not require the positions of the microphones and is robust against the synchronization mismatch of channels, thus ensuring its suitability for use with a distributed microphone array. Experimental results obtained using actual environmental sounds verify the validity of our approach even when a smaller feature dimension than the original one is used, which is achieved by dimensionality reduction through PCA. Additionally, experimental results also indicate that the robustness of the proposed method is satisfactory for observations that have the synchronization mismatch of channels.
When distances between microphone pairs are larger than the half-wavelength of signals, source localization methods using cross-correlation such as time-difference-of-arrival (TDOA), steered response power (SRP) are c...
详细信息
When distances between microphone pairs are larger than the half-wavelength of signals, source localization methods using cross-correlation such as time-difference-of-arrival (TDOA), steered response power (SRP) are c...
详细信息
ISBN:
(纸本)9781509041183
When distances between microphone pairs are larger than the half-wavelength of signals, source localization methods using cross-correlation such as time-difference-of-arrival (TDOA), steered response power (SRP) are commonly used in practice. We present here a novel model that expresses microphone pairwise cross-correlations as a sum of autocorrelations of source signals shifted by the relative delays of the signals arriving at the microphone pairs, and weighted by the source power and the distances between the sources and the microphone pairs. The model is formulated as a linear inverse problem and is sparse with respect to the source power map. The source power map, which directly shows the locations of all the sound sources, can be reconstructed using l_1-norm minimization algorithms. We demonstrate the effectiveness of our model in a wildlife monitoring application, where the goal is to locate multiple frogs in a dense chorus.
The geometrical structure and size of a distributed microphone array are usually irregular, and need to be estimated in many applications. A microphone position calibration method based on combination of acoustic ener...
详细信息
The geometrical structure and size of a distributed microphone array are usually irregular, and need to be estimated in many applications. A microphone position calibration method based on combination of acoustic energy decay model and time difference of arrival for distributed microphone arrays is proposed in this paper. The method utilizes the acoustic energy decay model to estimate the coarse distance between the microphone and the sound source, and then applies time difference of arrival to search for the accurate distance within a certain range near the coarse distance. Finally, the minimum mean square error estimation method is employed to determine the position of the microphone. The proposed method has a high positioning accuracy, stable calibration performance and low computational complexity. Simulation results reveal the validity of the proposed method at a theoretical level. (C) 2015 Elsevier Ltd. All rights reserved.
In this paper we propose a robust and efficient method to utilize the spatial information provided by a distributed microphone array for acoustic scene analysis. In our approach, similarly to the cepstrum, which is wi...
详细信息
ISBN:
(纸本)9780992862633
In this paper we propose a robust and efficient method to utilize the spatial information provided by a distributed microphone array for acoustic scene analysis. In our approach, similarly to the cepstrum, which is widely used as a spectral feature, the logarithm of the amplitude in multichannel observation is converted to a feature vector by a linear orthogonal transformation. Then, the spatial information of the acoustic scene is represented in the spatial feature space. This approach does not require the positions of the microphones and is not sensitive to the synchronization mismatch of channels, both of which make the method suitable for use with a distributed microphone array. Experimental results using real-life environmental sounds show the validity of our approach even when a smaller feature dimension than the original one is used.
An accurate estimation of a source activity information is essential for many speech enhancement algorithms including blind source separation (BSS). In this paper, we propose a novel BSS method that accurately models ...
详细信息
暂无评论