作者:
GRENIER, YENST
DEPT SIGNAL 46 RUE BARRAULT F-75634 PARIS 13 FRANCE
This paper describes a microphone array for speech recording in car environments. The array is designed for hands-free radiotelephone, and is also used as a front-end for an automatic speech recognition system (this s...
详细信息
This paper describes a microphone array for speech recording in car environments. The array is designed for hands-free radiotelephone, and is also used as a front-end for an automatic speech recognition system (this study has been realised within the european ESPRIT project ARS ''adverse environment recognition of speech''). We first summarise the adaptive beamforming techniques that we have used. We then describe several aspects of the implementation of the array (configuration, design of fixed beamformers, adaptation, complexity reduction). In the last section, we evaluate the performance of the array. Two measures of performance have been retained, one is the signal-to-noise ratio, and the other is the score obtained with the speech recognition system.
In this paper, a new microphone array processing technique is proposed for blind dereverberation of speech signals affected by room acoustics. It is based on the separate processing of the minimum-phase and all-pass c...
详细信息
In this paper, a new microphone array processing technique is proposed for blind dereverberation of speech signals affected by room acoustics. It is based on the separate processing of the minimum-phase and all-pass components of delay-steered multi-microphone signals. The minimum-phase components are processed in the cepstrum-domain, where spatial averaging followed by low-time filtering is applied. The all-pass components, which contain the source location information, are processed in the frequency-domain by performing spatial averaging and by retaining only the all-pass component of the resulting output. The underlying motivation for the new processor is to use spatio-temporal processing over a single set of synchronous speech segments from several microphones to reconstruct the source speech, such that it is applicable to practical time-variant acoustic environments. Simulated room impulse responses are used to evaluate the new processor and to compare it to a conventional beamformer, Significant improvements in array gain and important reductions of reverberation in listening tests are observed.
Because of the smoke or collapsed walls in the buildings while fire or earthquake occurred, a rescue robot can not directly find the target which is in other rooms or invisible places by using visual, ultrasonic or in...
详细信息
ISBN:
(纸本)9781424487363
Because of the smoke or collapsed walls in the buildings while fire or earthquake occurred, a rescue robot can not directly find the target which is in other rooms or invisible places by using visual, ultrasonic or infrared ray sensors. According to the diffraction property of audio signal, the sound can bypass obstacles. By using microphone array, combined with the speech recognition technology, an audio-based robot navigation system is developed to make it feasible to guide the robot find the target shouting for help. And ideal effects are obtained in experiments.
The theoretic foundation of traditional microphone array post-filters is the assumption that the noise between sensors is uncorrelated. However, this assumption is inaccurate in real environments since the correlated ...
详细信息
ISBN:
(纸本)9781424421787
The theoretic foundation of traditional microphone array post-filters is the assumption that the noise between sensors is uncorrelated. However, this assumption is inaccurate in real environments since the correlated noise exists. In this paper, a generalized microphone array post-filter is proposed to deal with both the correlated and uncorrelated noise in environments and a novel perceptual filter is proposed to reduce the musical residual noise introduced by the post-filter. Experiments show that the proposed technique produces impressive results in terms of quality measures of the enhanced speech.
Conventional methods for sound source localization using microphone arrays are usually addressed from the signal processing viewpoint, where the sound source location is treated as a continuous parameter to be estimat...
详细信息
Conventional methods for sound source localization using microphone arrays are usually addressed from the signal processing viewpoint, where the sound source location is treated as a continuous parameter to be estimated over some spatial space. Actually, in some practical scenarios, such as in conference rooms and cars, sound source locations are only confined to some predefined areas. Therefore, it is more reasonable to deal with the problem from a machine learning point of view. By incorporating the prior information available about sound environments, machine learning-based methods have the potential to better deal with sound source localization in the presence of room reverberation. The key to machine learning-based sound source localization methods is how to extract effective source location features. The existing feature extraction schemes, such as the popular timedifference- of-arrival features, however, are not suitable for smallsized sensor arrays, due to the fact that sound source localization in reverberant environments become much challenging for smallsized arrays. To combat the problem, in this paper, we propose a reverberation robust feature extraction method for sound source localization based on sound intensity (SI) estimation using a small-sized microphone array. In particular, three robust feature extraction procedures have been employed in the proposed features, including normalization, phase transform weighting, and fully incorporating the redundancies in SI estimation. Simulation and real-world experimental results both show that the proposed sound source location features are more effective for small-sized arrays in reverberant environments when compared with the existing features.
We propose a microphone array network that realizes ubiquitous sound acquisition. Nodes with 16 microphones are connected to form a huge sound acquisition system that carries out VAD, sound source localization and sep...
详细信息
ISBN:
(纸本)9781424442966
We propose a microphone array network that realizes ubiquitous sound acquisition. Nodes with 16 microphones are connected to form a huge sound acquisition system that carries out VAD, sound source localization and separation. The three operations are distributed among nodes. The VAD is implemented to manage power consumption. Consequently, the system consumes little power when speech is not active. The VAD module uses only 2.1 mW. The system can improve an SNR by 7.75 dB using 112 microphones.
This paper presents a microphone array sound source localization system based on deep learning algorithms. Currently, the most popular acoustic source localization algorithms are based on the traditional array signal ...
详细信息
ISBN:
(纸本)9781538656273
This paper presents a microphone array sound source localization system based on deep learning algorithms. Currently, the most popular acoustic source localization algorithms are based on the traditional array signal processing methods. These methods have good localization performance in the ideal acoustic environment. However, the performances degraded significantly in low signal-to-noise ratio (SNR) and strong reverberation environments. To deal with this problem, this study developed a deep neural network (DNN) based system. Unlike the traditional algorithms with poor adaptability to environmental conditions, the proposed system can automatically learn the spatial information of sound sources under various conditions through training a large amount of data. Furthermore, it can fully utilize all the information of the original data without additional feature extraction. A set of experiments are carried out to evaluate the performance of the proposed system in comparison with the generalized cross correlation phase transform (GCC-PHAT) method. Results verify that the DNN based system achieves higher accuracy under low SNR conditions.
Wall pressure fluctuations in turbulent boundary layer flow over backward-facing step with and without entrainment were investigated. Digital array pressure sensors and multi-arrayed microphones were employed to acqui...
详细信息
Wall pressure fluctuations in turbulent boundary layer flow over backward-facing step with and without entrainment were investigated. Digital array pressure sensors and multi-arrayed microphones were employed to acquire the time-averaged static pressure and fluctuating pressure, respectively. The differences of two flows were scrutinized in terms of static pressure characteristics, pressure fluctuations, cross-correlation and coherence of wall pressure. Introduction of the entrainment increased scale of large-scale vortical structure and reduced its convection velocity. However, shedding frequency of large-scale vortical structures was found to be the same for both flows.
This paper describes a new speech enhancement system that employs a microphone array with post-processing based on minimum mean-square error short-time spectral amplitude (MMSE-STSA) estimator. To get more accurate MM...
详细信息
This paper describes a new speech enhancement system that employs a microphone array with post-processing based on minimum mean-square error short-time spectral amplitude (MMSE-STSA) estimator. To get more accurate MMSE-STSA estimator in a microphone array, modification and refinement procedure are carried out from each microphone output. Performance of the proposed system is compared with that of other methods using a microphone array. Noise removal experiments for white and pink noises demonstrate the superiority of the proposed speech enhancement system to others with a microphone array in average output SNRs and cepstral distance measures.
The ability to localize acoustic sources can greatly improve the perception of smart devices (e.g., a smart speaker like Amazon Alexa). In this work, we study the problem of concurrently localizing multiple acoustic s...
详细信息
The ability to localize acoustic sources can greatly improve the perception of smart devices (e.g., a smart speaker like Amazon Alexa). In this work, we study the problem of concurrently localizing multiple acoustic sources with a single smart device. Our proposal called Symphony is the first complete solution to tackle the above problem, including method, theory, and practice. The method stems from the insight that the geometric layout of microphones on the array determines the unique relationship among signals from the same source along the same arriving path. We also establish the theoretical model of Symphony, which reveals the relation between localization performance (resolution and coverage) and impacting factors (sampling rate, array aperture, and array-wall distance). Moreover, the ability to separate and localize multiple sources is also studied theoretically and numerically. We implement Symphony with different types of commercial off-the-shelf microphone arrays and evaluate its performance under different settings. The results show that Symphony has a median localization error of 0.662 m.
暂无评论