This paper describes a unique cross-phoneme speaker identification experiment, using deliberately mismatched phoneme sets for training and testing. The underlying goal is to identify features that represent broad indi...
详细信息
Abundance fully constrained least squares (FLCS) method has been widely used for spectral unmixing. A modified FCLS (MFCLS) was previously proposed for the same purpose to derive two iterative equations for solving fu...
详细信息
This paper introduces a new method to track articulator movements, specifically jaw position and angle, using 5 degree of freedom (5 DOF) orientation data. The approach uses a quaternion rotation method to accomplish ...
详细信息
This paper describes a unique cross-phoneme speaker identification experiment, using deliberately mismatched phoneme sets for training and testing. The underlying goal is to identify features that represent broad indi...
详细信息
ISBN:
(纸本)9781467301732
This paper describes a unique cross-phoneme speaker identification experiment, using deliberately mismatched phoneme sets for training and testing. The underlying goal is to identify features that represent broad individually unique characteristics rather than those that represent phonetic differences, as are more typical of modern speaker identification and verification systems. A wide range of features are proposed and evaluated within this context using a Gaussian Mixture Model framework. The results show that log-area ratio has better phonetic independence than MFCCs, that residual phase carries substantial speaker information, and identifies several other features that also have usefulness for speaker identification independent of phonetic content.
This paper introduces a new method to track articulator movements, specifically jaw position and angle, using 5 degree of freedom (5 DOF) orientation data. The approach uses a quaternion rotation method to accomplish ...
详细信息
This paper introduces a new method to track articulator movements, specifically jaw position and angle, using 5 degree of freedom (5 DOF) orientation data. The approach uses a quaternion rotation method to accomplish this jaw tracking during speech using a single senor on the mandibular incisor. Data were collected using the NDI Wave Speech Research System for one pilot subject with various speech tasks. The degree of jaw rotation from the proposed approach is compared with traditional geometric calculation. Results show that the quaternion based method is able to describe jaw angle trajectory and gives more accurate and smooth estimation of jaw kinematics.
A factor analysis model based on multitask learning (MTL) is developed to characterize the FFT-magnitude feature of complex high-resolution range profile (HRRP), motivated by the problem of radar automatic target reco...
详细信息
A factor analysis model based on multitask learning (MTL) is developed to characterize the FFT-magnitude feature of complex high-resolution range profile (HRRP), motivated by the problem of radar automatic target recognition (RATR). The MTL mechanism makes it possible to appropriately share the information among samples from different target-aspects and learn the aspect-dependent parameters collectively, thus offering the potential to improve the overall recognition performance with small training data size. In addition, since the noise level of a test sample is usually different from those of the training samples in the real application, another contribution is that the proposed framework can update the noise level parameter in the FA model to adaptively match that of the received test sample. Efficient inference is performed via variational Bayesian (VB) for the proposed hierarchical Bayesian model, and encouraging results are reported on the measured HRRP dataset with small training data size and under the test condition of low signal-to-noise ratio (SNR).
Compressive sensing (CS) holds new promises for the digitization of wideband. frequency-domain sparse signals at sub-Nyquist rate sampling without compromising the reconstruction quality. In this paper, the impact of ...
详细信息
One of the challenges for electrocardiogram (ECG) signals extraction is additive power line interference (PLI) which contains 50Hz or 60Hz base frequency and several harmonics. Variety of fixed-frequency and adaptive ...
详细信息
One of the challenges for electrocardiogram (ECG) signals extraction is additive power line interference (PLI) which contains 50Hz or 60Hz base frequency and several harmonics. Variety of fixed-frequency and adaptive notch filter structures are proposed for power line interference suppression. Because of frequency varying behavior of power line interference, fixed frequency filters suffer from the problem of low noise attenuation rate when the interference frequencies drift from their nominal values, and also signal amplitude and phase destruction around interference frequencies due to relatively high notching bandwidth. On the other hand adaptive filters are given high calculation costs. In this paper we propose a real-time, low calculation cost adaptive notch filter, using phase-locked loop (PLL). Experimental results show that in extremely noisy environments where input SNR is around - 20dB, average SNR enhancement of proposed method is approximately 40dB. This value is also as good as 35dB for -10dB input SNR.
Transform coding is a widely used image compression technique, where entropy reduction can be achieved by decomposing the image over a dictionary which provides compaction. Existing algorithms, such as JPEG and JPEG20...
详细信息
Transform coding is a widely used image compression technique, where entropy reduction can be achieved by decomposing the image over a dictionary which provides compaction. Existing algorithms, such as JPEG and JPEG2000, utilize fixed dictionaries which are shared by the encoder and decoder. Recently, works utilizing content-specific dictionaries show promising results by focusing on specific classes of images and using highly specialized dictionaries. However, such approaches lose the ability to compress arbitrary images. In this paper we propose an input-adaptive compression approach, which encodes each input image over a dictionary specifically trained for it. The scheme is based on the sparse dictionary structure, whose compact representation allows relatively low-cost transmission of the dictionary along with the compressed data. In this way, the process achieves both adaptivity and generality. Our results show that although this method involves transmitting the dictionary, it remains competitive with the JPEG and JPEG2000 algorithms.
暂无评论