Speaker adaptive test normalization (ATnorm) is the most effective approach of the widely used score normalization in text-flldependent speaker verification, which selects speaker adaptive impostor cohorts with an e...
详细信息
Speaker adaptive test normalization (ATnorm) is the most effective approach of the widely used score normalization in text-flldependent speaker verification, which selects speaker adaptive impostor cohorts with an extra development corpus in order to enhance the recognition performance. In this paper, an improved implementation of ATnorm that can offer overall significant advantages over the original ATnorm is presented. This method adopts a novel cross similarity measurement in speaker adaptive cohort model selection without an extra development corpus. It can achieve a comparable performance with the original ATnorm and reduce the computation complexity moderately. With the full use of the saved extra development corpus, the overall system performance can be improved significantly. The results are presented on NIST 2006 Speaker recognition Evaluation data corpora where it is shown that this method provides significant improvements in system performance, with relatively 14.4% gain on equal error rate (EER) and 14.6% gain on decision cost function (DCF) obtained as a whole.
Traditional multi-class classification methods based on Fisher kernel combine generative models such as Gaussian mixture models(GMMs)of all the classes ***,the combination generates high dimensional feature vectors an...
详细信息
Traditional multi-class classification methods based on Fisher kernel combine generative models such as Gaussian mixture models(GMMs)of all the classes ***,the combination generates high dimensional feature vectors and leads to large *** this paper,a new classification method is *** method adopts an intelligent feature space selection strategy by clustering similar Gaussian mixtures in order to reduce the feature *** classification experiments show that the proposed method is more accurate and effective with less computation compared with traditional methods.
To solve the frame delay problem and match the previous frame,Plapous et al.[IEEE Transactions on Audio,Speech,and Language Processing,2006,14(6):2098–2108]introduced a novel approach called two-step noise reduction(...
详细信息
To solve the frame delay problem and match the previous frame,Plapous et al.[IEEE Transactions on Audio,Speech,and Language Processing,2006,14(6):2098–2108]introduced a novel approach called two-step noise reduction(TSNR)technique to improve the performance of the speech enhancement ***,TSNR approach results in spectral peaks of short duration and the broken spectral outlier,which degrade the spectral characteristics of the *** solve this problem,a cepstral smoothing step is added in order to remove these spectral peaks brought by TSNR *** analysis shows that the proposed approach can effectively smooth the spectral peaks and keep the spectral outlier so as to protect the speech *** results also show that the proposed approach can bring significant improvement compared to decision-directed(DD)and TSNR approaches,especially in non-stationary noisy environments.
Open relation extraction is the task to extract relational facts without pre-defined relation types from open-domain corpora. However, since there are some hard or semi-hard instances sharing similar context and entit...
详细信息
Zero-shot relation extraction aims to identify novel relations which cannot be observed at the training stage. However, it still faces some challenges since the unseen relations of instances are similar or the input s...
详细信息
Active Learning (AL) is designed to aid the labor-intensive process of training acoustic model for speech recognition. In AL, only the most informative training samples are selected for manual annotation. Thus, how to...
详细信息
Supervised open relation extraction aims to discover novel relations by leveraging supervised data of pre-defined relations. However, most existing methods do not achieve effective knowledge transfer from pre-defined ...
详细信息
Currently those algorithms to mine the alarm association rules are limited to the minimal support, so that they can only obtain the association rules among the frequently occurring alarm events, furthermore, the rules...
详细信息
Most previous approaches to automatic audio events (AEs) annotation are based on supervised learning which relies on the availability of a labeled corpus to train classification models. However, instance annotation is...
详细信息
ISBN:
(纸本)9781424472369
Most previous approaches to automatic audio events (AEs) annotation are based on supervised learning which relies on the availability of a labeled corpus to train classification models. However, instance annotation is often difficult, expensive, and time consuming. In this paper, we apply semi- supervised learning with transductive Support Vector Machine (TSVM) algorithm to automatic AEs annotation. Besides, considering about the presence of outliers which degrade the generalization and the classification performance, we propose a confidence-based method for samples selection. In our experiments based on the melodrama Friends corpus, the proposed method can effectively use unlabeled data to improve the classification performance with only a small amount of the labeled data.
In this paper, a novel sparse feature representation method for object tracking is proposed. The method is on the observation that a tracked object can be dynamically and compactly represented by a few features (spars...
详细信息
暂无评论