In speech recognition, acoustic modeling always requires tremendous transcribed samples, and the transcription becomes intensively time-consuming and costly. In order to aid this labor-intensive process, Active Learni...
详细信息
In speech recognition, acoustic modeling always requires tremendous transcribed samples, and the transcription becomes intensively time-consuming and costly. In order to aid this labor-intensive process, Active Learning (AL) is adopted for speech recognition, where only the most informative training samples are selected for manual annotation. In this paper, we propose a novel active learning method for Chinese acoustic modeling, the methods for initial training set selection based on Kullback-Leibler Divergence (KLD) and sample evaluation based on multi-level confusion networks are proposed and adopted in our active learning system, respectively. Our experiments show that our proposed method can achieve satisfying performances.
In this paper, a non-linear adaptive control method based on SO(3) for the quadrotor attitude tracking is proposed. Distinct from other control methods on Euclidean space, the controller proposed is developed on SO(3)...
详细信息
Besides their decorative purposes,vehicle manufacturer logos can provide rich information for vehicle verification and classification in many applications such as security and information ***,unlike the license plate,...
详细信息
Besides their decorative purposes,vehicle manufacturer logos can provide rich information for vehicle verification and classification in many applications such as security and information ***,unlike the license plate,which is designed for identification purposes,vehicle manufacturer logos are mainly designed for decorative purposes such that they might lack discriminative features ***,in practical applications,the vehicle manufacturer logos captured by a fixed camera vary in *** these reasons,detection and recognition of vehicle manufacturer logos are very challenging but crucial problems to *** this paper,based on preparatory works on logo localization and image segmentation,we propose a size-self-adaptive method to recognize vehicle manufacturer logos based on feature extraction and support vector machine(SVM)*** experimental results demonstrate that the proposed method is more effective and robust in dealing with the recognition problem of vehicle logos in different ***,it has a good performance both in preciseness and speed.
In this paper, we present a large scale off-line handwritten Chinese character database-HCL2000 which will be made public available for the research community. The database contains 3,755 frequently used simplified Ch...
详细信息
This paper presents an audio event classification algorithm which automatically classifies an audio event as footstep,glass breaking,gunshot or scream mainly for surveillance ***,the Gabor feature of the audio spectro...
详细信息
ISBN:
(纸本)9781509012473
This paper presents an audio event classification algorithm which automatically classifies an audio event as footstep,glass breaking,gunshot or scream mainly for surveillance ***,the Gabor feature of the audio spectrogram is extracted,there are two kinds of Gabor features,namely global Gabor feature and local Gabor *** we use Principal Components Analysis(PCA) and Linear Discriminant Analysis(LDA) to compress the feature dimension,finally the K nearest neighbor classifier(KNN) is used to recognize audio *** carried out extensive experiments on the clean and noisy audio *** results demonstrate that the algorithm is able to guarantee a recall of 96.1%on clean sets and is proved to be more effective than traditional methods.
Image classification is a fundamental problem in computer vision and patternrecognition. Feature extraction is often regarded as the key for classifying images. Traditional ways rely on handcrafted features heavily, ...
详细信息
There are two important research topics in the field of Music Information Retrieval (MIR). One is how to improve the robustness of features and the other is how to speed up the retrieval process. This paper improved t...
详细信息
This paper presents a novel humming feature extraction algorithm based on locality statistical analysis to tackle the problem of the instability of humming features in the query by humming(QBH) *** carrying out stat...
详细信息
ISBN:
(纸本)9781509012473
This paper presents a novel humming feature extraction algorithm based on locality statistical analysis to tackle the problem of the instability of humming features in the query by humming(QBH) *** carrying out statistics to humming notes sequences in both longitudinal vocal range distribution and horizontal temporal variation distribution,we can obtain the locality statistical humming *** we concatenate several features using the idea of N-gram to improve feature *** the framework of QBH based on Locality Sensitive Hashing(LSH),the proposed method has achieves 86%top-1 rate and 92%top-5 rate in the experiment,indicating the effectiveness of the method.
Speaker adaptive test normalization (ATnorm) is the most effective approach of the widely used score normalization in text-flldependent speaker verification, which selects speaker adaptive impostor cohorts with an e...
详细信息
Speaker adaptive test normalization (ATnorm) is the most effective approach of the widely used score normalization in text-flldependent speaker verification, which selects speaker adaptive impostor cohorts with an extra development corpus in order to enhance the recognition performance. In this paper, an improved implementation of ATnorm that can offer overall significant advantages over the original ATnorm is presented. This method adopts a novel cross similarity measurement in speaker adaptive cohort model selection without an extra development corpus. It can achieve a comparable performance with the original ATnorm and reduce the computation complexity moderately. With the full use of the saved extra development corpus, the overall system performance can be improved significantly. The results are presented on NIST 2006 Speaker recognition Evaluation data corpora where it is shown that this method provides significant improvements in system performance, with relatively 14.4% gain on equal error rate (EER) and 14.6% gain on decision cost function (DCF) obtained as a whole.
暂无评论