Speech recognitionsystems are usually trained using tremendous transcribed utterances, and training data preparation is intensively time-consuming and costly. Aiming at reducing the number of training examples to be ...
详细信息
Speech recognitionsystems are usually trained using tremendous transcribed utterances, and training data preparation is intensively time-consuming and costly. Aiming at reducing the number of training examples to be labeled, active learning is used in acoustic modeling of speech recognition, this learning scheme iteratively inspects the unlabeled samples, selects the most informative samples corresponding to a certain criterion, then annotates them, and adds the newly transcribed samples to the training set to update the acoustic model. Concerning about the importance of the criterion to select the most informative samples, we proposed a confidence measure computed by confusion network, and used this measure as the criterion for sample selection to improve the efficiency of active learning in acoustic modeling. Our experiments show that active learning, which adopts the proposed confidence measure, can achieve 31% maximum reduction of labeled data compared with random selection method.
Automatic song identification has long been a research focus. In this paper, a novel structural fingerprint based hierarchical filtering method is proposed and it consists of two parts: one is the generation of finger...
详细信息
ISBN:
(纸本)9781612843483
Automatic song identification has long been a research focus. In this paper, a novel structural fingerprint based hierarchical filtering method is proposed and it consists of two parts: one is the generation of fingerprint with both long structural information and low collision, and the other is an efficient searching algorithm based on a set of selective 2-level filters. Experiments conducted on a database of 10,000 songs show that our approach is fast enough and can achieve the accuracy of 99.7% on 5 second clips with the SNR at 0db comparable to the state-of-the-art.
The difference between speakers will blur the semantic information, leading to the mismatch between training and decoding, which means the reduction of the performance of speech recognition. This paper presents a nove...
详细信息
The difference between speakers will blur the semantic information, leading to the mismatch between training and decoding, which means the reduction of the performance of speech recognition. This paper presents a novel method to combine subglottic parameter, glottis parameter and supraglottic parameter based on Linear Discriminant Analysis (LDA). Estimating warped factor from multiple parameters using LDA is efficient to extract more stable individual difference information. Experimental results show that the proposed algorithm has better performance than the conventional methods.
An accuracy assessment method that integrates segmentation and classification accuracy is proposed to meet the requirements of object-based image analysis. Segmentation errors are measured by establishing the relation...
详细信息
In the design of brain-computer interface systems, classification of Electroencephalogram (EEG) signals is the essential part and a challenging task. Recently, as the marginalized discrete wavelet transform (mDWT) rep...
详细信息
In many multi-camera surveillance systems, there is a need to identify whether a captured person have emerged before over the network of cameras. This is the person re-identification problem. In this paper, we propose...
详细信息
To alleviate the workload of labeling before estimating certain color distributions, integrative labeling is introduced, which merely needs to figure out whether a picture contains positive-class regions or not and th...
详细信息
To alleviate the workload of labeling before estimating certain color distributions, integrative labeling is introduced, which merely needs to figure out whether a picture contains positive-class regions or not and then all pixels of the picture are treated as positive or negative class training samples. Integrative labeling, however, results in heavy mixture of training samples. Thus traditional generative density estimation methods can't be used directly in that they perform poorly with heavily polluted training samples. In this paper, by utilizing the prior knowledge of high separability between positive and negative class color distributions, a discriminative learning based GMM(DiscGMM) is proposed for integrative labeling. Besides generating the polluted positive-class samples with comparatively high probability, optimal parameters found by DiscGMM also enjoy a comparatively low probability of generating negative-class samples. The parameter learning problem is solved by a modified Expectation Maximization (EM) algorithm. In an integrative labeling experiment of skin detection, DiscGMM is testified to enjoy much better performance than generative density estimation methods and shows qualified results.
Local Binary pattern (LBP) is a powerful texture descriptor for its tolerance against illumination changes and its computational simplicity. The basic LBP encodes 256 feature patterns in a 3×3 neighborhood, but n...
详细信息
Local Binary pattern (LBP) is a powerful texture descriptor for its tolerance against illumination changes and its computational simplicity. The basic LBP encodes 256 feature patterns in a 3×3 neighborhood, but not all the patterns are effective for classification. In this paper, we propose a simplified LBP(S-LBP) which produces optimal patterns by using the best coding principle for classification. Meanwhile, we combine S-LBP and Mahalonobis distance in solving the practical problem of character recognition in Chinese license plate. Experimental results demonstrate the effectiveness of our method for vehicle license recognition comparing with other popular methods.
Weblog is widely used, and the number of users is increasing rapidly. Weblog reflects every aspect of the society, such as politics, economy and culture, so the topic relevance retrieval research on Weblog becomes nec...
详细信息
Weblog is widely used, and the number of users is increasing rapidly. Weblog reflects every aspect of the society, such as politics, economy and culture, so the topic relevance retrieval research on Weblog becomes necessary. Because of a lot of noise in the corpus and it is usually difficult to obtain the appropriate query, the common methods sometimes fail to reach an acceptable precision. We design a Modified Topic Relevance Retrieval system (MTRRS) containing query formulation and a combination model. To design the query, manual adjustment and machine learning are used. During the machine learning processing, we define a center word list which helps to generate a novel distance feature. The result can be improved 22.97% on MAP by query formulation. The results of document retrieval model and passage retrieval model are combined. 33.55% increase on MAP can be received. Also by using the combination model, the retrieval result of the semi-machine learning query is closely approaching the manually adjusted result.
With the popularity of MMS, the multimedia messages which include sensitive information are increasing rapidly. In the paper, a novel framework of a MMS filtering for Chinese sensitive text in image is presented. An e...
详细信息
With the popularity of MMS, the multimedia messages which include sensitive information are increasing rapidly. In the paper, a novel framework of a MMS filtering for Chinese sensitive text in image is presented. An effective method is applied to detect and filter sensitive texts in image of multimedia message which could easily be transmitted through the mobile communication network without being monitored at recent stage. The detection and recognition of sensitive text are achieved by using SIFT feature, which is proper to the characteristics of the text in image of multimedia message and get an accurate result. The method has a good practical application value.
暂无评论