Most previous approaches to automatic audio events (AEs) annotation are based on supervised learning which relies on the availability of a labeled corpus to train classification models. However, instance annotation is...
详细信息
ISBN:
(纸本)9781424472369
Most previous approaches to automatic audio events (AEs) annotation are based on supervised learning which relies on the availability of a labeled corpus to train classification models. However, instance annotation is often difficult, expensive, and time consuming. In this paper, we apply semi- supervised learning with transductive Support Vector Machine (TSVM) algorithm to automatic AEs annotation. Besides, considering about the presence of outliers which degrade the generalization and the classification performance, we propose a confidence-based method for samples selection. In our experiments based on the melodrama Friends corpus, the proposed method can effectively use unlabeled data to improve the classification performance with only a small amount of the labeled data.
Density estimation via Gaussian mixture modeling has been successfully applied to image segmentation, speech processing and other fields relevant to clustering analysis and Probability density function (PDF) modeling....
详细信息
Density estimation via Gaussian mixture modeling has been successfully applied to image segmentation, speech processing and other fields relevant to clustering analysis and Probability density function (PDF) modeling. Finite Gaussian mixture model is usually used in practice and the selection of number of mixture components is a significant problem in its application. For example, in image segmentation, it is the donation of the number of segmentation regions. The determination of the optimal model order therefore is a problem that achieves widely attention. This paper proposes a degenerating model algorithm that could simultaneously select the optimal number of mixture components and estimate the parameters for Gaussian mixture model. Unlike traditional model order selection method, it does not need to select the optimal number of components from a set of candidate models. Based on the investigation on the property of the elliptically contoured distributions of generalized multivariate analysis, it select the correct model order in a different way that needs less operation times and less sensitive to the initial value of EM. The experimental results show the effectiveness of the algorithm.
In this paper, we present the fusional feature composed of Affine-SIFT, MSER and color moment invariants. The fusional feature is more robust and distinctive than a single local feature. Instead of adding three local ...
详细信息
In this paper, we present a novel scheme to tackle the task of near-duplicate image detection. Given two input images, the algorithm based on the refined similarity measure can judge rightly whether two input image ar...
详细信息
In this paper, we present a novel scheme to tackle the task of near-duplicate image detection. Given two input images, the algorithm based on the refined similarity measure can judge rightly whether two input image are duplicate images or not. The two images are represented with local feature (i.e, Affine-SIFT) in bag of features model. The Affine-SIFT can undergo larger affine distortions than Hessian-Affine and MSER (Maximally Stable Extremal Region). The refined similarity measure exploits the spatial information between two images. The algorithm is demonstrated on some image pairs with scale change, viewpoint change, blur, noise and spatial deformation. The experimental results show that proposed algorithm is more effective than other state-of-the-art duplicate image detection algorithm.
Active Learning (AL) is designed to aid the laborintensive process of training acoustic model for speech recognition. In AL, only the most informative training samples are selected for manual annotation. Thus, how to ...
详细信息
Active Learning (AL) is designed to aid the laborintensive process of training acoustic model for speech recognition. In AL, only the most informative training samples are selected for manual annotation. Thus, how to evaluate the unlabeled samples is worth researching. In this paper, we propose a unified framework to generate confusion networks of multiple levels including character, syllable and phone, and present a novel active learning sample evaluation method for Chinese acoustic modeling, posterior probabilities obtained from multi-level confusion networks are respectively adopted to evaluate the unlabeled samples. Our experiments show that compared with the widely used sample evaluation method using word posterior probability obtained from word confusion network, our proposed method can achieve satisfying performances.
In this paper, we present a novel scheme to tackle the task of near-duplicate image detection. Given two input images, the algorithm based on the refined similarity measure can judge rightly whether two input image ar...
详细信息
Rich information is contributed to blogs by millions of users all around the world with the development of blogsphere. However, few work has been done on the study of blog extraction so far. Unlike the traditional tem...
详细信息
Rich information is contributed to blogs by millions of users all around the world with the development of blogsphere. However, few work has been done on the study of blog extraction so far. Unlike the traditional template-dependent wrapper, not only blog articles but also blogroll is extracted with template-independent wrapper in this paper. In our method, blog extraction is formalized as a machine learning problem and a template-independent wrapper is learned by using labeled blog pages from a single site. Testing pages are obtained from 10 popular Chinese blog sites. And experimental results on 300 real blog pages indicate that the proposed method can correctly extract data from blogs with the accuracy of 90% or even above.
This paper presents an automatic face replacement approach in video based on 2D morphable model. Our approach includes three main modules: face alignment, face morph, and face fusion. Given a source image and target v...
详细信息
A novel authentication watermarking scheme for images is proposed in this paper, which holds accuracy location and high security at the same time. In the scheme, different keys are selected for different host data, an...
详细信息
Cross-document coreference resolution plays an import part in the filed of natural language processing (NLP). It captures the ability of gathering documents for information about a certain entity. Most previous algori...
详细信息
ISBN:
(纸本)9781424453979
Cross-document coreference resolution plays an import part in the filed of natural language processing (NLP). It captures the ability of gathering documents for information about a certain entity. Most previous algorithms identify the underlying entity of a given document depending on the original text, which is unreliable if the original text contains multiple parts of different themes. In this paper, we propose a cross-document coreference resolution algorithm based on automatic text summary instead of the original text. In our approach, we extract query-specific and informative-indicative summary from the original text by using Hobbs algorithm and measure the similarity between two summaries. This automatic text summary-based cross-document coreference resolution (ATSCDCR) system is effective in disambiguating different entities of the same mention name and identifying the same entity of different mention names. The results from our experiments show that the macro average of ATSCDCR system is up to 73.16% and the micro average of ATSCDCR system is 67.34 %.
暂无评论