The paper presents a method of automatic enrichment of a very large dictionary of word combinations. The method is based on results of automatic syntactic analysis (parsing) of sentences. The dependency formalism is u...
详细信息
Much is known about the design of automated systems to search broadcast news, but it has only recently become possible to apply similar techniques to large collections of spontaneous speech. This paper presents initia...
详细信息
Much is known about the design of automated systems to search broadcast news, but it has only recently become possible to apply similar techniques to large collections of spontaneous speech. This paper presents initial results from experiments with speech recognition, topic segmentation, topic categorization, and named entity detection using a large collection of recorded oral histories. The work leverages a massive manual annotation effort on 10 000 h of spontaneous speech to evaluate the degree to which automatic speech recognition (ASR)-based segmentation and categorization techniques can be adapted to approximate decisions made by human annotators. ASR word error rates near 40% were achieved for both English and Czech for heavily accented, emotional and elderly spontaneous speech based on 65-84 h of transcribed speech. Topical segmentation based on shifts in the recognized English vocabulary resulted in 80% agreement with manually annotated boundary positions at a 0.35 false alarm rate. Categorization was considerably more challenging, with a nearest-neighbor technique yielding F = 0.3. This is less than half the value obtained by the same technique on a standard newswire categorization benchmark, but replication on human-transcribed interviews showed that ASR errors explain little of that difference. The paper concludes with a description of how these capabilities could be used together to search large collections of recorded oral histories.
Forward decoding kernel machines (FDKM) combine large-margin classifiers with hidden Markov models (HMM) for maximum a posteriori (MAP) adaptive sequence estimation. State transitions in the sequence are conditioned o...
详细信息
ISBN:
(纸本)0262025507
Forward decoding kernel machines (FDKM) combine large-margin classifiers with hidden Markov models (HMM) for maximum a posteriori (MAP) adaptive sequence estimation. State transitions in the sequence are conditioned on observed data using a kernel-based probability model trained with a recursive scheme that deals effectively with noisy and partially labeled data. Training over very large datasets is accomplished using a sparse probabilistic support vector machine (SVM) model based on quadratic entropy, and an on-line stochastic steepest descent algorithm. For speaker-independent continuous phone recognition, FDKM trained over 177,080 samples of the TIMET database achieves 80.6% recognition accuracy over the full test set, without use of a prior phonetic language model.
In this paper we present an automated processing pipeline that, from a sequence of images, reconstructs a 3D model. The approach is particularly flexible as it can deal with a hand-held camera without the need for an ...
详细信息
In this paper we present an automated processing pipeline that, from a sequence of images, reconstructs a 3D model. The approach is particularly flexible as it can deal with a hand-held camera without the need for an a priori calibration or explicit knowledge about the recorded scene. In a fist stage features are extracted and tracked throughout the sequence. Using robust statistics and multiple view relations the 3D structure of the observed features and the camera motion and calibration are computed. In a second stage stereo matching is used to obtain a detailed estimate of the geometry of the observed scene. The presented approach integrates state-of-the-art algorithms developed in computer vision, computer graphics and photogrammetry. Due to its flexibility during image acqusition, this approach is particularly well suited for application in the field of archaeology and architectural conservation.
We previously proposed (Kamm and Meyer (2001, 2002)) a two-pronged approach to improve system performance by selective use of training data. We demonstrated a sentence-selective algorithm that, first, made effective u...
详细信息
We previously proposed (Kamm and Meyer (2001, 2002)) a two-pronged approach to improve system performance by selective use of training data. We demonstrated a sentence-selective algorithm that, first, made effective use of the available humanly transcribed training data and, second, focused future human transcription effort on data that was more likely to improve system performance. We now extend that algorithm to focus on word selection, and demonstrate that we can reduce the error rate from 10.3 % to 9.3 % on a simple, 36-word corpus, by selecting 30 % (15 hours) of the 50 hours of training data available in this corpus, without knowledge of the true transcription. We also discuss application of our word selection algorithm to the Wall Street Journal 5 K word task. Preliminary results show that we can select up to 60 % (48 hours) of the training data, with minimal knowledge of the true transcription, and match or beat the error rate of a system built using the same amount of randomly selected training data.
This paper presents a comprehensive empirical exploration and evaluation of a diverse range of data characteristics which influence word sense disambiguation performance. It focuses on a set of six core supervised alg...
Classifier combination is an effective and broadly useful method of improving system performance. This article investigates in depth a large number of both well-established and novel classifier combination approaches ...
This paper presents a method for inducing translation lexicons between two distant languages without the need for either parallel bilingual corpora or a direct bilingual seed dictionary. The algorithm successfully com...
详细信息
暂无评论