Current speech understanding systems are typically designed as multistage systems, although this theoretically gives rise to errors due to early decisions. We present a framework that offers the chance of reducing the...
详细信息
Current speech understanding systems are typically designed as multistage systems, although this theoretically gives rise to errors due to early decisions. We present a framework that offers the chance of reducing these errors by an integrated system which directly computes a semantic tree representation from the input speech signal through a token passing based one-stage decoder, called ODINS. In order to limit the complexity of ODINS, we represent all a-priori knowledge consistently by a generalized uniform knowledge model based on a hierarchy of probabilistic transition networks, which also can be n-grams. Our framework includes a method to evaluate the system output using an edit distance based tree matching algorithm. First experiments quantify and confirm the theoretical advantage of the one-stage strategy over a corresponding two-stage approach.
In this study, we have proposed an automatic word spacing system for the Korean language, which uses syllable n-gram and word statistics extracted from a large amount of processed corpora. The optimal spacing points o...
详细信息
ISBN:
(纸本)0780379020
In this study, we have proposed an automatic word spacing system for the Korean language, which uses syllable n-gram and word statistics extracted from a large amount of processed corpora. The optimal spacing points of a sentence are decided mainly by using the Viterbi algorithm. As the statistical studies performance is sensitive to the training corpus and shows data sparseness problem, we have tried to enlarge the training corpora, used parameters found by examining test data and proposed an adjusting method of the 'longest match strategy' based on the viable prefix. These increase the system's accuracy. Our corpora, covering various language registers, were made up of 33643884 words. The pilot test was conducted with test data derived from different sources. 94.24% precision in word-unit correction were obtained on average for spacing test data.
<正>A novel audio watermarking method with time domain processing is proposed in this paper for copyright protection to audio signal.A scheme of adaptive watermark embedding is used,in which watermark is a significa...
详细信息
ISBN:
(纸本)0780379020
<正>A novel audio watermarking method with time domain processing is proposed in this paper for copyright protection to audio signal.A scheme of adaptive watermark embedding is used,in which watermark is a significant binary image and adaptively embedded according to the amplitude of audio signal.A novel scheme of detecting watermark is presented with linear predictive coding(LPC),and it does not use the original signal during extracting watermark so that the reliability of detecting watermark is *** results show that the proposed algorithm outperforms the ML’s method,and the watermark embedded is imperceptible and the algorithm is robust to many attacks,such as low pass filter,resampling and so on.
The hallmark of an undergraduate program is a successful co-curricular program. This program may have several forms ranging from a programming team to an active summer research program. The important ingredient in any...
详细信息
ISBN:
(纸本)1892512416
The hallmark of an undergraduate program is a successful co-curricular program. This program may have several forms ranging from a programming team to an active summer research program. The important ingredient in any program is student-faculty interaction in the development of solutions to current computer science problems. This paper explores the use of research in a software engineering course at Gettysburg College. The research involves the use of Java applets and servlets on web pages within a highly personalized knowledge portal as one alternative to improve the cacheability of pages containing dynamic content, and decrease the overall load time for these pages.
The backoff hierarchical class n-gram language models(LMs) are a generalization of the common backoff word n-gram *** to the traditional backoff word n-gram LMs that uses (n-1)-gram to estimate the likelihood of an un...
详细信息
The backoff hierarchical class n-gram language models(LMs) are a generalization of the common backoff word n-gram *** to the traditional backoff word n-gram LMs that uses (n-1)-gram to estimate the likelihood of an unseen n-gram event,backoff hierarchical class n-gram LMs uses a class hierarchy to define an appropriate context. In this paper,we study the impact of the hierarchy depth on the performance of the approach. Performance is evaluated on several databases such us switchboard,call-home and Wall Street Journal (WSJ).Results show that better improvement is achieved when a shallow word(few levels) tree is *** show up to 26%improvement on the unseen events perplexity and up to 12% improvement in the word error rate(WER).
In this paper,a new algorithm is put forwarded for speaker *** difficulties for speaker recognition were first *** most methods for speaker identification are based on parameter estimation,this paper tries to put forw...
详细信息
ISBN:
(纸本)0780379020
In this paper,a new algorithm is put forwarded for speaker *** difficulties for speaker recognition were first *** most methods for speaker identification are based on parameter estimation,this paper tries to put forward a non-parameter method for speaker *** method is based on Fisher differentiation *** influences of different factors to the identification accuracy were analyzed. The experiment shows that it is an effective method for text-dependent speaker identification.
This paper presents a nearly unsupervised learning methodology for automatically acquiring a thematic corpus from the *** on a bootstrapping mechanism,our system starts with one single linguistic expression of a given...
详细信息
This paper presents a nearly unsupervised learning methodology for automatically acquiring a thematic corpus from the *** on a bootstrapping mechanism,our system starts with one single linguistic expression of a given target semantic *** then samples the Web so as to progressively accumulate a corpus of potential examples of the same *** steps alternate with filtering steps,making it possible to keep the corpus thematically *** corpus is finally analysed to search for potential paraphrases of the initial expression of the semantic relationship. These paraphrases will eventually be used to improve our question-answering *** paper focuses on the learning aspect of the system and reports experimental results regarding the effectiveness of our filtering strategy.
<正>Automatic classification of modulation signals plays an important role in communication applications such as speech recognition,intelligent demodulator and electronic warfare *** how to improve the performance o...
详细信息
ISBN:
(纸本)0780379020
<正>Automatic classification of modulation signals plays an important role in communication applications such as speech recognition,intelligent demodulator and electronic warfare *** how to improve the performance of the modulation classification algorithms in low SNR condition is an important problem during their practical application. This paper presents a method that adopts linear smoothing to preprocess the intercepted signal,decreases the influence of the noise to the signal characteristic and then extracts the key features,so the features are reliable to anti-jamming and can identify the various signals in low SNR range. Simulation indicates the linear smoothing process is simply computed and the improvement of the algorithm that used it is effective.
This paper describes an efficient implementation for mining textual associations from text *** order to tackle real world applications,efficient algorithms and data structures are needed to manage, in reasonable time ...
详细信息
This paper describes an efficient implementation for mining textual associations from text *** order to tackle real world applications,efficient algorithms and data structures are needed to manage, in reasonable time and space,the overgrowing volume of text *** that purpose,we introduce a global architecture based on masks,suffix arrays and multidimensional arrays to implement the SENTA extractor[Dias,2002].In particular,SENTA has shown great flexibility and accuracy to mine textual associations such as collocations,cognates, morphemes and *** solution shows O(h(F)N log N)time complexity and O(N)space complexity where N is the size of the corpus and h(F)is a function of the context window size.
This paper presents a system, named CAPTOP, for authoring and checking operating procedures for plant operations. It consists of a knowledge base of plant unit operations that can be linked to a graphical front end fo...
详细信息
ISBN:
(纸本)3540404554
This paper presents a system, named CAPTOP, for authoring and checking operating procedures for plant operations. It consists of a knowledge base of plant unit operations that can be linked to a graphical front end for inputting operating instructions. The system then builds a formal model of the instruction set as an interlingua and then uses it to output multilingual operating procedures. It avoids the problems of naturallanguage understanding that make machine translation so difficult Furthermore, the system could also generate output in a formal syntax that can be used as input to another knowledge based component, CHECKOP, for checking the procedure for operability and safety problems.
暂无评论