Selection of wavelet type, decomposition level and fusing rule is a key problem when wavelet transform is applied to image fusion. 2916 kinds of different fusing methods(54×5×9, including 54 wavelet types, 5...
详细信息
In our study of Text-to-Scene conversation (TTS), which translates natural language into animations automatically, we realized that event entailment knowledge is useful in generating scenes since the main part of a sc...
详细信息
In our study of Text-to-Scene conversation (TTS), which translates natural language into animations automatically, we realized that event entailment knowledge is useful in generating scenes since the main part of a scene is to show an event. In this paper, we provide some results of our attempt to extract event entailment knowledge. We use entailment chains instead of traditional entailment rules since the sequence of events is a process which make useful in TTS. The result shows that the work is worth to continue to study.
As one of the challenging issues in the field of Natural languageprocessing (NLP), metaphor has aroused substantial attention among researchers in recent years. Many models and methods have been proposed for proper u...
详细信息
As one of the challenging issues in the field of Natural languageprocessing (NLP), metaphor has aroused substantial attention among researchers in recent years. Many models and methods have been proposed for proper understanding of metaphors. But the automatic identification of metaphor is less touched. This paper presents a tentative study on the metaphor identification based on rules, and the results on a small scale corpus are provided.
We present a forced decoding approach for the tuning process in statistical machine translation. Unlike the traditional discriminative approaches, the forced decoding system can take advantage of the reference of deve...
详细信息
As a new branch of data mining and knowledge discovery, the research of biomedical text mining has a rapid progress currently. Biomedical named entity recognition is a basic technique in the biomedical knowledge disco...
详细信息
As a new branch of data mining and knowledge discovery, the research of biomedical text mining has a rapid progress currently. Biomedical named entity recognition is a basic technique in the biomedical knowledge discovery and its performance has direct effects on further discovery and processing in biomedical texts. In this paper, we present classifiers ensemble approaches for biomedical named entity recognition. Four individual classifiers, Generalized Winnow, Conditional Random Fields, Support Vector Machine, and Maximum Entropy are combined through three different strategies. We demonstrate the effectiveness of the strategies and compare their performances with standalone classifier system. The experiments are carried on JNLPBA2004 corpus with an F-sore of 77.57%. Experimental results show that the proposed method, stacking ensemble strategy, can yield promising performances.
In the past few years, much attention has been paid on extending phrase-based statistical machine translation with syntactic structures. In this paper we introduce a novel syntax encapsulated phrase(SEP) model, in whi...
详细信息
In the past few years, much attention has been paid on extending phrase-based statistical machine translation with syntactic structures. In this paper we introduce a novel syntax encapsulated phrase(SEP) model, in which treebank tag sequences are employed to decorate the bilingual phrase pairs. We use tag sequences, instead of phrase pairs, to train the lexicalized reordering model. Since the number of treebank tags is much smaller than the number of words, the tag sequence based reordering model is smaller and more accurate than the phrase based reordering model. Experiments were carried out on four types of models: the phrase model, the hierarchical phrase model, the POS tag encapsulated phrase(PTEP) model and the syntactic tag encapsulated phrase(STEP) model. The STEP model obtained higher BLEU-4 score than other models on NIST 2005 MT task.
The location of a passage is a kind of semantic information that may prove useful for a variety of applications dealing with inference over passages described in natural language texts. In this paper, we propose a met...
详细信息
The location of a passage is a kind of semantic information that may prove useful for a variety of applications dealing with inference over passages described in natural language texts. In this paper, we propose a method for automatic discovery of pairs of an event and a place term related by the location, such as wash clothes ⇒ laundry room. In contrast to previous approaches that extracting associations between particular actions and locations when those actions occur on dunning likelihood ratio, the underlying assumption of our method is that correlation between an event and a place term is in the regular co-occurrence of a verb and a place noun within locally coherent text. Based on the analogy with the problem of inferring semantic information from text corpus statistical method of Dunning's likelihood ratio is used to score the extracted pairs for association in order to discover the correlation in each pair. In our experimental evaluation, we examine the effect that various statistical methods produce on the accuracy of this model of inferring locations. After that we carried out a direct evaluation of rare pairs against different statistical method.
Most example-based machine translation (EBMT) systems handle their translation examples using some heuristic measures based on human intuition. However, these heuristic rules are usually hard to be effectively organiz...
详细信息
Most example-based machine translation (EBMT) systems handle their translation examples using some heuristic measures based on human intuition. However, these heuristic rules are usually hard to be effectively organized to scale to incorporate diverse features to cover more language phenomenon and large domains. In this paper, we use machine learning approach for EBMT model design instead of human intuition. Maximum entropy (ME) model is introduced in order to adequately incorporate different kinds of features inherited in the translation examples effectively. At the same time, a multi-dimensional feature space is formally constructed to include various features of different aspects. In the experiments, the proposed model shows significant performance improvement.
Blog is becoming more and more popular with the rapid development of Internet. It needs to find an automatic way to distinguish the blog pages from ordinary Web pages for the content extraction of blog pages and the b...
详细信息
Blog is becoming more and more popular with the rapid development of Internet. It needs to find an automatic way to distinguish the blog pages from ordinary Web pages for the content extraction of blog pages and the blog community discovered. Some basic concepts and ideas in the area of blog was described in this paper, and a method on the blog pages identification is proposed, which is based on the blog pages structure and blog content. The experimentation shows that a high result can be achieved in precision.
暂无评论