The training procedure is very important in statistical machine translation (SMT). It has a great influence on the final performance of a translation system. The widely used method in SMT is the minimum error rate tra...
详细信息
Along with the fast advance of internet technique, internet users have to deal with novel data every day. For most of them, one of the most useful knowledge exploited from web is about the transfer of the information ...
详细信息
As one of the challenging issues in the field of naturallanguageprocessing (NLP), metaphor has aroused substantial attention among researchers in recent years. Many models and methods have been proposed for proper u...
详细信息
Interpersonal trust relationship is an important dimension of interpersonal relationships. With new introduced plots of literature, we can evaluate the environment of characters and predict plot development to some ex...
详细信息
In this paper, an integrated algorithm to detect humans in thermal imagery was introduced. In recent years, histogram of oriented gradient (HOG) is a quite popular algorithm for person detection in visible imagery. We...
详细信息
In the past few years, much attention has been paid on extending phrase-based statistical machine translation with syntactic structures. In this paper we introduce a novel syntax encapsulated phrase(SEP) model, in whi...
详细信息
In the past few years, much attention has been paid on extending phrase-based statistical machine translation with syntactic structures. In this paper, we introduce a novel phrase model, in which treebank tags are emp...
详细信息
A dynamic incremental model is presented for sub-topic detection and tracking, which borrows ideas of single-pass clustering, multi-category and dynamic incremental model. It is proposed based on time series of topic ...
详细信息
A dynamic incremental model is presented for sub-topic detection and tracking, which borrows ideas of single-pass clustering, multi-category and dynamic incremental model. It is proposed based on time series of topic event, containing dynamic threshold selection, similarity smoothing and dynamic incremental strategy. Meanwhile, overall evaluation criteria combining with χ2 - test is served for performance analysis. The algorithm is effective for sub-topic detection, facilitating users to follow topic event explicitly. Results show that the algorithm proposed in this paper obtains satisfying performance.
The training procedure is very important in statistical machine translation (SMT). It has a great influence on the final performance of a translation system. The widely used method in SMT is the minimum error rate tra...
详细信息
The training procedure is very important in statistical machine translation (SMT). It has a great influence on the final performance of a translation system. The widely used method in SMT is the minimum error rate training (MERT). It is effective to estimate the feature function weights. However, MERT does not use regularization and has been observed to over-fit. In this paper, we describe a method named softmax-margin, which is a modification of the max-margin training. This approach is simple, efficient, and easy to implement. We conduct our work using data sets from the WMT shared tasks. The results of experiment on small scale French-English translation task reach a competitive performance compared to MERT.
This paper describes our syllable-based phrase transliteration system for the NEWS 2012 shared task on English-Chinese track and its back. Grapheme-based Transliteration maps the character(s) in the source side to the...
详细信息
ISBN:
(纸本)9781627483476
This paper describes our syllable-based phrase transliteration system for the NEWS 2012 shared task on English-Chinese track and its back. Grapheme-based Transliteration maps the character(s) in the source side to the target character(s) directly. However, character-based segmentation on English side will cause ambiguity in alignment step. In this paper we utilize Phrase-based model to solve machine transliteration with the mapping between Chinese characters and English syllables rather than English characters. Two heuristic rule-based syllable segmentation algorithms are applied. This transliteration model also incorporates three phonetic features to enhance discriminative ability for phrase. The primary system achieved 0.330 on Chinese-English and 0.177 on English-Chinese in terms of top-1 accuracy.
暂无评论