作者:
Lo, Chi-KiuWu, DekaiHKUST
Human Language Technology Center Dept. of Computer Science and Engineering Hong Kong University of Science and Technology Hong Kong
We argue that failing to capture the degree of contribution of each semantic frame in a sentence explains puzzling results in recent work on the MEANT family of semantic MT evaluation metrics, which have disturbingly ...
详细信息
We argue that learning word alignments through a compositionally-structured, joint process yields higher phrase-based translation accuracy than the conventional heuristic of intersecting conditional models. Flawed wor...
详细信息
Recently, numerous statistical machine translation models which can utilize various kinds of translation rules are proposed. In these models, not only the conventional syntactic rules but also the non-syntactic rules ...
ISBN:
(纸本)9781932432398
Recently, numerous statistical machine translation models which can utilize various kinds of translation rules are proposed. In these models, not only the conventional syntactic rules but also the non-syntactic rules can be applied. Even the pure phrase rules are includes in some of these models. Although the better performances are reported over the conventional phrase model and syntax model, the mixture of diversified rules still leaves much room for study. In this paper, we present a refined rule classification system. Based on this classification system, the rules are classified according to different standards, such as lexicalization level and generalization. Especially, we refresh the concepts of the structure reordering rules and the discontiguous phrase rules. This novel classification system may supports the SMT research community with some helpful references.
作者:
Padó, SebastianBoleda, GemmaSALSA
Dept. of Computational Linguistics Saarland University Saarbrücken Germany GLiCom
Dept. of Translation and Interpreting Pompeu Fabra University Barcelona Spain
We present a data and error analysis for semantic role labelling. In a first experiment, we build a generic statistical model for semantic role assignment in the FrameNet paradigm and show that there is a high varianc...
详细信息
We explored novel automatic evaluation measures for machine translation output oriented to the syntactic structure of the sentence: the Bleu score on the detailed Part-of-Speech (pos) tags as well as the precision, re...
We explored novel automatic evaluation measures for machine translation output oriented to the syntactic structure of the sentence: the Bleu score on the detailed Part-of-Speech (pos) tags as well as the precision, recall and F-measure obtained on pos n-grams. We also introduced F-measure based on both word and pos n-grams. Correlations between the new metrics and human judgments were calculated on the data of the first, second and third shared task of the statistical Machine translationworkshop. Machine translation outputs in four different European languages were taken into account: English, Spanish, French and German. The results show that the new measures correlate very well with the human judgements and that they are competitive with the widely used BLEU, METEOR and TER metrics.
A key concern in building syntax-based machine translation systems is how to improve coverage by incorporating more traditional phrase-based SMT phrase pairs that do not correspond to syntactic constituents. At the sa...
ISBN:
(纸本)9781932432398
A key concern in building syntax-based machine translation systems is how to improve coverage by incorporating more traditional phrase-based SMT phrase pairs that do not correspond to syntactic constituents. At the same time, it is desirable to include as much syntactic information in the system as possible in order to carry out linguistically motivated reordering, for example. We apply an extended and modified version of the approach of Tinsley et al. (2007), extracting syntax-based phrase pairs from a large parallel parsed corpus, combining them with PBSMT phrases, and performing joint decoding in a syntax-based MT framework without loss of translation quality. This effectively addresses the low coverage of purely syntactic MT without discarding syntactic information. Further, we show the potential for improved translation results with the inclusion of a syntactic grammar. We also introduce a new syntax-prioritized technique for combining syntactic and non-syntactic phrases that reduces overall phrase table size and decoding time by 61%, with only a minimal drop in automatic translation metric scores.
In this paper, we start with the existing idea of taking reordering rules automatically derived from syntactic representations, and applying them in a preprocessing step before translation to make the source sentence ...
ISBN:
(纸本)9781932432398
In this paper, we start with the existing idea of taking reordering rules automatically derived from syntactic representations, and applying them in a preprocessing step before translation to make the source sentence structurally more like the target; and we propose a new approach to hierarchically extracting these rules. We evaluate this, combined with a lattice-based decoding, and show improvements over state-of-the-art distortion models.
The empirical adequacy of synchronous context-free grammars of rank two (2-SCFGs) (Satta and Peserico, 2005), used in syntax-based machine translation systems such as Wu (1997), Zhang et al. (2006) and Chiang (2007), ...
ISBN:
(纸本)9781932432398
The empirical adequacy of synchronous context-free grammars of rank two (2-SCFGs) (Satta and Peserico, 2005), used in syntax-based machine translation systems such as Wu (1997), Zhang et al. (2006) and Chiang (2007), in terms of what alignments they induce, has been discussed in Wu (1997) and Wellington et al. (2006), but with a one-sided focus on so-called "inside-out alignments". Other alignment configurations that cannot be induced by 2-SCFGs are identified in this paper, and their frequencies across a wide collection of hand-aligned parallel corpora are examined. Empirical lower bounds on two measures of alignment error rate, i.e. the one introduced in Och and Ney (2000) and one where only complete translation units are considered, are derived for 2-SCFGs and related formalisms.
Word alignments that violate syntactic correspondences interfere with the extraction of string-to-tree transducer rules for syntax-based machine translation. We present an algorithm for identifying and deleting incorr...
ISBN:
(纸本)9781932432091
Word alignments that violate syntactic correspondences interfere with the extraction of string-to-tree transducer rules for syntax-based machine translation. We present an algorithm for identifying and deleting incorrect word alignment links, using features of the extracted rules. We obtain gains in both alignment quality and translation quality in Chinese-English and Arabic-English translation experiments relative to a GIZA++ union baseline.
This paper seeks to complement the current trend of adding more structure to statistical Machine translation systems, by exploring the opposite direction: adding statistical components to a Transfer-Based MT system. I...
This paper seeks to complement the current trend of adding more structure to statistical Machine translation systems, by exploring the opposite direction: adding statistical components to a Transfer-Based MT system. Initial results on the BTEC data show significant improvement according to three automatic evaluation metrics (BLEU, NIST and METEOR).
暂无评论