Features and Training | Before elaborating the details of how the actual graph is constructed, we would like to first introduce how the graph-based translation consensus can be used in an MT system . |
Features and Training | When graph-based consensus is applied to an MT system , the graph will have nodes for training data, development (dev) data, and test data (details in Section 5). |
Graph-based Translation Consensus | Our MT system with graph-based translation consensus adopts the conventional log-linear model. |
Introduction | The principle of consensus can be sketched as “a translation candidate is deemed more plausible if it is supported by other translation candidates.” The actual formulation of the principle depends on whether the translation candidate is a complete sentence or just a span of it, whether the candidate is the same as or similar to the supporting candidates, and whether the supporting candidates come from the same or different MT system . |
Introduction | Others extend consensus among translations from the same MT system to those from different MT systems . |
Introduction | For the source (Chinese) span “fig 73 H T 57 3x91 the MT system produced the correct translation for the second sentence, but it failed to do so for the first one. |
Abstract | This paper presents PORTl, a new MT evaluation metric which combines precision, recall and an ordering metric and which is primarily designed for tuning MT systems . |
Abstract | We compare PORT-tuned MT systems to BLEU-tuned baselines in five experimental conditions involving four language pairs. |
Conclusions | Most important, our results show that PORT-tuned MT systems yield better translations than BLEU-tuned systems on several language pairs, according both to automatic metrics and human evaluations. |
Introduction | First, there is no evidence that any other tuning metric yields better MT systems . |
Introduction | In this work, our goal is to devise a metric that, like BLEU, is computationally cheap and language-independent, but that yields better MT systems than BLEU when used for tuning. |
Experiments | A Chinese-English and an English-Chinese MT system are trained on (C0, E0). |
Forward-Translation vs. Back-Translation | The only requirement is that the MT system needs to be bidirectional. |
Forward-Translation vs. Back-Translation | The procedure includes translating a text into certain foreign language with the MT system (Forward-Translation), and translating it back into the original language with the same system (Back-Translation). |
Forward-Translation vs. Back-Translation | Two possible reasons may explain this phenomenon: (l) in the first round of translation T 0 9 S1, some target word orders are reserved due to the reordering failure, and these reserved orders lead to a better result in the second round of translation; (2) the text generated by an MT system is more likely to be matched by the reversed but homologous MT system . |
Related Work | (2010) captures the structures implicitly by training an MT system on (SO, $1) and “translates” the SMT input to an MT-favored expression. |
Introduction | Discrimi-native optimization methods such as MERT (Och, 2003), MIRA (Crammer et al., 2006), PRO (Hopkins and May, 2011), and Downhill-Simplex (Nelder and Mead, 1965) have been influential in improving MT systems in recent years. |
Introduction | We want to build a MT system that does well with respect to many aspects of translation quality. |
Opportunities and Limitations | We introduce a new approach (PMO) for training MT systems on multiple metrics. |
Theory of Pareto Optimality 2.1 Definitions and Concepts | Here, the MT system’s Decode function, parameterized by weight vector w, takes in a foreign sentence f and returns a translated hypothesis h. The argmax operates in vector space and our goal is to find to leading to hypotheses on the Pareto Frontier. |
A Class-based Model of Agreement | However, we conduct CRF inference in tandem with the translation decoding procedure (§3), creating an environment in which subsequent words of the observation are not available; the MT system has yet to generate the rest of the translation when the tagging features for a position are scored. |
Introduction | The MT system selects the correct verb stem, but with masculine inflection. |
Introduction | Agreement relations that cross statistical phrase boundaries are not explicitly modeled in most phrase-based MT systems (Avramidis and Koehn, 2008). |
Related Work | To our knowledge, Uszkoreit and Brants (2008) are the only recent authors to show an improvement in a state-of-the-art MT system using class-based LMs. |
Conclusion & Future Work | In this approach a number of MT systems are combined at decoding time in order to form an ensemble model. |
Conclusion & Future Work | We will also add capability of supporting syntax-based ensemble decoding and experiment how a phrase-based system can benefit from syntax information present in a syntax-aware MT system . |
Experiments & Results 4.1 Experimental Setup | The first group are the baseline results on the phrase-based system discussed in Section 2 and the second group are those of our hierarchical MT system . |
Experiments & Results 4.1 Experimental Setup | Since the Hiero baselines results were substantially better than those of the phrase-based model, we also implemented the best-performing baseline, linear mixture, in our Hiero-style MT system and in fact it achieves the hights BLEU score among all the baselines as shown in Table 2. |
Experiments | The test set was translated by seven MT systems , and each translation has been manually judged for adequacy and fluency. |
Experiments | In addition, the translation outputs of the MT systems are also manually ranked according to their translation quality. |
Experiments | The NIST 2008 English-Chinese MT task consists of 127 documents with 1,830 segments, each with four reference translations and eleven automatic MT system translations. |