Adaptive MT Quality Estimation | The source side of the QE training data Sq is combined with the input document Sd for MT system training data subsampling. |
Adaptive MT Quality Estimation | Once the document-specific MT system is trained, we use it to translate both the input document and the source QE training data, obtaining the translation Td and |
Adaptive MT Quality Estimation | As the QE model is adaptively retrained for each document-specific MT system , its prediction is more accurate and consistent. |
Document-specific MT System | Building a general MT system using all the parallel data not only produces a huge translation model (unless with very aggressive pruning), the performance on the given input document is suboptimal due to the unwanted dominance of out-of-domain data. |
Document-specific MT System | The document-specific system is built based on sub-sampling: from the parallel corpora we select sentence pairs that are the most similar to the sentences from the input document, then build the MT system with the sub-sampled sentence pairs. |
Introduction | Depending on the difficulty of the input sentences (sentence length, OOV words, complex sentence structures and the coverage of the MT system’s training data), some translation outputs can be perfect, while others are ungrammatical, missing important words or even totally garbled. |
Introduction | This shortcoming is one of the main obstacles for the adoption of MT systems , especially in machine assisted human translation: MT post-editing, where human translators have an option to edit MT proposals or translate from scratch. |
Introduction | In section 3 we will introduce the document-specific MT system built for post-editing. |
Static MT Quality Estimation | However for the post-editing task, we argue that it could also be cast as a classification problem: MT system |
Static MT Quality Estimation | We build a document-specific MT system to translate this document, then ask human translator to correct the translation output. |
Abstract | Our best result improves over the best single MT system baseline by 1.0% BLEU and over a strong system selection baseline by 0.6% BLEU on a blind test set. |
Introduction | In this paper we study the use of sentence-level dialect identification together with various linguistic features in optimizing the selection of outputs of four different MT systems on input text that includes a mix of dialects. |
Introduction | Our best system selection approach improves over our best baseline single MT system by 1.0% absolute BLEU point on a blind test set. |
Related Work | Sawaf (2010) and Salloum and Habash (2013) used hybrid solutions that combine rule-based algorithms and resources such as leXicons and morphological analyzers with statistical models to map DA to MSA before using MSA-to-English MT systems . |
Related Work | In this paper we use four MT systems that translate from DA to English in different ways. |
Related Work | Our fourth MT system uses ELISSA, the DA-to-MSA MT tool by Salloum and Habash (2013), to produce an MSA pivot. |
Abstract | Recent work has shown success in using neural network language models (NNLMs) as features in MT systems . |
Model Variations | > 5 MT System |
Model Variations | In this section, we describe the MT system used in our experiments. |
Model Variations | Because of this, the baseline BLEU scores are much higher than a typical MT system — especially a real-time, production engine which must support many language pairs. |
Experiments | MT System We develop a machine translation baseline as follows. |
Experiments | MT System ADD single |
Experiments | As expected, the MT system slightly outperforms our models on most language pairs. |
Discussion and conclusion | An application of our idea outside the area of translation assistance is post-correction of the output of some MT systems that, as a last-resort heuristic, copy source words or phrases into their output, producing precisely the kind of input our system is trained on. |
Discussion and conclusion | Our classification-based approach may be able to resolve some of these cases operating as an add-on to a regular MT system —or as a independent post-correction system. |
Evaluation | These scores should generally be much better than the typical MT system performances as only local changes are made to otherwise “perfect” L2 sentences. |
Evaluation | We see, first of all, that the MT system benefits from our approach in most cases. |
Evaluation | Apart from showing the effects of the skeleton-based model, we also studied the behavior of the MT system under the different settings of search space. |
Related Work | More importantly, we develop a complete approach to this issue and show its effectiveness in a state-of-the-art MT system . |