Abstract | Modern phrase-based machine translation systems make extensive use of word-based translation models for inducing alignments from parallel corpora. |
Abstract | This paper presents a novel method for inducing phrase-based translation units directly from parallel data, which we frame as learning an inverse transduction grammar (ITG) using a recursive Bayesian prior. |
Analysis | We have presented a novel method for leam-ing a phrase-based model of translation directly from parallel data which we have framed as leam-ing an inverse transduction grammar (ITG) using a recursive Bayesian prior. |
Experiments | As a baseline, we train a phrase-based model using the moses toolkit12 based on the word alignments obtained using GIZA++ in both directions and symmetrized using the grow-diag-final-and heuristic13 (Koehn et al., 2003). |
Introduction | The phrase-based approach (Koehn et al., 2003) to machine translation (MT) has transformed MT from a narrow research topic into a truly useful technology to end users. |
Introduction | Word-based translation models (Brown et al., 1993) remain central to phrase-based model training, where they are used to infer word-level alignments from sentence aligned parallel data, from |
Introduction | Firstly, many phrase-based phenomena which do not decompose into word translations (e.g., idioms) will be missed, as the underlying word-based alignment model is unlikely to propose the correct alignments. |
Related Work | A number of other approaches have been developed for learning phrase-based models from bilingual data, starting with Marcu and Wong (2002) who developed an extension to IBM model 1 to handle multi-word units. |
Abstract | However phrase-based approaches are much less able to model sentence level effects between different phrase-pairs. |
Experiments | However in this paper we limit our focus to inducing word alignments, i.e., by using the model to infer alignments which are then used in a standard phrase-based translation pipeline. |
Experiments | We leave full decoding for later work, which we anticipate would further improve performance by exploiting gapping phrases and other phenomena that implicitly form part of our model but are not represented in the phrase-based decoder. |
Introduction | Recent years have witnessed burgeoning development of statistical machine translation research, notably phrase-based (Koehn et al., 2003) and syntax-based approaches (Chiang, 2005; Galley et al., 2006; Liu et al., 2006). |
Introduction | These approaches model sentence translation as a sequence of simple translation decisions, such as the application of a phrase translation in phrase-based methods or a grammar rule in syntax-based approaches. |
Introduction | This conflicts with the intuition behind phrase-based MT, namely that translation decisions should be dependent on con- |
Model | We consider a process in which the target string is generated using a left-to-right order, similar to the decoding strategy used by phrase-based machine translation systems (Koehn et al., 2003). |
Model | In contrast to phrase-based models, we use words as our basic translation unit, rather than multi-word phrases. |
Related Work | This idea has been developed explicitly in a number of previous approaches, in grammar based (Chiang, 2005) and phrase-based systems (Galley and Manning, 2010). |
Abstract | This paper proposes new distortion models for phrase-based SMT. |
Distortion Model for Phrase-Based SMT | A Moses-style phrase-based SMT generates target hypotheses sequentially from left to right. |
Introduction | To address this problem, there has been a lot of research done into word reordering: lexical reordering model (Tillman, 2004), which is one of the distortion models, reordering constraint (Zens et al., 2004), pre-ordering (Xia and Mc-Cord, 2004), hierarchical phrase-based SMT (Chiang, 2007), and syntax-based SMT (Yamada and Knight, 2001). |
Introduction | Phrase-based SMT (Koehn et al., 2007) is a widely used SMT method that does not use a parser. |
Introduction | Phrase-based SMT mainly1 estimates word reordering using distortion models2. |
Abstract | We introduce a shift-reduce parsing algorithm for phrase-based string-to-dependency translation. |
Abstract | As our approach combines the merits of phrase-based and string-to-dependency models, it achieves significant improvements over the two baselines on the NIST Chinese-English datasets. |
Introduction | Modern statistical machine translation approaches can be roughly divided into two broad categories: phrase-based and syntax-based. |
Introduction | Phrase-based approaches treat phrase, which is usually a sequence of consecutive words, as the basic unit of translation (Koehn et al., 2003; Och and Ney, 2004). |
Introduction | As phrases are capable of memorizing local context, phrase-based approaches excel at handling local word selection and reordering. |
Abstract | Hiero translation models have two limitations compared to phrase-based models: 1) Limited hypothesis space; 2) No lexicalized reordering model. |
Abstract | Phrasal-Hiero still has the same hypothesis space as the original Hiero but incorporates a phrase-based distance cost feature and lexicalized reodering features into the chart decoder. |
Abstract | The work consists of two parts: 1) for each Hiero translation derivation, find its corresponding discontinuous phrase-based path. |
Introduction | Phrase-based and tree-based translation model are the two main streams in state-of-the-art machine translation. |
Introduction | Yet, tree-based translation often underperforms phrase-based translation in language pairs with short range reordering such as Arabic-English translation (Zollmann et al., 2008; Birch et al., 2009). |
Introduction | (2003) for our phrase-based system and Chiang (2005) for our Hiero system. |
Abstract | In this paper, we take a step forward and propose a simple but effective method to induce a phrase-based model from the monolingual corpora given an au-tomatically-induced translation lexicon or a manually-edited translation dictionary. |
Experiments | We construct two kinds of phrase-based models using Moses (Koehn et al., 2007): one uses out-of-domain data and the other uses in-domain data. |
Introduction | Novel translation models, such as phrase-based models (Koehn et a., 2007), hierarchical phrase-based models (Chiang, 2007) and linguistically syntax-based models (Liu et a., 2006; Huang et al., 2006; Galley, 2006; Zhang et a1, 2008; Chiang, 2010; Zhang et al., 2011; Zhai et al., 2011, 2012) have been proposed and achieved higher and higher translation performance. |
Introduction | Finally, they used the learned translation model directly to translate unseen data (Ravi and Knight, 2011; Nuhn et al., 2012) or incorporated the learned bilingual lexicon as a new in-domain translation resource into the phrase-based model which is trained with out-of-domain data to improve the domain adaptation performance in machine translation (Dou and Knight, 2012). |
Introduction | level translation rules and learn a phrase-based model from the monolingual corpora. |
Phrase Pair Refinement and Parameterization | It is well known that in the phrase-based SMT there are four translation probabilities and the reordering probability for each phrase pair. |
Phrase Pair Refinement and Parameterization | The translation probabilities in the traditional phrase-based SMT include bidirectional phrase translation probabilities and bidirectional lexical weights. |
Probabilistic Bilingual Lexicon Acquisition | In order to assign probabilities to each entry, we apply the Corpus Translation Probability which used in (Wu et al., 2008): given an in-domain source language monolingual data, we translate this data with the phrase-based model trained on the out-of-domain News data, the in-domain lexicon and the in-domain target language monolingual data (for language model estimation). |
Related Work | For the target-side monolingual data, they just use it to train language model, and for the source-side monolingual data, they employ a baseline (word-based SMT or phrase-based SMT trained with small-scale bitext) to first translate the source sentences, combining the source sentence and its target translation as a bilingual sentence pair, and then train a new phrase-base SMT with these pseudo sentence pairs. |
Abstract | Since statistical machine translation (SMT) and translation memory (TM) complement each other in matched and unmatched regions, integrated models are proposed in this paper to incorporate TM information into phrase-based SMT. |
Conclusion and Future Work | Unlike the previous pipeline approaches, which directly merge TM phrases into the final translation result, we integrate TM information of each source phrase into the phrase-based SMT at decoding. |
Experiments | For the phrase-based SMT system, we adopted the Moses toolkit (Koehn et al., 2007). |
Experiments | conducted using the Moses phrase-based decoder (Koehn et al., 2007). |
Introduction | Statistical machine translation (SMT), especially the phrase-based model (Koehn et al., 2003), has developed very fast in the last decade. |
Introduction | On a Chinese—English computer technical documents TM database, our experiments have shown that the proposed Model-III improves the translation quality significantly over either the pure phrase-based SMT or the TM systems when the fuzzy match score is above 0.4. |
Problem Formulation | Compared with the standard phrase-based machine translation model, the translation problem is reformulated as follows (only based on the best TM, however, it is similar for multiple TM sentences): |
Problem Formulation | mula (3) is just the typical phrase-based SMT model, and the second factor P(Mk|Lk, 2:) (to be specified in the Section 3) is the information derived from the TM sentence pair. |
Problem Formulation | Therefore, we can still keep the original phrase-based SMT model and only pay attention to how to extract |
Experimental setup | Implementation In all experiments, we use the IBM Model 4 implementation from the GIZA++ toolkit (Och and Ney, 2000) for alignment, and the phrase-based and hierarchical models implemented in the Moses toolkit (Koehn et a1., 2007) for rule extraction. |
Introduction | Our contributions are as follows: We develop a semantic parser using off-the-shelf MT components, exploring phrase-based as well as hierarchical models. |
MT—based semantic parsing | We consider a phrase-based translation model (Koehn et al., 2003) and a hierarchical translation model (Chiang, 2005). |
MT—based semantic parsing | Rules for the phrase-based model consist of pairs of aligned source and target sequences, while hierarchical rules are SCFG productions containing at most two instances of a single nonterminal symbol. |
Related Work | The present work is also the first we are aware of which uses phrase-based rather than tree-based machine translation techniques to learn a semantic parser. |
Related Work | multilevel rules composed from smaller rules, a process similar to the one used for creating phrase tables in a phrase-based MT system. |
Results | We first compare the results for the two translation rule extraction models, phrase-based and hierar-chica1 (“MT-phrase” and “MT-hier” respectively in Table 1). |
Introduction | 1We define translation units as phrases in phrase-based SMT, and as translation rules in syntax-based SMT. |
Introduction | Specifically, the popular distortion or lexicalized reordering models in phrase-based SMT focus only on making good local prediction (i.e. |
Introduction | Even though the experimental results carried out in this paper employ SCFG-based SMT systems, we would like to point out that our models is applicable to other systems including phrase-based SMT systems. |
Maximal Orientation Span | 3We use hierarchical phrase-based translation system as a case in point, but the merit is generalizable to other systems. |
Maximal Orientation Span | Additionally, this illustration also shows a case where MOS acts as a cross-boundary context which effectively relaxes the context-free assumption of hierarchical phrase-based formalism. |
Related Work | Our TNO model is closely related to the Unigram Orientation Model (UOM) (Tillman, 2004), which is the de facto reordering model of phrase-based SMT (Koehn et al., 2007). |
Related Work | Our MOS concept is also closely related to hierarchical reordering model (Galley and Manning, 2008) in phrase-based decoding, which computes 0 of b with respect to a multi-block unit that may go beyond 19’. |
Corpus Data and Baseline SMT | Our phrase-based decoder is similar to Moses (Koehn et al., 2007) and uses the phrase pairs and target LM to perform beam search stack decoding based on a standard log-linear model, the parameters of which were tuned with MERT (Och, 2003) on a held-out development set (3,534 sentence pairs, 45K words) using BLEU as the tuning metric. |
Experimental Setup and Results | The baseline English-to-Iraqi phrase-based SMT system was built as described in Section 3. |
Introduction | In this paper, we describe a novel topic-based adaptation technique for phrase-based statistical machine translation (SMT) of spoken conversations. |
Introduction | Translation phrase pairs that originate in training conversations whose topic distribution is similar to that of the current conversation are given preference through a single similarity feature, which augments the standard phrase-based SMT log-linear model. |
Introduction | With this approach, we demonstrate significant improvements over a baseline phrase-based SMT system as measured by BLEU, TER and NIST scores on an English-to-Iraqi CSLT task. |
Experiments | Our baseline is a phrase-based decoder, which includes the following models: an n-gram target-side language model (LM), a phrase translation model and a word-based lexicon model. |
Introduction | Within the phrase-based SMT framework there are mainly three stages where improved reordering could be integrated: In the preprocessing: the source sentence is reordered by heuristics, so that the word order of source and target sentences is similar. |
Introduction | In this way, syntax information can be incorporated into phrase-based SMT systems. |
Translation System Overview | In this paper, the phrase-based machine translation system |
Abstract | Using the resulting alignments for phrase-based translation systems offers no clear insights w.r.t. |
Conclusion | Table 2: Evaluation of phrase-based translation from German to English with the obtained alignments (for 100.000 sentence pairs). |
See e. g. the author’s course notes (in German), currently | In addition we evaluate the effect on phrase-based translation on one of the tasks. |
See e. g. the author’s course notes (in German), currently | We also check the effect of the various alignments (all produced by RegAligner) on translation performance for phrase-based translation, randomly choosing translation from German to English. |
Introduction | Long-distance dependencies are common, and this creates problems for both RBMT and SMT systems (especially for phrase-based ones). |
Results | 081 Phrase-based SMT |
Results | 082 Phrase-based SMT |