Index of papers in Proc. ACL that mention
  • phrase-based
Goto, Isao and Utiyama, Masao and Sumita, Eiichiro and Tamura, Akihiro and Kurohashi, Sadao
Abstract
This paper proposes new distortion models for phrase-based SMT.
Distortion Model for Phrase-Based SMT
A Moses-style phrase-based SMT generates target hypotheses sequentially from left to right.
Introduction
To address this problem, there has been a lot of research done into word reordering: lexical reordering model (Tillman, 2004), which is one of the distortion models, reordering constraint (Zens et al., 2004), pre-ordering (Xia and Mc-Cord, 2004), hierarchical phrase-based SMT (Chiang, 2007), and syntax-based SMT (Yamada and Knight, 2001).
Introduction
Phrase-based SMT (Koehn et al., 2007) is a widely used SMT method that does not use a parser.
Introduction
Phrase-based SMT mainly1 estimates word reordering using distortion models2.
phrase-based is mentioned in 18 sentences in this paper.
Topics mentioned in this paper:
Feng, Yansong and Lapata, Mirella
Abstractive Caption Generation
Phrase-based Model The model outlined in equation (8) will generate captions with function words.
Abstractive Caption Generation
Search To generate a caption it is necessary to find the sequence of words that maximizes P(w1,w2, ...,wn) for the word-based model (equation (8)) and P(p1, p2, ..., pm) for the phrase-based model (equation (15)).
Experimental Setup
Documents and captions were parsed with the Stanford parser (Klein and Manning, 2003) in order to obtain dependencies for the phrase-based abstractive model.
Experimental Setup
We tuned the caption length parameter on the development set using a range of [5, 14] tokens for the word-based model and [2, 5] phrases for the phrase-based model.
Experimental Setup
For the phrase-based model, we also experimented with reducing the search scope, either by considering only the n most similar sentences to the keywords (range [2,10]), or simply the single most similar sentence and its neighbors (range [2, 5]).
Results
different from phrase-based abstractive system.
Results
Table 4: Captions written by humans (G) and generated by extractive (KL), word-based abstractive (AW), and phrase-based extractive (Ap systems).
Results
It is significantly worse than the phrase-based abstractive system (0c < 0.01), the extractive system (0c < 0.01), and the gold standard (0c < 0.01).
phrase-based is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
Yang, Nan and Li, Mu and Zhang, Dongdong and Yu, Nenghai
Abstract
We evaluated our approach on large-scale J apanese-English and English-Japanese machine translation tasks, and show that it can significantly outperform the baseline phrase-based SMT system.
Experiments
We use a BTG phrase-based system with a Max-Ent based leXicalized reordering model (Wu, 1997; Xiong et al., 2006) as our baseline system for
Integration into SMT system
There are two ways to integrate the ranking reordering model into a phrase-based SMT system: the pre-reorder method, and the decoding time constraint method.
Integration into SMT system
Reordered sentences can go through the normal pipeline of a phrase-based decoder.
Integration into SMT system
The above three penalties are added as additional features into the log-linear model of the phrase-based system.
Introduction
In phrase-based models (Och, 2002; Koehn et al., 2003), phrase is introduced to serve as the fundamental translation element and deal with local reordering, while a distance based distortion model is used to coarsely depict the exponentially decayed word movement probabilities in language translation.
Introduction
Long-distance word reordering between language pairs with substantial word order difference, such as Japanese with Subject-Object—Verb (SOV) structure and English with Subject-Verb-Object (SVO) structure, is generally viewed beyond the scope of the phrase-based systems discussed above, because of either distortion limits or lack of discriminative features for modeling.
Introduction
This is usually done in a preprocessing step, and then followed by a standard phrase-based SMT system that takes the reordered source sentence as input to finish the translation.
phrase-based is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Liu, Zhanyi and Wang, Haifeng and Wu, Hua and Li, Sheng
Abstract
We make use of the collocation probabilities, which are estimated from monolingual corpora, in two aspects, namely improving word alignment for various kinds of SMT systems and improving phrase table for phrase-based SMT.
Abstract
As compared to baseline systems, we achieve absolute improvements of 2.40 BLEU score on a phrase-based SMT system and 1.76 BLEU score on a parsing-based SMT system.
Experiments on Phrase-Based SMT
Moses (Koehn et al., 2007) is used as the baseline phrase-based SMT system.
Experiments on Phrase-Based SMT
6.2 Effect of improved word alignment on phrase-based SMT
Improving Phrase Table
Phrase-based SMT system automatically extracts bilingual phrase pairs from the word aligned bilingual corpus.
Improving Phrase Table
These collocation probabilities are incorporated into the phrase-based SMT system as features.
Introduction
In phrase-based SMT (Koehn et al., 2003), the phrase boundary is usually determined based on the bidirectional word alignments.
Introduction
Then the collocation information is employed to improve Bilingual Word Alignment (BWA) for various kinds of SMT systems and to improve phrase table for phrase-based SMT.
Introduction
Then the phrase collocation probabilities are used as additional features in phrase-based SMT systems.
phrase-based is mentioned in 19 sentences in this paper.
Topics mentioned in this paper:
Cohn, Trevor and Haffari, Gholamreza
Abstract
Modern phrase-based machine translation systems make extensive use of word-based translation models for inducing alignments from parallel corpora.
Abstract
This paper presents a novel method for inducing phrase-based translation units directly from parallel data, which we frame as learning an inverse transduction grammar (ITG) using a recursive Bayesian prior.
Analysis
We have presented a novel method for leam-ing a phrase-based model of translation directly from parallel data which we have framed as leam-ing an inverse transduction grammar (ITG) using a recursive Bayesian prior.
Experiments
As a baseline, we train a phrase-based model using the moses toolkit12 based on the word alignments obtained using GIZA++ in both directions and symmetrized using the grow-diag-final-and heuristic13 (Koehn et al., 2003).
Introduction
The phrase-based approach (Koehn et al., 2003) to machine translation (MT) has transformed MT from a narrow research topic into a truly useful technology to end users.
Introduction
Word-based translation models (Brown et al., 1993) remain central to phrase-based model training, where they are used to infer word-level alignments from sentence aligned parallel data, from
Introduction
Firstly, many phrase-based phenomena which do not decompose into word translations (e.g., idioms) will be missed, as the underlying word-based alignment model is unlikely to propose the correct alignments.
Related Work
A number of other approaches have been developed for learning phrase-based models from bilingual data, starting with Marcu and Wong (2002) who developed an extension to IBM model 1 to handle multi-word units.
phrase-based is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Sun, Xu and Gao, Jianfeng and Micol, Daniel and Quirk, Chris
A Phrase-Based Error Model
The goal of the phrase-based error model is to transform a correctly spelled query C into a misspelled query Q.
A Phrase-Based Error Model
If we assume a uniform probability over segmentations, then the phrase-based probability can be defined as:
Abstract
Then, a phrase-based error model that accounts for the transformation probability between multi-term phrases is trained and integrated into a query speller system.
Abstract
Results show that the system using the phrase-based error model outperforms significantly its baseline systems.
Introduction
Among these models, the most effective one is a phrase-based error model that captures the probability of transforming one multi-term phrase into another multi-term phrase.
Introduction
Comparing to traditional error models that account for transformation probabilities between single characters (Kernighan et al., 1990) or sub-word strings (Brill and Moore, 2000), the phrase-based model is more powerful in that it captures some contextual information by retaining inter-term dependencies.
Introduction
In particular, the speller system incorporating a phrase-based error model significantly outperforms its baseline systems.
Related Work
To this end, inspired by the phrase-based statistical machine translation (SMT) systems (Koehn et al., 2003; Och and Ney, 2004), we propose a phrase-based error model where we assume that query spelling correction is performed at the phrase level.
Related Work
In what follows, before presenting the phrase-based error model, we will first describe the clickthrough data and the query speller system we used in this study.
The Baseline Speller System
Figure 2: Example demonstrating the generative procedure behind the phrase-based error model.
phrase-based is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
Xiong, Deyi and Zhang, Min and Aw, Aiti and Li, Haizhou
Abstract
Previous efforts add syntactic constraints to phrase-based translation by directly rewarding/punishing a hypothesis whenever it matches/violates source-side constituents.
Introduction
The phrase-based approach is widely adopted in statistical machine translation (SMT).
Introduction
In such a process, original phrase-based decoding (Koehn et al., 2003) does not take advantage of any linguistic analysis, which, however, is broadly used in rule-based approaches.
Introduction
Since it is not linguistically motivated, original phrase-based decoding might produce ungrammatical or even wrong translations.
The Syntax-Driven Bracketing Model 3.1 The Model
3.3 The Integration of the SDB Model into Phrase-Based SMT
The Syntax-Driven Bracketing Model 3.1 The Model
We integrate the SDB model into phrase-based SMT to help decoder perform syntax-driven phrase translation.
The Syntax-Driven Bracketing Model 3.1 The Model
In this paper, we implement the SDB model in a state-of-the-art phrase-based system which adapts a binary bracketing transduction grammar (BTG) (Wu, 1997) to phrase translation and reordering, described in (Xiong et al., 2006).
phrase-based is mentioned in 14 sentences in this paper.
Topics mentioned in this paper:
Razmara, Majid and Foster, George and Sankaran, Baskaran and Sarkar, Anoop
Conclusion & Future Work
We will also add capability of supporting syntax-based ensemble decoding and experiment how a phrase-based system can benefit from syntax information present in a syntax-aware MT system.
Ensemble Decoding
The current implementation is able to combine hierarchical phrase-based systems (Chiang, 2005) as well as phrase-based translation systems (Koehn et al., 2003).
Ensemble Decoding
phrase-based, hierarchical phrase-based , and/or syntax-based systems.
Experiments & Results 4.1 Experimental Setup
For the mixture baselines, we used a standard one-pass phrase-based system (Koehn et al., 2003), Portage (Sadat et al., 2005), with the following 7 features: relative-frequency and lexical translation model (TM) probabilities in both directions; word-displacement distortion model; language model (LM) and word count.
Experiments & Results 4.1 Experimental Setup
For ensemble decoding, we modified an in-house implementation of hierarchical phrase-based system, Kriya (Sankaran et al., 2012) which uses the same features mentioned in (Chiang, 2005): forward and backward relative-frequency and lexical TM probabilities; LM; word, phrase and glue-rules penalty.
Experiments & Results 4.1 Experimental Setup
The first group are the baseline results on the phrase-based system discussed in Section 2 and the second group are those of our hierarchical MT system.
Introduction
We have modified Kriya (Sankaran et al., 2012), an in-house implementation of hierarchical phrase-based translation system (Chiang, 2005), to implement ensemble decoding using multiple translation models.
Related Work 5.1 Domain Adaptation
Among these approaches are sentence-based, phrase-based and word-based output combination methods.
Related Work 5.1 Domain Adaptation
Firstly, unlike the multi-table support of Moses which only supports phrase-based translation table combination, our approach supports ensembles of both hierarchical and phrase-based systems.
phrase-based is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Song, Young-In and Lee, Jung-Tae and Rim, Hae-Chang
Abstract
We also present useful features that reflect the compositionality and discriminative power of a phrase and its constituent words for optimizing the weights of phrase use in phrase-based retrieval models.
Introduction
Our approach to phrase-based retrieval is motivated from the following linguistic intuitions: a) phrases have relatively different degrees of significance, and b) the influence of a phrase should be differentiated based on the phrase’s constituents in retrieval models.
Previous Work
One of the most earliest work on phrase-based retrieval was done by (Fagan, 1987).
Previous Work
In many cases, the early researches on phrase-based retrieval have only focused on extracting phrases, not concerning about how to devise a retrieval model that effectively considers both words and phrases in ranking.
Previous Work
While a phrase-based approach selectively incorporated potentially-useful relation between words, the probabilistic approaches force to estimate parameters for all possible combinations of words in text.
Proposed Method
In this section, we present a phrase-based retrieval framework that utilizes both words and phrases effectively in ranking.
Proposed Method
3.1 Basic Phrase-based Retrieval Model
Proposed Method
We start out by presenting a simple phrase-based language modeling retrieval model that assumes uniform contribution of words and phrases.
phrase-based is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Setiawan, Hendra and Kan, Min Yen and Li, Haizhou and Resnik, Philip
Abstract
Hierarchical phrase-based models are attractive because they provide a consistent framework within which to characterize both local and long-distance reorderings, but they also make it difficult to distinguish many implausible reorderings from those that are linguistically plausible.
Abstract
Rather than appealing to annotation-driven syntactic modeling, we address this problem by observing the influential role of function words in determining syntactic structure, and introducing soft constraints on function word relationships as part of a standard log-linear hierarchical phrase-based model.
Hierarchical Phrase-based System
Formally, a hierarchical phrase-based SMT system is based on a weighted synchronous context free grammar (SCFG) with one type of nonterminal symbol.
Hierarchical Phrase-based System
Synchronous rules in hierarchical phrase-based models take the following form:
Hierarchical Phrase-based System
Translation of a source sentence 6 using hierarchical phrase-based models is formulated as a search for the most probable derivation D* whose source side is equal to e:
Introduction
Hierarchical phrase-based models (Chiang, 2005; Chiang, 2007) offer a number of attractive benefits in statistical machine translation (SMT), while maintaining the strengths of phrase-based systems (Koehn et al., 2003).
Introduction
To model such a reordering, a hierarchical phrase-based system demands no additional parameters, since long and short distance reorderings are modeled identically using synchronous context free grammar (SCFG) rules.
Introduction
Interestingly, hierarchical phrase-based models provide this benefit without making any linguistic commitments beyond the structure of the model.
Overgeneration and Topological Ordering of Function Words
The problem may be less severe in hierarchical phrase-based MT than in BTG, since lexical items on the rules’ right hand sides often limit the span of nonterminals.
phrase-based is mentioned in 19 sentences in this paper.
Topics mentioned in this paper:
Feng, Yang and Cohn, Trevor
Abstract
However phrase-based approaches are much less able to model sentence level effects between different phrase-pairs.
Experiments
However in this paper we limit our focus to inducing word alignments, i.e., by using the model to infer alignments which are then used in a standard phrase-based translation pipeline.
Experiments
We leave full decoding for later work, which we anticipate would further improve performance by exploiting gapping phrases and other phenomena that implicitly form part of our model but are not represented in the phrase-based decoder.
Introduction
Recent years have witnessed burgeoning development of statistical machine translation research, notably phrase-based (Koehn et al., 2003) and syntax-based approaches (Chiang, 2005; Galley et al., 2006; Liu et al., 2006).
Introduction
These approaches model sentence translation as a sequence of simple translation decisions, such as the application of a phrase translation in phrase-based methods or a grammar rule in syntax-based approaches.
Introduction
This conflicts with the intuition behind phrase-based MT, namely that translation decisions should be dependent on con-
Model
We consider a process in which the target string is generated using a left-to-right order, similar to the decoding strategy used by phrase-based machine translation systems (Koehn et al., 2003).
Model
In contrast to phrase-based models, we use words as our basic translation unit, rather than multi-word phrases.
Related Work
This idea has been developed explicitly in a number of previous approaches, in grammar based (Chiang, 2005) and phrase-based systems (Galley and Manning, 2010).
phrase-based is mentioned in 16 sentences in this paper.
Topics mentioned in this paper:
Liu, Yang and Mi, Haitao and Feng, Yang and Liu, Qun
Background
In phrase-based models, a decision can be translating a source phrase into a target phrase or reordering the target phrases.
Experiments
first model was the hierarchical phrase-based model (Chiang, 2005; Chiang, 2007).
Introduction
We evaluated our joint decoder that integrated a hierarchical phrase-based model (Chiang, 2005; Chiang, 2007) and a tree-to-string model (Liu et al., 2006) on the NIST 2005 Chinese-English test-set.
Introduction
Some researchers prefer to saying “phrase-based approaches” or “phrase-based systems”.
Introduction
On the other hand, other authors (e. g., (Och and Ney, 2004; Koehn et al., 2003; Chiang, 2007)) do use the expression “phrase-based models”.
Joint Decoding
Figure 2(a) demonstrates a translation hypergraph for one model, for example, a hierarchical phrase-based model.
Joint Decoding
Although phrase-based decoders usually produce translations from left to right, they can adopt bottom-up decoding in principle.
Joint Decoding
(2006) propose left-to-right target generation for hierarchical phrase-based translation.
phrase-based is mentioned in 14 sentences in this paper.
Topics mentioned in this paper:
Liu, Yang
Abstract
We introduce a shift-reduce parsing algorithm for phrase-based string-to-dependency translation.
Abstract
As our approach combines the merits of phrase-based and string-to-dependency models, it achieves significant improvements over the two baselines on the NIST Chinese-English datasets.
Introduction
Modern statistical machine translation approaches can be roughly divided into two broad categories: phrase-based and syntax-based.
Introduction
Phrase-based approaches treat phrase, which is usually a sequence of consecutive words, as the basic unit of translation (Koehn et al., 2003; Och and Ney, 2004).
Introduction
As phrases are capable of memorizing local context, phrase-based approaches excel at handling local word selection and reordering.
phrase-based is mentioned in 32 sentences in this paper.
Topics mentioned in this paper:
Nguyen, ThuyLinh and Vogel, Stephan
Abstract
Hiero translation models have two limitations compared to phrase-based models: 1) Limited hypothesis space; 2) No lexicalized reordering model.
Abstract
Phrasal-Hiero still has the same hypothesis space as the original Hiero but incorporates a phrase-based distance cost feature and lexicalized reodering features into the chart decoder.
Abstract
The work consists of two parts: 1) for each Hiero translation derivation, find its corresponding discontinuous phrase-based path.
Introduction
Phrase-based and tree-based translation model are the two main streams in state-of-the-art machine translation.
Introduction
Yet, tree-based translation often underperforms phrase-based translation in language pairs with short range reordering such as Arabic-English translation (Zollmann et al., 2008; Birch et al., 2009).
Introduction
(2003) for our phrase-based system and Chiang (2005) for our Hiero system.
phrase-based is mentioned in 69 sentences in this paper.
Topics mentioned in this paper:
Kumar, Shankar and Macherey, Wolfgang and Dyer, Chris and Och, Franz
Discussion
We believe that our efficient algorithms will make them more widely applicable in both SCFG—based and phrase-based MT systems.
Experiments
Our phrase-based statistical MT system is similar to the alignment template system described in (Och and Ney, 2004; Tromble et al., 2008).
Experiments
We also train two SCFG—based MT systems: a hierarchical phrase-based SMT (Chiang, 2007) system and a syntax augmented machine translation (SAMT) system using the approach described in Zollmann and Venugopal (2006).
Experiments
Both systems are built on top of our phrase-based systems.
Introduction
These two techniques were originally developed for N -best lists of translation hypotheses and recently extended to translation lattices (Macherey et al., 2008; Tromble et al., 2008) generated by a phrase-based SMT system (Och and Ney, 2004).
Introduction
SMT systems based on synchronous context free grammars (SCFG) (Chiang, 2007; Zollmann and Venugopal, 2006; Galley et al., 2006) have recently been shown to give competitive performance relative to phrase-based SMT.
Translation Hypergraphs
A translation lattice compactly encodes a large number of hypotheses produced by a phrase-based SMT system.
phrase-based is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Xiao, Tong and Zhu, Jingbo and Zhu, Muhua and Wang, Huizhen
Abstract
We evaluate our method on Chinese-to-English Machine Translation (MT) tasks in three baseline systems, including a phrase-based system, a hierarchical phrase-based system and a syntax-based system.
Background
The first SMT system is a phrase-based system with two reordering models including the maximum entropy-based lexicalized reordering model proposed by Xiong et al.
Background
The second SMT system is an in-house reim-plementation of the Hiero system which is based on the hierarchical phrase-based model proposed by Chiang (2005).
Background
After 5, 7 and 8 iterations, relatively stable improvements are achieved by the phrase-based system, the Hiero system and the syntaX-based system, respectively.
Introduction
Many SMT frameworks have been developed, including phrase-based SMT (Koehn et al., 2003), hierarchical phrase-based SMT (Chiang, 2005), syntax-based SMT (Eisner, 2003; Ding and Palmer, 2005; Liu et al., 2006; Galley et al., 2006; Cowan et al., 2006), etc.
Introduction
Our experiments are conducted on Chinese-to-English translation in three state-of-the-art SMT systems, including a phrase-based system, a hierarchical phrase-based system and a syntax-based
phrase-based is mentioned in 19 sentences in this paper.
Topics mentioned in this paper:
Galley, Michel and Manning, Christopher D.
Abstract
This paper applies MST parsing to MT, and describes how it can be integrated into a phrase-based decoder to compute dependency language model scores.
Abstract
Our results show that augmenting a state-of-the-art phrase-based system with this dependency language model leads to significant improvements in TER (0.92%) and BLEU (0.45%) scores on five NIST Chinese-English evaluation test sets.
Dependency parsing for machine translation
In this section, we review dependency parsing formulated as a maximum spanning tree problem (McDonald et al., 2005b), which can be solved in quadratic time, and then present its adaptation and novel application to phrase-based decoding.
Dependency parsing for machine translation
We now formalize weighted non-projective dependency parsing similarly to (McDonald et al., 2005b) and then describe a modified and more efficient version that can be integrated into a phrase-based decoder.
Introduction
Hierarchical approaches to machine translation have proven increasingly successful in recent years (Chiang, 2005; Marcu et al., 2006; Shen et al., 2008), and often outperform phrase-based systems (Och and Ney, 2004; Koehn et al., 2003) on.ungetlanguage fluency'and.adequacy; Ilouh ever, their benefits generally come with high computational costs, particularly when chart parsing, such as CKY, is integrated with language models of high orders (Wu, 1996).
Introduction
In comparison, phrase-based decoding can run in linear time if a distortion limit is imposed.
Introduction
Since exact MT decoding is NP complete (Knight, 1999), there is no exact search algorithm for either phrase-based or syntactic MT that runs in polynomial time (unless P = NP).
phrase-based is mentioned in 22 sentences in this paper.
Topics mentioned in this paper:
Zhang, Min and Jiang, Hongfei and Aw, Aiti and Li, Haizhou and Tan, Chew Lim and Li, Sheng
Abstract
The model leverages on the strengths of both phrase-based and linguistically syntax-based method.
Introduction
Phrase-based modeling method (Koehn et al., 2003; Och and Ney, 2004a) is a simple, but powerful mechanism to machine translation since it can model local reorderings and translations of multi-word expressions well.
Introduction
It is designed to combine the strengths of phrase-based and syntax-based methods.
Introduction
Experiment results on the NIST MT-2005 Chinese-English translation task show that our method significantly outperforms Moses (Koehn et al., 2007), a state-of-the-art phrase-based SMT system, and other linguistically syntax-based methods, such as SCFG-based and STSG-based methods (Zhang et al., 2007).
Related Work
However, most of them fail to utilize non-syntactic phrases well that are proven useful in the phrase-based methods (Koehn et al., 2003)
Related Work
Chiang (2005)’s hierarchal phrase-based model achieves significant performance improvement.
Related Work
In the last two years, many research efforts were devoted to integrating the strengths of phrase-based and syntax-based methods.
Tree Sequence Alignment Model
We use seven basic features that are analogous to the commonly used features in phrase-based systems (Koehn, 2004): 1) bidirectional rule mapping probabilities; 2) bidirectional lexical rule translation probabilities; 3) the target language model; 4) the number of rules used and 5) the number of target words.
phrase-based is mentioned in 16 sentences in this paper.
Topics mentioned in this paper:
Yeniterzi, Reyyan and Oflazer, Kemal
Abstract
We present a novel scheme to apply factored phrase-based SMT to a language pair with very disparate morphological structures.
Conclusions
We have presented a novel way to incorporate source syntactic structure in English-to-Turkish phrase-based machine translation by parsing the source sentences and then encoding many local and nonlocal source syntactic structures as additional complex tag factors.
Experimental Setup and Results
We evaluated the impact of the transformations in factored phrase-based SMT with an English-Turkish data set which consists of 52712 parallel sentences.
Experimental Setup and Results
As a baseline system, we built a standard phrase-based system, using the surface forms of the words without any transformations, and with a 3—gram LM in the decoder.
Experimental Setup and Results
Factored phrase-based SMT allows the use of multiple language models for the target side, for different factors during decoding.
Introduction
Once these were identified as separate tokens, they were then used as “words” in a standard phrase-based framework (Koehn et al., 2003).
Introduction
This facilitates the use of factored phrase-based translation that was not previously applicable due to the morphological complexity on the target side and mismatch between source and target morphologies.
Introduction
We assume that the reader is familiar with the basics of phrase-based statistical machine translation (Koehn et al., 2003) and factored statistical machine translation (Koehn and Hoang, 2007).
Related Work
Koehn (2005) applied standard phrase-based SMT to Finnish using the Europarl corpus and reported that translation to Finnish had the worst BLEU scores.
Related Work
Yang and Kirchhoff (2006) have used phrase-based backoff models to translate unknown words by morphologically decomposing the unknown source words.
Related Work
They used both CCG supertags and LTAG su-pertags in Arabic-to-English phrase-based translation and have reported about 6% relative improvement in BLEU scores.
phrase-based is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Zhang, Jiajun and Zong, Chengqing
Abstract
In this paper, we take a step forward and propose a simple but effective method to induce a phrase-based model from the monolingual corpora given an au-tomatically-induced translation lexicon or a manually-edited translation dictionary.
Experiments
We construct two kinds of phrase-based models using Moses (Koehn et al., 2007): one uses out-of-domain data and the other uses in-domain data.
Introduction
Novel translation models, such as phrase-based models (Koehn et a., 2007), hierarchical phrase-based models (Chiang, 2007) and linguistically syntax-based models (Liu et a., 2006; Huang et al., 2006; Galley, 2006; Zhang et a1, 2008; Chiang, 2010; Zhang et al., 2011; Zhai et al., 2011, 2012) have been proposed and achieved higher and higher translation performance.
Introduction
Finally, they used the learned translation model directly to translate unseen data (Ravi and Knight, 2011; Nuhn et al., 2012) or incorporated the learned bilingual lexicon as a new in-domain translation resource into the phrase-based model which is trained with out-of-domain data to improve the domain adaptation performance in machine translation (Dou and Knight, 2012).
Introduction
level translation rules and learn a phrase-based model from the monolingual corpora.
Phrase Pair Refinement and Parameterization
It is well known that in the phrase-based SMT there are four translation probabilities and the reordering probability for each phrase pair.
Phrase Pair Refinement and Parameterization
The translation probabilities in the traditional phrase-based SMT include bidirectional phrase translation probabilities and bidirectional lexical weights.
Probabilistic Bilingual Lexicon Acquisition
In order to assign probabilities to each entry, we apply the Corpus Translation Probability which used in (Wu et al., 2008): given an in-domain source language monolingual data, we translate this data with the phrase-based model trained on the out-of-domain News data, the in-domain lexicon and the in-domain target language monolingual data (for language model estimation).
Related Work
For the target-side monolingual data, they just use it to train language model, and for the source-side monolingual data, they employ a baseline (word-based SMT or phrase-based SMT trained with small-scale bitext) to first translate the source sentences, combining the source sentence and its target translation as a bilingual sentence pair, and then train a new phrase-base SMT with these pseudo sentence pairs.
phrase-based is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Xiao, Tong and Zhu, Jingbo and Zhang, Chunliang
A Skeleton-based Approach to MT 2.1 Skeleton Identification
As is standard in SMT, we further assume that 1) the translation process can be decomposed into a derivation of phrase-pairs (for phrase-based models) or translation rules (for syntax-based models); 2) and a linear function is used to assign a model score to each derivation.
A Skeleton-based Approach to MT 2.1 Skeleton Identification
See Figure l for an example of applying the above model to phrase-based MT.
A Skeleton-based Approach to MT 2.1 Skeleton Identification
While we will restrict ourself to phrase-based translation in the following description and experiments, we can choose different models/features for gskel(d) and gfuu(d).
Abstract
We apply our approach to a state-of-the-art phrase-based system and demonstrate very promising BLEU improvements and TER reductions on the NIST Chinese-English MT evaluation data.
Introduction
The simplest of these is the phrase-based approach (Och et al., 1999; Koehn et al., 2003) which employs a global model to process any sub-strings of the input sentence.
Introduction
However, these approaches suffer from the same problem as the phrase-based counterpart and use the single global model to handle different translation units, no matter they are from the skeleton of the input tree/sentence or other not-so-important substructures.
Introduction
0 We apply the proposed model to Chinese-English phrase-based MT and demonstrate promising BLEU improvements and TER reductions on the NIST evaluation data.
phrase-based is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Schwartz, Lane and Callison-Burch, Chris and Schuler, William and Wu, Stephen
Abstract
This paper describes a novel technique for incorporating syntactic knowledge into phrase-based machine translation through incremental syntactic parsing.
Abstract
This requirement makes it difficult to incorporate them into phrase-based translation, which generates partial hypothesized translations from left-to-right.
Abstract
Incremental syntactic language models score sentences in a similar left-to-right fashion, and are therefore a good mechanism for incorporating syntax into phrase-based translation.
Introduction
Modern phrase-based translation using large scale n-gram language models generally performs well in terms of lexical choice, but still often produces ungrammatical output.
Introduction
Bottom-up and top-down parsers typically require a completed string as input; this requirement makes it difficult to incorporate these parsers into phrase-based translation, which generates hypothesized translations incrementally, from left-to-right.1 As a workaround, parsers can rerank the translated output of translation systems (Och et al., 2004).
Introduction
We observe that incremental parsers, used as structured language models, provide an appropriate algorithmic match to incremental phrase-based decoding.
Related Work
Neither phrase-based (Koehn et al., 2003) nor hierarchical phrase-based translation (Chiang, 2005) take explicit advantage of the syntactic structure of either source or target language.
Related Work
Early work in statistical phrase-based translation considered whether restricting translation models to use only syntactically well-formed constituents might improve translation quality (Koehn et al., 2003) but found such restrictions failed to improve translation quality.
phrase-based is mentioned in 32 sentences in this paper.
Topics mentioned in this paper:
Cherry, Colin
Abstract
Phrase-based decoding produces state-of-the-art translations with no regard for syntax.
Abstract
The resulting cohesive, phrase-based decoder is shown to produce translations that are preferred over noncohesive output in both automatic and human evaluations.
Cohesive Decoding
This section describes a modification to standard phrase-based decoding, so that the system is constrained to produce only cohesive output.
Conclusion
We have presented a definition of syntactic cohesion that is applicable to phrase-based SMT.
Conclusion
This suggests that cohesion could be a strong feature in estimating the confidence of phrase-based translations.
Experiments
We have adapted the notion of syntactic cohesion so that it is applicable to phrase-based decoding.
Experiments
The lexical reordering model provides a good comparison point as a non-syntactic, and potentially orthogonal, improvement to phrase-based movement modeling.
Experiments
This indicates that the cohesive subset is easier to translate with a phrase-based system.
Introduction
We attempt to use this strong, but imperfect, characterization of movement to assist a non-syntactic translation method: phrase-based SMT.
Introduction
Phrase-based decoding (Koehn et al., 2003) is a dominant formalism in statistical machine translation.
phrase-based is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Green, Spence and DeNero, John
A Class-based Model of Agreement
We chose a bigram model due to the aggressive recombination strategy in our phrase-based decoder.
Abstract
The model does not require bitext or phrase table annotations and can be easily implemented as a feature in many phrase-based decoders.
Discussion of Translation Results
Phrase Table Coverage In a standard phrase-based system, effective translation into a highly inflected target language requires that the phrase table contain the inflected word forms necessary to construct an output with correct agreement.
Discussion of Translation Results
This large gap between the unigram recall of the actual translation output (top) and the lexical coverage of the phrase-based model (bottom) indicates that translation performance can be improved dramatically by altering the translation model through features such as ours, without expanding the search space of the decoder.
Experiments
Experimental Setup Our decoder is based on the phrase-based approach to translation (Och and Ney, 2004) and contains various feature functions including phrase relative frequency, word-level alignment statistics, and lexicalized reordering models (Tillmann, 2004; Och et al., 2004).
Inference during Translation Decoding
3.1 Phrase-based Translation Decoding
Inference during Translation Decoding
We consider the standard phrase-based approach to MT (Och and Ney, 2004).
Introduction
Agreement relations that cross statistical phrase boundaries are not explicitly modeled in most phrase-based MT systems (Avramidis and Koehn, 2008).
Introduction
The model can be implemented using the feature APIs of popular phrase-based decoders such as Moses (Koehn et al., 2007) and Phrasal (Cer et al., 2010).
Related Work
Subotin (2011) recently extended factored translation models to hierarchical phrase-based translation and developed a discriminative model for predicting target-side morphology in English-Czech.
phrase-based is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Zaslavskiy, Mikhail and Dymetman, Marc and Cancedda, Nicola
Abstract
In this paper, we focus on the reverse mapping, showing that any phrase-based SMT decoding problem can be directly reformulated as a TSP.
Conclusion
The main contribution of this paper has been to propose a transformation for an arbitrary phrase-based SMT decoding instance into a TSP instance.
Introduction
Phrase-based systems (Koehn et al., 2003) are probably the most widespread class of Statistical Machine Translation systems, and arguably one of the most successful.
Introduction
We will see in the next section that some characteristics of beam-search make it a suboptimal choice for phrase-based decoding, and we will propose an alternative.
Introduction
This alternative is based on the observation that phrase-based decoding can be very naturally cast as a Traveling Salesman Problem (TSP), one of the best studied problems in combinatorial optimization.
Phrase-based Decoding as TSP
Successful phrase-based systems typically employ language models of order higher than two.
Related work
In (Tillmann and Ney, 2003) and (Tillmann, 2006), the authors modify a certain Dynamic Programming technique used for TSP for use with an IBM-4 word-based model and a phrase-based model respectively.
Related work
with the mainstream phrase-based SMT models, and therefore making it possible to directly apply existing TSP solvers to SMT.
The Traveling Salesman Problem and its variants
As will be shown in the next section, phrase-based SMT decoding can be directly reformulated as an AGTSP.
phrase-based is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Wuebker, Joern and Mauser, Arne and Ney, Hermann
Abstract
Several attempts have been made to learn phrase translation probabilities for phrase-based statistical machine translation that go beyond pure counting of phrases in word-aligned training data.
Alignment
We apply our normal phrase-based decoder on the source side of the training data and constrain the translations to the corresponding target sentences from the training data.
Conclusion
We have shown that training phrase models can improve translation performance on a state-of-the-art phrase-based translation model.
Experimental Evaluation
The baseline system is a standard phrase-based SMT system with eight features: phrase translation and word lexicon probabilities in both translation directions, phrase penalty, word penalty, language model score and a simple distance-based reordering model.
Introduction
A phrase-based SMT system takes a source sentence and produces a translation by segmenting the sentence into phrases and translating those phrases separately (Koehn et al., 2003).
Introduction
We use a modified version of a phrase-based decoder to perform the forced alignment.
Related Work
For the hierarchical phrase-based approach, (Blunsom et al., 2008) present a discriminative rule model and show the difference between using only the viterbi alignment in training and using the full sum over all possible derivations.
Related Work
We also include these word lexica, as they are standard components of the phrase-based system.
Related Work
They report improvements over a phrase-based model that uses an inverse phrase model and a language model.
phrase-based is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Wang, Kun and Zong, Chengqing and Su, Keh-Yih
Abstract
Since statistical machine translation (SMT) and translation memory (TM) complement each other in matched and unmatched regions, integrated models are proposed in this paper to incorporate TM information into phrase-based SMT.
Conclusion and Future Work
Unlike the previous pipeline approaches, which directly merge TM phrases into the final translation result, we integrate TM information of each source phrase into the phrase-based SMT at decoding.
Experiments
For the phrase-based SMT system, we adopted the Moses toolkit (Koehn et al., 2007).
Experiments
conducted using the Moses phrase-based decoder (Koehn et al., 2007).
Introduction
Statistical machine translation (SMT), especially the phrase-based model (Koehn et al., 2003), has developed very fast in the last decade.
Introduction
On a Chinese—English computer technical documents TM database, our experiments have shown that the proposed Model-III improves the translation quality significantly over either the pure phrase-based SMT or the TM systems when the fuzzy match score is above 0.4.
Problem Formulation
Compared with the standard phrase-based machine translation model, the translation problem is reformulated as follows (only based on the best TM, however, it is similar for multiple TM sentences):
Problem Formulation
mula (3) is just the typical phrase-based SMT model, and the second factor P(Mk|Lk, 2:) (to be specified in the Section 3) is the information derived from the TM sentence pair.
Problem Formulation
Therefore, we can still keep the original phrase-based SMT model and only pay attention to how to extract
phrase-based is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Durrani, Nadir and Sajjad, Hassan and Fraser, Alexander and Schmid, Helmut
Abstract
We obtain final BLEU scores of 19.35 (conditional probability model) and 19.00 (joint probability model) as compared to 14.30 for a baseline phrase-based system and 16.25 for a system which transliterates OOV words in the baseline system.
Evaluation
Table 4: Comparing Model-1 and Model-2 with Phrase-based Systems
Evaluation
We also used two methods to incorporate transliterations in the phrase-based system:
Evaluation
Post-process P191: All the 00V words in the phrase-based output are replaced with their top-candidate transliteration as given by our transliteration system.
Introduction
Section 4 discusses the training data, parameter optimization and the initial set of experiments that compare our two models with a baseline Hindi-Urdu phrase-based system and with two transliteration-aided phrase-based systems in terms of BLEU scores
phrase-based is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Sun, Jun and Zhang, Min and Tan, Chew Lim
Conclusions and Future Work
The experimental results show that our model outperforms the baseline models and verify the effectiveness of noncontiguous translational equivalences to noncontiguous phrase modeling in both syntax-based and phrase-based systems.
Experiments
We compare the SncTSSG based model against two baseline models: the phrase-based and the STSSG-based models.
Experiments
For the phrase-based model, we use Moses (Koehn et al, 2007) with its default settings; for the STSSG and SncTSSG based models we use our decoder Pisces by setting the following parameters: d = 4, h = 6, C = 6, I: 6, a = 50, ,8 = 50.
Experiments
Table 3 explores the contribution of the noncontiguous translational equivalence to phrase-based models (all the rules in Table 3 has no grammar tags, but a gap <***> is allowed in the last three rows).
Introduction
Current research in statistical machine translation (SMT) mostly settles itself in the domain of either phrase-based or syntax-based.
Introduction
Between them, the phrase-based approach (Marcu and Wong, 2002; Koehn et a1, 2003; Och and Ney, 2004) allows local reordering and contiguous phrase translation.
Introduction
However, it is hard for phrase-based models to learn global reorderings and to deal with noncontiguous phrases.
The Pisces decoder
Consequently, this distortional operation, like phrase-based models, is much more flexible in the order of the target constituents than the traditional syntax-based models which are limited by the syntactic structure.
phrase-based is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
He, Xiaodong and Deng, Li
Abstract
Our work is based on a phrase-based SMT system.
Abstract
Phrase-based Translation System
Abstract
The translation process of phrase-based SMT can be briefly described in three steps: segment source sentence into a sequence of phrases, translate each
phrase-based is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Zhang, Hao and Fang, Licheng and Xu, Peng and Wu, Xiaoyun
Abstract
Combining the two techniques, we show that using a fast shift-reduce parser we can achieve significant quality gains in NIST 2008 English-to-Chinese track (1.3 BLEU points over a phrase-based system, 0.8 BLEU points over a hierarchical phrase-based system).
Experiments
We compare three systems: a phrase-based system (Och and Ney, 2004), a hierarchical phrase-based system (Chiang, 2005), and our forest-to-string system with different binarization schemes.
Experiments
In the phrase-based decoder, jump width is set to 8.
Experiments
Besides standard features (Och and Ney, 2004), the phrase-based decoder also uses a Maximum Entropy phrasal reordering model (Zens and Ney, 2006).
phrase-based is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Andreas, Jacob and Vlachos, Andreas and Clark, Stephen
Experimental setup
Implementation In all experiments, we use the IBM Model 4 implementation from the GIZA++ toolkit (Och and Ney, 2000) for alignment, and the phrase-based and hierarchical models implemented in the Moses toolkit (Koehn et a1., 2007) for rule extraction.
Introduction
Our contributions are as follows: We develop a semantic parser using off-the-shelf MT components, exploring phrase-based as well as hierarchical models.
MT—based semantic parsing
We consider a phrase-based translation model (Koehn et al., 2003) and a hierarchical translation model (Chiang, 2005).
MT—based semantic parsing
Rules for the phrase-based model consist of pairs of aligned source and target sequences, while hierarchical rules are SCFG productions containing at most two instances of a single nonterminal symbol.
Related Work
The present work is also the first we are aware of which uses phrase-based rather than tree-based machine translation techniques to learn a semantic parser.
Related Work
multilevel rules composed from smaller rules, a process similar to the one used for creating phrase tables in a phrase-based MT system.
Results
We first compare the results for the two translation rule extraction models, phrase-based and hierar-chica1 (“MT-phrase” and “MT-hier” respectively in Table 1).
phrase-based is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Xiong, Deyi and Zhang, Min and Li, Haizhou
Abstract
The two models are integrated into a state-of-the-art phrase-based machine translation system and evaluated on Chinese-to-English translation tasks with large-scale training data.
Conclusions and Future Work
The two models have been integrated into a phrase-based SMT system and evaluated on Chinese-to-English translation tasks using large-scale training data.
Integrating the Two Models into SMT
In this section, we elaborate how to integrate the two models into phrase-based SMT.
Integrating the Two Models into SMT
In particular, we integrate the models into a phrase-based system which uses bracketing transduction grammars (BTG) (Wu, 1997) for phrasal translation (Xiong et al., 2006).
Integrating the Two Models into SMT
It is straightforward to integrate the predicate translation model into phrase-based SMT (Koehn et al.,
Introduction
We integrate these two discriminative models into a state-of-the-art phrase-based system.
phrase-based is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Setiawan, Hendra and Zhou, Bowen and Xiang, Bing and Shen, Libin
Introduction
1We define translation units as phrases in phrase-based SMT, and as translation rules in syntax-based SMT.
Introduction
Specifically, the popular distortion or lexicalized reordering models in phrase-based SMT focus only on making good local prediction (i.e.
Introduction
Even though the experimental results carried out in this paper employ SCFG-based SMT systems, we would like to point out that our models is applicable to other systems including phrase-based SMT systems.
Maximal Orientation Span
3We use hierarchical phrase-based translation system as a case in point, but the merit is generalizable to other systems.
Maximal Orientation Span
Additionally, this illustration also shows a case where MOS acts as a cross-boundary context which effectively relaxes the context-free assumption of hierarchical phrase-based formalism.
Related Work
Our TNO model is closely related to the Unigram Orientation Model (UOM) (Tillman, 2004), which is the de facto reordering model of phrase-based SMT (Koehn et al., 2007).
Related Work
Our MOS concept is also closely related to hierarchical reordering model (Galley and Manning, 2008) in phrase-based decoding, which computes 0 of b with respect to a multi-block unit that may go beyond 19’.
phrase-based is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Auli, Michael and Gao, Jianfeng
Abstract
Our best results improve a phrase-based statistical machine translation system trained on WMT 2012 French-English data by up to 2.0 BLEU, and the expected BLEU objective improves over a cross-entropy trained model by up to 0.6 BLEU in a single reference setup.
Conclusion and Future Work
Our best result improves the output of a phrase-based decoder by up to 2.0 BLEU on French-English translation, outperforming n-best rescoring by up to 1.1 BLEU and lattice rescoring by up to 0.4 BLEU.
Decoder Integration
Typically, phrase-based decoders maintain a set of states representing partial and complete translation hypothesis that are scored by a set of features.
Expected BLEU Training
Formally, our phrase-based model is parameterized by M parameters A where each Am 6 A, m = l .
Experiments
We use a phrase-based system similar to Moses (Koehn et al., 2007) based on a set of common features including maximum likelihood estimates pML (elf) and pML (f |e), lexically weighted estimates pLW(e| f) and p LW( f |e), word and phrase-penalties, a hierarchical reordering model (Galley and Manning, 2008), a linear distortion feature, and a modified Kneser—Ney language model trained on the target-side of the parallel data.
Experiments
ther lattices or the unique 100-best output of the phrase-based decoder and reestimate the log-linear weights by running a further iteration of MERT on the n-best list of the development set, augmented by scores corresponding to the neural network models.
Experiments
We use the same data both for training the phrase-based system as well as the language model but find that the resulting bias did not hurt end-to-end accuracy (Yu et al., 2013).
phrase-based is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Zhang, Jiajun and Liu, Shujie and Li, Mu and Zhou, Ming and Zong, Chengqing
Discussions
As the semantic phrase embedding can fully represent the phrase, we can go a step further in the phrase-based SMT and feed the semantic phrase embeddings to DNN in order to model the whole translation process (e.g.
Experiments
With the semantic phrase embeddings and the vector space transformation function, we apply the BRAE to measure the semantic similarity between a source phrase and its translation candidates in the phrase-based SMT.
Experiments
We have implemented a phrase-based translation system with a maximum entropy based reordering model using the bracketing transduction grammar (Wu, 1997; Xiong et al., 2006).
Experiments
Typically, four translation probabilities are adopted in the phrase-based SMT, including phrase translation probability and lexical weights in both directions.
Introduction
However, in the conventional ( phrase-based ) SMT, phrases are the basic translation units.
Introduction
Instead, we focus on learning phrase embeddings from the view of semantic meaning, so that our phrase embedding can fully represent the phrase and best fit the phrase-based SMT.
phrase-based is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Woodsend, Kristian and Lapata, Mirella
Abstract
The model operates over a phrase-based representation of the source document which we obtain by merging information from PCFG parse trees and dependency graphs.
Experimental Setup
Training We obtained phrase-based salience scores using a supervised machine learning algorithm.
Experimental Setup
The SVM was trained with the same features used to obtain phrase-based salience scores, but with sentence-level labels (labels (1) and (2) positive, (3) negative).
Experimental Setup
Figure 3: ROUGE-l and ROUGE-L results for phrase-based ILP model and two baselines, with error bars showing 95% confidence levels.
Results
F-score is higher for the phrase-based system but not significantly.
Results
The highlights created by the sentence ILP were considered significantly more verbose (0c < 0.05) than those created by the phrase-based system and the CNN abstractors.
Results
Table 5 shows the output of the phrase-based system for the documents in Table 1.
phrase-based is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Subotin, Michael
Hierarchical phrase-based translation
We take as our starting point David Chiang’s Hiero system, which generalizes phrase-based translation to substrings with gaps (Chiang, 2007).
Hierarchical phrase-based translation
As shown by Chiang (2007), a weighted grammar of this form can be collected and scored by simple extensions of standard methods for phrase-based translation and efficiently combined with a language model in a CKY decoder to achieve large improvements over a state-of-the-art phrase-based system.
Hierarchical phrase-based translation
Although a variety of scores interpolated into the decision rule for phrase-based systems have been investigated over the years, only a handful have been discovered to be consistently useful.
Introduction
Translation into languages with rich morphology presents special challenges for phrase-based methods.
Introduction
Thus, Birch et al (2008) find that translation quality achieved by a popular phrase-based system correlates significantly with a measure of target-side, but not source-side morphological complexity.
Introduction
Recently, several studies (Bojar, 2007; Avramidis and Koehn, 2009; Ramanathan et al., 2009; Yen-iterzi and Oflazer, 2010) proposed modeling target-side morphology in a phrase-based factored models framework (Koehn and Hoang, 2007).
Modeling unobserved target inflections
As a consequence of translating into a morphologically rich language, some inflected forms of target words are unobserved in training data and cannot be generated by the decoder under standard phrase-based approaches.
phrase-based is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Clifton, Ann and Sarkar, Anoop
Abstract
This paper extends the training and tuning regime for phrase-based statistical machine translation to obtain fluent translations into morphologically complex languages (we build an English to Finnish translation system).
Conclusion and Future Work
We also demonstrate that for Finnish (and possibly other agglutinative languages), phrase-based MT benefits from allowing the translation model access to morphological segmentation yielding productive morphological phrases.
Conclusion and Future Work
In order to help with replication of the results in this paper, we have run the various morphological analysis steps and created the necessary training, tuning and test data files needed in order to train, tune and test any phrase-based machine translation system with our data.
Experimental Results
In all the experiments conducted in this paper, we used the Moses5 phrase-based translation system (Koehn et al., 2007), 2008 version.
Models 2.1 Baseline Models
We then trained the Moses phrase-based system (Koehn et al., 2007) on the segmented and marked text.
Translation and Morphology
In this work, we propose to address the problem of morphological complexity in an English-to-Finnish MT task within a phrase-based translation framework.
phrase-based is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Lu, Shixiang and Chen, Zhenbiao and Xu, Bo
Abstract
In this paper, instead of designing new features based on intuition, linguistic knowledge and domain, we learn some new and effective features using the deep auto-encoder (DAE) paradigm for phrase-based translation model.
Experiments and Results
We choose the Moses (Koehn et al., 2007) framework to implement our phrase-based machine system.
Input Features for DNN Feature Learning
The phrase-based translation model (Koehn et al., 2003; Och and Ney, 2004) has demonstrated superior performance and been widely used in current SMT systems, and we employ our implementation on this translation model.
Introduction
Instead of designing new features based on intuition, linguistic knowledge and domain, for the first time, Maskey and Zhou (2012) explored the possibility of inducing new features in an unsupervised fashion using deep belief net (DBN) (Hinton et al., 2006) for hierarchical phrase-based trans-
Introduction
al., 2010), and speech spectrograms (Deng et al., 2010), we propose new feature learning using semi-supervised DAE for phrase-based translation model.
Semi-Supervised Deep Auto-encoder Features Learning for SMT
Each translation rule in the phrase-based translation model has a set number of features that are combined in the log-linear model (Och and Ney, 2002), and our semi-supervised DAE features can also be combined in this model.
phrase-based is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Xiao, Xinyan and Xiong, Deyi and Zhang, Min and Liu, Qun and Lin, Shouxun
Abstract
We therefore propose a topic similarity model to exploit topic information at the synchronous rule level for hierarchical phrase-based translation.
Conclusion and Future Work
We have presented a topic similarity model which incorporates the rule-topic distributions on both the source and target side into traditional hierarchical phrase-based system.
Experiments
We divide the rules into three types: phrase rules, which only contain terminals and are the same as the phrase pairs in phrase-based system; monotone rules, which contain non-terminals and produce monotone translations; reordering rules, which also contain non-terminals but change the order of translations.
Introduction
Consequently, we propose a topic similarity model for hierarchical phrase-based translation (Chiang, 2007), where each synchronous rule is associated with a topic distribution.
Introduction
We augment the hierarchical phrase-based system by integrating the proposed topic similarity model as a new feature (Section 3.1).
Introduction
Experiments on Chinese-English translation tasks (Section 6) show that, our method outperforms the baseline hierarchial phrase-based system by +0.9 BLEU points.
phrase-based is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Li, Junhui and Marton, Yuval and Resnik, Philip and Daumé III, Hal
Abstract
This paper explores a simple and effective unified framework for incorporating soft linguistic reordering constraints into a hierarchical phrase-based translation system: l) a syntactic reordering model that explores reorderings for context free grammar rules; and 2) a semantic reordering model that focuses on the reordering of predicate-argument structures.
Abstract
Experiments on Chinese-English translation show that the reordering approach can significantly improve a state-of-the-art hierarchical phrase-based translation system.
Conclusion and Future Work
Experiments on Chinese-English translation show that the reordering approach can significantly improve a state-of-the-art hierarchical phrase-based translation system.
Introduction
The popular distortion or lexicalized reordering models in phrase-based SMT make good local predictions by focusing on reordering on word level, while the synchronous context free grammars in hierarchical phrase-based (HPB) translation models are capable of handling nonlocal reordering on the translation phrase level.
Introduction
The general ideas, however, are applicable to other translation models, e.g., phrase-based model, as well.
Related Work
Last, we also note that recent work on non-syntax-based reorderings in (flat) phrase-based models (Cherry, 2013; Feng et al., 2013) can also be potentially adopted to hpb models.
phrase-based is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Duan, Xiangyu and Zhang, Min and Li, Haizhou
Abstract
The pipeline of most Phrase-Based Statistical Machine Translation (PB-SMT) systems starts from automatically word aligned parallel corpus.
Conclusion
We have presented pseudo-word as a novel machine translational unit for phrase-based machine translation.
Conclusion
Experimental results of Chinese-to-English translation task show that, in phrase-based machine translation model, pseudo-word performs significantly better than word in both spoken language translation domain and news domain.
Experiments and Results
The pipeline uses GIZA++ model 4 (Brown et al., 1993; Och and Ney, 2003) for pseudo-word alignment, uses Moses (Koehn et al., 2007) as phrase-based decoder, uses the SRI Language Modeling Toolkit to train language model with modified Kneser-Ney smoothing (Kneser and Ney 1995; Chen and Goodman 1998).
Experiments and Results
We use GIZA++ model 4 for word alignment, use Moses for phrase-based decoding.
Introduction
The pipeline of most Phrase-Based Statistical Machine Translation (PB-SMT) systems starts from automatically word aligned parallel corpus generated from word-based models (Brown et al., 1993), proceeds with step of induction of phrase table (Koehn et al., 2003) or synchronous grammar (Chiang, 2007) and with model weights tuning step.
phrase-based is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Chen, Boxing and Foster, George and Kuhn, Roland
Abstract
Significant improvements are obtained over a state-of-the—art hierarchical phrase-based machine translation system.
Bag-of-Words Vector Space Model
In the hierarchical phrase-based translation method, the translation rules are extracted by abstracting some words from an initial phrase pair (Chiang, 2005).
Experiments
For the baseline, we train the translation model by following (Chiang, 2005; Chiang, 2007) and our decoder is Joshuas, an open-source hierarchical phrase-based machine translation system written in Java.
Hierarchical phrase-based MT system
The hierarchical phrase-based translation method (Chiang, 2005; Chiang, 2007) is a formal syntax-based translation modeling method; its translation model is a weighted synchronous context free grammar (SCFG).
Hierarchical phrase-based MT system
Empirically, this method has yielded better performance on language pairs such as Chinese-English than the phrase-based method because it permits phrases with gaps; it generalizes the normal phrase-based models in a way that allows long-distance reordering (Chiang, 2005; Chiang, 2007).
Introduction
We chose a hierarchical phrase-based SMT system as our baseline; thus, the units involved in computation of sense similarities are hierarchical rules.
phrase-based is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Hewavitharana, Sanjika and Mehay, Dennis and Ananthakrishnan, Sankaranarayanan and Natarajan, Prem
Corpus Data and Baseline SMT
Our phrase-based decoder is similar to Moses (Koehn et al., 2007) and uses the phrase pairs and target LM to perform beam search stack decoding based on a standard log-linear model, the parameters of which were tuned with MERT (Och, 2003) on a held-out development set (3,534 sentence pairs, 45K words) using BLEU as the tuning metric.
Experimental Setup and Results
The baseline English-to-Iraqi phrase-based SMT system was built as described in Section 3.
Introduction
In this paper, we describe a novel topic-based adaptation technique for phrase-based statistical machine translation (SMT) of spoken conversations.
Introduction
Translation phrase pairs that originate in training conversations whose topic distribution is similar to that of the current conversation are given preference through a single similarity feature, which augments the standard phrase-based SMT log-linear model.
Introduction
With this approach, we demonstrate significant improvements over a baseline phrase-based SMT system as measured by BLEU, TER and NIST scores on an English-to-Iraqi CSLT task.
phrase-based is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Avramidis, Eleftherios and Koehn, Philipp
Experiments
The Czech noun cases which appear only in prepositional phrases were ignored, since they are covered by the phrase-based model.
Introduction
Our method is based on factored phrase-based statistical machine translation models.
Introduction
1.1 Morphology in Phrase-based SMT
Introduction
0 Meanwhile, in phrase-based SMT models, words are mapped in chunks.
phrase-based is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Saluja, Avneesh and Hassan, Hany and Toutanova, Kristina and Quirk, Chris
Abstract
Statistical phrase-based translation learns translation rules from bilingual corpora, and has traditionally only used monolingual evidence to construct features that rescore existing translation candidates.
Abstract
Our proposed approach significantly improves the performance of competitive phrase-based systems, leading to consistent improvements between 1 and 4 BLEU points on standard evaluation sets.
Evaluation
The baseline is a state-of-the-art phrase-based system; we perform word alignment using a lexicalized hidden Markov model, and then the phrase table is extracted using the grow—diag—final heuristic (Koehn et al., 2003).
Generation & Propagation
2.5 Phrase-based SMT Expansion
Introduction
With large amounts of data, phrase-based translation systems (Koehn et al., 2003; Chiang, 2007) achieve state-of-the-art results in many ty-pologically diverse language pairs (Bojar et al., 2013).
phrase-based is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Li, Junhui and Tu, Zhaopeng and Zhou, Guodong and van Genabith, Josef
Abstract
This paper presents an extension of Chiang’s hierarchical phrase-based (HPB) model, called Head—Driven HPB (HD—HPB), which incorporates head information in translation rules to better capture syntax-driven information, as well as improved reordering between any two neighboring non-terminals at any stage of a derivation to explore a larger reordering search space.
Conclusion
We present a head-driven hierarchical phrase-based (HD-HPB) translation model, which adopts head information (derived through unlabeled dependency analysis) in the definition of non-terminals to better differentiate among translation rules.
Head-Driven HPB Translation Model
For rule extraction, we first identify initial phrase pairs on word-aligned sentence pairs by using the same criterion as most phrase-based translation models (Och and Ney, 2004) and Chiang’s HPB model (Chiang, 2005; Chiang, 2007).
Introduction
Chiang’s hierarchical phrase-based (HPB) translation model utilizes synchronous context free grammar (SCFG) for translation derivation (Chiang, 2005; Chiang, 2007) and has been widely adopted in statistical machine translation (SMT).
Introduction
[ Phrase-based Translation
phrase-based is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Liu, Yang and Lü, Yajuan and Liu, Qun
Abstract
Comparable to the state-of-the-art phrase-based system Moses, using packed forests in tree-to-tree translation results in a significant absolute improvement of 3.6 BLEU points over using l-best trees.
Conclusion
Our system also achieves comparable performance with the state-of-the-art phrase-based system Moses.
Experiments
The absence of such non-syntactic mappings prevents tree-based tree-to-tree models from achieving comparable results to phrase-based models.
Related Work
They replace l-best trees with packed forests both in training and decoding and show superior translation quality over the state-of-the-art hierarchical phrase-based system.
phrase-based is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Neubig, Graham and Watanabe, Taro and Sumita, Eiichiro and Mori, Shinsuke and Kawahara, Tatsuya
A Probabilistic Model for Phrase Table Extraction
If 6 takes the form of a scored phrase table, we can use traditional methods for phrase-based SMT to find P(e|f, 6) and concentrate on creating a model for P(6| (5 , .7: We decompose this posterior probability using Bayes law into the corpus likelihood and parameter prior probabilities
Abstract
This allows for a completely probabilistic model that is able to create a phrase table that achieves competitive accuracy on phrase-based machine translation tasks directly from unaligned sentence pairs.
Hierarchical ITG Model
and we confirm in the experiments in Section 7, using only minimal phrases leads to inferior translation results for phrase-based SMT.
Introduction
The training of translation models for phrase-based statistical machine translation (SMT) systems (Koehn et al., 2003) takes unaligned bilingual training data as input, and outputs a scored table of phrase pairs.
phrase-based is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Mi, Haitao and Liu, Qun
Introduction
By incorporating the syntactic annotations of parse trees from both or either side(s) of the bitext, they are believed better than phrase-based counterparts in reorderings.
Introduction
In contrast to conventional tree-to-tree approaches (Ding and Palmer, 2005; Quirk et al., 2005; Xiong et al., 2007; Zhang et al., 2007; Liu et al., 2009), which only make use of a single type of trees, our model is able to combine two types of trees, outperforming both phrase-based and tree-to-string systems.
Introduction
Current tree-to-tree models (Xiong et al., 2007; Zhang et al., 2007; Liu et al., 2009) still have not outperformed the phrase-based system Moses (Koehn et al., 2007) significantly even with the help of forests.1
Related Work
This model shows a significant improvement over the state-of-the-art hierarchical phrase-based system (Chiang, 2005).
phrase-based is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Cui, Lei and Zhang, Dongdong and Liu, Shujie and Chen, Qiming and Li, Mu and Zhou, Ming and Yang, Muyun
Experiments
In SMT training, an in-house hierarchical phrase-based SMT decoder is implemented for our experiments.
Introduction
For example, translation sense disambiguation approaches (Carpuat and Wu, 2005; Carpuat and Wu, 2007) are proposed for phrase-based SMT systems.
Introduction
Meanwhile, for hierarchical phrase-based or syntax-based SMT systems, there is also much work involving rich contexts to guide rule selection (He et al., 2008; Liu et al., 2008; Marton and Resnik, 2008; Xiong et al., 2009).
Related Work
Following this work, (Xiao et al., 2012) extended topic-specific lexicon translation models to hierarchical phrase-based translation models, where the topic information of synchronous rules was directly inferred with the help of document-level information.
phrase-based is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zhang, Hao and Quirk, Chris and Moore, Robert C. and Gildea, Daniel
Bootstrapping Phrasal ITG from Word-based ITG
This section introduces a technique that bootstraps candidate phrase pairs for phrase-based ITG from word-based ITG Viterbi alignments.
Experiments
Finally we ran 10 iterations of phrase-based ITG over the residual charts, using EM or VB, and extracted the Viterbi alignments.
Summary of the Pipeline
From this alignment, phrase pairs are extracted in the usual manner, and a phrase-based translation system is trained.
Variational Bayes for ITG
The drawback of maximum likelihood is obvious for phrase-based models.
phrase-based is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Haffari, Gholamreza and Sarkar, Anoop
Abstract
We also provide new highly effective sentence selection methods that improve AL for phrase-based SMT in the multilingual and single language pair setting.
Introduction
0 We introduce new highly effective sentence selection methods that improve phrase-based SMT in the multilingual and single language pair setting.
Sentence Selection: Multiple Language Pairs
For the single language pair setting, (Haffari et al., 2009) presents and compares several sentence selection methods for statistical phrase-based machine translation.
Sentence Selection: Single Language Pair
Phrases are basic units of translation in phrase-based SMT models.
phrase-based is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Schoenemann, Thomas
Abstract
Using the resulting alignments for phrase-based translation systems offers no clear insights w.r.t.
Conclusion
Table 2: Evaluation of phrase-based translation from German to English with the obtained alignments (for 100.000 sentence pairs).
See e. g. the author’s course notes (in German), currently
In addition we evaluate the effect on phrase-based translation on one of the tasks.
See e. g. the author’s course notes (in German), currently
We also check the effect of the various alignments (all produced by RegAligner) on translation performance for phrase-based translation, randomly choosing translation from German to English.
phrase-based is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Feng, Minwei and Peter, Jan-Thorsten and Ney, Hermann
Experiments
Our baseline is a phrase-based decoder, which includes the following models: an n-gram target-side language model (LM), a phrase translation model and a word-based lexicon model.
Introduction
Within the phrase-based SMT framework there are mainly three stages where improved reordering could be integrated: In the preprocessing: the source sentence is reordered by heuristics, so that the word order of source and target sentences is similar.
Introduction
In this way, syntax information can be incorporated into phrase-based SMT systems.
Translation System Overview
In this paper, the phrase-based machine translation system
phrase-based is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Li, Mu and Duan, Nan and Zhang, Dongdong and Li, Chi-Ho and Zhou, Ming
Collaborative Decoding
Similar to a typical phrase-based decoder (Koehn, 2004), we associate each hypothesis with a coverage vector C to track translated source words in it.
Collaborative Decoding
But to be a general framework, this step is necessary for some state-of-the-art phrase-based decoders (Koehn, 2007; Och and Ney, 2004) because in these decoders, hypotheses with different coverage vectors can coeXist in the same bin, or hypotheses associated with the same coverage vector might appear in different bins.
Experiments
The first one (SYS 1) is re-implementation of Hiero, a hierarchical phrase-based decoder.
phrase-based is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Li, Zhifei and Eisner, Jason and Khudanpur, Sanjeev
Background 2.1 Terminology
In MT, spurious ambiguity occurs both in regular phrase-based systems (e.g., Koehn et al.
Background 2.1 Terminology
Figure l: Segmentation ambiguity in phrase-based MT: two different segmentations lead to the same translation string.
Background 2.1 Terminology
It can be used to encode exponentially many hypotheses generated by a phrase-based MT system (e.g., Koehn et al.
phrase-based is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Toutanova, Kristina and Suzuki, Hisami and Ruopp, Achim
Integration of inflection models with MT systems
For the phrase-based system, we generated the annotations needed by first parsing the source sentence e, aligning the source and candidate translations with the word-alignment model used in training, and projected the dependency tree to the target using the algorithm of (Quirk et al., 2005).
Machine translation systems and data
We integrated the inflection prediction model with two types of machine translation systems: systems that make use of syntax and surface phrase-based systems.
Related work
In recent work, Koehn and Hoang (2007) proposed a general framework for including morphological features in a phrase-based SMT system by factoring the representation of words into a vector of morphological features and allowing a phrase-based MT system to work on any of the factored representations, which is implemented in the Moses system.
phrase-based is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Li, Zhifei and Yarowsky, David
Conclusions
We integrate our method into a state-of-the-art phrase-based baseline translation system, i.e., Moses (Koehn et al., 2007), and show that the integrated system consistently improves the performance of the baseline system on various NIST machine translation test sets.
Experimental Results
Using the toolkit Moses (Koehn et al., 2007), we built a phrase-based baseline system by following
Experimental Results
This is analogous to the concept of “phrase” in phrase-based MT.
phrase-based is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Braslavski, Pavel and Beloborodov, Alexander and Khalilov, Maxim and Sharoff, Serge
Introduction
Long-distance dependencies are common, and this creates problems for both RBMT and SMT systems (especially for phrase-based ones).
Results
081 Phrase-based SMT
Results
082 Phrase-based SMT
phrase-based is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Lin, Dekang and Wu, Xiaoyun
Introduction
With phrase-based clustering, “Land of Odds” is grouped with many names that are labeled as company names, which is a strong indication that it is a company name as well.
Introduction
The disambiguation power of phrases is also evidenced by the improvements of phrase-based machine translation systems (Koehn et.
Introduction
We demonstrate the advantages of phrase-based clusters over word-based ones with experimental results from two distinct application domains: named entity recognition and query classification.
phrase-based is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Wuebker, Joern and Ney, Hermann and Zens, Richard
Abstract
In this work we present two extensions to the well-known dynamic programming beam search in phrase-based statistical machine translation (SMT), aiming at increased efficiency of decoding by minimizing the number of language model computations and hypothesis expansions.
Conclusions
This work introduces two extensions to the well-known beam search algorithm for phrase-based machine translation.
Introduction
Research efforts to increase search efficiency for phrase-based MT (Koehn et al., 2003) have explored several directions, ranging from generalizing the stack decoding algorithm (Ortiz et al., 2006) to additional early pruning techniques (Delaney et al., 2006), (Moore and Quirk, 2007) and more efficient language model (LM) querying (Heafield, 2011).
phrase-based is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Narayan, Shashi and Gardent, Claire
Experiments
These input are simplified using our simplification system namely, the DRS-SM model and the phrase-based machine translation system (Section 3.2).
Experiments
These four sentences are directly sent to the phrase-based machine translation system to produce simplified sentences.
Simplification Framework
The DRS associated with the final M-node D fin is then mapped to a simplified sentence s’fm which is further simplified using the phrase-based machine translation system to produce the final simplified sentence ssimple.
phrase-based is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Salameh, Mohammad and Cherry, Colin and Kondrak, Grzegorz
Methods
In this section, we discuss how a lattice from a multi-stack phrase-based decoder such as Moses (Koehn et al., 2007) can be desegmented to enable word-level features.
Methods
A phrase-based decoder produces its output from left to right, with each operation appending the translation of a source phrase to a growing target hypothesis.
Methods
The search graph of a phrase-based decoder can be interpreted as a lattice, which can be interpreted as a finite state acceptor over target strings.
phrase-based is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Ganchev, Kuzman and Graça, João V. and Taskar, Ben
Abstract
We propose and extensively evaluate a simple method for using alignment models to produce alignments better-suited for phrase-based MT systems, and show significant gains (as measured by BLEU score) in end-to-end translation systems for six languages pairs used in recent MT competitions.
Phrase-based machine translation
The baseline system uses GIZA model 4 alignments and the open source Moses phrase-based machine translation toolkit2, and performed close to the best at the competition last year.
Word alignment results
Unfortunately, as was shown by Fraser and Marcu (2007) AER can have weak correlation with translation performance as measured by BLEU score (Pa-pineni et al., 2002), when the alignments are used to train a phrase-based translation system.
phrase-based is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Zhang, Hui and Zhang, Min and Li, Haizhou and Aw, Aiti and Tan, Chew Lim
Experiment
We use the first three syntax-based systems (TT2S, TTS2S, FT2S) and Moses (Koehn et al., 2007), the state-of-the-art phrase-based system, as our baseline systems.
Forest-based tree sequence to string model
We use seven basic features that are analogous to the commonly used features in phrase-based systems (Koehn, 2003): l) bidirectional rule mapping probabilities, 2) bidirectional lexical rule translation probabilities, 3) target language model, 4) number of rules used and 5) number of target words.
Related work
Motivated by the fact that non-syntactic phrases make nontrivial contribution to phrase-based SMT, the tree sequence-based translation model is proposed (Liu et al., 2007; Zhang et al., 2008a) that uses tree sequence as the basic translation unit, rather than using single subtree as in the STSG.
phrase-based is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: