Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation
Nguyen, ThuyLinh and Vogel, Stephan

Article Structure

Abstract

Hiero translation models have two limitations compared to phrase-based models: 1) Limited hypothesis space; 2) No lexicalized reordering model.

Introduction

Phrase-based and tree-based translation model are the two main streams in state-of-the-art machine translation.

Phrasal-Hiero Model

Phrasal-Hiero maps a Hiero derivation into a discontinuous phrase-based translation path by the following two steps:

Decoding

Chiang (2007) applied bottom up chart parsing to parse the source sentence and project on the target side for the best translation.

Experiment Results

In all experiments we use phrase-orientation lexicalized reordering (Galley and Manning, 2008)2 which models monotone, swap, discontinuous orientations from both reordering with preVious phrase pair and with the next phrase pair.

Topics

phrase-based

Appears in 69 sentences as: Phrase-Based (1) Phrase-based (5) phrase-based (69)
In Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation
  1. Hiero translation models have two limitations compared to phrase-based models: 1) Limited hypothesis space; 2) No lexicalized reordering model.
    Page 1, “Abstract”
  2. Phrasal-Hiero still has the same hypothesis space as the original Hiero but incorporates a phrase-based distance cost feature and lexicalized reodering features into the chart decoder.
    Page 1, “Abstract”
  3. The work consists of two parts: 1) for each Hiero translation derivation, find its corresponding discontinuous phrase-based path.
    Page 1, “Abstract”
  4. 2) Extend the chart decoder to incorporate features from the phrase-based path.
    Page 1, “Abstract”
  5. We achieve significant improvement over both Hiero and phrase-based baselines for Arabic-English, Chinese-English and German-English translation.
    Page 1, “Abstract”
  6. Phrase-based and tree-based translation model are the two main streams in state-of-the-art machine translation.
    Page 1, “Introduction”
  7. Yet, tree-based translation often underperforms phrase-based translation in language pairs with short range reordering such as Arabic-English translation (Zollmann et al., 2008; Birch et al., 2009).
    Page 1, “Introduction”
  8. (2003) for our phrase-based system and Chiang (2005) for our Hiero system.
    Page 1, “Introduction”
  9. o a is a translation path of f. In the phrase-based system, aph represents a segmentation of e and f and a correspondance of phrases.
    Page 1, “Introduction”
  10. o (H (f) is the hypothesis space of the sentence f. We denote {th (f) as the phrase-based hypothesis space of f and Il-Ctr (f) as its tree-based hypothesis space.
    Page 1, “Introduction”
  11. Galley and Manning (2010) point out that due to the hard constraints of rule combination, the tree-based system does not have the same excessive hypothesis space as the phrase-based system.
    Page 1, “Introduction”

See all papers in Proc. ACL 2013 that mention phrase-based.

See all papers in Proc. ACL that mention phrase-based.

Back to top.

phrase pair

Appears in 51 sentences as: phrase pair (31) Phrase Pairs (1) Phrase pairs (2) phrase pairs (31)
In Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation
  1. However, these phrase pairs with gaps do not capture structure reordering as do Hiero rules with nonterminal mappings.
    Page 2, “Introduction”
  2. From this Hiero derivation, we have a segmentation of the sentence pairs into phrase pairs according to the word alignments, as shown on the left side of Figure 1.
    Page 2, “Introduction”
  3. Ordering these phrase pairs according the word sequence on the target side, shown on the right side of Figure l, we have a phrase-based translation path consisting of four phrase pairs : (je, , (ne .
    Page 2, “Introduction”
  4. Note that even though the Hiero decoder uses a composition of three rules, the corresponding phrase-based path consists of four phrase pairs .
    Page 2, “Introduction”
  5. Training: Represent each rule as a sequence of phrase pairs and nonterminals.
    Page 3, “Phrasal-Hiero Model”
  6. Decoding: Use the rules’ sequences of phrase pairs and nonterminals to find the corresponding phrase-based path of a Hiero derivation and calculate its feature scores.
    Page 3, “Phrasal-Hiero Model”
  7. 2.1 Map Rule to A Sequence of Phrase Pairs and Nonterminals
    Page 3, “Phrasal-Hiero Model”
  8. We segment the rules’ lexical items into phrase pairs .
    Page 3, “Phrasal-Hiero Model”
  9. These phrase pairs will be part of the phrase-based translation path in the decoding step.
    Page 3, “Phrasal-Hiero Model”
  10. The rules’ nonterminals are also preserved in the sequence, during the decoding they will be substituted by other rules’ phrase pairs .
    Page 3, “Phrasal-Hiero Model”
  11. We now explain how to map a rule to a sequence of phrase pairs and nonterminals.
    Page 3, “Phrasal-Hiero Model”

See all papers in Proc. ACL 2013 that mention phrase pair.

See all papers in Proc. ACL that mention phrase pair.

Back to top.

lexicalized

Appears in 36 sentences as: Lexicalized (1) lexicalized (34) lexicals (3)
In Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation
  1. Hiero translation models have two limitations compared to phrase-based models: 1) Limited hypothesis space; 2) No lexicalized reordering model.
    Page 1, “Abstract”
  2. Phrasal-Hiero still has the same hypothesis space as the original Hiero but incorporates a phrase-based distance cost feature and lexicalized reodering features into the chart decoder.
    Page 1, “Abstract”
  3. Most phrase-based systems are equipped with a distance reordering cost feature to tune the system towards the right amount of reordering, but then also a lexicalized reordering
    Page 1, “Introduction”
  4. It does not have the expressive lexicalized reordering model and distance cost features of the phrase-based system.
    Page 2, “Introduction”
  5. If we look at the leaves of a Hiero derivation tree, the lexicals also form a segmentation of the source and target sentence, thus also form a discontinuous phrase-based translation path.
    Page 2, “Introduction”
  6. 2.2 Training: Lexicalized Reordering Table
    Page 4, “Phrasal-Hiero Model”
  7. Phrasal-Hiero needs a phrase-based lexicalized reordering table to calculate the features.
    Page 4, “Phrasal-Hiero Model”
  8. The lexicalized reordering table could be from a discontinuous phrase-based system.
    Page 4, “Phrasal-Hiero Model”
  9. To guarantee the lexicalized reordering table to cover all phrase pairs of the rule table, we extract phrase-pairs and their reordering directions during rule extraction.
    Page 4, “Phrasal-Hiero Model”
  10. Phrase pairs are generated together with phrase-based reordering orientations to build lexicalized reordering table.
    Page 5, “Phrasal-Hiero Model”
  11. In all experiments we use phrase-orientation lexicalized reordering (Galley and Manning, 2008)2 which models monotone, swap, discontinuous orientations from both reordering with preVious phrase pair and with the next phrase pair.
    Page 5, “Experiment Results”

See all papers in Proc. ACL 2013 that mention lexicalized.

See all papers in Proc. ACL that mention lexicalized.

Back to top.

BLEU

Appears in 24 sentences as: BLEU (25)
In Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation
  1. Compare BLEU scores of translation using all extracted rules (the first row) and translation using only rules without nonaligned subphrases (the second row).
    Page 4, “Phrasal-Hiero Model”
  2. We tuned the parameters on the MT06 NIST test set (1664 sentences) and report the BLEU scores on three unseen test sets: MT04 (1353 sentences), MT05 (1056 sentences) and MT09 (1313 sentences).
    Page 6, “Experiment Results”
  3. On average the improvement is 1.07 BLEU score (45.66
    Page 6, “Experiment Results”
  4. Table 4: Arabic-English true case translation scores in BLEU metric.
    Page 6, “Experiment Results”
  5. (2008), i.e, that the Hiero baseline system underperforms compared to the phrase-based system with lexicalized phrase-based reordering for Arabic-English in all test sets, on average by about 0.60 BLEU points (46.13 versus 46.73).
    Page 6, “Experiment Results”
  6. As mentioned in section 2.1, Phrasal-Hiero only uses 48.54% of the rules but achieves as good or even better performance (on average 0.24 BLEU points better) compared to the original Hiero system using the full set of rules.
    Page 6, “Experiment Results”
  7. Table 4 shows that the P.H.+lex system gains on average 0.67 BLEU points (47.04 versus 46.37).
    Page 7, “Experiment Results”
  8. On average P.H.+dist+lex improves 0.90 BLEU points over P.H.
    Page 7, “Experiment Results”
  9. without new phrase-based features and 1.14 BLEU score over the baseline Hiero system.
    Page 7, “Experiment Results”
  10. Note that Hiero rules already have lexical context in the reordering, but adding phrase-based lexicalized reordering features to the system still gives us about as much improvement as the phrase-based system gets from lexicalized reordering features, here 1.07 BLEU points.
    Page 7, “Experiment Results”
  11. And our best Phrasal-Hiero significantly improves over the best phrase-based baseline by 0.54 BLEU points.
    Page 7, “Experiment Results”

See all papers in Proc. ACL 2013 that mention BLEU.

See all papers in Proc. ACL that mention BLEU.

Back to top.

Chinese-English

Appears in 15 sentences as: Chinese-English (15)
In Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation
  1. We achieve significant improvement over both Hiero and phrase-based baselines for Arabic-English, Chinese-English and German-English translation.
    Page 1, “Abstract”
  2. In our Chinese-English experiment, the Hiero system still outperforms the discontinuous phrase-based system.
    Page 2, “Introduction”
  3. (2008) added structure distortion features into their decoder and showed improvements in their Chinese-English experiment.
    Page 2, “Introduction”
  4. In the experiment section, we will discuss the impact of removing rules with nonaligned sub-phrases in our German-English and Chinese-English experiments.
    Page 4, “Phrasal-Hiero Model”
  5. We will report the impact of integrating phrase-based features into Hiero systems for three language pairs: Arabic-English, Chinese-English and German-English.
    Page 5, “Experiment Results”
  6. 4.3 Chinese-English Results
    Page 7, “Experiment Results”
  7. The Chinese-English system was trained on FBIS corpora of 384K sentence pairs, the English corpus is lower case.
    Page 7, “Experiment Results”
  8. (2008) on the baselines for Chinese-English translation.
    Page 7, “Experiment Results”
  9. Table 5: Chinese-English lower case translation scores in BLEU metric.
    Page 7, “Experiment Results”
  10. (2008) in their Chinese-English experiment, we benefit by adding the distance cost feature.
    Page 7, “Experiment Results”
  11. This shows that a strong Chinese-English Hiero system still benefits from phrase-based features.
    Page 7, “Experiment Results”

See all papers in Proc. ACL 2013 that mention Chinese-English.

See all papers in Proc. ACL that mention Chinese-English.

Back to top.

BLEU points

Appears in 12 sentences as: BLEU points (13)
In Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation
  1. (2008), i.e, that the Hiero baseline system underperforms compared to the phrase-based system with lexicalized phrase-based reordering for Arabic-English in all test sets, on average by about 0.60 BLEU points (46.13 versus 46.73).
    Page 6, “Experiment Results”
  2. As mentioned in section 2.1, Phrasal-Hiero only uses 48.54% of the rules but achieves as good or even better performance (on average 0.24 BLEU points better) compared to the original Hiero system using the full set of rules.
    Page 6, “Experiment Results”
  3. Table 4 shows that the P.H.+lex system gains on average 0.67 BLEU points (47.04 versus 46.37).
    Page 7, “Experiment Results”
  4. On average P.H.+dist+lex improves 0.90 BLEU points over P.H.
    Page 7, “Experiment Results”
  5. Note that Hiero rules already have lexical context in the reordering, but adding phrase-based lexicalized reordering features to the system still gives us about as much improvement as the phrase-based system gets from lexicalized reordering features, here 1.07 BLEU points .
    Page 7, “Experiment Results”
  6. And our best Phrasal-Hiero significantly improves over the best phrase-based baseline by 0.54 BLEU points .
    Page 7, “Experiment Results”
  7. Even though the phrase-based system benefits from lexicalized reordering, PB+lex on average outperforms PB+nolex by 1.16 BLEU points (25.87 versus 27.03), it is the Hiero system that has the best baseline scores across all test sets, with and average of 27.70 BLEU points .
    Page 7, “Experiment Results”
  8. It uses 84.19% of the total training rules, but unlike the Arabic-English system, using a subset of the rules costs Phrasal-Hiero on all test sets and on average it loses 0.49 BLEU points (27.21 versus 27.70).
    Page 7, “Experiment Results”
  9. We have better improvements when adding the six features of the lexicalized reordering model: P.H.+lex on average has 28.05 BLEU points , i.e.
    Page 7, “Experiment Results”
  10. The P.H+dist+lex has the best score across all the test sets and on average gains 1.14 BLEU points over P.H.
    Page 7, “Experiment Results”
  11. The Hiero baseline performs on average 0.26 BLEU points better than the phrase-based system with lexicalized reordering features (PB+lex).
    Page 8, “Experiment Results”

See all papers in Proc. ACL 2013 that mention BLEU points.

See all papers in Proc. ACL that mention BLEU points.

Back to top.

sentence pair

Appears in 10 sentences as: sentence pair (6) sentence pairs (4)
In Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation
  1. From this Hiero derivation, we have a segmentation of the sentence pairs into phrase pairs according to the word alignments, as shown on the left side of Figure 1.
    Page 2, “Introduction”
  2. In the rule X —> Je X1 [6 Francais ; I X1 french extract from sentence pair in Figure l, the phrase le Frangais connects to the phrase french because the French word Frangais aligns with the English word french even though le is unaligned.
    Page 4, “Phrasal-Hiero Model”
  3. Figure 2: Alignment of a sentence pair .
    Page 4, “Phrasal-Hiero Model”
  4. For exam-pleintheruler4 = X —> je X1 le X2 ; 2' X1 X2 extracted from the sentence pair in Figure 2, the phrase le is not aligned.
    Page 4, “Phrasal-Hiero Model”
  5. Let (5,13) be a sentence pair in the training data and 7“ = X —> 80X181...Xk8k ; t0X1t1...thk be a rule extracted from the sentence.
    Page 4, “Phrasal-Hiero Model”
  6. For example, the training sentence pair in Figure 2 generates the rule 72 = X —> ne X1 pas ; don’t X1 spanning (l .
    Page 4, “Phrasal-Hiero Model”
  7. Look at the example of the training sentence pair in Figure 2, the rule X —> je ; Ispanning (0...1,0...1) andtheruleX —> je X1 ; I X1 spanning (0...3,0...2) are both sharing the same lexical phrase pair (je, spanning (0 .
    Page 4, “Phrasal-Hiero Model”
  8. The Arabic-English system was trained from 264K sentence pairs with true case English.
    Page 6, “Experiment Results”
  9. The Chinese-English system was trained on FBIS corpora of 384K sentence pairs , the English corpus is lower case.
    Page 7, “Experiment Results”
  10. The systems were trained on 1.8 million sentence pairs using the Europarl corpora.
    Page 7, “Experiment Results”

See all papers in Proc. ACL 2013 that mention sentence pair.

See all papers in Proc. ACL that mention sentence pair.

Back to top.

BLEU score

Appears in 8 sentences as: BLEU score (5) BLEU scores (3)
In Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation
  1. Compare BLEU scores of translation using all extracted rules (the first row) and translation using only rules without nonaligned subphrases (the second row).
    Page 4, “Phrasal-Hiero Model”
  2. We tuned the parameters on the MT06 NIST test set (1664 sentences) and report the BLEU scores on three unseen test sets: MT04 (1353 sentences), MT05 (1056 sentences) and MT09 (1313 sentences).
    Page 6, “Experiment Results”
  3. On average the improvement is 1.07 BLEU score (45.66
    Page 6, “Experiment Results”
  4. without new phrase-based features and 1.14 BLEU score over the baseline Hiero system.
    Page 7, “Experiment Results”
  5. The German-English translations on average gain 0.38 BLEU score by adding both distance cost and discriminative reordering features.
    Page 8, “Experiment Results”
  6. The first block shows an improvement of 0.32 BLEU score when adding discriminated reordering features on Hiero (using the whole set of rules and no rule segmentation).
    Page 8, “Experiment Results”
  7. Here the improvement of P.H+lex over P.H is 0.67 BLEU score .
    Page 8, “Experiment Results”
  8. The numbers are average BLEU scores of all test sets.
    Page 9, “Experiment Results”

See all papers in Proc. ACL 2013 that mention BLEU score.

See all papers in Proc. ACL that mention BLEU score.

Back to top.

translation model

Appears in 5 sentences as: translation model (3) translation models (2)
In Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation
  1. Hiero translation models have two limitations compared to phrase-based models: 1) Limited hypothesis space; 2) No lexicalized reordering model.
    Page 1, “Abstract”
  2. Phrase-based and tree-based translation model are the two main streams in state-of-the-art machine translation.
    Page 1, “Introduction”
  3. The tree-based translation model , by using a synchronous context-free grammar formalism, can capture longer reordering between source and target language.
    Page 1, “Introduction”
  4. Many features are shared between phrase-based and tree-based systems including language model, word count, and translation model features.
    Page 1, “Introduction”
  5. When comparing phrase-based and Hiero translation models , most of previous work on tree-based translation addresses its limited hypothesis space problem.
    Page 2, “Introduction”

See all papers in Proc. ACL 2013 that mention translation model.

See all papers in Proc. ACL that mention translation model.

Back to top.

language model

Appears in 4 sentences as: language model (5) language models (1)
In Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation
  1. Many features are shared between phrase-based and tree-based systems including language model , word count, and translation model features.
    Page 1, “Introduction”
  2. The language model is the interpolation of 5-gram language models built from news corpora of the NIST 2012 evaluation.
    Page 6, “Experiment Results”
  3. The language model is the trigram SRI language model built from Xinhua corpus of 180 millions words.
    Page 7, “Experiment Results”
  4. The language model is three-gram SRILM trained from the target side of the training corpora.
    Page 7, “Experiment Results”

See all papers in Proc. ACL 2013 that mention language model.

See all papers in Proc. ACL that mention language model.

Back to top.

significant improvement

Appears in 4 sentences as: significant improvement (2) significant improvements (1) significantly improves (1)
In Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation
  1. We achieve significant improvement over both Hiero and phrase-based baselines for Arabic-English, Chinese-English and German-English translation.
    Page 1, “Abstract”
  2. Phrase-based with lexicalized reordering fea-tures(PB+leX) shows significant improvement on all test sets over the simple phrase-based system without lexicalized reordering (PB +nolex).
    Page 6, “Experiment Results”
  3. distance-based reordering feature (P.H+dist) to the Arabic-English experiment but get significant improvements when adding the six features of the lexicalized reordering (P.H+lex).
    Page 7, “Experiment Results”
  4. And our best Phrasal-Hiero significantly improves over the best phrase-based baseline by 0.54 BLEU points.
    Page 7, “Experiment Results”

See all papers in Proc. ACL 2013 that mention significant improvement.

See all papers in Proc. ACL that mention significant improvement.

Back to top.

language pairs

Appears in 3 sentences as: language pairs (3)
In Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation
  1. Yet, tree-based translation often underperforms phrase-based translation in language pairs with short range reordering such as Arabic-English translation (Zollmann et al., 2008; Birch et al., 2009).
    Page 1, “Introduction”
  2. This is important for language pairs with strict reordering.
    Page 2, “Introduction”
  3. We will report the impact of integrating phrase-based features into Hiero systems for three language pairs : Arabic-English, Chinese-English and German-English.
    Page 5, “Experiment Results”

See all papers in Proc. ACL 2013 that mention language pairs.

See all papers in Proc. ACL that mention language pairs.

Back to top.

NIST

Appears in 3 sentences as: NIST (3)
In Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation
  1. The language model is the interpolation of 5-gram language models built from news corpora of the NIST 2012 evaluation.
    Page 6, “Experiment Results”
  2. We tuned the parameters on the MT06 NIST test set (1664 sentences) and report the BLEU scores on three unseen test sets: MT04 (1353 sentences), MT05 (1056 sentences) and MT09 (1313 sentences).
    Page 6, “Experiment Results”
  3. We tuned the parameters on MT06 NIST test set of 1664 sentences and report the results of MT04, MT05 and MT08 unseen test sets.
    Page 7, “Experiment Results”

See all papers in Proc. ACL 2013 that mention NIST.

See all papers in Proc. ACL that mention NIST.

Back to top.