Head-Driven Hierarchical Phrase-based Translation
Li, Junhui and Tu, Zhaopeng and Zhou, Guodong and van Genabith, Josef

Article Structure

Abstract

This paper presents an extension of Chiang’s hierarchical phrase-based (HPB) model, called Head—Driven HPB (HD—HPB), which incorporates head information in translation rules to better capture syntax-driven information, as well as improved reordering between any two neighboring non-terminals at any stage of a derivation to explore a larger reordering search space.

Introduction

Chiang’s hierarchical phrase-based (HPB) translation model utilizes synchronous context free grammar (SCFG) for translation derivation (Chiang, 2005; Chiang, 2007) and has been widely adopted in statistical machine translation (SMT).

Head-Driven HPB Translation Model

Like Chiang (2005) and Chiang (2007), our HD-HPB translation model adopts a synchronous context free grammar, a rewriting system which generates source and target side string pairs simultaneously using a context-free grammar.

Experiments

We evaluate the performance of our HD-HPB model and compare it with our implementation of Chiang’s HPB model (Chiang, 2007), a source-side SAMT—style refined version of HPB (SAMT—HPB), and the Moses implementation of HPB.

Conclusion

We present a head-driven hierarchical phrase-based (HD-HPB) translation model, which adopts head information (derived through unlabeled dependency analysis) in the definition of non-terminals to better differentiate among translation rules.

Topics

translation model

Appears in 6 sentences as: translation model (5) translation models (1)
In Head-Driven Hierarchical Phrase-based Translation
  1. Chiang’s hierarchical phrase-based (HPB) translation model utilizes synchronous context free grammar (SCFG) for translation derivation (Chiang, 2005; Chiang, 2007) and has been widely adopted in statistical machine translation (SMT).
    Page 1, “Introduction”
  2. However, the two approaches are not mutually exclusive, as we could also include a set of syntax-driven features into our translation model .
    Page 1, “Introduction”
  3. Like Chiang (2005) and Chiang (2007), our HD-HPB translation model adopts a synchronous context free grammar, a rewriting system which generates source and target side string pairs simultaneously using a context-free grammar.
    Page 2, “Head-Driven HPB Translation Model”
  4. For rule extraction, we first identify initial phrase pairs on word-aligned sentence pairs by using the same criterion as most phrase-based translation models (Och and Ney, 2004) and Chiang’s HPB model (Chiang, 2005; Chiang, 2007).
    Page 2, “Head-Driven HPB Translation Model”
  5. Merging two neighboring non-terminals into a single nonterminal, NRRs enable the translation model to explore a wider search space.
    Page 3, “Head-Driven HPB Translation Model”
  6. We present a head-driven hierarchical phrase-based (HD-HPB) translation model , which adopts head information (derived through unlabeled dependency analysis) in the definition of non-terminals to better differentiate among translation rules.
    Page 4, “Conclusion”

See all papers in Proc. ACL 2012 that mention translation model.

See all papers in Proc. ACL that mention translation model.

Back to top.

BLEU

Appears in 5 sentences as: BLEU (6)
In Head-Driven Hierarchical Phrase-based Translation
  1. Experiments on Chinese—English translation on four NIST MT test sets show that the HD—HPB model significantly outperforms Chiang’s model with average gains of 1.91 points absolute in BLEU .
    Page 1, “Abstract”
  2. For evaluation, the NIST BLEU script (version 12) with the default settings is used to calculate the BLEU scores.
    Page 4, “Experiments”
  3. Table 3 lists the translation performance with BLEU scores.
    Page 4, “Experiments”
  4. Table 3 shows that our HD-HPB model significantly outperforms Chiang’s HPB model with an average improvement of 1.91 in BLEU (and similar improvements over Moses HPB).
    Page 4, “Experiments”
  5. Table 3: BLEU (%) scores of different models.
    Page 4, “Experiments”

See all papers in Proc. ACL 2012 that mention BLEU.

See all papers in Proc. ACL that mention BLEU.

Back to top.

phrase-based

Appears in 5 sentences as: Phrase-based (1) phrase-based (4)
In Head-Driven Hierarchical Phrase-based Translation
  1. This paper presents an extension of Chiang’s hierarchical phrase-based (HPB) model, called Head—Driven HPB (HD—HPB), which incorporates head information in translation rules to better capture syntax-driven information, as well as improved reordering between any two neighboring non-terminals at any stage of a derivation to explore a larger reordering search space.
    Page 1, “Abstract”
  2. Chiang’s hierarchical phrase-based (HPB) translation model utilizes synchronous context free grammar (SCFG) for translation derivation (Chiang, 2005; Chiang, 2007) and has been widely adopted in statistical machine translation (SMT).
    Page 1, “Introduction”
  3. [ Phrase-based Translation
    Page 1, “Introduction”
  4. For rule extraction, we first identify initial phrase pairs on word-aligned sentence pairs by using the same criterion as most phrase-based translation models (Och and Ney, 2004) and Chiang’s HPB model (Chiang, 2005; Chiang, 2007).
    Page 2, “Head-Driven HPB Translation Model”
  5. We present a head-driven hierarchical phrase-based (HD-HPB) translation model, which adopts head information (derived through unlabeled dependency analysis) in the definition of non-terminals to better differentiate among translation rules.
    Page 4, “Conclusion”

See all papers in Proc. ACL 2012 that mention phrase-based.

See all papers in Proc. ACL that mention phrase-based.

Back to top.

NIST

Appears in 4 sentences as: NIST (5)
In Head-Driven Hierarchical Phrase-based Translation
  1. Experiments on Chinese—English translation on four NIST MT test sets show that the HD—HPB model significantly outperforms Chiang’s model with average gains of 1.91 points absolute in BLEU.
    Page 1, “Abstract”
  2. Experiments on Chinese-English translation using four NIST MT test sets show that our HD-HPB model significantly outperforms Chiang’s HPB as well as a SAMT—style refined version of HPB.
    Page 2, “Introduction”
  3. We train our model on a dataset with ~1.5M sentence pairs from the LDC dataset.2 We use the 2002 NIST MT evaluation test data (878 sentence pairs) as the development data, and the 2003, 2004, 2005, 2006-news NIST MT evaluation test data (919, 1788, 1082, and 616 sentence pairs, respectively) as the test data.
    Page 3, “Experiments”
  4. For evaluation, the NIST BLEU script (version 12) with the default settings is used to calculate the BLEU scores.
    Page 4, “Experiments”

See all papers in Proc. ACL 2012 that mention NIST.

See all papers in Proc. ACL that mention NIST.

Back to top.

phrase pairs

Appears in 4 sentences as: phrase pair (1) phrase pairs (3)
In Head-Driven Hierarchical Phrase-based Translation
  1. For rule extraction, we first identify initial phrase pairs on word-aligned sentence pairs by using the same criterion as most phrase-based translation models (Och and Ney, 2004) and Chiang’s HPB model (Chiang, 2005; Chiang, 2007).
    Page 2, “Head-Driven HPB Translation Model”
  2. We extract HD-HRs and NRRs based on initial phrase pairs , respectively.
    Page 2, “Head-Driven HPB Translation Model”
  3. We look for initial phrase pairs that contain other phrases and then replace sub-phrases with POS tags corresponding to their heads.
    Page 2, “Head-Driven HPB Translation Model”
  4. Given an initial phrase pair on the source side, there are four possible positional relationships for their target side translations (we use Y as a variable for non-terminals on the source side while all non-terminals on the target side are labeled as X):
    Page 2, “Head-Driven HPB Translation Model”

See all papers in Proc. ACL 2012 that mention phrase pairs.

See all papers in Proc. ACL that mention phrase pairs.

Back to top.

POS tags

Appears in 4 sentences as: POS tag (1) POS tags (3)
In Head-Driven Hierarchical Phrase-based Translation
  1. Here, each Chinese word is attached with its POS tag and Pinyin.
    Page 2, “Introduction”
  2. Instead of collapsing all non-terminals in the source language into a single symbol X as in Chiang (2007), given a word sequence f2- from position i to position 3', we first find heads and then concatenate the POS tags of these heads as fé’s nonterminal symbol.
    Page 2, “Head-Driven HPB Translation Model”
  3. We look for initial phrase pairs that contain other phrases and then replace sub-phrases with POS tags corresponding to their heads.
    Page 2, “Head-Driven HPB Translation Model”
  4. Examining translation rules extracted from the training data shows that there are 72,366 types of non-terminals with respect to 33 types of POS tags .
    Page 4, “Experiments”

See all papers in Proc. ACL 2012 that mention POS tags.

See all papers in Proc. ACL that mention POS tags.

Back to top.

word alignment

Appears in 4 sentences as: word alignment (2) word alignments (2)
In Head-Driven Hierarchical Phrase-based Translation
  1. Figure 1: An example word alignment for a Chinese-English sentence pair with the dependency parse tree for the Chinese sentence.
    Page 2, “Introduction”
  2. Given the word alignment in Figure 1, Table 1 demonstrates the difference between hierarchical rules in Chiang (2007) and HD-HRs defined here.
    Page 2, “Head-Driven HPB Translation Model”
  3. For Moses HPB, we use “grow-diag-final-and” to obtain symmetric word alignments , 10 for the maximum phrase length, and the recommended default values for all other parameters.
    Page 3, “Experiments”
  4. We obtain the word alignments by running
    Page 3, “Experiments”

See all papers in Proc. ACL 2012 that mention word alignment.

See all papers in Proc. ACL that mention word alignment.

Back to top.

Chinese-English

Appears in 3 sentences as: Chinese-English (3)
In Head-Driven Hierarchical Phrase-based Translation
  1. Figure 1: An example word alignment for a Chinese-English sentence pair with the dependency parse tree for the Chinese sentence.
    Page 2, “Introduction”
  2. Experiments on Chinese-English translation using four NIST MT test sets show that our HD-HPB model significantly outperforms Chiang’s HPB as well as a SAMT—style refined version of HPB.
    Page 2, “Introduction”
  3. Experimental results on Chinese-English translation across four test sets demonstrate significant improvements of the HD-HPB model over both Chiang’s HPB and a source-side SAMT—style refined version of HPB.
    Page 5, “Conclusion”

See all papers in Proc. ACL 2012 that mention Chinese-English.

See all papers in Proc. ACL that mention Chinese-English.

Back to top.

sentence pairs

Appears in 3 sentences as: sentence pair (1) sentence pairs (4)
In Head-Driven Hierarchical Phrase-based Translation
  1. Figure 1: An example word alignment for a Chinese-English sentence pair with the dependency parse tree for the Chinese sentence.
    Page 2, “Introduction”
  2. For rule extraction, we first identify initial phrase pairs on word-aligned sentence pairs by using the same criterion as most phrase-based translation models (Och and Ney, 2004) and Chiang’s HPB model (Chiang, 2005; Chiang, 2007).
    Page 2, “Head-Driven HPB Translation Model”
  3. We train our model on a dataset with ~1.5M sentence pairs from the LDC dataset.2 We use the 2002 NIST MT evaluation test data (878 sentence pairs) as the development data, and the 2003, 2004, 2005, 2006-news NIST MT evaluation test data (919, 1788, 1082, and 616 sentence pairs , respectively) as the test data.
    Page 3, “Experiments”

See all papers in Proc. ACL 2012 that mention sentence pairs.

See all papers in Proc. ACL that mention sentence pairs.

Back to top.

significantly outperforms

Appears in 3 sentences as: significantly outperforms (3)
In Head-Driven Hierarchical Phrase-based Translation
  1. Experiments on Chinese—English translation on four NIST MT test sets show that the HD—HPB model significantly outperforms Chiang’s model with average gains of 1.91 points absolute in BLEU.
    Page 1, “Abstract”
  2. Experiments on Chinese-English translation using four NIST MT test sets show that our HD-HPB model significantly outperforms Chiang’s HPB as well as a SAMT—style refined version of HPB.
    Page 2, “Introduction”
  3. Table 3 shows that our HD-HPB model significantly outperforms Chiang’s HPB model with an average improvement of 1.91 in BLEU (and similar improvements over Moses HPB).
    Page 4, “Experiments”

See all papers in Proc. ACL 2012 that mention significantly outperforms.

See all papers in Proc. ACL that mention significantly outperforms.

Back to top.

translation probabilities

Appears in 3 sentences as: translation probabilities (2) translation probability (1)
In Head-Driven Hierarchical Phrase-based Translation
  1. o Phd_h7- and Phde- (3|t), translation probabilities for HD—HRs;
    Page 3, “Head-Driven HPB Translation Model”
  2. Plea, and Plea, (3|t), lexical translation probabilities for HD—HRs;
    Page 3, “Head-Driven HPB Translation Model”
  3. Ptyhcbhr = escp (—1), rule penalty for HD—HRs; PW (t|s), translation probability for NRRs;
    Page 3, “Head-Driven HPB Translation Model”

See all papers in Proc. ACL 2012 that mention translation probabilities.

See all papers in Proc. ACL that mention translation probabilities.

Back to top.