A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation
Sun, Jun and Zhang, Min and Tan, Chew Lim

Article Structure

Abstract

The tree sequence based translation model allows the violation of syntactic boundaries in a rule to capture non-syntactic phrases, where a tree sequence is a contiguous sequence of sub-trees.

Introduction

Current research in statistical machine translation (SMT) mostly settles itself in the domain of either phrase-based or syntax-based.

NonContiguous Tree sequence Align-ment-based Model

In this section, we give a formal definition of SncTSSG and accordingly we propose the alignment based translation model.

Tree Sequence Pair Extraction

In training, other than the contiguous tree sequence pairs, we extract the noncontiguous ones as well.

The Pisces decoder

We implement our decoder Pisces by simulating the span based CYK parser constrained by the rules of SncTSSG.

Experiments

5.1 Experimental Settings

Conclusions and Future Work

In this paper, we present a noncontiguous tree sequence alignment model based on SncTSSG to enhance the ability of noncontiguous phrase modeling and the reordering caused by noncontiguous constituents with large gaps.

Topics

word alignment

Appears in 11 sentences as: word alignment (6) word alignments (1) words aligned (4)
In A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation
  1. (2006) statistically report that discontinuities are very useful for translational equivalence analysis using binary branching structures under word alignment and parse tree constraints.
    Page 1, “Introduction”
  2. Data structure: p[j1, jg] to store tree sequence pairs covering source SpanU1,j2] 1: foreach source span [j1,j2], do 2: find a target span [i1,i2] with minimal length covering all the target words aligned to [j1,j2] 3: if all the target words in [i1,i2] are aligned with source words only in [j1,j2], then 4: Pair each source tree sequence covering [j1,j2] with those in target covering [i1,i2] as a contiguous tree sequence pair
    Page 4, “Tree Sequence Pair Extraction”
  3. 7: create sub-span set s([i1,i2]) to cover all the target words aligned to [j1,j2]
    Page 4, “Tree Sequence Pair Extraction”
  4. 13: find a source span [1'], jg] with minimal length covering all the source words aligned to [i1,i2]
    Page 4, “Tree Sequence Pair Extraction”
  5. 15: create sub-span set s([j1, j2]) to cover all the source words aligned to [i1,i2] 16: Pair each source tree sequence covering s([j1,
    Page 4, “Tree Sequence Pair Extraction”
  6. (2006) also reports that allowing gaps in one side only is enough to eliminate the hierarchical alignment failure with word alignment and one side parse tree constraints.
    Page 4, “Tree Sequence Pair Extraction”
  7. We base on the m-to-n word alignments dumped by GIZA++ to extract the tree sequence pairs.
    Page 6, “Experiments”
  8. The STSSG or any contiguous translational equivalence based model is unable to attain the corresponding target output for this idiom word via the noncontiguous word alignment and consider it as an out-of—vocabulary (OOV).
    Page 8, “Experiments”
  9. On the contrary, the SncTSSG based model can capture the noncontiguous tree sequence pair consistent with the word alignment and further provide a reasonable target translation.
    Page 8, “Experiments”
  10. Besides, SncTSSG is less sensitive to the error of word alignment when extracting the translation candidates than the contiguous translational equivalence based models.
    Page 8, “Experiments”
  11. Although the characteristic of more sensitiveness to word alignment error enables SncTSSG to capture the additional noncontiguous language phenomenon, it also induces many redundant noncontiguous rules.
    Page 8, “Conclusions and Future Work”

See all papers in Proc. ACL 2009 that mention word alignment.

See all papers in Proc. ACL that mention word alignment.

Back to top.

phrase-based

Appears in 8 sentences as: phrase-based (8)
In A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation
  1. Current research in statistical machine translation (SMT) mostly settles itself in the domain of either phrase-based or syntax-based.
    Page 1, “Introduction”
  2. Between them, the phrase-based approach (Marcu and Wong, 2002; Koehn et a1, 2003; Och and Ney, 2004) allows local reordering and contiguous phrase translation.
    Page 1, “Introduction”
  3. However, it is hard for phrase-based models to learn global reorderings and to deal with noncontiguous phrases.
    Page 1, “Introduction”
  4. Consequently, this distortional operation, like phrase-based models, is much more flexible in the order of the target constituents than the traditional syntax-based models which are limited by the syntactic structure.
    Page 6, “The Pisces decoder”
  5. We compare the SncTSSG based model against two baseline models: the phrase-based and the STSSG-based models.
    Page 6, “Experiments”
  6. For the phrase-based model, we use Moses (Koehn et al, 2007) with its default settings; for the STSSG and SncTSSG based models we use our decoder Pisces by setting the following parameters: d = 4, h = 6, C = 6, I: 6, a = 50, ,8 = 50.
    Page 6, “Experiments”
  7. Table 3 explores the contribution of the noncontiguous translational equivalence to phrase-based models (all the rules in Table 3 has no grammar tags, but a gap <***> is allowed in the last three rows).
    Page 7, “Experiments”
  8. The experimental results show that our model outperforms the baseline models and verify the effectiveness of noncontiguous translational equivalences to noncontiguous phrase modeling in both syntax-based and phrase-based systems.
    Page 8, “Conclusions and Future Work”

See all papers in Proc. ACL 2009 that mention phrase-based.

See all papers in Proc. ACL that mention phrase-based.

Back to top.

translation model

Appears in 8 sentences as: Translation Model (1) translation model (7)
In A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation
  1. The tree sequence based translation model allows the violation of syntactic boundaries in a rule to capture non-syntactic phrases, where a tree sequence is a contiguous sequence of sub-trees.
    Page 1, “Abstract”
  2. This paper goes further to present a translation model based on noncontiguous tree sequence alignment, where a noncontiguous tree sequence is a sequence of sub-trees and gaps.
    Page 1, “Abstract”
  3. Bod (2007) also finds that discontinues phrasal rules make significant improvement in linguistically motivated STSG-based translation model .
    Page 1, “Introduction”
  4. 2 We illustrate the rule extraction with an example from the tree-to-tree translation model based on tree sequence alignment (Zhang et al, 2008a) without losing of generality to most syntactic tree based models.
    Page 1, “Introduction”
  5. To address this issue, we propose a syntactic translation model based on noncontiguous tree sequence alignment.
    Page 2, “Introduction”
  6. In this section, we give a formal definition of SncTSSG and accordingly we propose the alignment based translation model .
    Page 2, “NonContiguous Tree sequence Align-ment-based Model”
  7. 2.2 SncTSSG based Translation Model
    Page 4, “NonContiguous Tree sequence Align-ment-based Model”
  8. In the experiments, we train the translation model on FBIS corpus (7.2M (Chinese) + 9.2M (English) words) and train a 4-gram language model on the Xinhua portion of the English Gigaword corpus (181M words) using the SRILM Toolkits (Stolcke,
    Page 6, “Experiments”

See all papers in Proc. ACL 2009 that mention translation model.

See all papers in Proc. ACL that mention translation model.

Back to top.

BLEU

Appears in 5 sentences as: BLEU (5)
In A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation
  1. System Model BLEU Moses cBP 23.86 STSSG 25.92 SncTSSG 26.53
    Page 6, “Experiments”
  2. ID Rule Set BLEU 1 CR (STSSG) 25.92 2 CR w/o ncPR 25.87 3 CR w/o ncPR + tgtncR 26.14 4 CR w/o ncPR + srchR 26.50 5 CR w/o ncPR + src&tgtncR 26.51 6 CR + tgtnCR 26.11 7 CR + srcncR 26.56 8 cR+src&tgtncR(SncTSSG) 26.53
    Page 7, “Experiments”
  3. 2) Not only that, after comparing Exp 6,7,8 against Exp 3,4,5 respectively, we find that the ability of rules derived from noncontiguous tree sequence pairs generally covers that of the rules derived from the contiguous tree sequence pairs, due to the slight Change in BLEU score.
    Page 7, “Experiments”
  4. System Rule Set BLEU
    Page 7, “Experiments”
  5. Max gaps allowed Rule # BLEU source target 0 0 1,661,045 25.92 1 1 +841,263 26.53 2 2 +447,161 26.55 3 3 +17,7 82 26.56 00 oo +8,223 26.57
    Page 7, “Experiments”

See all papers in Proc. ACL 2009 that mention BLEU.

See all papers in Proc. ACL that mention BLEU.

Back to top.

parse tree

Appears in 4 sentences as: parse tree (3) parse trees (1)
In A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation
  1. (2006) statistically report that discontinuities are very useful for translational equivalence analysis using binary branching structures under word alignment and parse tree constraints.
    Page 1, “Introduction”
  2. Figure 2: A word-aligned parse tree pair
    Page 3, “NonContiguous Tree sequence Align-ment-based Model”
  3. Given the source and target sentence f1] and e{, as well as the corresponding parse trees T(f1]) and T(e{), our approach directly approximates the posterior probability Pr(T(e{)|T(f1])) based on the log-linear framework:
    Page 4, “NonContiguous Tree sequence Align-ment-based Model”
  4. (2006) also reports that allowing gaps in one side only is enough to eliminate the hierarchical alignment failure with word alignment and one side parse tree constraints.
    Page 4, “Tree Sequence Pair Extraction”

See all papers in Proc. ACL 2009 that mention parse tree.

See all papers in Proc. ACL that mention parse tree.

Back to top.

proposed model

Appears in 4 sentences as: proposed model (4)
In A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation
  1. Compared with the contiguous tree sequence-based model, the proposed model can well handle noncontiguous phrases with any large gaps by means of noncontiguous tree sequence alignment.
    Page 1, “Abstract”
  2. Experimental results on the NIST MT-05 Chi-nese-English translation task show that the proposed model statistically significantly outperforms the baseline systems.
    Page 1, “Abstract”
  3. With the help of the noncontiguous tree sequence, the proposed model can well capture the noncontiguous phrases in avoidance of the constraints of large applicability of context and enhance the noncontiguous constituent modeling.
    Page 2, “Introduction”
  4. As for the above example, the proposed model enables the noncontiguous tree sequence pair indexed as TSPS in Fig.
    Page 2, “Introduction”

See all papers in Proc. ACL 2009 that mention proposed model.

See all papers in Proc. ACL that mention proposed model.

Back to top.

language model

Appears in 3 sentences as: language model (3)
In A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation
  1. 2) The bi-lexical translation probabilities 3) The target language model
    Page 4, “NonContiguous Tree sequence Align-ment-based Model”
  2. On the other hand, to simplify the computation of language model , we only compute for source side contiguous translational hypothesis, while neglecting gaps in the target side if any.
    Page 6, “The Pisces decoder”
  3. In the experiments, we train the translation model on FBIS corpus (7.2M (Chinese) + 9.2M (English) words) and train a 4-gram language model on the Xinhua portion of the English Gigaword corpus (181M words) using the SRILM Toolkits (Stolcke,
    Page 6, “Experiments”

See all papers in Proc. ACL 2009 that mention language model.

See all papers in Proc. ACL that mention language model.

Back to top.

translation task

Appears in 3 sentences as: translation task (3)
In A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation
  1. Experimental results on the NIST MT-05 Chi-nese-English translation task show that the proposed model statistically significantly outperforms the baseline systems.
    Page 1, “Abstract”
  2. 6&7 as well as Exp 3&4, shows that non-contiguity in the target side in Chinese-English translation task is not so useful as that in the source side when constructing the noncontiguous phrasal rules.
    Page 7, “Experiments”
  3. We also find that in Chinese-English translation task , gaps are more effective in Chinese side than in the English side.
    Page 8, “Conclusions and Future Work”

See all papers in Proc. ACL 2009 that mention translation task.

See all papers in Proc. ACL that mention translation task.

Back to top.