A Tree Sequence Alignment-based Tree-to-Tree Translation Model
Zhang, Min and Jiang, Hongfei and Aw, Aiti and Li, Haizhou and Tan, Chew Lim and Li, Sheng

Article Structure

Abstract

This paper presents a translation model that is based on tree sequence alignment, where a tree sequence refers to a single sequence of sub-trees that covers a phrase.

Introduction

Phrase-based modeling method (Koehn et al., 2003; Och and Ney, 2004a) is a simple, but powerful mechanism to machine translation since it can model local reorderings and translations of multi-word expressions well.

Related Work

Many techniques on linguistically syntax-based SMT have been proposed in literature.

Tree Sequence Alignment Model

3.1 Tree Sequence Translation Rule

Rule Extraction

Rules are extracted from word-aligned, bi-parsed sentence pairs < T(f1J),T(eII),A > , which are

Decoding

Given T(f1‘]) , the decoder is to find the best derivation (9 that generates <T(f1‘]) , T (ell ) >.

Experiments

6.1 Experimental Settings

Conclusions and Future Work

In this paper, we present a tree sequence align-ment-based translation model to combine the strengths of phrase-based and syntax-based methods.

Topics

parse tree

Appears in 18 sentences as: parse tree (14) parse trees (4)
In A Tree Sequence Alignment-based Tree-to-Tree Translation Model
  1. 1 A tree sequence refers to an ordered subtree sequence that covers a phrase or a consecutive tree fragment in a parse tree .
    Page 1, “Introduction”
  2. Yamada and Knight (2001) use noisy-channel model to transfer a target parse tree into a source sentence.
    Page 2, “Related Work”
  3. (2006) propose a feature-based discriminative model for target language syntactic structures prediction, given a source parse tree .
    Page 2, “Related Work”
  4. (2006) create an xRS rule headed by a pseudo, non-syntactic nonterminal symbol that subsumes the phrase and its corresponding multi-headed syntactic structure; and one sibling xRS rule that explains how the pseudo symbol can be combined with other genuine non-terminals for acquiring the genuine parse trees .
    Page 2, “Related Work”
  5. Although the solution shows effective empirically, it only utilizes the source side syntactic phrases of the input parse tree during decoding.
    Page 2, “Related Work”
  6. Figure l: A word-aligned parse tree pairs of a Chinese sentence and its English translation
    Page 3, “Related Work”
  7. source and target parse trees T ( fl‘]) and T (ell ) in Fig.
    Page 3, “Tree Sequence Alignment Model”
  8. 2 illustrates two examples of tree sequences derived from the two parse trees .
    Page 3, “Tree Sequence Alignment Model”
  9. and their parse trees T(f1‘]) and T (611 ) 9 the tree
    Page 3, “Tree Sequence Alignment Model”
  10. 1) Pr(T(f1J) | fl‘]) 2 lsince we only use the best source and target parse tree pairs in training.
    Page 4, “Tree Sequence Alignment Model”
  11. parse tree T ( fl‘l) , there are multiple derivations that could lead to the same target tree T (ell ) , the mapping probability Pr(T(eII ) | T( is obtained by summing over the probabilities of all deriva-
    Page 4, “Tree Sequence Alignment Model”

See all papers in Proc. ACL 2008 that mention parse tree.

See all papers in Proc. ACL that mention parse tree.

Back to top.

translation model

Appears in 18 sentences as: Translation Model (2) translation model (15) translation models (2)
In A Tree Sequence Alignment-based Tree-to-Tree Translation Model
  1. This paper presents a translation model that is based on tree sequence alignment, where a tree sequence refers to a single sequence of sub-trees that covers a phrase.
    Page 1, “Abstract”
  2. 3d Tree-to-Tree Translation Model
    Page 1, “Introduction”
  3. In this paper, we propose a tree-to-tree translation model that is based on tree sequence alignment.
    Page 1, “Introduction”
  4. Ding and Palmer (2005) propose a syntax-based translation model based on a probabilistic synchronous dependency insertion grammar.
    Page 2, “Related Work”
  5. (2005) propose a dependency treelet-based translation model .
    Page 2, “Related Work”
  6. (2007b) present a STSG-based tree-to-tree translation model .
    Page 2, “Related Work”
  7. Bod (2007) reports that the unsupervised STSG-based translation model performs much better than the supervised one.
    Page 2, “Related Work”
  8. (2007) integrate supertags (a kind of lexicalized syntactic description) into the target side of translation model and language mod-
    Page 2, “Related Work”
  9. We go beyond the single subtree mapping model to propose a tree sequence alignment-based translation model .
    Page 3, “Related Work”
  10. 3.2 Tree Sequence Translation Model
    Page 3, “Tree Sequence Alignment Model”
  11. sequence-to-tree sequence translation model is formulated as:
    Page 3, “Tree Sequence Alignment Model”

See all papers in Proc. ACL 2008 that mention translation model.

See all papers in Proc. ACL that mention translation model.

Back to top.

phrase-based

Appears in 16 sentences as: Phrase-based (1) phrase-based (15)
In A Tree Sequence Alignment-based Tree-to-Tree Translation Model
  1. The model leverages on the strengths of both phrase-based and linguistically syntax-based method.
    Page 1, “Abstract”
  2. Phrase-based modeling method (Koehn et al., 2003; Och and Ney, 2004a) is a simple, but powerful mechanism to machine translation since it can model local reorderings and translations of multi-word expressions well.
    Page 1, “Introduction”
  3. It is designed to combine the strengths of phrase-based and syntax-based methods.
    Page 1, “Introduction”
  4. Experiment results on the NIST MT-2005 Chinese-English translation task show that our method significantly outperforms Moses (Koehn et al., 2007), a state-of-the-art phrase-based SMT system, and other linguistically syntax-based methods, such as SCFG-based and STSG-based methods (Zhang et al., 2007).
    Page 1, “Introduction”
  5. However, most of them fail to utilize non-syntactic phrases well that are proven useful in the phrase-based methods (Koehn et al., 2003)
    Page 2, “Related Work”
  6. Chiang (2005)’s hierarchal phrase-based model achieves significant performance improvement.
    Page 2, “Related Work”
  7. In the last two years, many research efforts were devoted to integrating the strengths of phrase-based and syntax-based methods.
    Page 2, “Related Work”
  8. el under the phrase-based translation framework, resulting in good performance improvement.
    Page 2, “Related Work”
  9. However, phrases are utilized independently in the phrase-based method without depending on any contexts.
    Page 2, “Related Work”
  10. In this paper, an alternative solution is presented to combine the strengths of phrase-based and syn-
    Page 2, “Related Work”
  11. We use seven basic features that are analogous to the commonly used features in phrase-based systems (Koehn, 2004): 1) bidirectional rule mapping probabilities; 2) bidirectional lexical rule translation probabilities; 3) the target language model; 4) the number of rules used and 5) the number of target words.
    Page 4, “Tree Sequence Alignment Model”

See all papers in Proc. ACL 2008 that mention phrase-based.

See all papers in Proc. ACL that mention phrase-based.

Back to top.

lexicalized

Appears in 9 sentences as: lexicalized (12)
In A Tree Sequence Alignment-based Tree-to-Tree Translation Model
  1. (2007) integrate supertags (a kind of lexicalized syntactic description) into the target side of translation model and language mod-
    Page 2, “Related Work”
  2. (2006) treat all bilingual phrases as lexicalized tree-to-string rules, including those non-syntactic phrases in training corpus.
    Page 2, “Related Work”
  3. In addition, we define two new features: 1) the number of lexical words in a rule to control the model’s preference for lexicalized rules over unlexicalized
    Page 4, “Tree Sequence Alignment Model”
  4. We first generate all fully lexicalized source and target tree sequences using a dynamic programming algorithm and then iterate over all generated source and
    Page 4, “Rule Extraction”
  5. Table l: # of rules used in the testing (61' = 4 , h = 6) (BP: bilingual phrase (used in Moses), TR: tree rule (only 1 tree), TSR: tree sequence rule (> 1 tree), L: fully lexicalized, P: partially lexicalized , U: unlexicalized)
    Page 6, “Experiments”
  6. lexicalized rules), in which the lexicalized TSRs model all non-syntactic phrase pairs with rich syntactic information.
    Page 6, “Experiments”
  7. It suggests that they are complementary to each other since the lexicalized TSRs are used to model non-syntactic phrases while the other two kinds of TSRs can generalize the lexicalized rules to unseen phrases.
    Page 7, “Experiments”
  8. 2) The lexicalized TSRs make the major contribution since they can capture non-syntactic phrases with syntactic structure features.
    Page 7, “Experiments”
  9. Note that the reordering between lexical words and nonterminal leaf nodes is not considered here) and Discontinuous Phrase Rules (DPR: refers to these rules having at least one nonterminal leaf node between two lexicalized leaf nodes) in our
    Page 7, “Experiments”

See all papers in Proc. ACL 2008 that mention lexicalized.

See all papers in Proc. ACL that mention lexicalized.

Back to top.

BLEU

Appears in 5 sentences as: BLEU (5)
In A Tree Sequence Alignment-based Tree-to-Tree Translation Model
  1. BLEU (%)
    Page 7, “Experiments”
  2. Rule TR TR TR+TSR_L TR Type (STSG) +TSR_L +TSR_P +TSR BLEU (%) 24.71 25.72 25.93 26.07
    Page 7, “Experiments”
  3. Rule Type BLEU (%) TR+TSR 26.07 (TR+TSR) w/o SRR 24.62 (TR+TSR) w/o DPR 25.78
    Page 7, “Experiments”
  4. It clearly indicates that SRRs are very effective in reordering structures, which improve performance by 1.45 (26.07-24.62) BLEU score.
    Page 7, “Experiments”
  5. BLEU (%)
    Page 8, “Experiments”

See all papers in Proc. ACL 2008 that mention BLEU.

See all papers in Proc. ACL that mention BLEU.

Back to top.

NIST

Appears in 4 sentences as: NIST (5)
In A Tree Sequence Alignment-based Tree-to-Tree Translation Model
  1. Experimental results on the NIST MT-2005 Chinese-English translation task show that our method statistically significantly outperforms the baseline systems.
    Page 1, “Abstract”
  2. Experiment results on the NIST MT-2005 Chinese-English translation task show that our method significantly outperforms Moses (Koehn et al., 2007), a state-of-the-art phrase-based SMT system, and other linguistically syntax-based methods, such as SCFG-based and STSG-based methods (Zhang et al., 2007).
    Page 1, “Introduction”
  3. We used sentences with less than 50 characters from the NIST MT-2002 test set as our development set and the NIST MT-2005 test set as our test set.
    Page 6, “Experiments”
  4. The experimental results on the NIST MT-2005 Chinese-English translation task demonstrate the effectiveness of the proposed model.
    Page 8, “Conclusions and Future Work”

See all papers in Proc. ACL 2008 that mention NIST.

See all papers in Proc. ACL that mention NIST.

Back to top.

proposed model

Appears in 4 sentences as: proposed model (4)
In A Tree Sequence Alignment-based Tree-to-Tree Translation Model
  1. The proposed model adopts tree sequence1 as the basic translation unit and utilizes tree sequence alignments to model the translation process.
    Page 1, “Introduction”
  2. 1 and 3 show how the proposed model works.
    Page 4, “Tree Sequence Alignment Model”
  3. For the SCFG/STSG and our proposed model , we used the same settings except for the parameters 61' and h (d =landh=2for the SCFG; d =landh=6for the STSG; d =4 and h =6 for our model).
    Page 6, “Experiments”
  4. The experimental results on the NIST MT-2005 Chinese-English translation task demonstrate the effectiveness of the proposed model .
    Page 8, “Conclusions and Future Work”

See all papers in Proc. ACL 2008 that mention proposed model.

See all papers in Proc. ACL that mention proposed model.

Back to top.

word alignment

Appears in 4 sentences as: word alignment (4) word alignments (1)
In A Tree Sequence Alignment-based Tree-to-Tree Translation Model
  1. We used GIZA++ (Och and Ney, 2004) and the heuristics “grow-diag-final” to generate m-to-n word alignments .
    Page 6, “Experiments”
  2. (2006) reports that discontinuities are very useful for translational equivalence analysis using binary-branching structures under word alignment and parse tree constraints while they are almost of no use if under word alignment constraints only.
    Page 7, “Experiments”
  3. In addition, word alignment is a hard constraint in our rule extraction.
    Page 8, “Conclusions and Future Work”
  4. We will study direct structure alignments to reduce the impact of word alignment errors.
    Page 8, “Conclusions and Future Work”

See all papers in Proc. ACL 2008 that mention word alignment.

See all papers in Proc. ACL that mention word alignment.

Back to top.

baseline systems

Appears in 3 sentences as: baseline systems (3)
In A Tree Sequence Alignment-based Tree-to-Tree Translation Model
  1. Experimental results on the NIST MT-2005 Chinese-English translation task show that our method statistically significantly outperforms the baseline systems .
    Page 1, “Abstract”
  2. We set three baseline systems : Moses (Koehn et al., 2007), and SCFG-based and STSG-based tree-to-tree translation models (Zhang et al., 2007).
    Page 6, “Experiments”
  3. In this subsection, we first report the rule distributions and compare our model with the three baseline systems .
    Page 6, “Experiments”

See all papers in Proc. ACL 2008 that mention baseline systems.

See all papers in Proc. ACL that mention baseline systems.

Back to top.

Chinese-English

Appears in 3 sentences as: Chinese-English (3)
In A Tree Sequence Alignment-based Tree-to-Tree Translation Model
  1. Experimental results on the NIST MT-2005 Chinese-English translation task show that our method statistically significantly outperforms the baseline systems.
    Page 1, “Abstract”
  2. Experiment results on the NIST MT-2005 Chinese-English translation task show that our method significantly outperforms Moses (Koehn et al., 2007), a state-of-the-art phrase-based SMT system, and other linguistically syntax-based methods, such as SCFG-based and STSG-based methods (Zhang et al., 2007).
    Page 1, “Introduction”
  3. The experimental results on the NIST MT-2005 Chinese-English translation task demonstrate the effectiveness of the proposed model.
    Page 8, “Conclusions and Future Work”

See all papers in Proc. ACL 2008 that mention Chinese-English.

See all papers in Proc. ACL that mention Chinese-English.

Back to top.

significantly outperforms

Appears in 3 sentences as: significantly outperforms (3)
In A Tree Sequence Alignment-based Tree-to-Tree Translation Model
  1. Experimental results on the NIST MT-2005 Chinese-English translation task show that our method statistically significantly outperforms the baseline systems.
    Page 1, “Abstract”
  2. Experiment results on the NIST MT-2005 Chinese-English translation task show that our method significantly outperforms Moses (Koehn et al., 2007), a state-of-the-art phrase-based SMT system, and other linguistically syntax-based methods, such as SCFG-based and STSG-based methods (Zhang et al., 2007).
    Page 1, “Introduction”
  3. 1) Our tree sequence-based model significantly outperforms (p < 0.01) previous phrase-based and linguistically syntax-based methods.
    Page 6, “Experiments”

See all papers in Proc. ACL 2008 that mention significantly outperforms.

See all papers in Proc. ACL that mention significantly outperforms.

Back to top.

translation task

Appears in 3 sentences as: translation task (3)
In A Tree Sequence Alignment-based Tree-to-Tree Translation Model
  1. Experimental results on the NIST MT-2005 Chinese-English translation task show that our method statistically significantly outperforms the baseline systems.
    Page 1, “Abstract”
  2. Experiment results on the NIST MT-2005 Chinese-English translation task show that our method significantly outperforms Moses (Koehn et al., 2007), a state-of-the-art phrase-based SMT system, and other linguistically syntax-based methods, such as SCFG-based and STSG-based methods (Zhang et al., 2007).
    Page 1, “Introduction”
  3. The experimental results on the NIST MT-2005 Chinese-English translation task demonstrate the effectiveness of the proposed model.
    Page 8, “Conclusions and Future Work”

See all papers in Proc. ACL 2008 that mention translation task.

See all papers in Proc. ACL that mention translation task.

Back to top.