A Syntax-Driven Bracketing Model for Phrase-Based Translation
Xiong, Deyi and Zhang, Min and Aw, Aiti and Li, Haizhou

Article Structure

Abstract

Syntactic analysis influences the way in which the source sentence is translated.

Introduction

The phrase-based approach is widely adopted in statistical machine translation (SMT).

The Acquisition of Bracketing Instances

In this section, we formally define the bracketing instance, comprising two types namely binary bracketing instance and unary bracketing instance.

The Syntax-Driven Bracketing Model 3.1 The Model

Our interest is to automatically detect phrase bracketing using rich contextual information.

Experiments

We carried out the MT experiments on Chinese-to-English translation, using (Xiong et al., 2006)’s system as our baseline system.

Analysis

In this section, we present analysis to perceive the influence mechanism of the SDB model on phrase translation by studying the effects of syntax-driven features and differences of l-best translation outputs.

Conclusion

In this paper, we presented a syntax-driven bracketing model that automatically learns bracketing knowledge from training corpus.

Topics

parse tree

Appears in 14 sentences as: parse tree (13) parse tree: (1) parse trees (1)
In A Syntax-Driven Bracketing Model for Phrase-Based Translation
  1. Consider the following Chinese fragment with its parse tree:
    Page 1, “Introduction”
  2. However, the parse tree of the source fragment constrains the phrase “ER “13”” to be translated as a unit.
    Page 1, “Introduction”
  3. Without considering syntactic constraints from the parse tree , the decoder makes wrong decisions not only on phrase movement but also on the lexical selection for the multi-meaning word “75’”.
    Page 1, “Introduction”
  4. For each of these instances, we automatically extract relevant syntactic features from the source parse tree as bracketing evidences.
    Page 2, “Introduction”
  5. Let c and e be the source sentence and the target sentence, W be the word alignment between them, T be the parse tree of c. We define a binary bracketing instance as a tuple (b,7'(cinj),7'(Cj+1nk),7'(cink)> where b E {bracketable,unbracketable}, cinj and cj+1nlc are two neighboring source phrases and 7'(T, 3) (7(3) for short) is a subtree function which returns the minimal subtree covering the source sequence 3 from the source parse tree T. Note that 7(cz-nk) includes both 7(cz-nj) and flog-+1.19).
    Page 3, “The Acquisition of Bracketing Instances”
  6. 1: Input: sentence pair (0, e), the parse tree T of c and the word alignment W between c and e 2: QR :2 (Z) 3: for each (i,j, k) E cdo 4: if There exist a target phrase can” aligned to Cinj and ep,,q aligned to Cj+1,_k; then
    Page 3, “The Acquisition of Bracketing Instances”
  7. These features capture syntactic “horizontal context” which demonstrates the expansion trend of the source phrase 3, 31 and 32 on the parse tree .
    Page 4, “The Syntax-Driven Bracketing Model 3.1 The Model”
  8. The tree path 0(31)..0(3) connecting 0(31) and 0(3), 0(32)..0(3) connecting 0(32) and 0(3), and 0(3)..p connecting 0(3) and the root node p of the whole parse tree are used as features.
    Page 4, “The Syntax-Driven Bracketing Model 3.1 The Model”
  9. These features provide syntactic “vertical context” which shows the generation history of the source phrases on the parse tree .
    Page 4, “The Syntax-Driven Bracketing Model 3.1 The Model”
  10. We removed 15,250 sentences, for which the Chinese parser failed to produce syntactic parse trees .
    Page 5, “Experiments”
  11. This proportion, which we call consistent constituent matching (CCM) rate , reflects the extent to which the translation output respects the source parse tree .
    Page 7, “Analysis”

See all papers in Proc. ACL 2009 that mention parse tree.

See all papers in Proc. ACL that mention parse tree.

Back to top.

phrase-based

Appears in 14 sentences as: Phrase-Based (1) phrase-based (13)
In A Syntax-Driven Bracketing Model for Phrase-Based Translation
  1. Previous efforts add syntactic constraints to phrase-based translation by directly rewarding/punishing a hypothesis whenever it matches/violates source-side constituents.
    Page 1, “Abstract”
  2. The phrase-based approach is widely adopted in statistical machine translation (SMT).
    Page 1, “Introduction”
  3. In such a process, original phrase-based decoding (Koehn et al., 2003) does not take advantage of any linguistic analysis, which, however, is broadly used in rule-based approaches.
    Page 1, “Introduction”
  4. Since it is not linguistically motivated, original phrase-based decoding might produce ungrammatical or even wrong translations.
    Page 1, “Introduction”
  5. The output is generated from a phrase-based system which does not involve any syntactic analysis.
    Page 1, “Introduction”
  6. 0 The SDB model maintains and protects the strength of the phrase-based approach in a better way than the CMVC does.
    Page 2, “Introduction”
  7. In section 3 we elaborate the syntax-driven bracketing model, including feature generation and the integration of the SDB model into phrase-based SMT.
    Page 2, “Introduction”
  8. 3.3 The Integration of the SDB Model into Phrase-Based SMT
    Page 4, “The Syntax-Driven Bracketing Model 3.1 The Model”
  9. We integrate the SDB model into phrase-based SMT to help decoder perform syntax-driven phrase translation.
    Page 4, “The Syntax-Driven Bracketing Model 3.1 The Model”
  10. In this paper, we implement the SDB model in a state-of-the-art phrase-based system which adapts a binary bracketing transduction grammar (BTG) (Wu, 1997) to phrase translation and reordering, described in (Xiong et al., 2006).
    Page 5, “The Syntax-Driven Bracketing Model 3.1 The Model”
  11. The SDB model, however, is not only limited to phrase-based SMT using BTG rules.
    Page 5, “The Syntax-Driven Bracketing Model 3.1 The Model”

See all papers in Proc. ACL 2009 that mention phrase-based.

See all papers in Proc. ACL that mention phrase-based.

Back to top.

BLEU

Appears in 8 sentences as: BLEU (8)
In A Syntax-Driven Bracketing Model for Phrase-Based Translation
  1. Our experimental results display that our SDB model achieves a substantial improvement over the baseline and significantly outperforms XP+ according to the BLEU metric (Papineni et al., 2002).
    Page 2, “Introduction”
  2. In addition, our analysis shows further evidences of the performance gain from a different perspective than that of BLEU .
    Page 2, “Introduction”
  3. Statistical significance in BLEU score differences was tested by paired bootstrap re-sampling (Koehn, 2004).
    Page 5, “Experiments”
  4. Like (Marton and Resnik, 2008), we find that the XP+ feature obtains a significant improvement of 1.08 BLEU over the baseline.
    Page 5, “Experiments”
  5. However, using all syntax-driven features described in section 3.2, our SDB models achieve larger improvements of up to 1.67 BLEU .
    Page 5, “Experiments”
  6. 0 The constituent boundary matching feature (CBMF) is a very important feature, which by itself achieves significant improvement over the baseline (up to 1.13 BLEU ).
    Page 6, “Analysis”
  7. 5.2 Beyond BLEU
    Page 6, “Analysis”
  8. Since BLEU is not sufficient
    Page 6, “Analysis”

See all papers in Proc. ACL 2009 that mention BLEU.

See all papers in Proc. ACL that mention BLEU.

Back to top.

syntactic contexts

Appears in 6 sentences as: syntactic contexts (6)
In A Syntax-Driven Bracketing Model for Phrase-Based Translation
  1. whether the current phrase can be translated as a unit or not within particular syntactic contexts (Fox, 2002)2, than that of constituent matching/violation.
    Page 2, “Introduction”
  2. It is able to reward non-syntactic translations by assigning an adequate probability to them if these translations are appropriate to particular syntactic contexts on the source side, rather than always punish them.
    Page 2, “Introduction”
  3. We consider this task as a binary-class classification problem: whether the current source phrase s is bracketable (1)) within particular syntactic contexts (7(3)).
    Page 3, “The Syntax-Driven Bracketing Model 3.1 The Model”
  4. If two neighboring sub-phrases 31 and 32 are given, we can use more inner syntactic contexts to complete this binary classification task.
    Page 3, “The Syntax-Driven Bracketing Model 3.1 The Model”
  5. new feature into the log-linear translation model: PSDB (b|T, This feature is computed by the SDB model described in equation (3) or equation (4), which estimates a probability that a source span is to be translated as a unit within particular syntactic contexts .
    Page 5, “The Syntax-Driven Bracketing Model 3.1 The Model”
  6. By allowing appropriate violations to translate non-syntactic phrases according to particular syntactic contexts , our SDB model better inherits the strength of phrase-based approach than XP+.
    Page 7, “Analysis”

See all papers in Proc. ACL 2009 that mention syntactic contexts.

See all papers in Proc. ACL that mention syntactic contexts.

Back to top.

log-linear

Appears in 4 sentences as: log-linear (4)
In A Syntax-Driven Bracketing Model for Phrase-Based Translation
  1. These constituent matching/violation counts are used as a feature in the decoder’s log-linear model and their weights are tuned via minimal error rate training (MERT) (Och, 2003).
    Page 1, “Introduction”
  2. Similar to previous methods, our SDB model is integrated into the decoder’s log-linear model as a feature so that we can inherit the idea of soft constraints.
    Page 2, “Introduction”
  3. new feature into the log-linear translation model: PSDB (b|T, This feature is computed by the SDB model described in equation (3) or equation (4), which estimates a probability that a source span is to be translated as a unit within particular syntactic contexts.
    Page 5, “The Syntax-Driven Bracketing Model 3.1 The Model”
  4. We want to further study the happenings after we integrate the constraint feature (our SDB model and Marton and Resnik’s XP+) into the log-linear translation model.
    Page 6, “Analysis”

See all papers in Proc. ACL 2009 that mention log-linear.

See all papers in Proc. ACL that mention log-linear.

Back to top.

subtrees

Appears in 4 sentences as: subtrees (5)
In A Syntax-Driven Bracketing Model for Phrase-Based Translation
  1. From a binary bracketing instance, we derive a unary bracketing instance ((9,710,119)), ignoring the subtrees 7(cz-nj) and flog-+1.19).
    Page 3, “The Acquisition of Bracketing Instances”
  2. These features are to capture the relationship between a source phrase 3 and 7(3) or 7(3)’s subtrees .
    Page 4, “The Syntax-Driven Bracketing Model 3.1 The Model”
  3. There are three different scenarios3: l) exact match, where 3 exactly matches the boundaries of 7(3) (figure 3(a)), 2) inside match, where 3 exactly spans a sequence of 7(3)’s subtrees (figure 3(b)), and 3) crossing, where 3 crosses the boundaries of one or two subtrees of 7(3) (figure 3(c)).
    Page 4, “The Syntax-Driven Bracketing Model 3.1 The Model”
  4. The source phrase 32 exactly spans two subtrees VV and AS of VP, therefore CBMF is “VP-I”.
    Page 4, “The Syntax-Driven Bracketing Model 3.1 The Model”

See all papers in Proc. ACL 2009 that mention subtrees.

See all papers in Proc. ACL that mention subtrees.

Back to top.

word alignment

Appears in 4 sentences as: word alignment (2) word alignments (2)
In A Syntax-Driven Bracketing Model for Phrase-Based Translation
  1. According to the word alignments , we define bracketable and unbracketable instances.
    Page 2, “Introduction”
  2. Let c and e be the source sentence and the target sentence, W be the word alignment between them, T be the parse tree of c. We define a binary bracketing instance as a tuple (b,7'(cinj),7'(Cj+1nk),7'(cink)> where b E {bracketable,unbracketable}, cinj and cj+1nlc are two neighboring source phrases and 7'(T, 3) (7(3) for short) is a subtree function which returns the minimal subtree covering the source sequence 3 from the source parse tree T. Note that 7(cz-nk) includes both 7(cz-nj) and flog-+1.19).
    Page 3, “The Acquisition of Bracketing Instances”
  3. 1: Input: sentence pair (0, e), the parse tree T of c and the word alignment W between c and e 2: QR :2 (Z) 3: for each (i,j, k) E cdo 4: if There exist a target phrase can” aligned to Cinj and ep,,q aligned to Cj+1,_k; then
    Page 3, “The Acquisition of Bracketing Instances”
  4. To obtain word-level alignments, we ran GIZA++ (Och and Ney, 2000) on the remaining corpus in both directions, and applied the “grow-diag-final” refinement rule (Koehn et al., 2005) to produce the final many-to-many word alignments .
    Page 5, “Experiments”

See all papers in Proc. ACL 2009 that mention word alignment.

See all papers in Proc. ACL that mention word alignment.

Back to top.

significant improvement

Appears in 3 sentences as: significant improvement (2) significant improvements (1)
In A Syntax-Driven Bracketing Model for Phrase-Based Translation
  1. Although experiments show that this constituent matching/violation counting feature achieves significant improvements on various language-pairs, one issue is that matching syntactic analysis can not always guarantee a good translation, and violating syntactic structure does not always induce a bad translation.
    Page 2, “Introduction”
  2. Like (Marton and Resnik, 2008), we find that the XP+ feature obtains a significant improvement of 1.08 BLEU over the baseline.
    Page 5, “Experiments”
  3. 0 The constituent boundary matching feature (CBMF) is a very important feature, which by itself achieves significant improvement over the baseline (up to 1.13 BLEU).
    Page 6, “Analysis”

See all papers in Proc. ACL 2009 that mention significant improvement.

See all papers in Proc. ACL that mention significant improvement.

Back to top.

significantly outperforms

Appears in 3 sentences as: significantly outperforms (3)
In A Syntax-Driven Bracketing Model for Phrase-Based Translation
  1. Our experimental results display that our SDB model achieves a substantial improvement over the baseline and significantly outperforms XP+ according to the BLEU metric (Papineni et al., 2002).
    Page 2, “Introduction”
  2. The binary SDB (BiSDB) model statistically significantly outperforms Marton and Resnik’s XP+ by an absolute improvement of 0.59 (relatively 2%).
    Page 5, “Experiments”
  3. EXperiments show that our model achieves substantial improvements over baseline and significantly outperforms (Marton and Resnik, 2008)’s XP+.
    Page 8, “Conclusion”

See all papers in Proc. ACL 2009 that mention significantly outperforms.

See all papers in Proc. ACL that mention significantly outperforms.

Back to top.

translation model

Appears in 3 sentences as: translation model (2) translation models (1)
In A Syntax-Driven Bracketing Model for Phrase-Based Translation
  1. new feature into the log-linear translation model : PSDB (b|T, This feature is computed by the SDB model described in equation (3) or equation (4), which estimates a probability that a source span is to be translated as a unit within particular syntactic contexts.
    Page 5, “The Syntax-Driven Bracketing Model 3.1 The Model”
  2. All translation models were trained on the FBIS corpus.
    Page 5, “Experiments”
  3. We want to further study the happenings after we integrate the constraint feature (our SDB model and Marton and Resnik’s XP+) into the log-linear translation model .
    Page 6, “Analysis”

See all papers in Proc. ACL 2009 that mention translation model.

See all papers in Proc. ACL that mention translation model.

Back to top.