Handling Ambiguities of Bilingual Predicate-Argument Structures for Statistical Machine Translation
Zhai, Feifei and Zhang, Jiajun and Zhou, Yu and Zong, Chengqing

Article Structure

Abstract

Predicate-argument structure (PAS) has been demonstrated to be very effective in improving SMT performance.

Introduction

Predicate-argument structure (PAS) depicts the relationship between a predicate and its associated arguments, which indicates the skeleton structure of a sentence on semantic level.

PAS-based Translation Framework

PAS-based translation framework is to perform translation based on PAS transformation (Zhai et al., 2012).

Inside Context Integration

In this section, we integrate the inside context of the PAS into PASTRs to do PAS disambiguation.

Maximum Entropy PAS Disambiguation (MEPD) Model

In order to handle the role ambiguities, in this section, we concentrate on utilizing a maximum entropy model to incorporate the context information for PAS disambiguation.

Integrating into the PAS-based Translation Framework

In this section, we integrate our method of PAS disambiguation into the PAS-based translation framework when translating each test sentence.

Related Work

The method of PAS disambiguation for SMT is relevant to the previous work on context depend-

Experiment

7.1 Experimental Setup

Conclusion and Future Work

In this paper, we focus on the problem of ambiguities for PASs.

Topics

context information

Appears in 11 sentences as: context information (11)
In Handling Ambiguities of Bilingual Predicate-Argument Structures for Statistical Machine Translation
  1. In this way, we incorporate rich context information of PAS for disambiguation.
    Page 1, “Abstract”
  2. In this paper, we propose two novel methods to incorporate rich context information to handle PAS ambiguities.
    Page 2, “Introduction”
  3. The target-side-like PAS is selected only according to the language model and translation probabilities, without considering any context information of PAS.
    Page 3, “PAS-based Translation Framework”
  4. In order to handle the role ambiguities, in this section, we concentrate on utilizing a maximum entropy model to incorporate the context information for PAS disambiguation.
    Page 4, “Maximum Entropy PAS Disambiguation (MEPD) Model”
  5. They combine rich context information to do disambiguation for words or phrases, and achieve improved translation performance.
    Page 5, “Related Work”
  6. By incorporating the rich context information as features, they chose better rules for translation and yielded stable improvements on translation quality.
    Page 5, “Related Work”
  7. They also combine the context information in the model.
    Page 5, “Related Work”
  8. Different from their work, we incorporate the context information to do PAS disambiguation based on the entire PAS.
    Page 5, “Related Work”
  9. Specifically, after integrating the inside context information of PAS into transformation, we can see that system IC-PASTR significantly outperforms system PASTR by 0.71 BLEU points.
    Page 6, “Experiment”
  10. Conversely, by considering the context information , the PASTR+MEPD system chooses a correct rule for translation:
    Page 8, “Experiment”
  11. The two methods successfully incorporate the rich context information into the translation process.
    Page 8, “Conclusion and Future Work”

See all papers in Proc. ACL 2013 that mention context information.

See all papers in Proc. ACL that mention context information.

Back to top.

maximum entropy

Appears in 9 sentences as: maximum entropy (9)
In Handling Ambiguities of Bilingual Predicate-Argument Structures for Statistical Machine Translation
  1. Then we propose two novel methods to handle the two PAS ambiguities for SMT accordingly: 1) inside context integration; 2) a novel maximum entropy PAS disambiguation (MEPD) model.
    Page 1, “Abstract”
  2. As to the role ambiguity, we design a novel maximum entropy PAS disambiguation (MEPD) model to combine various context features, such as context words of PAS.
    Page 2, “Introduction”
  3. Thus to overcome this problem, we design two novel methods to cope with the PAS ambiguities: inside-context integration and a maximum entropy PAS disambiguation (MEPD) model.
    Page 3, “PAS-based Translation Framework”
  4. In order to handle the role ambiguities, in this section, we concentrate on utilizing a maximum entropy model to incorporate the context information for PAS disambiguation.
    Page 4, “Maximum Entropy PAS Disambiguation (MEPD) Model”
  5. The maximum entropy model is the classical way to handle this problem:
    Page 4, “Maximum Entropy PAS Disambiguation (MEPD) Model”
  6. We train a maximum entropy classifier for each Sp via the off-the-shelf MaxEnt toolkit3.
    Page 4, “Maximum Entropy PAS Disambiguation (MEPD) Model”
  7. Note that since the training procedure of maximum entropy classifier is really fast, it does not take much time to train these classifiers.
    Page 5, “Maximum Entropy PAS Disambiguation (MEPD) Model”
  8. (2010) designed maximum entropy (ME) classifiers to do better rule section for hierarchical phrase-based model and tree-to-string model respectively.
    Page 5, “Related Work”
  9. Towards the MEPD model, we design a maximum entropy model for each ambitious source-side PASs.
    Page 8, “Conclusion and Future Work”

See all papers in Proc. ACL 2013 that mention maximum entropy.

See all papers in Proc. ACL that mention maximum entropy.

Back to top.

development set

Appears in 5 sentences as: development set (5)
In Handling Ambiguities of Bilingual Predicate-Argument Structures for Statistical Machine Translation
  1. The development set and test set come from the NIST evaluation test data (from 2003 to 2005).
    Page 5, “Experiment”
  2. Finally, the development set includes 595 sentences from NIST MT03 and the test set contains 1,786 sentences from NIST MT04 and MT05.
    Page 6, “Experiment”
  3. We perform SRL on the source part of the training set, development set and test set by the Chinese SRL system used in (Zhuang and Zong, 2010b).
    Page 6, “Experiment”
  4. Moreover, according to our statistics, among all PASS appearing in the development set and test set, 56.7% of them carry gap strings.
    Page 6, “Experiment”
  5. The statistics are conducted on the combination of development set and test set.
    Page 7, “Experiment”

See all papers in Proc. ACL 2013 that mention development set.

See all papers in Proc. ACL that mention development set.

Back to top.

parse tree

Appears in 5 sentences as: parse tree (4) parse trees (2)
In Handling Ambiguities of Bilingual Predicate-Argument Structures for Statistical Machine Translation
  1. The stag sequence dominates the corresponding syntactic tree fragments in the parse tree .
    Page 3, “Inside Context Integration”
  2. (2012) attached the IC to its neighboring elements based on parse trees .
    Page 3, “Inside Context Integration”
  3. These features include st(Ei), i.e., the highest syntax tag for each argument, and fst(PAS) which is the lowest father node of Sp in the parse tree .
    Page 5, “Maximum Entropy PAS Disambiguation (MEPD) Model”
  4. To relieve the negative effect of SRL errors, we get the multiple SRL results by providing the SRL system with 3-best parse trees of Berkeley parser (Petrov and Klein, 2007), 1-best parse tree of Bikel parser (Bikel, 2004) and Stanford parser (Klein and Manning, 2003).
    Page 6, “Experiment”
  5. Thus, the system using PASTRs can only attach the long phrase to the predicate “511:” according to the parse tree , and meanwhile, make use of a transformation rule as follows:
    Page 7, “Experiment”

See all papers in Proc. ACL 2013 that mention parse tree.

See all papers in Proc. ACL that mention parse tree.

Back to top.

word alignment

Appears in 5 sentences as: word alignment (4) word alignments (1)
In Handling Ambiguities of Bilingual Predicate-Argument Structures for Statistical Machine Translation
  1. We demand that every element and its corresponding target span must be consistent with word alignment .
    Page 4, “Inside Context Integration”
  2. Note that we only apply the source-side PAS and word alignment for IC—PASTR extraction.
    Page 4, “Inside Context Integration”
  3. Thus to get a high recall for PASs, we only utilize word alignment instead of capturing the relation between bilingual elements.
    Page 4, “Inside Context Integration”
  4. t_range(PAS) refers to the target range covering all the words that are reachable from the PAS via word alignment .
    Page 5, “Maximum Entropy PAS Disambiguation (MEPD) Model”
  5. We run GIZA++ and then employ the grow-diag—final—and (gdfa) strategy to produce symmetric word alignments .
    Page 5, “Experiment”

See all papers in Proc. ACL 2013 that mention word alignment.

See all papers in Proc. ACL that mention word alignment.

Back to top.

BLEU

Appears in 4 sentences as: BLEU (4)
In Handling Ambiguities of Bilingual Predicate-Argument Structures for Statistical Machine Translation
  1. Specifically, after integrating the inside context information of PAS into transformation, we can see that system IC-PASTR significantly outperforms system PASTR by 0.71 BLEU points.
    Page 6, “Experiment”
  2. Moreover, after we import the MEPD model into system PASTR, we get a significant improvement over PASTR (by 0.54 BLEU points).
    Page 6, “Experiment”
  3. We can see that this system further achieves a remarkable improvement over system PASTR (0.95 BLEU points).
    Page 6, “Experiment”
  4. However, from Table 2, we find that system IC-PASTR+MEPD only outperforms system IC-PASTR slightly (0.24 BLEU points).
    Page 6, “Experiment”

See all papers in Proc. ACL 2013 that mention BLEU.

See all papers in Proc. ACL that mention BLEU.

Back to top.

BLEU points

Appears in 4 sentences as: BLEU points (4)
In Handling Ambiguities of Bilingual Predicate-Argument Structures for Statistical Machine Translation
  1. Specifically, after integrating the inside context information of PAS into transformation, we can see that system IC-PASTR significantly outperforms system PASTR by 0.71 BLEU points .
    Page 6, “Experiment”
  2. Moreover, after we import the MEPD model into system PASTR, we get a significant improvement over PASTR (by 0.54 BLEU points ).
    Page 6, “Experiment”
  3. We can see that this system further achieves a remarkable improvement over system PASTR (0.95 BLEU points ).
    Page 6, “Experiment”
  4. However, from Table 2, we find that system IC-PASTR+MEPD only outperforms system IC-PASTR slightly (0.24 BLEU points ).
    Page 6, “Experiment”

See all papers in Proc. ACL 2013 that mention BLEU points.

See all papers in Proc. ACL that mention BLEU points.

Back to top.

translation quality

Appears in 4 sentences as: translation quality (4)
In Handling Ambiguities of Bilingual Predicate-Argument Structures for Statistical Machine Translation
  1. Experiments show that our approach helps to achieve significant improvements on translation quality .
    Page 1, “Abstract”
  2. This harms the translation quality .
    Page 3, “PAS-based Translation Framework”
  3. By incorporating the rich context information as features, they chose better rules for translation and yielded stable improvements on translation quality .
    Page 5, “Related Work”
  4. The translation quality is evaluated by case-insensitive BLEU-4 with shortest length penalty.
    Page 6, “Experiment”

See all papers in Proc. ACL 2013 that mention translation quality.

See all papers in Proc. ACL that mention translation quality.

Back to top.

translation system

Appears in 4 sentences as: translation system (4)
In Handling Ambiguities of Bilingual Predicate-Argument Structures for Statistical Machine Translation
  1. Experiments show that the two PAS disambiguation methods significantly improve the baseline translation system .
    Page 2, “Introduction”
  2. For inside context integration, since the format of IC-PASTR is the same to PASTR4, we can use the IC-PASTR to substitute PASTR for building a PAS-based translation system directly.
    Page 5, “Integrating into the PAS-based Translation Framework”
  3. In addition, since our method of rule extraction is different from (Zhai et al., 2012), we also use PASTR to construct a translation system as the baseline system, which we call “PASTR”.
    Page 5, “Integrating into the PAS-based Translation Framework”
  4. Therefore, based on this advantage, although the number of matching PASs decreases, IC-PASTR still improves the translation system using PASTR significantly.
    Page 7, “Experiment”

See all papers in Proc. ACL 2013 that mention translation system.

See all papers in Proc. ACL that mention translation system.

Back to top.

language model

Appears in 3 sentences as: language model (3)
In Handling Ambiguities of Bilingual Predicate-Argument Structures for Statistical Machine Translation
  1. The target-side-like PAS is selected only according to the language model and translation probabilities, without considering any context information of PAS.
    Page 3, “PAS-based Translation Framework”
  2. The weights of the MEPD feature can be tuned by MERT (Och, 2003) together with other translation features, such as language model .
    Page 5, “Integrating into the PAS-based Translation Framework”
  3. We train a 5-gram language model with the Xinhua portion of English Gigaword corpus and target part of the training data.
    Page 6, “Experiment”

See all papers in Proc. ACL 2013 that mention language model.

See all papers in Proc. ACL that mention language model.

Back to top.

significant improvement

Appears in 3 sentences as: significant improvement (1) significant improvements (1) significantly improve (1)
In Handling Ambiguities of Bilingual Predicate-Argument Structures for Statistical Machine Translation
  1. Experiments show that our approach helps to achieve significant improvements on translation quality.
    Page 1, “Abstract”
  2. Experiments show that the two PAS disambiguation methods significantly improve the baseline translation system.
    Page 2, “Introduction”
  3. Moreover, after we import the MEPD model into system PASTR, we get a significant improvement over PASTR (by 0.54 BLEU points).
    Page 6, “Experiment”

See all papers in Proc. ACL 2013 that mention significant improvement.

See all papers in Proc. ACL that mention significant improvement.

Back to top.