Joint Learning of a Dual SMT System for Paraphrase Generation
Sun, Hong and Zhou, Ming

Article Structure

Abstract

SMT has been used in paraphrase generation by translating a source sentence into another (pivot) language and then back into the source.

Introduction

Paraphrasing (at word, phrase, and sentence levels) is a procedure for generating alternative expressions with an identical or similar meaning to the original text.

Paraphrasing with a Dual SMT System

We focus on sentence level paraphrasing and leverage homogeneous machine translation systems for this task bi-directionally.

Experiments and Results

3.1 Experiment Setup

Discussion

4.1 SMT Systems and Pivot Languages

Conclusion

We propose a joint learning method for pivot language-based paraphrase generation.

Topics

SMT systems

Appears in 18 sentences as: SMT System (1) SMT system (7) SMT Systems (1) SMT systems (12)
In Joint Learning of a Dual SMT System for Paraphrase Generation
  1. Existing work that uses two independently trained SMT systems cannot directly optimize the paraphrase results.
    Page 1, “Abstract”
  2. In this paper, we propose a joint learning method of two SMT systems to optimize the process of paraphrase generation.
    Page 1, “Abstract”
  3. In addition, a revised BLEU score (called iBLEU) which measures the adequacy and diversity of the generated paraphrase sentence is proposed for tuning parameters in SMT systems .
    Page 1, “Abstract”
  4. Thus researchers leverage bilingual parallel data for this task and apply two SMT systems (dual SMT system ) to translate the original sentences into another pivot language and then translate them back into the original language.
    Page 1, “Introduction”
  5. Context features are added into the SMT system to improve translation correctness against polysemous.
    Page 1, “Introduction”
  6. Previous work employs two separately trained SMT systems the parameters of which are tuned for SMT scheme and therefore cannot directly optimize the paraphrase purposes, for example, optimize the diversity against the input.
    Page 2, “Introduction”
  7. To address these issues, in this paper, we propose a joint learning method of two SMT systems for paraphrase generation.
    Page 2, “Introduction”
  8. The jointly-learned dual SMT system: (1) Adapts the SMT systems so that they are tuned specifically for paraphrase generation purposes, e. g., to increase the dissimilarity; (2) Employs a revised BLEU score (named iBLEU, as it’s an input-aware BLEU metric) that measures adequacy and dissimilarity of the paraphrase results at the same time.
    Page 2, “Introduction”
  9. Generating sentential paraphrase with the SMT system is done by first translating a source sentence into another pivot language, and then back into the source.
    Page 2, “Paraphrasing with a Dual SMT System”
  10. Here, we call these two procedures a dual SMT system .
    Page 2, “Paraphrasing with a Dual SMT System”
  11. 2.1 Joint Inference of Dual SMT System
    Page 2, “Paraphrasing with a Dual SMT System”

See all papers in Proc. ACL 2012 that mention SMT systems.

See all papers in Proc. ACL that mention SMT systems.

Back to top.

BLEU

Appears in 13 sentences as: BLEU (17)
In Joint Learning of a Dual SMT System for Paraphrase Generation
  1. In addition, a revised BLEU score (called iBLEU) which measures the adequacy and diversity of the generated paraphrase sentence is proposed for tuning parameters in SMT systems.
    Page 1, “Abstract”
  2. The jointly-learned dual SMT system: (1) Adapts the SMT systems so that they are tuned specifically for paraphrase generation purposes, e. g., to increase the dissimilarity; (2) Employs a revised BLEU score (named iBLEU, as it’s an input-aware BLEU metric) that measures adequacy and dissimilarity of the paraphrase results at the same time.
    Page 2, “Introduction”
  3. Two issues are also raised in (Zhao and Wang, 2010) about using automatic metrics: paraphrase changes less gets larger BLEU score and the evaluations of paraphrase quality and rate tend to be incompatible.
    Page 3, “Paraphrasing with a Dual SMT System”
  4. iBLEU(s,rS,c) = aBLEU(c,7“S) — (l—a) BLEU (c,s) (3)
    Page 3, “Paraphrasing with a Dual SMT System”
  5. BLEU (C, r3) captures the semantic equivalency between the candidates and the references (Finch et al.
    Page 3, “Paraphrasing with a Dual SMT System”
  6. (2005) have shown the capability for measuring semantic equivalency using BLEU score); BLEU (c, s) is the BLEU score computed between the candidate and the source sentence to measure the dissimilarity.
    Page 3, “Paraphrasing with a Dual SMT System”
  7. Jomt learnlng BLEU BLEU zB LE U No Joint 27.16 35.42 / oz 2 1 30.75 53.51 30.75
    Page 3, “Experiments and Results”
  8. We show the BLEU score (computed against references) to measure the adequacy and self-BLEU (computed against source sentence) to evaluate the dissimilarity (lower is better).
    Page 3, “Experiments and Results”
  9. From the results we can see that, when the value of a decreases to address more penalty on self-paraphrase, the self-BLEU score rapidly decays while the consequence effect is that BLEU score computed against references also drops seriously.
    Page 3, “Experiments and Results”
  10. It is not capable with no joint learning or with the traditional BLEU score does not take self-paraphrase into consideration.
    Page 4, “Experiments and Results”
  11. From the results we can see that human evaluations are quite consistent with the automatic evaluation, where higher BLEU scores correspond to larger number of good adequacy and fluency labels, and higher self-BLEU results tend to get lower human evaluations over dissimilarity.
    Page 4, “Experiments and Results”

See all papers in Proc. ACL 2012 that mention BLEU.

See all papers in Proc. ACL that mention BLEU.

Back to top.

BLEU score

Appears in 10 sentences as: BLEU score (10) BLEU scores (1)
In Joint Learning of a Dual SMT System for Paraphrase Generation
  1. In addition, a revised BLEU score (called iBLEU) which measures the adequacy and diversity of the generated paraphrase sentence is proposed for tuning parameters in SMT systems.
    Page 1, “Abstract”
  2. The jointly-learned dual SMT system: (1) Adapts the SMT systems so that they are tuned specifically for paraphrase generation purposes, e. g., to increase the dissimilarity; (2) Employs a revised BLEU score (named iBLEU, as it’s an input-aware BLEU metric) that measures adequacy and dissimilarity of the paraphrase results at the same time.
    Page 2, “Introduction”
  3. Two issues are also raised in (Zhao and Wang, 2010) about using automatic metrics: paraphrase changes less gets larger BLEU score and the evaluations of paraphrase quality and rate tend to be incompatible.
    Page 3, “Paraphrasing with a Dual SMT System”
  4. (2005) have shown the capability for measuring semantic equivalency using BLEU score); BLEU (c, s) is the BLEU score computed between the candidate and the source sentence to measure the dissimilarity.
    Page 3, “Paraphrasing with a Dual SMT System”
  5. We show the BLEU score (computed against references) to measure the adequacy and self-BLEU (computed against source sentence) to evaluate the dissimilarity (lower is better).
    Page 3, “Experiments and Results”
  6. From the results we can see that, when the value of a decreases to address more penalty on self-paraphrase, the self-BLEU score rapidly decays while the consequence effect is that BLEU score computed against references also drops seriously.
    Page 3, “Experiments and Results”
  7. It is not capable with no joint learning or with the traditional BLEU score does not take self-paraphrase into consideration.
    Page 4, “Experiments and Results”
  8. From the results we can see that human evaluations are quite consistent with the automatic evaluation, where higher BLEU scores correspond to larger number of good adequacy and fluency labels, and higher self-BLEU results tend to get lower human evaluations over dissimilarity.
    Page 4, “Experiments and Results”
  9. The first part of iBLEU, which is the traditional BLEU score , helps to ensure the quality of the machine translation results.
    Page 4, “Discussion”
  10. Furthermore, a revised BLEU score that balances between paraphrase adequacy and dissimilarity is proposed in our training process.
    Page 5, “Conclusion”

See all papers in Proc. ACL 2012 that mention BLEU score.

See all papers in Proc. ACL that mention BLEU score.

Back to top.

machine translation

Appears in 8 sentences as: Machine Translation (1) machine translation (7)
In Joint Learning of a Dual SMT System for Paraphrase Generation
  1. Paraphrasing technology has been applied in many NLP applications, such as machine translation (MT), question answering (QA), and natural language generation (NLG).
    Page 1, “Introduction”
  2. As paraphrasing can be viewed as a translation process between the original expression (as input) and the paraphrase results (as output), both in the same language, statistical machine translation (SMT) has been used for this task.
    Page 1, “Introduction”
  3. the noise introduced by machine translation , Zhao et al.
    Page 2, “Introduction”
  4. (2010) propose combining the results of multiple machine translation engines’ by performing MBR (Minimum Bayes Risk) (Kumar and Byrne, 2004) decoding on the N-best translation candidates.
    Page 2, “Introduction”
  5. We focus on sentence level paraphrasing and leverage homogeneous machine translation systems for this task bi-directionally.
    Page 2, “Paraphrasing with a Dual SMT System”
  6. We use 2003 NIST Open Machine Translation Evaluation data (NIST 2003) as development data (containing 919 sentences) for MERT and test the performance on NIST 2008 data set (containing 1357 sentences).
    Page 3, “Experiments and Results”
  7. As the method highly depends on machine translation , a natural question arises to what is the impact when using different pivots or SMT systems.
    Page 4, “Discussion”
  8. The first part of iBLEU, which is the traditional BLEU score, helps to ensure the quality of the machine translation results.
    Page 4, “Discussion”

See all papers in Proc. ACL 2012 that mention machine translation.

See all papers in Proc. ACL that mention machine translation.

Back to top.

NIST

Appears in 5 sentences as: NIST (7)
In Joint Learning of a Dual SMT System for Paraphrase Generation
  1. Our experiments on NIST 2008 testing data with automatic evaluation as well as human judgments suggest that the proposed method is able to enhance the paraphrase quality by adjusting between semantic equivalency and surface dissimilarity.
    Page 1, “Abstract”
  2. We test our method on NIST 2008 testing data.
    Page 2, “Introduction”
  3. We use 2003 NIST Open Machine Translation Evaluation data (NIST 2003) as development data (containing 919 sentences) for MERT and test the performance on NIST 2008 data set (containing 1357 sentences).
    Page 3, “Experiments and Results”
  4. NIST Chinese-to-English evaluation data offers four English human translations for every Chinese sentence.
    Page 3, “Experiments and Results”
  5. Table 1: iBLEU Score Results( NIST 2008)
    Page 3, “Experiments and Results”

See all papers in Proc. ACL 2012 that mention NIST.

See all papers in Proc. ACL that mention NIST.

Back to top.

Evaluation Metrics

Appears in 3 sentences as: evaluation metric (1) Evaluation Metrics (1) evaluation metrics (1)
In Joint Learning of a Dual SMT System for Paraphrase Generation
  1. MERT integrates the automatic evaluation metrics into the training process to achieve optimal end-to-end performance.
    Page 2, “Paraphrasing with a Dual SMT System”
  2. (2) where G is the automatic evaluation metric for paraphrasing.
    Page 2, “Paraphrasing with a Dual SMT System”
  3. 2.2 Paraphrase Evaluation Metrics
    Page 2, “Paraphrasing with a Dual SMT System”

See all papers in Proc. ACL 2012 that mention Evaluation Metrics.

See all papers in Proc. ACL that mention Evaluation Metrics.

Back to top.