Improving Statistical Machine Translation with Monolingual Collocation
Liu, Zhanyi and Wang, Haifeng and Wu, Hua and Li, Sheng

Article Structure

Abstract

This paper proposes to use monolingual collocations to improve Statistical Machine Translation (SMT).

Introduction

Statistical bilingual word alignment (Brown et al.

Collocation Model

Collocation is generally defined as a group of words that occur together more often than by chance (McKeown and Radev, 2000).

Improving Statistical Bilingual Word Alignment

We use the collocation information to improve both one-directional and bidirectional bilingual word alignments.

Improving Phrase Table

Phrase-based SMT system automatically extracts bilingual phrase pairs from the word aligned bilingual corpus.

Experiments on Word Alignment

5.1 Experimental settings

Experiments on Phrase-Based SMT

6.1 Experimental settings

Experiments on Parsing-Based SMT

We also investigate the effectiveness of the improved word alignments on the parsing-based SMT system, Joshua (Li et al., 2009).

Conclusion

We presented a novel method to use monolingual collocations to improve SMT.

Topics

word alignment

Appears in 46 sentences as: word aligned (4) Word Alignment (4) word alignment (23) word alignments (17) words aligned (2)
In Improving Statistical Machine Translation with Monolingual Collocation
  1. We make use of the collocation probabilities, which are estimated from monolingual corpora, in two aspects, namely improving word alignment for various kinds of SMT systems and improving phrase table for phrase-based SMT.
    Page 1, “Abstract”
  2. The experimental results show that our method improves the performance of both word alignment and translation quality significantly.
    Page 1, “Abstract”
  3. Statistical bilingual word alignment (Brown et al.
    Page 1, “Introduction”
  4. Although many methods were proposed to improve the quality of word alignments (Wu, 1997; Och and Ney, 2000; Marcu and Wong, 2002; Cherry and Lin, 2003; Liu et al., 2005; Huang, 2009), the correlation of the words in multi-word alignments is not fully considered.
    Page 1, “Introduction”
  5. In phrase-based SMT (Koehn et al., 2003), the phrase boundary is usually determined based on the bidirectional word alignments .
    Page 1, “Introduction”
  6. We first identify potentially collocated words and estimate collocation probabilities from monolingual corpora using a Monolingual Word Alignment (MWA) method (Liu et al., 2009), which does not need any additional resource or linguistic preprocessing, and which outperforms previous methods on the same eXperimental data.
    Page 1, “Introduction”
  7. Then the collocation information is employed to improve Bilingual Word Alignment (BWA) for various kinds of SMT systems and to improve phrase table for phrase-based SMT.
    Page 1, “Introduction”
  8. This method adapts the bilingual word alignment algorithm to monolingual scenario to extract collocations only from monolingual corpora.
    Page 2, “Collocation Model”
  9. 2.1 Monolingual word alignment
    Page 2, “Collocation Model”
  10. Then the monolingual word alignment algorithm is employed to align the potentially collocated words in the monolingual sentences.
    Page 2, “Collocation Model”
  11. (2009), we employ the MWA Model 3 (corresponding to IBM Model 3) to calculate the probability of the monolingual word alignment sequence, as shown in Eq.
    Page 2, “Collocation Model”

See all papers in Proc. ACL 2010 that mention word alignment.

See all papers in Proc. ACL that mention word alignment.

Back to top.

phrase-based

Appears in 19 sentences as: Phrase-Based (1) Phrase-based (2) phrase-based (16)
In Improving Statistical Machine Translation with Monolingual Collocation
  1. We make use of the collocation probabilities, which are estimated from monolingual corpora, in two aspects, namely improving word alignment for various kinds of SMT systems and improving phrase table for phrase-based SMT.
    Page 1, “Abstract”
  2. As compared to baseline systems, we achieve absolute improvements of 2.40 BLEU score on a phrase-based SMT system and 1.76 BLEU score on a parsing-based SMT system.
    Page 1, “Abstract”
  3. In phrase-based SMT (Koehn et al., 2003), the phrase boundary is usually determined based on the bidirectional word alignments.
    Page 1, “Introduction”
  4. Then the collocation information is employed to improve Bilingual Word Alignment (BWA) for various kinds of SMT systems and to improve phrase table for phrase-based SMT.
    Page 1, “Introduction”
  5. Then the phrase collocation probabilities are used as additional features in phrase-based SMT systems.
    Page 1, “Introduction”
  6. The alignment improvement results in an improvement of 2.16 BLEU score on phrase-based SMT system and an improvement of 1.76 BLEU score on parsing-based SMT system.
    Page 1, “Introduction”
  7. If we use phrase collocation probabilities as additional features, the phrase-based
    Page 1, “Introduction”
  8. Phrase-based SMT system automatically extracts bilingual phrase pairs from the word aligned bilingual corpus.
    Page 4, “Improving Phrase Table”
  9. These collocation probabilities are incorporated into the phrase-based SMT system as features.
    Page 4, “Improving Phrase Table”
  10. Moses (Koehn et al., 2007) is used as the baseline phrase-based SMT system.
    Page 6, “Experiments on Phrase-Based SMT”
  11. 6.2 Effect of improved word alignment on phrase-based SMT
    Page 6, “Experiments on Phrase-Based SMT”

See all papers in Proc. ACL 2010 that mention phrase-based.

See all papers in Proc. ACL that mention phrase-based.

Back to top.

SMT system

Appears in 19 sentences as: SMT system (13) SMT systems (9)
In Improving Statistical Machine Translation with Monolingual Collocation
  1. We make use of the collocation probabilities, which are estimated from monolingual corpora, in two aspects, namely improving word alignment for various kinds of SMT systems and improving phrase table for phrase-based SMT.
    Page 1, “Abstract”
  2. As compared to baseline systems, we achieve absolute improvements of 2.40 BLEU score on a phrase-based SMT system and 1.76 BLEU score on a parsing-based SMT system .
    Page 1, “Abstract”
  3. 1993) is the base of most SMT systems .
    Page 1, “Introduction”
  4. Then the collocation information is employed to improve Bilingual Word Alignment (BWA) for various kinds of SMT systems and to improve phrase table for phrase-based SMT.
    Page 1, “Introduction”
  5. Then the phrase collocation probabilities are used as additional features in phrase-based SMT systems .
    Page 1, “Introduction”
  6. The alignment improvement results in an improvement of 2.16 BLEU score on phrase-based SMT system and an improvement of 1.76 BLEU score on parsing-based SMT system .
    Page 1, “Introduction”
  7. This method ignores the correlation of the words in the same alignment unit, so an alignment may include many unrelated words2, which influences the performances of SMT systems .
    Page 3, “Improving Statistical Bilingual Word Alignment”
  8. Phrase-based SMT system automatically extracts bilingual phrase pairs from the word aligned bilingual corpus.
    Page 4, “Improving Phrase Table”
  9. These collocation probabilities are incorporated into the phrase-based SMT system as features.
    Page 4, “Improving Phrase Table”
  10. To train a Chinese-to-English SMT system , we need to perform both Chinese-to-English and
    Page 5, “Experiments on Word Alignment”
  11. We use FBIS corpus to train the Chinese-to-English SMT systems .
    Page 6, “Experiments on Phrase-Based SMT”

See all papers in Proc. ACL 2010 that mention SMT system.

See all papers in Proc. ACL that mention SMT system.

Back to top.

BLEU

Appears in 13 sentences as: BLEU (16)
In Improving Statistical Machine Translation with Monolingual Collocation
  1. As compared to baseline systems, we achieve absolute improvements of 2.40 BLEU score on a phrase-based SMT system and 1.76 BLEU score on a parsing-based SMT system.
    Page 1, “Abstract”
  2. The alignment improvement results in an improvement of 2.16 BLEU score on phrase-based SMT system and an improvement of 1.76 BLEU score on parsing-based SMT system.
    Page 1, “Introduction”
  3. SMT performance is further improved by 0.24 BLEU score.
    Page 2, “Introduction”
  4. Experiments BLEU (%) Baseline 29.62 CM-l 30.85 WA-l CM-2 31.28 CM-3 31.48 CM-l 3 l .00 Our methods WA-2 CM-2 3 l .33 CM-3 31.51 CM-l 3 l .43 WA-3 CM-2 31.62 CM-3 31.78
    Page 6, “Experiments on Word Alignment”
  5. We use BLEU (Papineni et al., 2002) as evaluation metrics.
    Page 6, “Experiments on Phrase-Based SMT”
  6. Experiments BLEU (%) Moses 29.62 + Phrase collocation probability 30.47
    Page 7, “Experiments on Phrase-Based SMT”
  7. If the same alignment method is used, the systems using CM-3 got the highest BLEU scores.
    Page 7, “Experiments on Phrase-Based SMT”
  8. When the phrase collocation probabilities are incorporated into the SMT system, the translation quality is improved, achieving an absolute improvement of 0.85 BLEU score.
    Page 7, “Experiments on Phrase-Based SMT”
  9. As compared with the baseline system, an absolute improvement of 2.40 BLEU score is achieved.
    Page 7, “Experiments on Phrase-Based SMT”
  10. Experiments BLEU (%) Joshua 30.05 + Improved word alignments 31.81
    Page 8, “Experiments on Parsing-Based SMT”
  11. The system using the improved word alignments achieves an absolute improvement of 1.76 BLEU score, which indicates that the improvements of word alignments are also effective to improve the performance of the parsing-based SMT systems.
    Page 8, “Experiments on Parsing-Based SMT”

See all papers in Proc. ACL 2010 that mention BLEU.

See all papers in Proc. ACL that mention BLEU.

Back to top.

BLEU score

Appears in 9 sentences as: BLEU score (11) BLEU scores (1)
In Improving Statistical Machine Translation with Monolingual Collocation
  1. As compared to baseline systems, we achieve absolute improvements of 2.40 BLEU score on a phrase-based SMT system and 1.76 BLEU score on a parsing-based SMT system.
    Page 1, “Abstract”
  2. The alignment improvement results in an improvement of 2.16 BLEU score on phrase-based SMT system and an improvement of 1.76 BLEU score on parsing-based SMT system.
    Page 1, “Introduction”
  3. SMT performance is further improved by 0.24 BLEU score .
    Page 2, “Introduction”
  4. If the same alignment method is used, the systems using CM-3 got the highest BLEU scores .
    Page 7, “Experiments on Phrase-Based SMT”
  5. When the phrase collocation probabilities are incorporated into the SMT system, the translation quality is improved, achieving an absolute improvement of 0.85 BLEU score .
    Page 7, “Experiments on Phrase-Based SMT”
  6. As compared with the baseline system, an absolute improvement of 2.40 BLEU score is achieved.
    Page 7, “Experiments on Phrase-Based SMT”
  7. The system using the improved word alignments achieves an absolute improvement of 1.76 BLEU score , which indicates that the improvements of word alignments are also effective to improve the performance of the parsing-based SMT systems.
    Page 8, “Experiments on Parsing-Based SMT”
  8. The improved word alignment results in an improvement of 2.16 BLEU score on a phrase-based SMT system and an improvement of 1.76 BLEU score on a parsing-based SMT system.
    Page 8, “Conclusion”
  9. When we also used phrase collocation probabilities as additional features, the phrase-based SMT performance is finally improved by 2.40 BLEU score as compared with the baseline system.
    Page 8, “Conclusion”

See all papers in Proc. ACL 2010 that mention BLEU score.

See all papers in Proc. ACL that mention BLEU score.

Back to top.

sentence pair

Appears in 8 sentences as: sentence pair (6) sentence pairs (2)
In Improving Statistical Machine Translation with Monolingual Collocation
  1. The monolingual corpus is first replicated to generate a parallel corpus, where each sentence pair consists of two identical sentences in the same language.
    Page 2, “Collocation Model”
  2. According to the BWA method, given a bilingual sentence pair E = e11 and F = fl’” , the optimal
    Page 3, “Improving Statistical Bilingual Word Alignment”
  3. Thus, the collocation probability of the alignment sequence of a sentence pair can be calculated according to Eq.
    Page 3, “Improving Statistical Bilingual Word Alignment”
  4. model to calculate the word alignment probability of a sentence pair , as shown in Eq.
    Page 3, “Improving Statistical Bilingual Word Alignment”
  5. Then we employ the hill-climbing algorithm (Al-Onaizan et al., 1999) to search for the optimal alignment sequence of a given sentence pair , where the score of an alignment sequence is calculated as in Eq.
    Page 3, “Improving Statistical Bilingual Word Alignment”
  6. (8) only deals with many-to-one alignments, but the alignment sequence of a sentence pair also includes one-to-one alignments.
    Page 3, “Improving Statistical Bilingual Word Alignment”
  7. To investigate the quality of the generated word alignments, we randomly selected a subset from the bilingual corpus as test set, including 500 sentence pairs .
    Page 4, “Experiments on Word Alignment”
  8. (11), we also manually labeled a development set including 100 sentence pairs , in the same manner as the test set.
    Page 5, “Experiments on Word Alignment”

See all papers in Proc. ACL 2010 that mention sentence pair.

See all papers in Proc. ACL that mention sentence pair.

Back to top.

error rate

Appears in 7 sentences as: error rate (7)
In Improving Statistical Machine Translation with Monolingual Collocation
  1. The evaluation results show that the proposed method in this paper significantly improves mul-ti-word alignment, achieving an absolute error rate reduction of 29%.
    Page 1, “Introduction”
  2. For multi-word alignments, our methods significantly outperform the baseline method in terms of both precision and recall, achieving up to 18% absolute error rate reduction.
    Page 5, “Experiments on Word Alignment”
  3. CM-3, the error rate of multi-word alignment results is further reduced.
    Page 5, “Experiments on Word Alignment”
  4. We can see that WA-l achieves lower alignment error rate as compared to the baseline method, since the performance of the improved one-directional alignment method is better than that of GIZA++.
    Page 6, “Experiments on Word Alignment”
  5. Our method using both methods proposed in section 3 produces the best alignment performance, achieving 11% absolute error rate reduction.
    Page 6, “Experiments on Word Alignment”
  6. And Koehn's implementation of minimum error rate training (Och, 2003) is used to tune the feature weights on the development set.
    Page 6, “Experiments on Phrase-Based SMT”
  7. The evaluation results showed that the proposed method significantly improved word alignment, achieving an absolute error rate reduction of 29% on multi-word alignment.
    Page 8, “Conclusion”

See all papers in Proc. ACL 2010 that mention error rate.

See all papers in Proc. ACL that mention error rate.

Back to top.

phrase table

Appears in 7 sentences as: phrase table (7)
In Improving Statistical Machine Translation with Monolingual Collocation
  1. We make use of the collocation probabilities, which are estimated from monolingual corpora, in two aspects, namely improving word alignment for various kinds of SMT systems and improving phrase table for phrase-based SMT.
    Page 1, “Abstract”
  2. Then the collocation information is employed to improve Bilingual Word Alignment (BWA) for various kinds of SMT systems and to improve phrase table for phrase-based SMT.
    Page 1, “Introduction”
  3. To improve phrase table , we calculate phrase collocation probabilities based on word collocation probabilities.
    Page 1, “Introduction”
  4. In section 3 and 4, we show how to improve the BWA method and the phrase table using collocation models respectively.
    Page 2, “Introduction”
  5. We also investigate the performance of the system employing both the word alignment improvement and phrase table improvement methods.
    Page 7, “Experiments on Phrase-Based SMT”
  6. Then the collocation information was employed to improve BWA for various kinds of SMT systems and to improve phrase table for phrase-based SMT.
    Page 8, “Conclusion”
  7. To improve phrase table , we calculate phrase collocation probabilities based on word collocation probabilities.
    Page 8, “Conclusion”

See all papers in Proc. ACL 2010 that mention phrase table.

See all papers in Proc. ACL that mention phrase table.

Back to top.

development set

Appears in 6 sentences as: development set (6)
In Improving Statistical Machine Translation with Monolingual Collocation
  1. For the phrase only including one word, we set a fixed collocation probability that is the average of the collocation probabilities of the sentences on a development set .
    Page 4, “Improving Phrase Table”
  2. (11), we also manually labeled a development set including 100 sentence pairs, in the same manner as the test set.
    Page 5, “Experiments on Word Alignment”
  3. By minimizing the AER on the development set , the interpolation coefficients of the collocation probabilities on CM-l and CM-2 were set to 0.1 and 0.9.
    Page 5, “Experiments on Word Alignment”
  4. We used the NIST MT-2002 set as the development set and the NIST MT-2004 test set as the test set.
    Page 6, “Experiments on Phrase-Based SMT”
  5. And Koehn's implementation of minimum error rate training (Och, 2003) is used to tune the feature weights on the development set .
    Page 6, “Experiments on Phrase-Based SMT”
  6. The feature weights are tuned on the development set using the minimum error
    Page 7, “Experiments on Parsing-Based SMT”

See all papers in Proc. ACL 2010 that mention development set.

See all papers in Proc. ACL that mention development set.

Back to top.

Machine Translation

Appears in 6 sentences as: Machine Translation (6)
In Improving Statistical Machine Translation with Monolingual Collocation
  1. This paper proposes to use monolingual collocations to improve Statistical Machine Translation (SMT).
    Page 1, “Abstract”
  2. Statistical Machine Translation .
    Page 8, “Conclusion”
  3. The Mathematics of Statistical Machine Translation : Pa-
    Page 8, “Conclusion”
  4. Statistical Significance Tests for Machine Translation Evaluation.
    Page 8, “Conclusion”
  5. Moses: Open Source Toolkit for Statistical Machine Translation .
    Page 8, “Conclusion”
  6. Demonstration of Joshua: An Open Source Toolkit for Parsing-based Machine Translation .
    Page 8, “Conclusion”

See all papers in Proc. ACL 2010 that mention Machine Translation.

See all papers in Proc. ACL that mention Machine Translation.

Back to top.

baseline system

Appears in 5 sentences as: baseline system (4) baseline systems (1)
In Improving Statistical Machine Translation with Monolingual Collocation
  1. As compared to baseline systems , we achieve absolute improvements of 2.40 BLEU score on a phrase-based SMT system and 1.76 BLEU score on a parsing-based SMT system.
    Page 1, “Abstract”
  2. From the results of Table 4, it can be seen that the systems using the improved bidirectional alignments achieve higher quality of translation than the baseline system .
    Page 7, “Experiments on Phrase-Based SMT”
  3. Figure 3 shows an example: T1 is generated by the system where the phrase collocation probabilities are used and T2 is generated by the baseline system .
    Page 7, “Experiments on Phrase-Based SMT”
  4. As compared with the baseline system , an absolute improvement of 2.40 BLEU score is achieved.
    Page 7, “Experiments on Phrase-Based SMT”
  5. When we also used phrase collocation probabilities as additional features, the phrase-based SMT performance is finally improved by 2.40 BLEU score as compared with the baseline system .
    Page 8, “Conclusion”

See all papers in Proc. ACL 2010 that mention baseline system.

See all papers in Proc. ACL that mention baseline system.

Back to top.

Statistical Machine Translation

Appears in 4 sentences as: Statistical Machine Translation (4)
In Improving Statistical Machine Translation with Monolingual Collocation
  1. This paper proposes to use monolingual collocations to improve Statistical Machine Translation (SMT).
    Page 1, “Abstract”
  2. Statistical Machine Translation .
    Page 8, “Conclusion”
  3. The Mathematics of Statistical Machine Translation : Pa-
    Page 8, “Conclusion”
  4. Moses: Open Source Toolkit for Statistical Machine Translation .
    Page 8, “Conclusion”

See all papers in Proc. ACL 2010 that mention Statistical Machine Translation.

See all papers in Proc. ACL that mention Statistical Machine Translation.

Back to top.

feature weights

Appears in 3 sentences as: feature weights (3)
In Improving Statistical Machine Translation with Monolingual Collocation
  1. feature weights , respectively.
    Page 3, “Improving Statistical Bilingual Word Alignment”
  2. And Koehn's implementation of minimum error rate training (Och, 2003) is used to tune the feature weights on the development set.
    Page 6, “Experiments on Phrase-Based SMT”
  3. The feature weights are tuned on the development set using the minimum error
    Page 7, “Experiments on Parsing-Based SMT”

See all papers in Proc. ACL 2010 that mention feature weights.

See all papers in Proc. ACL that mention feature weights.

Back to top.

translation model

Appears in 3 sentences as: translation model (3)
In Improving Statistical Machine Translation with Monolingual Collocation
  1. IBM Model 1 only employs the word translation model to calculate the probabilities of alignments.
    Page 3, “Improving Statistical Bilingual Word Alignment”
  2. In IBM Model 2, both the word translation model and position distribution model are used.
    Page 3, “Improving Statistical Bilingual Word Alignment”
  3. IBM Model 3, 4 and 5 consider the fertility model in addition to the word translation model and position distribution model.
    Page 3, “Improving Statistical Bilingual Word Alignment”

See all papers in Proc. ACL 2010 that mention translation model.

See all papers in Proc. ACL that mention translation model.

Back to top.

translation quality

Appears in 3 sentences as: translation quality (3)
In Improving Statistical Machine Translation with Monolingual Collocation
  1. The experimental results show that our method improves the performance of both word alignment and translation quality significantly.
    Page 1, “Abstract”
  2. Here, we investigate three different collocation models for translation quality improvement.
    Page 7, “Experiments on Phrase-Based SMT”
  3. When the phrase collocation probabilities are incorporated into the SMT system, the translation quality is improved, achieving an absolute improvement of 0.85 BLEU score.
    Page 7, “Experiments on Phrase-Based SMT”

See all papers in Proc. ACL 2010 that mention translation quality.

See all papers in Proc. ACL that mention translation quality.

Back to top.

word pair

Appears in 3 sentences as: word pair (2) word pairs (1)
In Improving Statistical Machine Translation with Monolingual Collocation
  1. Figure 1 shows an example of the potentially collocated word pairs aligned by the MWA method.
    Page 2, “Collocation Model”
  2. Then the probability for each aligned word pair is estimated as follows:
    Page 2, “Collocation Model”
  3. word pair calculated according to Eq.
    Page 4, “Improving Phrase Table”

See all papers in Proc. ACL 2010 that mention word pair.

See all papers in Proc. ACL that mention word pair.

Back to top.