Name-aware Machine Translation
Li, Haibo and Zheng, Jing and Ji, Heng and Li, Qi and Wang, Wen

Article Structure

Abstract

We propose a Name-aware Machine Translation (MT) approach which can tightly integrate name processing into MT model, by jointly annotating parallel corpora, extracting name-aware translation grammar and rules, adding name phrase table and name translation driven decoding.

Introduction

A shrinking fraction of the world’s Web pages are written in English, therefore the ability to access pages across a range of languages is becoming increasingly important.

Baseline MT

As our baseline, we apply a high-performing Chinese-English MT system (Zheng, 2008; Zheng et al., 2009) based on hierarchical phrase-based translation framework (Chiang, 2005).

Name-aware MT

We tightly integrate name processing into the above baseline to construct a NAMT model.

Name-aware MT Evaluation

Traditional MT evaluation metrics such as BLEU (Papineni et al., 2002) and Translation Edit Rate (TER) (Snover et al., 2006) assign the same weights to all tokens equally.

Experiments

In this section we present the experimental results of NAMT compared to the baseline MT.

Related Work

Two types of humble strategies were previously attempted to build name translation components which operate in tandem and loosely integrate into conventional statistical MT systems:

Conclusions and Future Work

We developed a name-aware MT framework which tightly integrates name tagging and name translation into training and decoding of MT.

Topics

BLEU

Appears in 19 sentences as: BLEU (20)
In Name-aware Machine Translation
  1. 0 The current dominant automatic MT scoring metrics (such as Bilingual Evaluation Understudy ( BLEU ) (Papineni et al., 2002)) treat all words equally, but names have relative low frequency in text (about 6% in newswire and only 3% in web documents) and thus are vastly outnumbered by function words and common nouns, etc..
    Page 1, “Introduction”
  2. The scaling factors for all features are optimized by minimum error rate training algorithm to maximize BLEU score (Och, 2003).
    Page 2, “Baseline MT”
  3. Traditional MT evaluation metrics such as BLEU (Papineni et al., 2002) and Translation Edit Rate (TER) (Snover et al., 2006) assign the same weights to all tokens equally.
    Page 4, “Name-aware MT Evaluation”
  4. In order to properly evaluate the translation quality of NAMT methods, we propose to modify the BLEU metric so that they can dynamically assign more weights to names during evaluation.
    Page 4, “Name-aware MT Evaluation”
  5. BLEU considers the correspondence between a system translation and a human translation:
    Page 4, “Name-aware MT Evaluation”
  6. N BLEU 2 BP - exp wn logpn) (1)
    Page 4, “Name-aware MT Evaluation”
  7. As in BLEU metric, we first count the maximum number of times an n-gram occurs in any single reference translation.
    Page 4, “Name-aware MT Evaluation”
  8. Based on BLEU score, we design a name-aware BLEU metric as follows.
    Page 4, “Name-aware MT Evaluation”
  9. Finally the name-aware BLEU score is defined as:
    Page 5, “Name-aware MT Evaluation”
  10. We can see that except for the BOLT3 data set with BLEU metric, our NAMT approach consistently outperformed the baseline system for all data sets with all metrics, and provided up to 23.6% relative error reduction on name translation.
    Page 6, “Experiments”
  11. According to Wilcoxon Matched-Pairs Signed-Ranks Test, the improvement is not significant with BLEU metric, but is significant at 98% confidence level with all of the other metrics.
    Page 6, “Experiments”

See all papers in Proc. ACL 2013 that mention BLEU.

See all papers in Proc. ACL that mention BLEU.

Back to top.

word alignment

Appears in 17 sentences as: Word Alignment (2) word alignment (15)
In Name-aware Machine Translation
  1. Experiments on Chinese-English translation demonstrated the effectiveness of our approach on enhancing the quality of overall translation, name translation and word alignment over a high-quality MT baselinel.
    Page 1, “Abstract”
  2. names in parallel corpora, updating word segmentation, word alignment and grammar extraction (Section 3.1).
    Page 2, “Introduction”
  3. We pair two entities from two languages, if they have the same entity type and are mapped together by word alignment .
    Page 3, “Name-aware MT”
  4. First, we replace tagged name pairs with their entity types, and then use Giza++ and symmetrization heuristics to regenerate word alignment .
    Page 3, “Name-aware MT”
  5. Since the name tags appear very frequently, the existence of such tags yields improvement in word alignment quality.
    Page 3, “Name-aware MT”
  6. It is necessary to incorporate word alignment as additional constraints because the order of names is often changed after translation.
    Page 4, “Name-aware MT”
  7. Therefore, it is important to use name-replaced corpora for rule extraction to fully take advantage of improved word alignment .
    Page 6, “Experiments”
  8. 5.4 Word Alignment
    Page 7, “Experiments”
  9. It is also important to investigate the impact of our NAMT approach on improving word alignment .
    Page 7, “Experiments”
  10. We conducted the experiment on the Chinese-English Parallel Treebank (Li et a1., 2010) with ground-truth word alignment .
    Page 7, “Experiments”
  11. Table 3: Impact of Joint Bilingual Name Tagging on Word Alignment (%).
    Page 7, “Experiments”

See all papers in Proc. ACL 2013 that mention word alignment.

See all papers in Proc. ACL that mention word alignment.

Back to top.

LM

Appears in 12 sentences as: LM (12)
In Name-aware Machine Translation
  1. Optimize name translation and context translation simultaneously and conduct name translation driven decoding with language model ( LM ) based selection (Section 3.2).
    Page 2, “Introduction”
  2. The LM used for decoding is a log-linear combination of four word n-gram LMs which are built on different English
    Page 2, “Baseline MT”
  3. corpora (details described in section 5.1), with the LM weights optimized on a development set and determined by minimum error rate training (MERT), to estimate the probability of a word given the preceding words.
    Page 2, “Baseline MT”
  4. LM1 is a 7-gram LM trained on the tar-
    Page 5, “Experiments”
  5. LM2 is a 7-gram LM trained only on the English monolingual discussion forums data listed above.
    Page 5, “Experiments”
  6. LM3 is a 4-gram LM trained on the web genre among the target side of all parallel text (i.e., web text from pre-BOLT parallel text and BOLT released discussion forum parallel text).
    Page 5, “Experiments”
  7. LM4 is a 4-gram LM trained on the English broadcast news and conversation transcripts released under the DARPA GALE program.
    Page 5, “Experiments”
  8. enable the LM to decide which translations to choose when encountering the names in the texts (Ji et al., 2009).
    Page 8, “Related Work”
  9. The LM selection method often assigns an inappropriate weight to the additional name translation table because it is constructed independently from translation of context words; therefore after weighted voting most correct name translations are not used in the final translation output.
    Page 8, “Related Work”
  10. More importantly, in these approaches the MT model was still mostly treated as a “black-box” because neither the translation model nor the LM was updated or adapted specifically for names.
    Page 8, “Related Work”
  11. Most of the previous name translation work combined supervised transliteration approaches with LM based re-scoring (Knight and Graehl, 1998; Al-Onaizan and Knight, 2002; Huang et al., 2004).
    Page 9, “Related Work”

See all papers in Proc. ACL 2013 that mention LM.

See all papers in Proc. ACL that mention LM.

Back to top.

MT system

Appears in 8 sentences as: MT system (5) MT systems (2) MT systems: (1)
In Name-aware Machine Translation
  1. A typical statistical MT system can only translate 60% person names correctly (Ji et al., 2009).
    Page 1, “Introduction”
  2. As our baseline, we apply a high-performing Chinese-English MT system (Zheng, 2008; Zheng et al., 2009) based on hierarchical phrase-based translation framework (Chiang, 2005).
    Page 2, “Baseline MT”
  3. For example, the baseline MT system mistakenly translated a person name “3% 21%?
    Page 6, “Experiments”
  4. For example, the following sentence: “§K%%Efifi§i$fi%§§ infill, EEWHWZ’Q (Gao Meimei’s strength really is formidable, I really admire her)” was mistakenly translated into “Gao the strength of the America and the America also really strong , ah , really admire her” by the baseline MT system because the person name “33%;: (Gaomeimei)” was mistakenly segmented into three words “$3 (Gao)”, “% (the America)” and “% (the America)”.
    Page 6, “Experiments”
  5. Furthermore, we calculated three Pearson product-moment correlation coefficients between human judgment scores and name-aware BLEU scores of these two MT systems .
    Page 7, “Experiments”
  6. Two types of humble strategies were previously attempted to build name translation components which operate in tandem and loosely integrate into conventional statistical MT systems:
    Page 8, “Related Work”
  7. Preprocessing: identify names in the source texts and propose name translations to the MT system ; the name translation results can be simply but aggressively transferred from the source to the target side using word alignment, or added into phrase table in order to
    Page 8, “Related Work”
  8. Some statistical MT systems (e.g.
    Page 8, “Related Work”

See all papers in Proc. ACL 2013 that mention MT system.

See all papers in Proc. ACL that mention MT system.

Back to top.

cross-lingual

Appears in 7 sentences as: Cross-lingual (1) cross-lingual (6)
In Name-aware Machine Translation
  1. This need can be addressed in part by cross-lingual information access tasks such as entity linking (McNamee et al., 2011; Cassidy et al., 2012), event extraction (Hakkani-Tur et al., 2007), slot filling (Snover et al., 2011) and question answering (Parton et al., 2009; Parton and McKeown, 2010).
    Page 1, “Introduction”
  2. A key bottleneck of high-quality cross-lingual information access lies in the performance of Machine Translation (MT).
    Page 1, “Introduction”
  3. Traditional name tagging approaches for single languages cannot address this requirement because they were all built on data and resources which are specific to each language without using any cross-lingual features.
    Page 2, “Name-aware MT”
  4. We developed a bilingual joint name tagger (Li et al., 2012) based on conditional random fields that incorporates both monolingual and cross-lingual features and conducts joint inference, so that name tagging from two languages can mutually enhance each other and therefore inconsistent results can be corrected simultaneously.
    Page 2, “Name-aware MT”
  5. However, for cross-lingual information processing applications, we should acknowledge that certain informationally critical words are more important than other common words.
    Page 4, “Name-aware MT Evaluation”
  6. Cross-lingual information transfer
    Page 8, “Experiments”
  7. Postprocessing: in a cross-lingual information retrieval or question answering framework, online query names can be utilized to obtain translation and post-edit MT output (Parton et al., 2009; Ma and McKeown, 2009; Parton and McKeown, 2010; Parton et al., 2012).
    Page 8, “Related Work”

See all papers in Proc. ACL 2013 that mention cross-lingual.

See all papers in Proc. ACL that mention cross-lingual.

Back to top.

Chinese-English

Appears in 6 sentences as: Chinese-English (6)
In Name-aware Machine Translation
  1. Experiments on Chinese-English translation demonstrated the effectiveness of our approach on enhancing the quality of overall translation, name translation and word alignment over a high-quality MT baselinel.
    Page 1, “Abstract”
  2. As our baseline, we apply a high-performing Chinese-English MT system (Zheng, 2008; Zheng et al., 2009) based on hierarchical phrase-based translation framework (Chiang, 2005).
    Page 2, “Baseline MT”
  3. We used a large Chinese-English MT training corpus from various sources and genres (including newswire, web text, broadcast news and broadcast conversations) for our experiments.
    Page 5, “Experiments”
  4. get side of Chinese-English and Egyptian Arabic-English parallel text, English monolingual discussion forums data Rl-R4 released in BOLT Phase 1 (LDC2012E04, LDC2012E16, LDC2012E21, LDC2012E54), and English Gigaword Fifth Edition (LDC2011T07).
    Page 5, “Experiments”
  5. We conducted the experiment on the Chinese-English Parallel Treebank (Li et a1., 2010) with ground-truth word alignment.
    Page 7, “Experiments”
  6. Experiments on Chinese-English translation demonstrated the effectiveness of our approach over a high-quality MT baseline in both overall translation and name translation, especially for formal genres.
    Page 9, “Conclusions and Future Work”

See all papers in Proc. ACL 2013 that mention Chinese-English.

See all papers in Proc. ACL that mention Chinese-English.

Back to top.

n-gram

Appears in 6 sentences as: n-gram (7)
In Name-aware Machine Translation
  1. The LM used for decoding is a log-linear combination of four word n-gram LMs which are built on different English
    Page 2, “Baseline MT”
  2. where wn is a set of positive weights summing to one and usually uniformly set as 712,, = l/N, c is the length of the system translation and 7“ is the length of reference translation, and pn is modified n-gram precision defined as: Z Z Countelipm-gram)
    Page 4, “Name-aware MT Evaluation”
  3. As in BLEU metric, we first count the maximum number of times an n-gram occurs in any single reference translation.
    Page 4, “Name-aware MT Evaluation”
  4. The weight of an n-gram in reference translation is the sum of weights of all tokens it contains.
    Page 4, “Name-aware MT Evaluation”
  5. Next, we compute the weighted modified n-gram precision Countweight_clip( n-gram ) as follows:
    Page 4, “Name-aware MT Evaluation”
  6. The Countelipm-gram) in the equation 3 is substituted with above Countweight_ClZ—p ( n-gram ).
    Page 5, “Name-aware MT Evaluation”

See all papers in Proc. ACL 2013 that mention n-gram.

See all papers in Proc. ACL that mention n-gram.

Back to top.

BLEU score

Appears in 6 sentences as: BLEU score (3) BLEU scores (3)
In Name-aware Machine Translation
  1. The scaling factors for all features are optimized by minimum error rate training algorithm to maximize BLEU score (Och, 2003).
    Page 2, “Baseline MT”
  2. Based on BLEU score , we design a name-aware BLEU metric as follows.
    Page 4, “Name-aware MT Evaluation”
  3. Finally the name-aware BLEU score is defined as:
    Page 5, “Name-aware MT Evaluation”
  4. In order to investigate the correlation between name-aware BLEU scores and human judgment results, we asked three bilingual speakers to judge our translation output from the baseline system and the NAMT system, on a Chinese subset of 250 sentences (each sentence has two corresponding translations from baseline and NAMT) extracted randomly from 7 test corpora.
    Page 7, “Experiments”
  5. We computed the name-aware BLEU scores on the subset and also the aggregated average scores from human judgments.
    Page 7, “Experiments”
  6. Furthermore, we calculated three Pearson product-moment correlation coefficients between human judgment scores and name-aware BLEU scores of these two MT systems.
    Page 7, “Experiments”

See all papers in Proc. ACL 2013 that mention BLEU score.

See all papers in Proc. ACL that mention BLEU score.

Back to top.

human judgment

Appears in 5 sentences as: human judgement (1) human judgment (3) human judgments (1)
In Name-aware Machine Translation
  1. In order to investigate the correlation between name-aware BLEU scores and human judgment results, we asked three bilingual speakers to judge our translation output from the baseline system and the NAMT system, on a Chinese subset of 250 sentences (each sentence has two corresponding translations from baseline and NAMT) extracted randomly from 7 test corpora.
    Page 7, “Experiments”
  2. We computed the name-aware BLEU scores on the subset and also the aggregated average scores from human judgments .
    Page 7, “Experiments”
  3. Figure 2 shows that NAMT consistently achieved higher scores with both name-aware BLEU metric and human judgement .
    Page 7, “Experiments”
  4. Furthermore, we calculated three Pearson product-moment correlation coefficients between human judgment scores and name-aware BLEU scores of these two MT systems.
    Page 7, “Experiments”
  5. Give the sample size and the correlation coefficient value, the high significance value of 0.99 indicates that name-aware BLEU tracks human judgment well.
    Page 7, “Experiments”

See all papers in Proc. ACL 2013 that mention human judgment.

See all papers in Proc. ACL that mention human judgment.

Back to top.

parallel corpora

Appears in 5 sentences as: parallel corpora (5)
In Name-aware Machine Translation
  1. We propose a Name-aware Machine Translation (MT) approach which can tightly integrate name processing into MT model, by jointly annotating parallel corpora , extracting name-aware translation grammar and rules, adding name phrase table and name translation driven decoding.
    Page 1, “Abstract”
  2. names in parallel corpora , updating word segmentation, word alignment and grammar extraction (Section 3.1).
    Page 2, “Introduction”
  3. We built a NAMT system from such name-tagged parallel corpora .
    Page 3, “Name-aware MT”
  4. The realigned parallel corpora are used to train our NAMT system based on SCFG.
    Page 3, “Name-aware MT”
  5. However, the original parallel corpora contain many high-frequency names, which can already be handled well by the baseline MT.
    Page 3, “Name-aware MT”

See all papers in Proc. ACL 2013 that mention parallel corpora.

See all papers in Proc. ACL that mention parallel corpora.

Back to top.

phrase table

Appears in 5 sentences as: phrase table (5)
In Name-aware Machine Translation
  1. We propose a Name-aware Machine Translation (MT) approach which can tightly integrate name processing into MT model, by jointly annotating parallel corpora, extracting name-aware translation grammar and rules, adding name phrase table and name translation driven decoding.
    Page 1, “Abstract”
  2. Finally, the extracted 9,963 unique name translation pairs were also used to create an additional name phrase table for NAMT.
    Page 4, “Name-aware MT”
  3. Finally, based on LMs, our decoder exploits the dynamically created phrase table from name translation, competing with originally extracted rules, to find the best translation for the input sentence.
    Page 4, “Name-aware MT”
  4. For better comparison with NAMT, besides the original baseline, we develop the other baseline system by adding name translation table into the phrase table (NPhrase).
    Page 6, “Experiments”
  5. Preprocessing: identify names in the source texts and propose name translations to the MT system; the name translation results can be simply but aggressively transferred from the source to the target side using word alignment, or added into phrase table in order to
    Page 8, “Related Work”

See all papers in Proc. ACL 2013 that mention phrase table.

See all papers in Proc. ACL that mention phrase table.

Back to top.

sentence pair

Appears in 5 sentences as: sentence pair (2) sentence pair: (1) sentence pairs (2)
In Name-aware Machine Translation
  1. Given a parallel sentence pair we first apply Giza++ (Och and Ney, 2003) to align words, and apply this join-
    Page 2, “Name-aware MT”
  2. For example, given the following sentence pair:
    Page 3, “Name-aware MT”
  3. Both sentence pairs are kept in the combined data to build the translation model.
    Page 3, “Name-aware MT”
  4. The training corpus includes 1,686,458 sentence pairs .
    Page 5, “Experiments”
  5. For example, in the following sentence pair : “lfiifi‘ EF' , filfifflfiié/ET XE Efii fiiifilifl‘]... (in accordance with the tripartite agreement reached by China, Laos and the UNH CR on )...”, even though the tagger can successfully label “Edi/ET XE Efi/UNHCR” as an organization because it is a common Chinese name, English features based on previous GPE contexts still incorrectly predicted “UNH CR” as a GPE name.
    Page 8, “Experiments”

See all papers in Proc. ACL 2013 that mention sentence pair.

See all papers in Proc. ACL that mention sentence pair.

Back to top.

TER

Appears in 4 sentences as: TER (4)
In Name-aware Machine Translation
  1. Traditional MT evaluation metrics such as BLEU (Papineni et al., 2002) and Translation Edit Rate ( TER ) (Snover et al., 2006) assign the same weights to all tokens equally.
    Page 4, “Name-aware MT Evaluation”
  2. Besides the new name-aware MT metric, we also adopt two traditional metrics, TER to evaluate the overall translation performance and Named Entity Weak Accuracy (NEWA) (Hermj akob et al., 2008) to evaluate the name translation performance.
    Page 5, “Experiments”
  3. TER measures the amount of edits required to change a system output into one of the reference translations.
    Page 5, “Experiments”
  4. TER = 10 average # of reference words ( )
    Page 5, “Experiments”

See all papers in Proc. ACL 2013 that mention TER.

See all papers in Proc. ACL that mention TER.

Back to top.

translation quality

Appears in 4 sentences as: translation quality (4)
In Name-aware Machine Translation
  1. Additionally, we also propose a new MT metric to appropriately evaluate the translation quality of informative words, by assigning different weights to different words according to their importance values in a document.
    Page 1, “Abstract”
  2. In order to properly evaluate the translation quality of NAMT methods, we propose to modify the BLEU metric so that they can dynamically assign more weights to names during evaluation.
    Page 4, “Name-aware MT Evaluation”
  3. Furthermore, using external name translation table only did not improve translation quality in most test sets except for BOLT2.
    Page 6, “Experiments”
  4. Although the proposed model has significantly enhanced translation quality , some challenges remain.
    Page 7, “Experiments”

See all papers in Proc. ACL 2013 that mention translation quality.

See all papers in Proc. ACL that mention translation quality.

Back to top.

Machine Translation

Appears in 3 sentences as: Machine Translation (2) machine translation (1)
In Name-aware Machine Translation
  1. We propose a Name-aware Machine Translation (MT) approach which can tightly integrate name processing into MT model, by jointly annotating parallel corpora, extracting name-aware translation grammar and rules, adding name phrase table and name translation driven decoding.
    Page 1, “Abstract”
  2. A key bottleneck of high-quality cross-lingual information access lies in the performance of Machine Translation (MT).
    Page 1, “Introduction”
  3. In contrast, our name pair mining approach described in this paper does not require any machine translation or transliteration features.
    Page 9, “Related Work”

See all papers in Proc. ACL 2013 that mention Machine Translation.

See all papers in Proc. ACL that mention Machine Translation.

Back to top.

evaluation metric

Appears in 3 sentences as: evaluation metric (2) evaluation metrics (1)
In Name-aware Machine Translation
  1. Propose a new MT evaluation metric which can discriminate names and noninformative words (Section 4).
    Page 2, “Introduction”
  2. Traditional MT evaluation metrics such as BLEU (Papineni et al., 2002) and Translation Edit Rate (TER) (Snover et al., 2006) assign the same weights to all tokens equally.
    Page 4, “Name-aware MT Evaluation”
  3. We also proposed a new name-aware evaluation metric .
    Page 9, “Conclusions and Future Work”

See all papers in Proc. ACL 2013 that mention evaluation metric.

See all papers in Proc. ACL that mention evaluation metric.

Back to top.

translation model

Appears in 3 sentences as: translation model (3)
In Name-aware Machine Translation
  1. Some of these names carry special meanings that may influence translations of the neighboring words, and thus replacing them with non-terminals can lead to information loss and weaken the translation model .
    Page 3, “Name-aware MT”
  2. Both sentence pairs are kept in the combined data to build the translation model .
    Page 3, “Name-aware MT”
  3. More importantly, in these approaches the MT model was still mostly treated as a “black-box” because neither the translation model nor the LM was updated or adapted specifically for names.
    Page 8, “Related Work”

See all papers in Proc. ACL 2013 that mention translation model.

See all papers in Proc. ACL that mention translation model.

Back to top.

translation system

Appears in 3 sentences as: translation system (3)
In Name-aware Machine Translation
  1. Then we apply a state-of-the-art name translation system (Ji et al., 2009) to translate names into the target language.
    Page 3, “Name-aware MT”
  2. The name translation system is composed of the following steps: (1) Dictionary matching based on 150,041 name translation pairs; (2) Statistical name transliteration based on a structured perceptron model and a character based MT model (Dayne and Shahram, 2007); (3) Context information extraction based re-ranking.
    Page 3, “Name-aware MT”
  3. For those names with fewer than five instances in the training data, we use the name translation system to provide translations; for the rest of the names, we leave them to the baseline MT model to handle.
    Page 4, “Name-aware MT”

See all papers in Proc. ACL 2013 that mention translation system.

See all papers in Proc. ACL that mention translation system.

Back to top.

baseline system

Appears in 3 sentences as: baseline system (3)
In Name-aware Machine Translation
  1. For better comparison with NAMT, besides the original baseline, we develop the other baseline system by adding name translation table into the phrase table (NPhrase).
    Page 6, “Experiments”
  2. We can see that except for the BOLT3 data set with BLEU metric, our NAMT approach consistently outperformed the baseline system for all data sets with all metrics, and provided up to 23.6% relative error reduction on name translation.
    Page 6, “Experiments”
  3. In order to investigate the correlation between name-aware BLEU scores and human judgment results, we asked three bilingual speakers to judge our translation output from the baseline system and the NAMT system, on a Chinese subset of 250 sentences (each sentence has two corresponding translations from baseline and NAMT) extracted randomly from 7 test corpora.
    Page 7, “Experiments”

See all papers in Proc. ACL 2013 that mention baseline system.

See all papers in Proc. ACL that mention baseline system.

Back to top.