Translating Dialectal Arabic to English
Sajjad, Hassan and Darwish, Kareem and Belinkov, Yonatan

Article Structure

Abstract

We present a dialectal Egyptian Arabic to English statistical machine translation system that leverages dialectal to Modern Standard Arabic (MSA) adaptation.

Introduction

Yonatan Belinkov CSAIL Massachusetts Institute of Technology belinkov@mit.edu

Previous Work

Our work is related to research on MT from a resource poor language (to other languages) by pivoting on a closely related resource rich language.

Proposed Methods 3.1 Egyptian to EG’ Conversion

As mentioned previously, dialects differ from MSA in vocabulary, morphology, and phonology.

Conclusion

We presented an Egyptian to English MT system.

Topics

BLEU

Appears in 13 sentences as: BLEU (13)
In Translating Dialectal Arabic to English
  1. 'The transfininafion reduces the out-of—vocabulary (00V) words from 5.2% to 2.6% and gives a gain of 1.87 BLEU points.
    Page 1, “Abstract”
  2. Further, adapting large MSAflEnglish parallel data increases the lexical coverage, reduces OOVs to 0.7% and leads to an absolute BLEU improvement of 2.73 points.
    Page 1, “Abstract”
  3. — We built a phrasal Machine Translation (MT) system on adapted EgyptiarflEnglish parallel data, which outperformed a non-adapted baseline by 1.87 BLEU points.
    Page 2, “Introduction”
  4. ‘Train LM BLEU oov
    Page 3, “Previous Work”
  5. The system trained on AR (B1) performed poorly compared to the one trained on EG (B2) with a 6.75 BLEU points difference.
    Page 3, “Previous Work”
  6. S], which used only EG’ for training showed an improvement of 1.67 BLEU points from the best baseline system (B4).
    Page 4, “Proposed Methods 3.1 Egyptian to EG’ Conversion”
  7. Phrase merging that preferred phrases learnt from EG’ data over AR data performed the best with a BLEU score of 16.96.
    Page 4, “Proposed Methods 3.1 Egyptian to EG’ Conversion”
  8. tian sentence “wbyHtrmwA AlnAs AltAnyp” Until produced “lyfizfij (OOV) the second people” ( BLEU = 0.31).
    Page 4, “Proposed Methods 3.1 Egyptian to EG’ Conversion”
  9. Conversion changed “wbyHtrmwA” to “watrmwA” and “AltAnyp” to “AlvAnyp” Styli”, leading to “and they respect other people” ( BLEU = 1).
    Page 4, “Proposed Methods 3.1 Egyptian to EG’ Conversion”
  10. In further analysis, we examined 1% of the sentences with the largest difference in BLEU score.
    Page 4, “Proposed Methods 3.1 Egyptian to EG’ Conversion”
  11. Out of these, more than 70% were cases where the EG’ model achieved a higher BLEU score.
    Page 4, “Proposed Methods 3.1 Egyptian to EG’ Conversion”

See all papers in Proc. ACL 2013 that mention BLEU.

See all papers in Proc. ACL that mention BLEU.

Back to top.

parallel data

Appears in 9 sentences as: parallel data (10)
In Translating Dialectal Arabic to English
  1. Further, adapting large MSAflEnglish parallel data increases the lexical coverage, reduces OOVs to 0.7% and leads to an absolute BLEU improvement of 2.73 points.
    Page 1, “Abstract”
  2. Later, we applied an adaptation method to incorporate MSA/English parallel data .
    Page 2, “Introduction”
  3. — We built a phrasal Machine Translation (MT) system on adapted EgyptiarflEnglish parallel data , which outperformed a non-adapted baseline by 1.87 BLEU points.
    Page 2, “Introduction”
  4. — We used phrase-table merging (Nakov and Ng, 2009) to utilize MSA/English parallel data with the available in-domain parallel data .
    Page 2, “Introduction”
  5. This can be done by either translating between the related languages using word-level translation, character level transformations, and language specific rules (Durrani et al., 2010; Hajic et al., 2000; Nakov and Tiedemann, 2012), or by concatenating the parallel data for both languages (Nakov and Ng, 2009).
    Page 2, “Previous Work”
  6. These translation methods generally require parallel data , for which hardly any exists between dialects and MSA.
    Page 2, “Previous Work”
  7. Their best Egyptian/English system was trained on dialect/English parallel data .
    Page 2, “Previous Work”
  8. We word-aligned the parallel data using GIZA++ (Och and Ney, 2003), and symmetrized the alignments using grow-diag-final-and heuristic (Koehn et al., 2003).
    Page 2, “Previous Work”
  9. adapted parallel data showed an improvement of 1.87 BLEU points over our best baseline.
    Page 5, “Conclusion”

See all papers in Proc. ACL 2013 that mention parallel data.

See all papers in Proc. ACL that mention parallel data.

Back to top.

LM

Appears in 8 sentences as: LM (8)
In Translating Dialectal Arabic to English
  1. Our LM experiments also affirmed the importance of in-domain English LMs.
    Page 2, “Previous Work”
  2. ‘Train LM BLEU oov
    Page 3, “Previous Work”
  3. Table 1: Baseline results using the EG and AR training sets with G W and EGen corpora for LM training
    Page 3, “Previous Work”
  4. In case of more than one LM , we tuned their weights on a development set using Minimum Error Rate Training (Och and Ney, 2003).
    Page 3, “Previous Work”
  5. We built several baseline systems as follows: — B1 used AR for training a translation model and GW for LM .
    Page 3, “Previous Work”
  6. Then we used a trigram LM that we built from the aforementioned Aljazeera articles to pick the most likely candidate in context.
    Page 3, “Proposed Methods 3.1 Egyptian to EG’ Conversion”
  7. We simply multiplied the character-level transformation probability with the LM probability — giving them equal weight.
    Page 3, “Proposed Methods 3.1 Egyptian to EG’ Conversion”
  8. - SI and $2 trained on the EG’ with EGen and both EGen and GW for LM training respectively.
    Page 4, “Proposed Methods 3.1 Egyptian to EG’ Conversion”

See all papers in Proc. ACL 2013 that mention LM.

See all papers in Proc. ACL that mention LM.

Back to top.

phrase table

Appears in 8 sentences as: phrase table (5) phrase tables (3)
In Translating Dialectal Arabic to English
  1. - Only added the phrase with its translations and their probabilities from the AR phrase table .
    Page 4, “Proposed Methods 3.1 Egyptian to EG’ Conversion”
  2. - Only added the phrase with its translations and their probabilities from the EG’ phrase table .
    Page 4, “Proposed Methods 3.1 Egyptian to EG’ Conversion”
  3. - Added translations of the phrase from both phrase tables and left the choice to the decoder.
    Page 4, “Proposed Methods 3.1 Egyptian to EG’ Conversion”
  4. We added three additional features to the new phrase table to avail the information about the origin of phrases (as in Nakov and Ng (2009)).
    Page 4, “Proposed Methods 3.1 Egyptian to EG’ Conversion”
  5. We built separate phrase tables from the two corpora and merged them.
    Page 4, “Proposed Methods 3.1 Egyptian to EG’ Conversion”
  6. For SALL, we kept phrases from both phrase tables .
    Page 4, “Proposed Methods 3.1 Egyptian to EG’ Conversion”
  7. Table 2 summarizes results of using EG’ and phrase table merging.
    Page 4, “Proposed Methods 3.1 Egyptian to EG’ Conversion”
  8. Using phrase table merging that combined AR and EG’ training data in a way that preferred adapted dialectal data yielded an extra 0.86 BLEU points.
    Page 5, “Conclusion”

See all papers in Proc. ACL 2013 that mention phrase table.

See all papers in Proc. ACL that mention phrase table.

Back to top.

BLEU points

Appears in 6 sentences as: BLEU points (6)
In Translating Dialectal Arabic to English
  1. 'The transfininafion reduces the out-of—vocabulary (00V) words from 5.2% to 2.6% and gives a gain of 1.87 BLEU points .
    Page 1, “Abstract”
  2. — We built a phrasal Machine Translation (MT) system on adapted EgyptiarflEnglish parallel data, which outperformed a non-adapted baseline by 1.87 BLEU points .
    Page 2, “Introduction”
  3. The system trained on AR (B1) performed poorly compared to the one trained on EG (B2) with a 6.75 BLEU points difference.
    Page 3, “Previous Work”
  4. S], which used only EG’ for training showed an improvement of 1.67 BLEU points from the best baseline system (B4).
    Page 4, “Proposed Methods 3.1 Egyptian to EG’ Conversion”
  5. adapted parallel data showed an improvement of 1.87 BLEU points over our best baseline.
    Page 5, “Conclusion”
  6. Using phrase table merging that combined AR and EG’ training data in a way that preferred adapted dialectal data yielded an extra 0.86 BLEU points .
    Page 5, “Conclusion”

See all papers in Proc. ACL 2013 that mention BLEU points.

See all papers in Proc. ACL that mention BLEU points.

Back to top.

language modeling

Appears in 6 sentences as: language modeling (3) language models (3)
In Translating Dialectal Arabic to English
  1. They used two language models built from the English GigaWord corpus and from a large web crawl.
    Page 2, “Previous Work”
  2. For language modeling , we used either EGen or the English side of the AR corpus plus the English side of NIST12 training data and English Gi-gaWord v5.
    Page 2, “Previous Work”
  3. — B2-B4 systems used identical training data, namely EG, with the GW, EGen, or both for B2, B3, and B4 respectively for language modeling .
    Page 3, “Previous Work”
  4. Using EG data for training both the translation and language models was effective.
    Page 3, “Previous Work”
  5. Using both language models (52) led to slight improvement.
    Page 4, “Proposed Methods 3.1 Egyptian to EG’ Conversion”
  6. Also, we believe that improving English language modeling to match the genre of the translated sentences can have significant positive impact on translation quality.
    Page 5, “Conclusion”

See all papers in Proc. ACL 2013 that mention language modeling.

See all papers in Proc. ACL that mention language modeling.

Back to top.

morphological analyzer

Appears in 6 sentences as: morphological analysis (2) morphological analyzer (4)
In Translating Dialectal Arabic to English
  1. Sawaf (2010) proposed a dialect to MSA normalization that used character-level rules and morphological analysis .
    Page 2, “Previous Work”
  2. We tokenized Egyptian and Arabic according to the ATB tokenization scheme using the MADA+TOKAN morphological analyzer and to-kenizer v3.1 (Roth et al., 2008).
    Page 2, “Previous Work”
  3. Perhaps a morphological analyzer , or just a part-of-speech tagger, could enforce (or probabilistically encourage) a match in parts of speech.
    Page 5, “Proposed Methods 3.1 Egyptian to EG’ Conversion”
  4. In particular, using a morphological analyzer seeems like a promising possibility.
    Page 5, “Proposed Methods 3.1 Egyptian to EG’ Conversion”
  5. One approach could be to run a morphological analyzer for dialectal Arabic (e.g.
    Page 5, “Proposed Methods 3.1 Egyptian to EG’ Conversion”
  6. For future work, we want to expand our work to other dialects, while utilizing dialectal morphological analysis to improve conversion.
    Page 5, “Conclusion”

See all papers in Proc. ACL 2013 that mention morphological analyzer.

See all papers in Proc. ACL that mention morphological analyzer.

Back to top.

BLEU score

Appears in 3 sentences as: BLEU score (3)
In Translating Dialectal Arabic to English
  1. Phrase merging that preferred phrases learnt from EG’ data over AR data performed the best with a BLEU score of 16.96.
    Page 4, “Proposed Methods 3.1 Egyptian to EG’ Conversion”
  2. In further analysis, we examined 1% of the sentences with the largest difference in BLEU score .
    Page 4, “Proposed Methods 3.1 Egyptian to EG’ Conversion”
  3. Out of these, more than 70% were cases where the EG’ model achieved a higher BLEU score .
    Page 4, “Proposed Methods 3.1 Egyptian to EG’ Conversion”

See all papers in Proc. ACL 2013 that mention BLEU score.

See all papers in Proc. ACL that mention BLEU score.

Back to top.

in-domain

Appears in 3 sentences as: in-domain (3)
In Translating Dialectal Arabic to English
  1. — We used phrase-table merging (Nakov and Ng, 2009) to utilize MSA/English parallel data with the available in-domain parallel data.
    Page 2, “Introduction”
  2. In contrast, we showed that training on in-domain dialectal data irrespective of its small size is better than training on large MSA/English data.
    Page 2, “Previous Work”
  3. Our LM experiments also affirmed the importance of in-domain English LMs.
    Page 2, “Previous Work”

See all papers in Proc. ACL 2013 that mention in-domain.

See all papers in Proc. ACL that mention in-domain.

Back to top.