Lexical normalisation | Second, IV words within a threshold TC character edit distance of the given OOV word are calculated, as is widely used in spell checkers. |
Lexical normalisation | Third, the double metaphone algorithm (Philips, 2000) is used to decode the pronunciation of all IV words, and IV words within a threshold Tp edit distance of the given OOV word under phonemic transcription, are included in the confusion set; this allows us to capture OOV words such as earthquick “earthquake”. |
Lexical normalisation | The recall for lexical edit distance with TC g 2 is moderately high, but it is unable to detect the correct candidate for about one quarter of words. |
Machine Translation as a Decipherment Task | Evaluation: All the MT systems are run on the Spanish test data and the quality of the resulting English translations are evaluated using two different measures—(1) Normalized edit distance score (Navarro, 2001),6 and (2) BLEU (Papineni et |
Machine Translation as a Decipherment Task | 6When computing edit distance , we account for substitutions, insertions, deletions as well as local-swap edit operations required to convert a given English string into the (gold) reference translation. |
Machine Translation as a Decipherment Task | Results: Figure 3 compares the results of various MT systems (using parallel versus decipherment training) on the two test corpora in terms of edit distance scores (a lower score indicates closer match to the gold translation). |
Related Work | The use of edit distance is a typical example, which exploits operations of character deletion, insertion and substitution. |
Related Work | Some methods generate candidates within a fixed range of edit distance or different ranges for strings with different lengths (Li et al., 2006; Whitelaw et al., 2009). |
Related Work | Other methods make use of weighted edit distance to enhance the representation power of edit distance (Ristad and Yianilos, 1998; Oncina and Sebban, 2005; McCal-lum et al., 2005; Ahmad and Kondrak, 2005). |