How to Speak a Language without Knowing It
Shi, Xing and Knight, Kevin and Ji, Heng

Article Structure

Abstract

We develop a system that lets people overcome language barriers by letting them speak a language they do not know.

Introduction

Can people speak a language they don’t know?

Evaluation

Our system’s input is Chinese.

Data

We seek to imitate phonetic transformations found in phrasebooks, so phrasebooks themselves are a good source of training data.

Model

We model Chinese-to-Chinglish translation with a cascade of weighted finite-state transducers (wFST), shown in Figure 2.

Training

FSTs A, C, and D are unweighted, and remain so throughout this paper.

Experiments

Our first evaluation (Table 4) is intrinsic, measuring our Chinglish output against references from

Conclusions

Our work aims to help people speak foreign languages they don’t know, by providing native phonetic spellings that approximate the sounds of foreign phrases.

Topics

edit distance

Appears in 9 sentences as: Edit Distance (2) edit distance (6) edit distances (1)
In How to Speak a Language without Knowing It
  1. 0 We compute the normalized edit distance between the system’s output and a human-generated Chinglish reference.
    Page 2, “Evaluation”
  2. We measure the normalized edit distance against an English reference.
    Page 2, “Evaluation”
  3. the test portion of our phrasebook, using edit distance .
    Page 4, “Experiments”
  4. The average edit distance of phoneme-phrase model and that of hybrid training/decoding model are close, indicating that long phoneme-phrase pairs can emulate word-pinyin mappings.
    Page 4, “Experiments”
  5. MOdeI Edit Distance Reference English 0.477 Phoneme based 0.696 Hybrid training and decoding 0.496
    Page 4, “Experiments”
  6. Then we measure edit distance between the human transcription and the reference English from our phrasebook.
    Page 4, “Experiments”
  7. MOdel Edit Distance Word based 0.925 Word-based hybrid training 0.925 Phoneme based 0.937 Phoneme-phrase based 0.896 Hybrid training and decoding 0.898
    Page 5, “Experiments”
  8. Numbers are average edit distance between recognized English and reference English.
    Page 5, “Experiments”
  9. Speech recognition is more fragile than human transcription, so edit distances are greater.
    Page 5, “Experiments”

See all papers in Proc. ACL 2014 that mention edit distance.

See all papers in Proc. ACL that mention edit distance.

Back to top.

phrase pairs

Appears in 3 sentences as: phrase pairs (3)
In How to Speak a Language without Knowing It
  1. Second, we extract phoneme phrase pairs consistent with these alignments.
    Page 3, “Training”
  2. From the example above, we pull out phrase pairs like:
    Page 3, “Training”
  3. We add these phrase pairs to FST B, and call this the phoneme-phrase-based model.
    Page 3, “Training”

See all papers in Proc. ACL 2014 that mention phrase pairs.

See all papers in Proc. ACL that mention phrase pairs.

Back to top.

Viterbi

Appears in 3 sentences as: Viterbi (3)
In How to Speak a Language without Knowing It
  1. To do this, we first take our phrasebook triples and construct sample string pairs <Epron, Pinyin-split> by pronouncing the phrasebook English with FST A, and by pronouncing the phrasebook Chinglish with FSTs D and C. Then we run the EM algorithm to learn FST B parameters (Table 3) and Viterbi alignments, such as:
    Page 3, “Training”
  2. First, we obtain Viterbi alignments using the phoneme-based model, e. g.:
    Page 3, “Training”
  3. EM learns values for parameters like P(nai te|night), plus Viterbi alignments such as:
    Page 3, “Training”

See all papers in Proc. ACL 2014 that mention Viterbi.

See all papers in Proc. ACL that mention Viterbi.

Back to top.