Letter-Phoneme Alignment: An Exploration
Jiampojamarn, Sittichai and Kondrak, Grzegorz

Article Structure

Abstract

Letter-phoneme alignment is usually generated by a straightforward application of the EM algorithm.

Introduction

Letter-to-phoneme (L2P) conversion (also called grapheme-to-phoneme conversion) is the task of predicting the pronunciation of a word given its orthographic form by converting a sequence of letters into a sequence of phonemes.

Background

We define the letter-phoneme alignment task as the problem of inducing links between units that are related by pronunciation.

EM Alignment

Early EM-based alignment methods (Daelemans and Bosch, 1997; Black et al., 1998; Damper et al., 2005) were generally pure ll models.

Phonetic alignment

The EM-based approaches to L2P alignment treat both letters and phonemes as abstract symbols.

Constraint-based alignment

One of the advantages of the phonetic alignment is its ability to rule out phonetically implausible letter-phoneme links, such as ozp.

IP Alignment

The process of manually inducing allowable letter-phoneme mappings is time-consuming and involves a great deal of language-specific knowledge.

Alignment by aggregation

During our development experiments, we observed that the technique that combines IP with EM described in the previous section generally leads to alignment quality improvement in comparison with the IP alignment.

Intrinsic evaluation

For the intrinsic evaluation, we compared the generated alignments to gold standard alignments extracted from the the core vocabulary of the Combilex data set (Richmond et al., 2009).

Extrinsic evaluation

In order to investigate the relationship between the alignment quality and L2P performance, we feed the alignments to two different L2P systems.

Conclusion

We investigated several new methods for generating letter-phoneme alignments.

Topics

gold standard

Appears in 8 sentences as: gold standard (8)
In Letter-Phoneme Alignment: An Exploration
  1. The intrinsic evaluation is conducted by comparing the generated alignments to a manually-constructed gold standard .
    Page 1, “Introduction”
  2. For the intrinsic evaluation, we compared the generated alignments to gold standard alignments extracted from the the core vocabulary of the Combilex data set (Richmond et al., 2009).
    Page 6, “Intrinsic evaluation”
  3. Since the gold standard includes many links that involve multiple letters, the theoretical upper bound for recall achieved by a one-to-one approach is 90.02%.
    Page 6, “Intrinsic evaluation”
  4. However, it is possible to obtain the perfect precision because we count as correct all ll links that are consistent with the MM links in the gold standard .
    Page 6, “Intrinsic evaluation”
  5. Overall, the MM models obtain lower precision but higher recall and F-score than 1-1 models, which is to be expected as the gold standard is defined in terms of MM links.
    Page 6, “Intrinsic evaluation”
  6. Its precision is particularly impressive: on average, only one link in a thousand is not consistent with the gold standard .
    Page 6, “Intrinsic evaluation”
  7. Interestingly, EM—Aggr matches the L2P accuracy obtained with the gold standard alignments.
    Page 7, “Extrinsic evaluation”
  8. However, there is no reason to claim that the gold standard alignments are optimal for the L2P generation task, so that result should not be considered as an upper bound.
    Page 7, “Extrinsic evaluation”

See all papers in Proc. ACL 2010 that mention gold standard.

See all papers in Proc. ACL that mention gold standard.

Back to top.

alignment model

Appears in 3 sentences as: alignment model (1) alignment models (1) alignment models: (1)
In Letter-Phoneme Alignment: An Exploration
  1. The following constraints on links are assumed by some or all alignment models:
    Page 2, “Background”
  2. We refer to an alignment model that assumes all three constraints as a pure one-to-one (ll) model.
    Page 2, “Background”
  3. The TiMBL L2P generation method (Table 2) is applicable only to the 1-1 alignment models .
    Page 7, “Extrinsic evaluation”

See all papers in Proc. ACL 2010 that mention alignment model.

See all papers in Proc. ACL that mention alignment model.

Back to top.

dynamic programming

Appears in 3 sentences as: dynamic programming (3)
In Letter-Phoneme Alignment: An Exploration
  1. The 1-1 alignment problem can be formulated as a dynamic programming problem to find the maximum score of alignment, given a probability table of aligning letter and phoneme as a mapping function.
    Page 2, “EM Alignment”
  2. The dynamic programming recursion to find the most likely alignment is the following:
    Page 2, “EM Alignment”
  3. It combines a dynamic programming alignment algorithm with an appropriate scoring scheme for computing phonetic similarity on the basis of multivalued features.
    Page 3, “Phonetic alignment”

See all papers in Proc. ACL 2010 that mention dynamic programming.

See all papers in Proc. ACL that mention dynamic programming.

Back to top.

F-score

Appears in 3 sentences as: F-score (3)
In Letter-Phoneme Alignment: An Exploration
  1. We report the alignment quality in terms of precision, recall and F-score .
    Page 6, “Intrinsic evaluation”
  2. The F-score corresponding to perfect precision and the upper-bound recall is 94.75%.
    Page 6, “Intrinsic evaluation”
  3. Overall, the MM models obtain lower precision but higher recall and F-score than 1-1 models, which is to be expected as the gold standard is defined in terms of MM links.
    Page 6, “Intrinsic evaluation”

See all papers in Proc. ACL 2010 that mention F-score.

See all papers in Proc. ACL that mention F-score.

Back to top.

Viterbi

Appears in 3 sentences as: Viterbi (3)
In Letter-Phoneme Alignment: An Exploration
  1. The final many-t0-many alignments are created by finding the most likely paths using the Viterbi algorithm based on the learned mapping probability table.
    Page 3, “EM Alignment”
  2. In order to generate the list of best alignments, we use Algorithm 2, which is an adaptation of the standard Viterbi algorithm.
    Page 5, “Alignment by aggregation”
  3. The decoder module uses standard Viterbi for the 1-1 case, and a phrasal decoder (Zens and Ney, 2004) for the MM case.
    Page 7, “Extrinsic evaluation”

See all papers in Proc. ACL 2010 that mention Viterbi.

See all papers in Proc. ACL that mention Viterbi.

Back to top.