Weak semantic context helps phonetic learning in a model of infant language acquisition
Frank, Stella and Feldman, Naomi H. and Goldwater, Sharon

Article Structure

Abstract

Learning phonetic categories is one of the first steps to learning a language, yet is hard to do using only distributional phonetic information.

Introduction

Infants begin learning the phonetic categories of their native language in their first year (Kuhl et al., 1992; Polka and Werker, 1994; Werker and Tees, 1984).

Background and overview of models

Infants attend to distributional characteristics of their input (Maye et al., 2002, 2008), leading to the hypothesis that phonetic categories could be acquired on the basis of bottom-up distributional learning alone (de Boer and Kuhl, 2003; Vallabha et al., 2007; McMurray et al., 2009).

Lexical-Distributional Model

In this section we describe more formally the generative process for the LD model (Feldman et al., 2013a), a joint Bayesian model over phonetic categories and a lexicon, before describing the TLD extension in the following section.

Topic-Lexical-Distributional Model

The TLD model retains the IGMM vowel phone component, but extends the lexicon of the LD model by adding topic-specific lexicons, which capture the notion that lexeme probabilities are topic-dependent.

Inference: Gibbs Sampling

We use Gibbs sampling to infer three sets of variables in the TLD model: assignments to vowel categories in the lexemes, assignments of tokens to

Experiments

6.1 Corpus

Conclusion

Language acquisition is a complex task, in which many heterogeneous sources of information may be useful.

Topics

hyperparameters

Appears in 7 sentences as: hyperparameter (1) Hyperparameters (2) hyperparameters (4)
In Weak semantic context helps phonetic learning in a model of infant language acquisition
  1. Squared nodes depict hyperparameters .
    Page 5, “Inference: Gibbs Sampling”
  2. A is the set of hyperparameters used by H L when generating lexical items (see Section 3.2).
    Page 5, “Inference: Gibbs Sampling”
  3. 5.3 Hyperparameters
    Page 6, “Inference: Gibbs Sampling”
  4. The three hyperparameters governing the HDP over the lexicon, a; and 04k, and the DP over vowel categories, ac, are estimated using a slice sampler.
    Page 6, “Inference: Gibbs Sampling”
  5. remaining hyperparameters for the vowel category and lexeme priors are set to the same values used by Feldman et al.
    Page 6, “Inference: Gibbs Sampling”
  6. Hyperparameters are inferred, which leads to a dominant topic that includes mainly light verbs (have, let, see, do).
    Page 7, “Experiments”
  7. Each condition (model, vowel speakers, consonant set) is run five times, using 1500 iterations of Gibbs sampling with hyperparameter sampling.
    Page 8, “Experiments”

See all papers in Proc. ACL 2014 that mention hyperparameters.

See all papers in Proc. ACL that mention hyperparameters.

Back to top.

topic distributions

Appears in 7 sentences as: topic distribution (2) topic distributions (5)
In Weak semantic context helps phonetic learning in a model of infant language acquisition
  1. However, in our simulations we approximate the environmental information by running a topic model (Blei et al., 2003) over a corpus of child-directed speech to infer a topic distribution for each situation.
    Page 1, “Introduction”
  2. These topic distributions are then used as input to our model to represent situational contexts.
    Page 1, “Introduction”
  3. From an acquisition perspective, the observed topic distribution represents the child’s knowledge of the context of the interaction: she can distinguish bathtime from dinnertime, and is able to recognize that some topics appear in certain contexts (e. g. animals on walks, vegetables at dinnertime) and not in others (few vegetables appear at bath-time).
    Page 3, “Background and overview of models”
  4. Conversely, potential minimal pairs that occur in situations with similar topic distributions are more likely to belong to the same topic and thus the same lexeme.
    Page 3, “Background and overview of models”
  5. Although we assume that children infer topic distributions from the nonlinguistic environment, we will use transcripts from CHILDES to create the word/phone learning input for our model.
    Page 3, “Background and overview of models”
  6. We therefore obtain the topic distributions used as input to the TLD model by
    Page 3, “Background and overview of models”
  7. There are a fixed number of lower level topic-lexicons; these are matched to the number of topics in the LDA model used to infer the topic distributions (see Section 6.4).
    Page 5, “Topic-Lexical-Distributional Model”

See all papers in Proc. ACL 2014 that mention topic distributions.

See all papers in Proc. ACL that mention topic distributions.

Back to top.

LDA

Appears in 6 sentences as: LDA (7)
In Weak semantic context helps phonetic learning in a model of infant language acquisition
  1. training an LDA topic model (Blei et al., 2003) on a superset of the child-directed transcript data we use for lexical-phonetic learning, dividing the transcripts into small sections (the ‘documents’ in LDA ) that serve as our distinct situations h. As noted above, the learned document-topic distributions 6 are treated as observed variables in the TLD model to represent the situational context.
    Page 4, “Background and overview of models”
  2. The topic-word distributions learned by LDA are discarded, since these are based on the (correct and unambiguous) words in the transcript, whereas the TLD model is presented with phonetically ambiguous versions of these word tokens and must learn to disambiguate them and associate them with topics.
    Page 4, “Background and overview of models”
  3. There are a fixed number of lower level topic-lexicons; these are matched to the number of topics in the LDA model used to infer the topic distributions (see Section 6.4).
    Page 5, “Topic-Lexical-Distributional Model”
  4. The first factor, the prior probability of topic k in document h, is given by 6M, obtained from the LDA .
    Page 6, “Inference: Gibbs Sampling”
  5. The input to the TLD model includes a distribution over topics for each situation, which we infer in advance from the full Brent corpus (not only the C1 subset) using LDA .
    Page 7, “Experiments”
  6. Regardless of the specific way in which infants encode semantic information, our method of adding this information by using LDA topics from transcript data was shown to be effective.
    Page 9, “Conclusion”

See all papers in Proc. ACL 2014 that mention LDA.

See all papers in Proc. ACL that mention LDA.

Back to top.

topic model

Appears in 4 sentences as: topic model (4)
In Weak semantic context helps phonetic learning in a model of infant language acquisition
  1. However, in our simulations we approximate the environmental information by running a topic model (Blei et al., 2003) over a corpus of child-directed speech to infer a topic distribution for each situation.
    Page 1, “Introduction”
  2. To demonstrate the benefit of situational information, we develop the Topic-Lexical-Distributional (TLD) model, which extends the LD model by assuming that words appear in situations analogous to documents in a topic model .
    Page 3, “Background and overview of models”
  3. (2012) found that topics learned from similar transcript data using a topic model were strongly correlated with immediate activities and contexts.
    Page 3, “Background and overview of models”
  4. training an LDA topic model (Blei et al., 2003) on a superset of the child-directed transcript data we use for lexical-phonetic learning, dividing the transcripts into small sections (the ‘documents’ in LDA) that serve as our distinct situations h. As noted above, the learned document-topic distributions 6 are treated as observed variables in the TLD model to represent the situational context.
    Page 4, “Background and overview of models”

See all papers in Proc. ACL 2014 that mention topic model.

See all papers in Proc. ACL that mention topic model.

Back to top.

content words

Appears in 3 sentences as: content words (3)
In Weak semantic context helps phonetic learning in a model of infant language acquisition
  1. We restrict the corpus to content words by retaining only words tagged as adj, n, part and v (adjectives, nouns, particles, and verbs).
    Page 6, “Experiments”
  2. As well as function words, we also remove the five most frequent content words (be, go, get, want, come).
    Page 7, “Experiments”
  3. On average, situations are only 59 words long, reflecting the relative lack of content words in CD8 utterances.
    Page 7, “Experiments”

See all papers in Proc. ACL 2014 that mention content words.

See all papers in Proc. ACL that mention content words.

Back to top.