PCFGs, Topic Models, Adaptor Grammars and Learning Topical Collocations and the Structure of Proper Names
Johnson, Mark

Article Structure

Abstract

This paper establishes a connection between two apparently very different kinds of probabilistic models.

Introduction

Over the last few years there has been considerable interest in Bayesian inference for complex hierarchical models both in machine learning and in computational linguistics.

Latent Dirichlet Allocation Models

Latent Dirichlet Allocation (LDA) was introduced as an explicit probabilistic counterpart to Latent Semantic Indexing (LSI) (Blei et al., 2003).

Probabilistic Context-Free Grammars

Context-Free Grammars are a simple model of hierarchical structure often used to describe natural language syntaX.

LDA topic models as PCFGs

This section explains how to construct a PCFG that generates the same distribution over a collection of documents as an LDA model, and where Bayesian inference for the PCFG’s rule probabilities yields the corresponding distributions as Bayesian inference of the corresponding LDA models.

Adaptor Grammars

Nonparametric Bayesian inference, where the inference task involves learning not just the values of a finite vector of parameters but which parameters are relevant, has been the focus of intense research in machine learning recently.

Topic models with collocations

Here we combine ideas from the unigram word segmentation adaptor grammar above and the PCFG encoding of LDA topic models to present a novel topic model that learns topical collocations.

Finding the structure of proper names

Grammars offer structural and positional sensitiVity that is not exploited in the basic LDA topic models.

Conclusion

This paper establishes a connection between two very different kinds of probabilistic models; LDA models of the kind used for topic modelling, and PCFGs, which are a standard model of hierarchical structure in language.

Topics

LDA

Appears in 51 sentences as: LDA (60)
In PCFGs, Topic Models, Adaptor Grammars and Learning Topical Collocations and the Structure of Proper Names
  1. Latent Dirichlet Allocation ( LDA ) models are used as “topic models” to produce a low-dimensional representation of documents, while Probabilistic Context-Free Grammars (PCFGs) define distributions over trees.
    Page 1, “Abstract”
  2. The paper begins by showing that LDA topic models can be viewed as a special kind of PCFG, so Bayesian inference for PCFGs can be used to infer Topic Models as well.
    Page 1, “Abstract”
  3. Exploiting the close relationship between LDA and PCFGs just described, we propose two novel probabilistic models that combine insights from LDA and AG models.
    Page 1, “Abstract”
  4. The first replaces the unigram component of LDA topic models with multi-word sequences or collocations generated by an AG.
    Page 1, “Abstract”
  5. Specifically, we show that an LDA model can be expressed as a certain kind of PCFG,
    Page 1, “Introduction”
  6. so Bayesian inference for PCFGs can be used to learn LDA topic models as well.
    Page 1, “Introduction”
  7. The importance of this observation is primarily theoretical, as current Bayesian inference algorithms for PCFGs are less efficient than those for LDA inference.
    Page 1, “Introduction”
  8. However, once this link is established it suggests a variety of extensions to the LDA topic models, two of which we explore in this paper.
    Page 1, “Introduction”
  9. The first involves extending the LDA topic model so that it generates collocations (sequences of words) rather than individual words.
    Page 1, “Introduction”
  10. The next section reviews Latent Dirichlet Allocation ( LDA ) topic models, and the following section reviews Probabilistic Context-Free Grammars (PCFGs).
    Page 1, “Introduction”
  11. Section 4 shows how an LDA topic model can be expressed as a PCFG, which provides the fundamental connection between LDA and PCFGs that we exploit in the rest of the paper, and shows how it can be used to define a “sticky topic” version of LDA .
    Page 1, “Introduction”

See all papers in Proc. ACL 2010 that mention LDA.

See all papers in Proc. ACL that mention LDA.

Back to top.

topic models

Appears in 26 sentences as: topic model (8) topic modelling (1) Topic Models (1) topic models (19) topic” model (1) “topic models” (1)
In PCFGs, Topic Models, Adaptor Grammars and Learning Topical Collocations and the Structure of Proper Names
  1. Latent Dirichlet Allocation (LDA) models are used as “topic models” to produce a low-dimensional representation of documents, while Probabilistic Context-Free Grammars (PCFGs) define distributions over trees.
    Page 1, “Abstract”
  2. The paper begins by showing that LDA topic models can be viewed as a special kind of PCFG, so Bayesian inference for PCFGs can be used to infer Topic Models as well.
    Page 1, “Abstract”
  3. The first replaces the unigram component of LDA topic models with multi-word sequences or collocations generated by an AG.
    Page 1, “Abstract”
  4. so Bayesian inference for PCFGs can be used to learn LDA topic models as well.
    Page 1, “Introduction”
  5. However, once this link is established it suggests a variety of extensions to the LDA topic models , two of which we explore in this paper.
    Page 1, “Introduction”
  6. The first involves extending the LDA topic model so that it generates collocations (sequences of words) rather than individual words.
    Page 1, “Introduction”
  7. The next section reviews Latent Dirichlet Allocation (LDA) topic models , and the following section reviews Probabilistic Context-Free Grammars (PCFGs).
    Page 1, “Introduction”
  8. Section 4 shows how an LDA topic model can be expressed as a PCFG, which provides the fundamental connection between LDA and PCFGs that we exploit in the rest of the paper, and shows how it can be used to define a “sticky topic” version of LDA.
    Page 1, “Introduction”
  9. Section 6 exploits the connection between LDA and PCFGs to propose an AG-based topic model that extends LDA by defining distributions over collocations rather than individual words, and section 7 applies this extension to the problem of finding the structure of proper names.
    Page 1, “Introduction”
  10. Figure l: A graphical model “plate” representation of an LDA topic model .
    Page 2, “Latent Dirichlet Allocation Models”
  11. Figure 2: A tree generated by the CFG encoding an LDA topic model .
    Page 4, “LDA topic models as PCFGs”

See all papers in Proc. ACL 2010 that mention topic models.

See all papers in Proc. ACL that mention topic models.

Back to top.

probabilistic models

Appears in 5 sentences as: probabilistic model (1) probabilistic models (4)
In PCFGs, Topic Models, Adaptor Grammars and Learning Topical Collocations and the Structure of Proper Names
  1. This paper establishes a connection between two apparently very different kinds of probabilistic models .
    Page 1, “Abstract”
  2. Exploiting the close relationship between LDA and PCFGs just described, we propose two novel probabilistic models that combine insights from LDA and AG models.
    Page 1, “Abstract”
  3. This paper establishes a theoretical connection between two very different kinds of probabilistic models : Probabilistic Context-Free Grammars (PCFGs) and a class of models known as Latent Dirichlet Allocation (Blei et al., 2003; Griffiths and Steyvers, 2004) models that have been used for a variety of tasks in machine learning.
    Page 1, “Introduction”
  4. An LDA model is an explicit generative probabilistic model of a collection of documents.
    Page 2, “Latent Dirichlet Allocation Models”
  5. This paper establishes a connection between two very different kinds of probabilistic models ; LDA models of the kind used for topic modelling, and PCFGs, which are a standard model of hierarchical structure in language.
    Page 8, “Conclusion”

See all papers in Proc. ACL 2010 that mention probabilistic models.

See all papers in Proc. ACL that mention probabilistic models.

Back to top.

machine learning

Appears in 3 sentences as: machine learning (3)
In PCFGs, Topic Models, Adaptor Grammars and Learning Topical Collocations and the Structure of Proper Names
  1. Over the last few years there has been considerable interest in Bayesian inference for complex hierarchical models both in machine learning and in computational linguistics.
    Page 1, “Introduction”
  2. This paper establishes a theoretical connection between two very different kinds of probabilistic models: Probabilistic Context-Free Grammars (PCFGs) and a class of models known as Latent Dirichlet Allocation (Blei et al., 2003; Griffiths and Steyvers, 2004) models that have been used for a variety of tasks in machine learning .
    Page 1, “Introduction”
  3. Nonparametric Bayesian inference, where the inference task involves learning not just the values of a finite vector of parameters but which parameters are relevant, has been the focus of intense research in machine learning recently.
    Page 5, “Adaptor Grammars”

See all papers in Proc. ACL 2010 that mention machine learning.

See all papers in Proc. ACL that mention machine learning.

Back to top.

subtrees

Appears in 3 sentences as: subtrees (3)
In PCFGs, Topic Models, Adaptor Grammars and Learning Topical Collocations and the Structure of Proper Names
  1. Adaptor grammars are an example of this approach (Johnson et al., 2007b), where entire subtrees generated by a “base grammar” can be viewed as distinct rules (in that we learn a separate probability for each subtree).
    Page 5, “Adaptor Grammars”
  2. The inference task is nonparametric if there are an unbounded number of such subtrees .
    Page 5, “Adaptor Grammars”
  3. (Word s i) (Word d 6) (Word b u k) Because the Word nonterminal is adapted (indicated here by underlining) the adaptor grammar learns the probability of the entire Word subtrees (e.g., the probability that b a k is a Word); see Johnson (2008) for further details.
    Page 6, “Adaptor Grammars”

See all papers in Proc. ACL 2010 that mention subtrees.

See all papers in Proc. ACL that mention subtrees.

Back to top.