A Provably Correct Learning Algorithm for Latent-Variable PCFGs

We introduce a provably correct learning algorithm for latent-variable PCFGs.

Latent-variable PCFGs (L—PCFGs) (Matsuzaki et al., 2005; Petrov et al., 2006) give state-of-the-art performance on parsing problems.

Recently a number of researchers have developed provably correct algorithms for parameter estimation in latent variable models such as hidden Markov models, topic models, directed graphical models with latent variables, and so on (Hsu et al., 2009; Bailly et al., 2010; Siddiqi et al., 2010; Parikh et al., 2011; Balle et al., 2011; Arora et al., 2013; Dhillon et al., 2012; Anandkumar et al., 2012; Arora et al., 2012; Arora et al., 2013).

This section gives definitions and notation for L-PCFGs, taken from (Cohen et al., 2012).

Our goal is to design a learning algorithm for L-PCFGs.

This section describes the matrix decomposition algorithm used in Step 1 of the learning algorithm.

6.1 Recovery of the 7t and q Parameters

This section describes parsing experiments using the learning algorithm for L-PCFGs.

Appears in 9 sentences as: Learning Algorithm (1) learning algorithm (7) learning algorithms (1)

In *A Provably Correct Learning Algorithm for Latent-Variable PCFGs*

- We introduce a provably correct learning algorithm for latent-variable PCFGs.Page 1, “Abstract”
- Our goal is to design a learning algorithm for L-PCFGs.Page 2, “The Learning Algorithm for L-PCFGS”
- 4.2 The Learning AlgorithmPage 3, “The Learning Algorithm for L-PCFGS”
- Figure 1 shows the learning algorithm for L-PCFGs.Page 3, “The Learning Algorithm for L-PCFGS”
- This section describes the matrix decomposition algorithm used in Step 1 of the learning algorithm .Page 5, “The Matrix Decomposition Algorithm”
- The learning algorithm for L-PCFGs can be useI as an initializer for the EM algorithm for L PCFGs.Page 8, “Additional Details of the Algorithm”
- This section describes parsing experiments using the learning algorithm for L-PCFGs.Page 8, “Experiments on Parsing”
- Table 1: Results on the development data (section 22) and test data (section 23) for various learning algorithms for L-PCFGs.Page 8, “Experiments on Parsing”
- In this special case, the L-PCFG learning algorithm is equivalent to a simple algorithm, with the following steps: 1) define the matrix Q with entries wam = count(w1,w2)/N where count(w1,w2) is the number of times that bi-gram (101,102) is seen in the data, and N = wam count(w1, 7.02).Page 9, “Experiments on Parsing”

See all papers in *Proc. ACL 2014* that mention learning algorithm.

See all papers in *Proc. ACL* that mention learning algorithm.

Back to top.

Appears in 7 sentences as: language model (1) Language Modeling (1) language modeling (5)

In *A Provably Correct Learning Algorithm for Latent-Variable PCFGs*

- Experiments on parsing and a language modeling problem show that the algorithm is efficient and effective in practice.Page 1, “Abstract”
- We describe experiments on learning of L-PCFGs, and also on learning of the latent-variable language model of Saul and Pereira (1997).Page 1, “Introduction”
- 8 Experiments on the Saul and Pereira (1997) Model for Language ModelingPage 8, “Experiments on Parsing”
- We now describe a second set of experiments, on the Saul and Pereira (1997) model for language modeling .Page 8, “Experiments on Parsing”
- We performed the language modeling experiments for a number of reasons.Page 9, “Experiments on Parsing”
- Second, it allows us to test the pivot method on the very large datasets that are available for language modeling .Page 9, “Experiments on Parsing”
- (2006) in language modeling experiments.Page 9, “Experiments on Parsing”

See all papers in *Proc. ACL 2014* that mention language modeling.

See all papers in *Proc. ACL* that mention language modeling.

Back to top.

Appears in 5 sentences as: optimization problem (4) optimization problems (1)

In *A Provably Correct Learning Algorithm for Latent-Variable PCFGs*

- We show that once the matrix decomposition step has been applied, parameter estimation of the L—PCFG can be reduced to a convex optimization problem that is easily solved by EM.Page 1, “Introduction”
- Crucially, this is a convex optimization problem , and the EM algorithm will converge to the global maximum of this likelihood function.Page 3, “The Learning Algorithm for L-PCFGS”
- Now consider the optimization problem in Eq.Page 4, “The Learning Algorithm for L-PCFGS”
- al., 2012) gives one set of guarantees; the remaining optimization problems we solve are convex maximum-likelihood problems, which are also relatively easy to analyze.Page 5, “The Learning Algorithm for L-PCFGS”
- This is again a convex optimization problem .Page 7, “The Matrix Decomposition Algorithm”

See all papers in *Proc. ACL 2014* that mention optimization problem.

See all papers in *Proc. ACL* that mention optimization problem.

Back to top.

Appears in 4 sentences as: objective function (4)

In *A Provably Correct Learning Algorithm for Latent-Variable PCFGs*

- 2) Optimization of a convex objective function using EM.Page 1, “Introduction”
- Step 2: Use the EM algorithm to find 75 values that maximize the objective function in Eq.Page 3, “The Learning Algorithm for L-PCFGS”
- Next, we modify the objective function in Eq. 'Page 8, “Additional Details of the Algorithm”
- Thus the new objective function consists of a sun of L x M 2 terms, each corresponding to a differen combination of inside and outside features.Page 8, “Additional Details of the Algorithm”

See all papers in *Proc. ACL 2014* that mention objective function.

See all papers in *Proc. ACL* that mention objective function.

Back to top.

Appears in 4 sentences as: topic models (4)

In *A Provably Correct Learning Algorithm for Latent-Variable PCFGs*

- (2012) in the context of learning topic models .Page 1, “Introduction”
- Recently a number of researchers have developed provably correct algorithms for parameter estimation in latent variable models such as hidden Markov models, topic models , directed graphical models with latent variables, and so on (Hsu et al., 2009; Bailly et al., 2010; Siddiqi et al., 2010; Parikh et al., 2011; Balle et al., 2011; Arora et al., 2013; Dhillon et al., 2012; Anandkumar et al., 2012; Arora et al., 2012; Arora et al., 2013).Page 1, “Related Work”
- Our work is most directly related to the algorithm for parameter estimation in topic models described by Arora et al.Page 2, “Related Work”
- (2012) in the context of learning topic models .Page 3, “The Learning Algorithm for L-PCFGS”

See all papers in *Proc. ACL 2014* that mention topic models.

See all papers in *Proc. ACL* that mention topic models.

Back to top.

Appears in 3 sentences as: latent variable (2) latent variables (2)

In *A Provably Correct Learning Algorithm for Latent-Variable PCFGs*

- This matrix form has clear relevance to latent variable models.Page 1, “Introduction”
- Recently a number of researchers have developed provably correct algorithms for parameter estimation in latent variable models such as hidden Markov models, topic models, directed graphical models with latent variables , and so on (Hsu et al., 2009; Bailly et al., 2010; Siddiqi et al., 2010; Parikh et al., 2011; Balle et al., 2011; Arora et al., 2013; Dhillon et al., 2012; Anandkumar et al., 2012; Arora et al., 2012; Arora et al., 2013).Page 1, “Related Work”
- The training set does not include values for the latent variables ; this is the main challenge in learning.Page 2, “The Learning Algorithm for L-PCFGS”

See all papers in *Proc. ACL 2014* that mention latent variable.

See all papers in *Proc. ACL* that mention latent variable.

Back to top.