Nonparametric Learning of Phonological Constraints in Optimality Theory

We present a method to jointly learn features and weights directly from distributional data in a log-linear framework.

Many aspects of human cognition involve the interaction of constraints that push a decision-maker toward different options, whether in something so trivial as choosing a movie or so important as a fight-or-flight response.

Optimality Theory has been used for constraint-based analysis of many areas of language, but we focus on its most successful application: phonology.

3.1 Structure

4.1 Wolof vowel harmony

5.1 Relation to phonotactic learning

A central assumption of Optimality Theory has been the existence of a fixed inventory of universal markedness constraints innately available to the learner, an assumption by arguments regarding the computational complexity of constraint identification.

Appears in 6 sentences as: log-linear (6)

In *Nonparametric Learning of Phonological Constraints in Optimality Theory*

- We present a method to jointly learn features and weights directly from distributional data in a log-linear framework.Page 1, “Abstract”
- The model uses an Indian Buffet Process prior to learn the feature values used in the log-linear method, and is the first algorithm for learning phonological constraints without presupposing constraint structure.Page 1, “Abstract”
- These constraint-driven decisions can be modeled with a log-linear system.Page 1, “Introduction”
- We consider this question by examining the dominant framework in modern phonology, Optimality Theory (Prince and Smolensky, 1993, OT), implemented in a log-linear framework, MaXEnt OT (Goldwater and Johnson, 2003), with output forms’ probabilities based on a weighted sum ofPage 1, “Introduction”
- In IBPOT, we use the log-linear EVAL developed by Goldwater and J ohn-son (2003) in their MaxEnt OT system.Page 2, “Phonology and Optimality Theory 2.1 OT structure”
- The weight vector w provides weight for both F and M. Probabilities of output forms are given by a log-linear function:Page 4, “The IBPOT Model”

See all papers in *Proc. ACL 2014* that mention log-linear.

See all papers in *Proc. ACL* that mention log-linear.

Back to top.

Appears in 5 sentences as: MaXEnt (2) MaxEnt (3)

In *Nonparametric Learning of Phonological Constraints in Optimality Theory*

- We consider this question by examining the dominant framework in modern phonology, Optimality Theory (Prince and Smolensky, 1993, OT), implemented in a log-linear framework, MaXEnt OT (Goldwater and Johnson, 2003), with output forms’ probabilities based on a weighted sum ofPage 1, “Introduction”
- In IBPOT, we use the log-linear EVAL developed by Goldwater and J ohn-son (2003) in their MaxEnt OT system.Page 2, “Phonology and Optimality Theory 2.1 OT structure”
- MEOT also is motivated by the general MaxEnt framework, whereas most other OT formulations are ad hoc constructions specific to phonology.Page 2, “Phonology and Optimality Theory 2.1 OT structure”
- In MaXEnt OT, each constraint has a weight, and the candidates’ scores are the sums of the weights of violated constraints.Page 3, “Phonology and Optimality Theory 2.1 OT structure”
- To establish performance for the phonological standard, we use the IBPOT learner to find constraint weights but do not update M. The resultant learner is essentially MaxEnt OT with the weights estimated through Metropolis sampling instead of gradient ascent.Page 6, “Experiment”

See all papers in *Proc. ACL 2014* that mention MaxEnt.

See all papers in *Proc. ACL* that mention MaxEnt.

Back to top.

Appears in 5 sentences as: weight vector (5)

In *Nonparametric Learning of Phonological Constraints in Optimality Theory*

- The IBPOT model defines a generative process for mappings between input and output forms based on three latent variables: the constraint violation matrices F (faithfulness) and M (markedness), and the weight vector w. The cells of the violation matrices correspond to the number of violations of a constraint by a given input-output mapping.Page 4, “The IBPOT Model”
- The weight vector w provides weight for both F and M. Probabilities of output forms are given by a log-linear function:Page 4, “The IBPOT Model”
- We initialize the model with a randomly-drawn markedness violation matrix M and weight vector 212.Page 4, “The IBPOT Model”
- After each iteration through M, we use Metropolis-Hastings to update the weight vector w.Page 4, “The IBPOT Model”
- Table 1: Data, markedness matrix, weight vector , and joint log-probabilities for the IBPOT and the phonological standard constraints.Page 7, “Experiment”

See all papers in *Proc. ACL 2014* that mention weight vector.

See all papers in *Proc. ACL* that mention weight vector.

Back to top.

Appears in 3 sentences as: generative process (3)

In *Nonparametric Learning of Phonological Constraints in Optimality Theory*

- The IBPOT model defines a generative process for mappings between input and output forms based on three latent variables: the constraint violation matrices F (faithfulness) and M (markedness), and the weight vector w. The cells of the violation matrices correspond to the number of violations of a constraint by a given input-output mapping.Page 4, “The IBPOT Model”
- Represented constraint sampling We begin by resampling M j; for all represented constraints M.l, conditioned on the rest of the violations (M_(jl), F) and the weights w. This is the sampling counterpart of drawing existing features in the IBP generative process .Page 4, “The IBPOT Model”
- This is the sampling counterpart to the Poisson draw for new features in the IBP generative process .Page 5, “The IBPOT Model”

See all papers in *Proc. ACL 2014* that mention generative process.

See all papers in *Proc. ACL* that mention generative process.

Back to top.