Learning Syntactic Verb Frames using Graphical Models
Lippincott, Thomas and Korhonen, Anna and Ó Séaghdha, Diarmuid

Article Structure

Abstract

We present a novel approach for building verb subcategorization leXicons using a simple graphical model.

Introduction

Subcategorization frames (SCFs) give a compact description of a verb’s syntactic preferences.

Previous work

Many state-of-the-art SCF acquisition systems take grammatical relations (GRs) as input.

Methodology

In this section we describe the basic components of our study: feature sets, graphical model, inference, and evaluation.

Results

4.1 Verb clustering

Conclusions and future work

Our study reached two important conclusions: first, given the same data as input, an unsupervised probabilistic model can outperform a handcrafted rule-based SCF extractor with a predefined inventory.

Topics

gold standard

Appears in 12 sentences as: gold standard (10) gold standards (2)
In Learning Syntactic Verb Frames using Graphical Models
  1. Both rely on a filtering stage that depends on external resources and/or gold standards to select top-performing thresholds.
    Page 3, “Previous work”
  2. Finally, our task-based evaluation, verb clustering with Levin (l993)’s alternation classes as the gold standard , was previously conducted by Joanis and Stevenson (2003), Korhonen et al.
    Page 3, “Previous work”
  3. We extract instances for the 385 verbs in the union of our two gold standards from the VALEX lexicon’s data set, which was used in previous studies (Sun and Korhonen, 2009; Preiss et al., 2007) and facilitates comparison with that resource.
    Page 4, “Methodology”
  4. Finally, we set our SCF count to 40, about twice the size of the strictly syntactic general-language gold standard we describe in section 3.3.
    Page 5, “Methodology”
  5. Quantitative: cluster gold standard
    Page 5, “Methodology”
  6. Our gold standard is from (Sun and Korhonen, 2009), where 200 verbs were assigned to 17 classes based on their alternation patterns (Levin, 1993).
    Page 5, “Methodology”
  7. The clusters are then compared to the gold standard clusters with the purity-based F—Score from Sun and Korhonen (2009) and the more familiar Adjusted Rand Index (Hubert and Arabie, 1985).
    Page 5, “Methodology”
  8. Qualitative: manual gold standard
    Page 5, “Methodology”
  9. Third, we compared the output for several verbs to a coarsened version of the manually-annotated gold standard used to evaluate VALEX (Preiss et al., 2007).
    Page 6, “Methodology”
  10. The full tables necessary to compare verb SCF distributions from our output with the manual gold standard are prohibited by space, but a few examples reinforce the analysis above.
    Page 7, “Results”
  11. The verbs “load” and “fill” show particularly high usage of ditransitive SCFs in the gold standard .
    Page 7, “Results”

See all papers in Proc. ACL 2012 that mention gold standard.

See all papers in Proc. ACL that mention gold standard.

Back to top.

POS tags

Appears in 12 sentences as: POS tag (1) POS tagging (5) POS tags (7)
In Learning Syntactic Verb Frames using Graphical Models
  1. Second, by replacing the syntactic features with an approximation based on POS tags , we achieve state-of-the-art performance without relying on error-prone unlexicalized or domain-specific lexicalized parsers.
    Page 2, “Introduction”
  2. Graphical models have been increasingly popular for a variety of tasks such as distributional semantics (Blei et al., 2003) and unsupervised POS tagging (Finkel et al., 2007), and sampling methods allow efficient estimation of full joint distributions (Neal, 1993).
    Page 3, “Previous work”
  3. Their study employed unsupervised POS tagging and parsing, and measures of selectional preference and argument structure as complementary features for the classifier.
    Page 3, “Previous work”
  4. The CONLL format is a common language for comparing output from dependency parsers: each lexical item has an index, lemma, POS tag , tGR in which it is the dependent, and index to the corresponding head.
    Page 3, “Methodology”
  5. Table 2 shows the three variations we tested: the simple tGR type, with parameterization for the POS tags of head and dependent, and with closed-class POS tags (determiners, pronouns and prepositions) lexicalized.
    Page 3, “Methodology”
  6. An unlexicalized parser cannot distinguish these based just on POS tags , while a lexicalized parser requires a large treebank.
    Page 4, “Methodology”
  7. As with tGRs, the closed-class tags can be lexicalized, but there are no corresponding feature sets for param (since they are already built from POS tags ) or lim (since there is no similar rule-based approach).
    Page 4, “Methodology”
  8. Since POS tagging is more reliable and robust across domains than parsing, retraining on new domains will not suffer the effects of a mismatched parsing model (Lippincott et al., 2010).
    Page 6, “Results”
  9. Third, lexicalizing the closed-class POS tags introduces semantic information outside the scope of the alternation-based definition of subcategorization.
    Page 6, “Results”
  10. Second, simply treating POS tags within a small window of the verb as pseudo-GRs produces state-of-the-art results without the need for a parsing model.
    Page 8, “Conclusions and future work”
  11. In fact, by integrating results from unsupervised POS tagging (Teichert and Daume III, 2009) we could render this approach fully domain- and language-independent.
    Page 8, “Conclusions and future work”

See all papers in Proc. ACL 2012 that mention POS tags.

See all papers in Proc. ACL that mention POS tags.

Back to top.

feature sets

Appears in 10 sentences as: feature set (2) feature sets (8)
In Learning Syntactic Verb Frames using Graphical Models
  1. In this section we describe the basic components of our study: feature sets , graphical model, inference, and evaluation.
    Page 3, “Methodology”
  2. 3.1 Input and feature sets
    Page 3, “Methodology”
  3. We tested several feature sets either based on, or approximating, the concept of grammatical relation described in section 2.
    Page 3, “Methodology”
  4. We’ll use a simple example sentence to illustrate how our feature sets are extracted from CONLL-formatted data (Nivre et al., 2007).
    Page 3, “Methodology”
  5. We define the feature set for a verb occurrence as the counts of each GR the verb participates in.
    Page 3, “Methodology”
  6. In addition, we tested the effect of limiting the features to subject, object and complement tGRs, indicated by adding the subscript “lim”, for a total of six tGR-based feature sets .
    Page 3, “Methodology”
  7. As with tGRs, the closed-class tags can be lexicalized, but there are no corresponding feature sets for param (since they are already built from POS tags) or lim (since there is no similar rule-based approach).
    Page 4, “Methodology”
  8. Whichever feature set is used, an instance is sim-
    Page 4, “Methodology”
  9. To compare the performance of our feature sets , we chose the simple and familiar K-Means clustering algorithm (Hartigan and Wong, 1979).
    Page 5, “Methodology”
  10. We evaluated SCF leXicons based on the eight feature sets described in section 3.1, as well as the VALEX SCF leXicon described in section 2.
    Page 6, “Results”

See all papers in Proc. ACL 2012 that mention feature sets.

See all papers in Proc. ACL that mention feature sets.

Back to top.

graphical model

Appears in 7 sentences as: graphical model (3) graphical modeling (1) Graphical models (1) graphical models (2)
In Learning Syntactic Verb Frames using Graphical Models
  1. We present a novel approach for building verb subcategorization leXicons using a simple graphical model .
    Page 1, “Abstract”
  2. We discuss the advantages of graphical models for this task, in particular the ease of integrating semantic information about verbs and arguments in a principled fashion.
    Page 1, “Abstract”
  3. Graphical models have been increasingly popular for a variety of tasks such as distributional semantics (Blei et al., 2003) and unsupervised POS tagging (Finkel et al., 2007), and sampling methods allow efficient estimation of full joint distributions (Neal, 1993).
    Page 3, “Previous work”
  4. In this section we describe the basic components of our study: feature sets, graphical model , inference, and evaluation.
    Page 3, “Methodology”
  5. Our graphical modeling approach uses the Bayesian network shown in Figure 1.
    Page 4, “Methodology”
  6. This is an example of how bad decisions made by the parser cannot be fixed by the graphical model , and an area where pGR features have an advantage.
    Page 8, “Results”
  7. Our initial attempt at applying graphical models to subcategorization also suggested several ways to extend and improve the method.
    Page 8, “Conclusions and future work”

See all papers in Proc. ACL 2012 that mention graphical model.

See all papers in Proc. ACL that mention graphical model.

Back to top.

lexicalized

Appears in 7 sentences as: lexicalized (6) lexicalizing (1)
In Learning Syntactic Verb Frames using Graphical Models
  1. Second, by replacing the syntactic features with an approximation based on POS tags, we achieve state-of-the-art performance without relying on error-prone unlexicalized or domain-specific lexicalized parsers.
    Page 2, “Introduction”
  2. The BioLexicon system extracts each verb instance’s GRs using the lexicalized Enju parser tuned to the biomedical domain (Miyao, 2005).
    Page 3, “Previous work”
  3. The BioLexicon system induces its SCF inventory automatically, but requires a lexicalized parsing model, rendering it more sensitive to domain variation.
    Page 3, “Previous work”
  4. Table 2 shows the three variations we tested: the simple tGR type, with parameterization for the POS tags of head and dependent, and with closed-class POS tags (determiners, pronouns and prepositions) lexicalized .
    Page 3, “Methodology”
  5. An unlexicalized parser cannot distinguish these based just on POS tags, while a lexicalized parser requires a large treebank.
    Page 4, “Methodology”
  6. As with tGRs, the closed-class tags can be lexicalized , but there are no corresponding feature sets for param (since they are already built from POS tags) or lim (since there is no similar rule-based approach).
    Page 4, “Methodology”
  7. Third, lexicalizing the closed-class POS tags introduces semantic information outside the scope of the alternation-based definition of subcategorization.
    Page 6, “Results”

See all papers in Proc. ACL 2012 that mention lexicalized.

See all papers in Proc. ACL that mention lexicalized.

Back to top.

parsing model

Appears in 6 sentences as: parsing model (5) parsing models (1)
In Learning Syntactic Verb Frames using Graphical Models
  1. However, the treebanks necessary for training a high-accuracy parsing model are expensive to build for new domains.
    Page 2, “Introduction”
  2. These typically rely on language-specific knowledge, either directly through heuristics, or indirectly through parsing models trained on treebanks.
    Page 2, “Previous work”
  3. Note that both methods require extensive manual work: the Preiss system involves the a priori definition of the SCF inventory, careful construction of matching rules, and an unlexicalized parsing model .
    Page 3, “Previous work”
  4. The BioLexicon system induces its SCF inventory automatically, but requires a lexicalized parsing model , rendering it more sensitive to domain variation.
    Page 3, “Previous work”
  5. Since POS tagging is more reliable and robust across domains than parsing, retraining on new domains will not suffer the effects of a mismatched parsing model (Lippincott et al., 2010).
    Page 6, “Results”
  6. Second, simply treating POS tags within a small window of the verb as pseudo-GRs produces state-of-the-art results without the need for a parsing model .
    Page 8, “Conclusions and future work”

See all papers in Proc. ACL 2012 that mention parsing model.

See all papers in Proc. ACL that mention parsing model.

Back to top.

distributional semantics

Appears in 3 sentences as: distributional semantics (3)
In Learning Syntactic Verb Frames using Graphical Models
  1. Graphical models have been increasingly popular for a variety of tasks such as distributional semantics (Blei et al., 2003) and unsupervised POS tagging (Finkel et al., 2007), and sampling methods allow efficient estimation of full joint distributions (Neal, 1993).
    Page 3, “Previous work”
  2. In a sense this is encouraging, as it motivates our most exciting future work: augmenting this simple model to explicitly capture complementary information such as distributional semantics (Blei et al., 2003), diathesis alternations (McCarthy, 2000) and selectional preferences (0 Séaghdha, 2010).
    Page 8, “Conclusions and future work”
  3. By combining the syntactic classes with unsupervised POS tagging (Teichert and Daumé III, 2009) and the selectional preferences with distributional semantics (O Séaghdha, 2010), we hope to produce more accurate results on these complementary tasks while avoiding the use of any supervised learning.
    Page 8, “Conclusions and future work”

See all papers in Proc. ACL 2012 that mention distributional semantics.

See all papers in Proc. ACL that mention distributional semantics.

Back to top.

rule-based

Appears in 3 sentences as: rule-based (3)
In Learning Syntactic Verb Frames using Graphical Models
  1. As with tGRs, the closed-class tags can be lexicalized, but there are no corresponding feature sets for param (since they are already built from POS tags) or lim (since there is no similar rule-based approach).
    Page 4, “Methodology”
  2. Table 4: Task-based evaluation of leXicons acquired with each of the eight feature types, and the state-of-the-art rule-based VALEX lexicon.
    Page 6, “Results”
  3. Our study reached two important conclusions: first, given the same data as input, an unsupervised probabilistic model can outperform a handcrafted rule-based SCF extractor with a predefined inventory.
    Page 8, “Conclusions and future work”

See all papers in Proc. ACL 2012 that mention rule-based.

See all papers in Proc. ACL that mention rule-based.

Back to top.

treebanks

Appears in 3 sentences as: treebank (1) treebanks (2)
In Learning Syntactic Verb Frames using Graphical Models
  1. However, the treebanks necessary for training a high-accuracy parsing model are expensive to build for new domains.
    Page 2, “Introduction”
  2. These typically rely on language-specific knowledge, either directly through heuristics, or indirectly through parsing models trained on treebanks .
    Page 2, “Previous work”
  3. An unlexicalized parser cannot distinguish these based just on POS tags, while a lexicalized parser requires a large treebank .
    Page 4, “Methodology”

See all papers in Proc. ACL 2012 that mention treebanks.

See all papers in Proc. ACL that mention treebanks.

Back to top.