Less Grammar, More Features
Hall, David and Durrett, Greg and Klein, Dan

Article Structure

Abstract

We present a parser that relies primarily on extracting information directly from surface spans rather than on propagating information through enriched grammar structure.

Introduction

Nai've context-free grammars, such as those embodied by standard treebank annotations, do not parse well because their symbols have too little context to constrain their syntactic behavior.

Parsing Model

In order to exploit nonindependent surface features of the input, we use a discriminative formulation.

Surface Feature Framework

To improve the performance of our X-bar grammar, we will add a number of surface feature templates derived only from the words in the sentence.

Features

Our goal is to use surface features to replicate the functionality of other annotations, without increasing the state space of our grammar, meaning that the rules rule(r) remain simple, as does the state space used during inference.

Annotations

We have built up a strong set of features by this point, but have not yet answered the question of whether or not grammar annotation is useful on top of them.

Other Languages

Historically, many annotation schemes for parsers have required language-specific engineering: for example, lexicalized parsers require a set of head rules and manually-annotated grammars require detailed analysis of the treebank itself (Klein and Manning, 2003).

Sentiment Analysis

Finally, because the system is, at its core, a classifier of spans, it can be used equally well for tasks that do not normally use parsing algorithms.

Conclusion

To date, the most successful constituency parsers have largely been generative, and operate by refining the grammar either manually or automatically so that relevant information is available locally to each parsing decision.

Topics

Treebank

Appears in 19 sentences as: Treebank (9) treebank (7) treebanks (4)
In Less Grammar, More Features
  1. Nai've context-free grammars, such as those embodied by standard treebank annotations, do not parse well because their symbols have too little context to constrain their syntactic behavior.
    Page 1, “Introduction”
  2. Our parser can be easily adapted to this task by replacing the X-bar grammar over treebank symbols with a grammar over the sentiment values to encode the output variables and then adding n-gram indicators to our feature set to capture the bulk of the lexical effects.
    Page 2, “Introduction”
  3. Because the X-bar grammar is so minimal, this grammar does not parse very accurately, scoring just 73 F1 on the standard English Penn Treebank task.
    Page 2, “Parsing Model”
  4. Throughout this and the following section, we will draw on motivating examples from the English Penn Treebank , though similar examples could be equally argued for other languages.
    Page 3, “Surface Feature Framework”
  5. There are a great number of spans in a typical treebank ; extracting features for every possible combination of span and rule is prohibitive.
    Page 3, “Surface Feature Framework”
  6. Table 1 shows the results of incrementally building up our feature set on the Penn Treebank development set.
    Page 4, “Features”
  7. Because constituents in the treebank can be quite long, we bin our length features into 8 buckets, of
    Page 4, “Features”
  8. Table 2: Results for the Penn Treebank development set, sentences of length g 40, for different annotation schemes implemented on top of the X-bar grammar.
    Page 6, “Annotations”
  9. Table 3: Final Parseval results for the v = l, h = 0 parser on Section 23 of the Penn Treebank .
    Page 6, “Annotations”
  10. Finally, Table 3 shows our final evaluation on Section 23 of the Penn Treebank .
    Page 6, “Annotations”
  11. Historically, many annotation schemes for parsers have required language-specific engineering: for example, lexicalized parsers require a set of head rules and manually-annotated grammars require detailed analysis of the treebank itself (Klein and Manning, 2003).
    Page 6, “Other Languages”

See all papers in Proc. ACL 2014 that mention Treebank.

See all papers in Proc. ACL that mention Treebank.

Back to top.

Berkeley parser

Appears in 13 sentences as: Berkeley parser (11) Berkeley parsers (1) Berkeley parser’s (1)
In Less Grammar, More Features
  1. While we do not do as well as the Berkeley parser , we will see in Section 6 that our parser does a substantially better job of generalizing to other languages.
    Page 6, “Annotations”
  2. We show that this is indeed the case: on nine languages, our system is competitive with or better than the Berkeley parser , which is the best single
    Page 6, “Other Languages”
  3. We compare to the Berkeley parser (Petrov and Klein, 2007) as well as two variants.
    Page 7, “Other Languages”
  4. (2013) (Berkeley-Rep), which is their best single parser.5 The “Replaced” system modifies the Berkeley parser by replacing rare words with morphological descriptors of those words computed using language-specific modules, which have been handcrafted for individual languages or are trained with additional annotation layers in the treebanks that we do not exploit.
    Page 7, “Other Languages”
  5. (2013) only report results on the development set for the Berkeley-Rep model; however, the task organizers also use a version of the Berkeley parser provided with parts of speech from high-quality POS taggers for each language (Berkeley-Tags).
    Page 7, “Other Languages”
  6. Both Berkeley-Rep and Berkeley-Tags make up for some shortcomings of the Berkeley parser’s unknown word model, which is tuned to English.
    Page 7, “Other Languages”
  7. In Table 4, we see that our performance is overall substantially higher than that of the Berkeley parser .
    Page 7, “Other Languages”
  8. On the development set, we outperform the Berkeley parser and match the performance of the Berkeley-Rep parser.
    Page 7, “Other Languages”
  9. 5 Their best parser, and the best overall parser from the shared task, is a reranked product of “Replaced” Berkeley parsers .
    Page 7, “Other Languages”
  10. form both the Berkeley parser and the Berkeley-Tags parser on seven of nine languages, losing only on Arabic and French.
    Page 7, “Other Languages”
  11. These results suggest that the Berkeley parser may be heavily fit to English, particularly in its lexicon.
    Page 7, “Other Languages”

See all papers in Proc. ACL 2014 that mention Berkeley parser.

See all papers in Proc. ACL that mention Berkeley parser.

Back to top.

sentiment analysis

Appears in 10 sentences as: sentiment analysis (10)
In Less Grammar, More Features
  1. Finally, we show that, in both syntactic parsing and sentiment analysis , many broad linguistic trends can be captured via surface features.
    Page 1, “Abstract”
  2. (2013) demonstrates that sentiment analysis , which is usually approached as a flat classification task, can be viewed as tree-structured.
    Page 2, “Introduction”
  3. One example is sentiment analysis .
    Page 8, “Sentiment Analysis”
  4. While approaches to sentiment analysis often simply classify the sentence monolithically, treating it as a bag of n-grams (Pang et al., 2002; Pang and Lee, 2005; Wang and Manning, 2012), the recent dataset of Socher et al.
    Page 8, “Sentiment Analysis”
  5. One structural difference between sentiment analysis and syntactic parsing lies in where the relevant information is present in a span.
    Page 8, “Sentiment Analysis”
  6. Therefore, we augment our existing model with standard sentiment analysis features that look at unigrams and bigrams in the span (Wang and Manning, 2012).
    Page 8, “Sentiment Analysis”
  7. We evaluated our model on the fine-grained sentiment analysis task presented in Socher et al.
    Page 8, “Sentiment Analysis”
  8. Their model has high capacity to model complex interactions of words through a combinatory tensor, but it appears that our simpler, feature-driven model is just as effective at capturing the key effects of compo-sitionality for sentiment analysis .
    Page 8, “Sentiment Analysis”
  9. Table 5: Fine-grained sentiment analysis results on the Stanford Sentiment Treebank of Socher et al.
    Page 9, “Sentiment Analysis”
  10. Moreover, we show that our parser is adaptable to other tree-structured tasks such as sentiment analysis ; we outperform the recent system of Socher et al.
    Page 9, “Conclusion”

See all papers in Proc. ACL 2014 that mention sentiment analysis.

See all papers in Proc. ACL that mention sentiment analysis.

Back to top.

lexicalized

Appears in 9 sentences as: Lexicalization (1) lexicalization (2) lexicalize (1) Lexicalized (1) lexicalized (3) lexicalizing (1)
In Less Grammar, More Features
  1. For example, head lexicalization (Eisner, 1996; Collins, 1997; Charniak, 1997), structural annotation (Johnson, 1998; Klein and Manning, 2003), and state-splitting (Matsuzaki et al., 2005; Petrov et al., 2006) are all designed to take coarse symbols like PP and decorate them with additional context.
    Page 1, “Introduction”
  2. Hall and Klein (2012) employed both kinds of annotations, along with lexicalized head word annotation.
    Page 3, “Parsing Model”
  3. Because heads of constituents are often at the beginning or the end of a span, these feature templates can (noisily) capture monolexical properties of heads without having to incur the inferential cost of lexicalized annotations.
    Page 4, “Features”
  4. Annotation Dev, len g 40 v = 0, h = 0 90.1 v = l, h = 0 90.5 v = 0, h = 1 90.2 v = l, h = 1 90.9 Lexicalized 90.3
    Page 6, “Annotations”
  5. Another commonly-used kind of structural annotation is lexicalization (Eisner, 1996; Collins, 1997; Charniak, 1997).
    Page 6, “Annotations”
  6. Table 2 shows results from lexicalizing the X-bar grammar; it provides meager improvements.
    Page 6, “Annotations”
  7. Lexicalization allows us to capture bileXical relationships along dependency arcs, but it has been previously shown that these add only marginal benefit to Collins’s model anyway (Gildea, 2001).
    Page 6, “Annotations”
  8. Historically, many annotation schemes for parsers have required language-specific engineering: for example, lexicalized parsers require a set of head rules and manually-annotated grammars require detailed analysis of the treebank itself (Klein and Manning, 2003).
    Page 6, “Other Languages”
  9. Our features can also lexicalize on other discourse connectives such as but or however, which often occur at the split point between two spans.
    Page 8, “Sentiment Analysis”

See all papers in Proc. ACL 2014 that mention lexicalized.

See all papers in Proc. ACL that mention lexicalized.

Back to top.

feature templates

Appears in 7 sentences as: feature template (2) feature templates (5)
In Less Grammar, More Features
  1. To improve the performance of our X-bar grammar, we will add a number of surface feature templates derived only from the words in the sentence.
    Page 3, “Surface Feature Framework”
  2. Subsequent lines in Table 1 indicate additional surface feature templates computed over the span, which are then conjoined with the rule identity as shown in Figure l to give additional features.
    Page 4, “Features”
  3. Note that many of these features have been used before (Taskar et al., 2004; Finkel et al., 2008; Petrov and Klein, 2008b); our goal here is not to amass as many feature templates as possible, but rather to examine the extent to which a simple set of features can replace a complicated state space.
    Page 4, “Features”
  4. Because heads of constituents are often at the beginning or the end of a span, these feature templates can (noisily) capture monolexical properties of heads without having to incur the inferential cost of lexicalized annotations.
    Page 4, “Features”
  5. stance of this feature template .
    Page 5, “Features”
  6. We exploit this by adding an additional feature template similar to our span shape feature from Section 4.4 which uses the (deterministic) tag for each word as its descriptor.
    Page 8, “Sentiment Analysis”
  7. We build up a small set of feature templates as part of a discriminative constituency parser and outperform the Berkeley parser on a wide range of languages.
    Page 9, “Conclusion”

See all papers in Proc. ACL 2014 that mention feature templates.

See all papers in Proc. ACL that mention feature templates.

Back to top.

Penn Treebank

Appears in 6 sentences as: Penn Treebank (6)
In Less Grammar, More Features
  1. Because the X-bar grammar is so minimal, this grammar does not parse very accurately, scoring just 73 F1 on the standard English Penn Treebank task.
    Page 2, “Parsing Model”
  2. Throughout this and the following section, we will draw on motivating examples from the English Penn Treebank , though similar examples could be equally argued for other languages.
    Page 3, “Surface Feature Framework”
  3. Table 1 shows the results of incrementally building up our feature set on the Penn Treebank development set.
    Page 4, “Features”
  4. Table 2: Results for the Penn Treebank development set, sentences of length g 40, for different annotation schemes implemented on top of the X-bar grammar.
    Page 6, “Annotations”
  5. Table 3: Final Parseval results for the v = l, h = 0 parser on Section 23 of the Penn Treebank .
    Page 6, “Annotations”
  6. Finally, Table 3 shows our final evaluation on Section 23 of the Penn Treebank .
    Page 6, “Annotations”

See all papers in Proc. ACL 2014 that mention Penn Treebank.

See all papers in Proc. ACL that mention Penn Treebank.

Back to top.

CRF

Appears in 4 sentences as: CRF (4)
In Less Grammar, More Features
  1. Formally, our model is a CRF where the features factor over anchored rules of a small backbone grammar, as shown in Figure 1.
    Page 2, “Introduction”
  2. All of these past CRF parsers do also exploit span features, as did the structured margin parser of Taskar et al.
    Page 3, “Parsing Model”
  3. Recall that our CRF factors over anchored rules 7“, where each 7“ has identity rule(7“) and anchoring span(r).
    Page 3, “Surface Feature Framework”
  4. As far as we can tell, all past CRF parsers have used “positive” features only.
    Page 3, “Surface Feature Framework”

See all papers in Proc. ACL 2014 that mention CRF.

See all papers in Proc. ACL that mention CRF.

Back to top.

development set

Appears in 4 sentences as: development set (4)
In Less Grammar, More Features
  1. Table 1 shows the results of incrementally building up our feature set on the Penn Treebank development set .
    Page 4, “Features”
  2. Table 2: Results for the Penn Treebank development set , sentences of length g 40, for different annotation schemes implemented on top of the X-bar grammar.
    Page 6, “Annotations”
  3. (2013) only report results on the development set for the Berkeley-Rep model; however, the task organizers also use a version of the Berkeley parser provided with parts of speech from high-quality POS taggers for each language (Berkeley-Tags).
    Page 7, “Other Languages”
  4. On the development set , we outperform the Berkeley parser and match the performance of the Berkeley-Rep parser.
    Page 7, “Other Languages”

See all papers in Proc. ACL 2014 that mention development set.

See all papers in Proc. ACL that mention development set.

Back to top.

Shared Task

Appears in 4 sentences as: Shared Task (2) shared task (2)
In Less Grammar, More Features
  1. On the SPMRL 2013 multilingual constituency parsing shared task (Seddah et al., 2013), our system outperforms the top single parser system of Bjorkelund et al.
    Page 1, “Abstract”
  2. Our parser is also able to generalize well across languages with little tuning: it achieves state-of-the-art results on multilingual parsing, scoring higher than the best single-parser system from the SPMRL 2013 Shared Task on a range of languages, as well as on the competition’s average Fl metric.
    Page 2, “Introduction”
  3. We evaluate on the constituency treebanks from the Statistical Parsing of Morphologically Rich Languages Shared Task (Seddah et al., 2013).
    Page 7, “Other Languages”
  4. 5 Their best parser, and the best overall parser from the shared task , is a reranked product of “Replaced” Berkeley parsers.
    Page 7, “Other Languages”

See all papers in Proc. ACL 2014 that mention Shared Task.

See all papers in Proc. ACL that mention Shared Task.

Back to top.

constituency parser

Appears in 3 sentences as: constituency parser (1) constituency parsers (1) constituency parsing (1)
In Less Grammar, More Features
  1. On the SPMRL 2013 multilingual constituency parsing shared task (Seddah et al., 2013), our system outperforms the top single parser system of Bjorkelund et al.
    Page 1, “Abstract”
  2. To date, the most successful constituency parsers have largely been generative, and operate by refining the grammar either manually or automatically so that relevant information is available locally to each parsing decision.
    Page 9, “Conclusion”
  3. We build up a small set of feature templates as part of a discriminative constituency parser and outperform the Berkeley parser on a wide range of languages.
    Page 9, “Conclusion”

See all papers in Proc. ACL 2014 that mention constituency parser.

See all papers in Proc. ACL that mention constituency parser.

Back to top.

feature set

Appears in 3 sentences as: feature set (3)
In Less Grammar, More Features
  1. Our parser can be easily adapted to this task by replacing the X-bar grammar over treebank symbols with a grammar over the sentiment values to encode the output variables and then adding n-gram indicators to our feature set to capture the bulk of the lexical effects.
    Page 2, “Introduction”
  2. Table 1 shows the results of incrementally building up our feature set on the Penn Treebank development set.
    Page 4, “Features”
  3. Table 2 shows the performance of our feature set in grammars with several different levels of structural annotation.3 Klein and Manning (2003) find large gains (6% absolute improvement, 20% relative improvement) going from v = 0, h = 0 to v = l, h = 1; however, we do not find the same level of benefit.
    Page 6, “Annotations”

See all papers in Proc. ACL 2014 that mention feature set.

See all papers in Proc. ACL that mention feature set.

Back to top.

part-of-speech

Appears in 3 sentences as: part-of-speech (3)
In Less Grammar, More Features
  1. An independent classification approach is actually very viable for part-of-speech tagging (Toutanova et al., 2003), but is problematic for parsing — if nothing else, parsing comes with a structural requirement that the output be a well-formed, nested tree.
    Page 2, “Introduction”
  2. features on preterminal part-of-speech tags.
    Page 4, “Features”
  3. These part-of-speech taggers often incorporate substantial knowledge of each language’s morphology.
    Page 7, “Other Languages”

See all papers in Proc. ACL 2014 that mention part-of-speech.

See all papers in Proc. ACL that mention part-of-speech.

Back to top.

part-of-speech taggers

Appears in 3 sentences as: part-of-speech taggers (1) part-of-speech tagging (1) part-of-speech tags (1)
In Less Grammar, More Features
  1. An independent classification approach is actually very viable for part-of-speech tagging (Toutanova et al., 2003), but is problematic for parsing — if nothing else, parsing comes with a structural requirement that the output be a well-formed, nested tree.
    Page 2, “Introduction”
  2. features on preterminal part-of-speech tags .
    Page 4, “Features”
  3. These part-of-speech taggers often incorporate substantial knowledge of each language’s morphology.
    Page 7, “Other Languages”

See all papers in Proc. ACL 2014 that mention part-of-speech taggers.

See all papers in Proc. ACL that mention part-of-speech taggers.

Back to top.

reranked

Appears in 3 sentences as: reranked (1) rerankers (1) reranking (1)
In Less Grammar, More Features
  1. There have been nonlocal approaches as well, such as tree-substitution parsers (Bod, 1993; Sima’an, 2000), neural net parsers (Henderson, 2003), and rerankers (Collins and Koo, 2005; Charniak and Johnson, 2005; Huang, 2008).
    Page 1, “Introduction”
  2. it does not use a reranking step or post-hoc combination of parser results.
    Page 7, “Other Languages”
  3. 5 Their best parser, and the best overall parser from the shared task, is a reranked product of “Replaced” Berkeley parsers.
    Page 7, “Other Languages”

See all papers in Proc. ACL 2014 that mention reranked.

See all papers in Proc. ACL that mention reranked.

Back to top.