Parser Evaluation Using Derivation Trees: A Complement to evalb
Kulick, Seth and Bies, Ann and Mott, Justin and Kroch, Anthony and Santorini, Beatrice and Liberman, Mark

Article Structure

Abstract

This paper introduces a new technique for phrase-structure parser analysis, categorizing possible treebank structures by integrating regular expressions into derivation trees.

Introduction

Phrase-structure parsing is usually evaluated using evalb (Sekine and Collins, 2008), which provides a score based on matching brackets.

Framework for analyzing parsing performance

We first describe the use of the regexes in tree decomposition, and then give some examples of in-

Analysis of parsing results

We worked with the three datasets as described in the introduction.

Topics

treebank

Appears in 7 sentences as: Treebank (3) treebank (4) treebanking (1)
In Parser Evaluation Using Derivation Trees: A Complement to evalb
  1. This paper introduces a new technique for phrase-structure parser analysis, categorizing possible treebank structures by integrating regular expressions into derivation trees.
    Page 1, “Abstract”
  2. We analyze the performance of the Berkeley parser on OntoNotes WSJ and the English Web Treebank .
    Page 1, “Abstract”
  3. Second, we use a set of regular expressions (henceforth “regexes”) that categorize the possible structures in the treebank .
    Page 1, “Introduction”
  4. After describing in more detail the basic framework, we show some aspects of the resulting analysis of the performance of the Berkeley parser (Petrov et al., 2008) on three datasets: (a) OntoNotes WSJ sections 2-21 (Weischedel et al., 2011)1, (b) OntoNotes WSJ section 22, and (c) the “Answers” section of the English Web Treebank (Bies et al., 2012).
    Page 1, “Introduction”
  5. 1We refer only to the WSJ treebank portion of OntoNotes, which is roughly a subset of the Penn Treebank (Marcus et al., 1999) with annotation revisions including the addition of NML nodes.
    Page 1, “Framework for analyzing parsing performance”
  6. We derived the regexes via an iterative process of inspection of tree decomposition on dataset (a), together with taking advantage of the treebanking experience from some of the coauthors.
    Page 2, “Framework for analyzing parsing performance”
  7. The high coverage (%) reinforces the point that there is a limited number of core structures in the treebank .
    Page 4, “Analysis of parsing results”

See all papers in Proc. ACL 2014 that mention treebank.

See all papers in Proc. ACL that mention treebank.

Back to top.

recursive

Appears in 5 sentences as: recursive (5)
In Parser Evaluation Using Derivation Trees: A Complement to evalb
  1. As described above, we are also interested in the type of linguistic construction represented by that one-level structure, each of which instantiates one of a few types - recursive coordination, simple head-and-sister, etc.
    Page 2, “Framework for analyzing parsing performance”
  2. (c) NP-modr is a regex for a recursive NP with a right modifier.
    Page 2, “Framework for analyzing parsing performance”
  3. (d) VP-crd is also a regex for a recursive structure, in this case for VP coordination, picking out the leftmost conjunct as the head of the structure.
    Page 2, “Framework for analyzing parsing performance”
  4. Also, the attachment score is not relevant for regexes that already express a recursive structure, such as NP—modr.
    Page 3, “Framework for analyzing parsing performance”
  5. attachment score does not apply to the recursive categories, as mentioned above.
    Page 4, “Analysis of parsing results”

See all papers in Proc. ACL 2014 that mention recursive.

See all papers in Proc. ACL that mention recursive.

Back to top.

regular expressions

Appears in 3 sentences as: regular expressions (3)
In Parser Evaluation Using Derivation Trees: A Complement to evalb
  1. This paper introduces a new technique for phrase-structure parser analysis, categorizing possible treebank structures by integrating regular expressions into derivation trees.
    Page 1, “Abstract”
  2. Second, we use a set of regular expressions (henceforth “regexes”) that categorize the possible structures in the treebank.
    Page 1, “Introduction”
  3. 2.1 Use of regular expressions
    Page 2, “Framework for analyzing parsing performance”

See all papers in Proc. ACL 2014 that mention regular expressions.

See all papers in Proc. ACL that mention regular expressions.

Back to top.