Abstract | CCG affords ways to augment treepath-based features to overcome these data sparsity issues. |
Abstract | By adding features over CCG word-word dependencies and lexicalized verbal subcategorization frames (“supertags”), we can obtain an F-score that is substantially better than a previous CCG-based SRL system and competitive with the current state of the art. |
Combinatory Categorial Grammar | Rather than using standard part-of-speech tags and grammatical rules, CCG encodes much of the combinatory potential of each word by assigning a syntactically informative category. |
Combinatory Categorial Grammar | Further, CCG has the advantage of a transparent interface between the way the words combine and their dependencies with other words. |
Introduction | Brutus uses the CCG parser of (Clark and Curran, 2007, henceforth the C&C parser), Charniak’s parser (Chamiak, 2001) for additional CFG-based features, and MALT parser (Nivre et al., 2007) for dependency features, while (Punyakanok et al., 2008) use results from an ensemble of parses from Charniak’s Parser and a Collins parser (Collins, 2003; Bike], 2004). |
Introduction | We do not employ a similar strategy due to the differing notions of constituency represented in our parsers ( CCG having a much more fluid notion of constituency and the MALT parser using a different approach entirely). |
Introduction | In the following, we briefly introduce the CCG grammatical formalism and motivate its use in SRL (Sections 2—3). |
Potential Advantages to using CCG | There are many potential advantages to using the CCG formalism in SRL. |
Abstract | We compare the CCG parser of Clark and Curran (2007) with a state-of-the-art Penn Treebank (PTB) parser. |
Abstract | An accuracy comparison is performed by converting the CCG derivations into PTB trees. |
Abstract | We show that the conversion is extremely difficult to perform, but are able to fairly compare the parsers on a representative subset of the PTB test section, obtaining results for the CCG parser that are statistically no different to those for the Berkeley parser. |
Introduction | The second approach is to apply statistical methods to parsers based on linguistic formalisms, such as HPSG, LFG, TAG, and CCG , with the grammar being defined manually or extracted from a formalism-specific treebank. |
Introduction | The formalism-based parser we use is the CCG parser of Clark and Curran (2007), which is based on CCGbank (Hockenmaier and Steedman, 2007), a CCG version of the Penn Treebank. |
Introduction | The comparison focuses on accuracy and is performed by converting CCG derivations into PTB phrase-structure trees. |
The CCG to PTB Conversion | shows that converting gold-standard CCG derivations into the GRs in DepBank resulted in an F-score of only 85%; hence the upper bound on the performance of the CCG parser, using this evaluation scheme, was only 85%. |
The CCG to PTB Conversion | First, the corresponding derivations in the treebanks are not isomorphic: a CCG derivation is not simply a relabelling of the nodes in the PTB tree; there are many constructions, such as coordination and control structures, where the trees are a different shape, as well as having different labels. |