Abstract | Via an oracle experiment, we show that the upper bound on accuracy of a CCG parser is significantly lowered when its search space is pruned using a supertagger, though the supertagger also prunes many bad parses. |
CCG and Supertagging | CCG is a lexicalized grammar formalism encoding for each word lexical categories that are either basic (eg. |
CCG and Supertagging | As can be inferred from even this small example, a key difficulty in parsing CCG is that the number of categories quickly becomes extremely large, and there are typically many ways to analyze every span of a sentence. |
Experiments | We evaluated on CCGbank (Hockenmaier and Steedman, 2007), a rightmost normal-form CCG version of the Penn Treebank. |
Experiments | :nce our combined model represents the best CCG rsing results under any setting. |
Integrated Supertagging and Parsing | Even allowing for the observation of Fowler and Penn (2010) that our practical CCG is context-free, this problem still reduces to the construction of Bar-Hillel et al. |
Introduction | Accurate and efficient parsing of Combinatorial Cat-egorial Grammar ( CCG ; Steedman, 2000) is a longstanding problem in computational linguistics, due to the complexities associated its mild context sensitivity. |
Introduction | Even for practical CCG that are strongly context-free (Fowler and Penn, 2010), parsing is much harder than with Penn Treebank—style context-free grammars, with vast numbers of nonterminal categories leading to increased grammar constants. |
Introduction | Where a typical Penn Treebank grammar may have fewer than 100 nonterminals (Hockenmaier and Steedman, 2002), we found that a CCG grammar derived from CCGbank contained over 1500. |
Abstract | This paper presents the first dependency model for a shift-reduce CCG parser. |
Abstract | Modelling dependencies is desirable for a number of reasons, including handling the “spurious” ambiguity of CCG; fitting well with the theory of CCG ; and optimizing for structures which are evaluated at test time. |
Abstract | Standard CCGBank tests show the model achieves up to 1.05 labeled F-score improvements over three existing, competitive CCG parsing models. |
Introduction | Combinatory Categorial Grammar ( CCG ; Steedman (2000)) is able to derive typed dependency structures (Hockenmaier, 2003; Clark and Curran, 2007), providing a useful approximation to the underlying predicate-argument relations of “who did what to whom”. |
Introduction | To date, CCG remains the most competitive formalism for recovering “deep” dependencies arising from many linguistic phenomena such as raising, control, extraction and coordination (Rimell et al., 2009; Nivre et al., 2010). |
Introduction | To achieve its expressiveness, CCG exhibits so-called “spurious” ambiguity, permitting many nonstandard surface derivations which ease the recovery of certain dependencies, especially those arising from type-raising and composition. |
Detection | Algorithm The detection algorithm considers all phrases that our CCG grammar A (Section 4) can parse, uses a learned classifier to further filter this set, and finally resolves conflicts between any overlapping predictions. |
Formal Overview | We use a log-linear CCG (Steedman, 1996; Clark and Curran, 2007) to rank possible meanings z E Z for each mention m in a document D, as described in Section 4. |
Introduction | For both tasks, we make use of a hand-engineered Combinatory Categorial Grammar ( CCG ) to construct a set of meaning representations that identify the time being described. |
Introduction | For the relatively closed-class time expressions, we demonstrate that it is possible to engineer a high quality CCG lexicon. |
Parsing Time Expressions | First, we use a CCG to generate an initial logical form for the mention. |
Parsing Time Expressions | Figure l: A CCG parse tree for the mention “one week ago.” The tree includes forward (>) and backward (<) application, as well as two type-shifting operations |
Parsing Time Expressions | CCG is a linguistically motivated categorial formalism for modeling a wide range of language phenomena (Steedman, 1996; Steedman, 2000). |
Introduction | Syntactic information is provided by CCGbank, a conversion of the Penn Treebank into the CCG formalism (Hockenmaier and Steedman, 2002a). |
Parser Design | This section describes the Combinatory Categorial Grammar ( CCG ) parsing model used by ASP. |
Parser Design | The input to the parser is a part-of-speech tagged sentence, and the output is a syntactic CCG parse tree, along with zero or more logical forms representing the semantics of subspans of the sentence. |
Parser Design | ASP uses a lexicalized and semantically-typed Combinatory Categorial Grammar ( CCG ) (Steedman, 1996). |
Prior Work | This paper combines two lines of prior work: broad coverage syntactic parsing with CCG and semantic parsing. |
Prior Work | Broad coverage syntactic parsing with CCG has produced both resources and successful parsers. |
Prior Work | These parsers are trained and evaluated using CCGbank (Hockenmaier and Steedman, 2002a), an automatic conversion of the Penn Treebank into the CCG formalism. |
Abstract | We are interested in parsing constituency-based grammars such as HPSG and CCG using a small amount of data specific for the target formalism, and a large quantity of coarse CFG annotations from the Penn Treebank. |
Abstract | We evaluate our approach on three constituency-based grammars — CCG , HPSG, and LPG, augmented with the Penn Treebank—l. |
Introduction | A natural candidate for such coarse annotations is context-free grammar (CFG) from the Penn Treebank, while the target formalism can be any constituency-based grammars, such as Combinatory Categorial Grammar ( CCG ) (Steedman, 2001), Lexical Functional Grammar (LFG) (Bresnan, 1982) or Head-Driven Phrase Structure Grammar (HPSG) (Pollard and Sag, 1994). |
Introduction | We evaluate our approach on three constituency-based grammars — CCG , HPSG, and LPG. |
Introduction | S CFG Sldcll CCG |
Related Work | There have been several attempts to map annotations in coarse grammars like CFG to annotations in richer grammar, like HPSG, LFG, or CCG . |
Related Work | For instance, Hockenmaier and Steedman (2002) made thousands of POS and constituent modifications to the Penn Treebank to facilitate transfer to CCG . |
The Learning Problem | Recall that our goal is to learn how to parse the target formalisms while using two annotated sources: a small set of sentences annotated in the target formalism (e.g., CCG ), and a large set of sentences with coarse annotations. |
The Learning Problem | For simplicity we focus on the CCG formalism in what follows. |
Abstract | This paper describes a method of inducing wide-coverage CCG resources for Japanese. |
Abstract | Our method first integrates multiple dependency-based corpora into phrase structure trees and then converts the trees into CCG derivations. |
Introduction | combinatory categorial grammar ( CCG ) (Steedman, 2001). |
Introduction | Our work is basically an extension of a seminal work on CCGbank (Hockenmaier and Steedman, 2007), in which the phrase structure trees of the Penn Treebank (PTB) (Marcus et al., 1993) are converted into CCG derivations and a wide-coverage CCG lexicon is then extracted from these derivations. |
Introduction | Moreover, the relation between chunk-based dependency structures and CCG derivations is not obvious. |
Abstract | This model leverages the CCG combinatory operators to guide a nonlinear transformation of meaning within a sentence. |
Background | In this paper we focus on CCG , a linguistically expressive yet computationally efficient grammar formalism. |
Background | CCG relies on combinatory logic (as opposed to lambda calculus) to build its expressions. |
Background | CCG has been described as having a transparent surface between the syntactic and the seman- |
Introduction | in this field includes the Combinatory Categorial Grammar ( CCG ), which also places increased emphasis on syntactic coverage (Szabolcsi, 1989). |
Introduction | We achieve this goal by employing the CCG formalism to consider compositional structures at any point in a parse tree. |
Introduction | CCG is attractive both for its transparent interface between syntax and semantics, and a small but powerful set of combinatory operators with which we can parametrise our nonlinear transformations of compositional meaning. |
Abstract | We propose an improved, bottom-up method for converting CCG derivations into PTB-style phrase structure trees. |
Background | Our focus is on CCG to PTB conversion (Clark and Curran, 2009). |
Background | 2.1 Combinatory Categorial Grammar ( CCG ) |
Background | The lower half of Figure 1 shows a CCG derivation (Steedman, 2000) in which each word is assigned a category, and combinatory rules are applied to adjacent categories until only one remains. |
Introduction | Converting the Penn Treebank (PTB, Marcus et al., 1993) to other formalisms, such as HPSG (Miyao et al., 2004), LFG (Cahill et al., 2008), LTAG (Xia, 1999), and CCG (Hockenmaier, 2003), is a complex process that renders linguistic phenomena in formalism-specific ways. |
Introduction | Clark and Curran (2009) developed a CCG to PTB conversion that treats the CCG derivation as a phrase structure tree and applies handcrafted rules to every pair of categories that combine in the derivation. |
Introduction | Because their approach does not exploit the gener-alisations inherent in the CCG formalism, they must resort to ad-hoc rules over nonlocal features of the CCG constituents being combined (when a fixed pair of CCG categories correspond to multiple PTB structures). |
Conclusion | Because the lexicon is the grammar in CCG , learning new word-category associations is grammar generalization and is of interest for grammar acquisition. |
Data | CCGbank was created by semiautomatically converting the Penn Treebank to CCG derivations (Hockenmaier and Steedman, 2007). |
Data | CCG-TUT was created by semiautomatically converting dependencies in the Italian Turin University Treebank to CCG derivations (Bos et al., 2009). |
Experiments | As such, these supertags are outside of the categorial system: their use in derivations requires phrase structure rules that are not derivable from the CCG combinatory rules. |
Experiments | EMG 1’s higher recall and precision indicate the tag transition distributions do capture general patterns of linkage between adjacent CCG categories, while EM ensures that the data filters out combinable, but unnecessary, bitags. |
Grammar informed initialization for supertagging | trast, supertags are detailed, structured labels; a universal set of grammatical rules defines how categories may combine with one another to project syntactic structure.2 Because of this, properties of the CCG formalism itself can be used to constrain learning—prior to considering any particular language, grammar or data set. |
Grammar informed initialization for supertagging | 2Note that supertags can be lexical categories of CCG (Steedman, 2000), elementary trees of Tree-adjoining Grammar (J oshi, 1988), or types in a feature hierarchy as in Head-driven Phrase Structure Grammar (Pollard and Sag, 1994). |
Introduction | A more challenging task is learning supertaggers for lexicalized grammar formalisms such as Combinatory Categorial Grammar ( CCG ) (Steedman, 2000). |
Introduction | Yet, this is an important task since creating grammars and resources for CCG parsers for new domains and languages is highly labor— and knowledge-intensive. |
Introduction | Baldridge (2008) uses grammar-informed initialization for HMM tag transitions based on the universal combinatory rules of the CCG formalism to obtain 56.1% accuracy on ambiguous word tokens, a large improvement over the 33.0% accuracy obtained with uniform initialization for tag transitions. |
Abstract | We demonstrate the effectiveness of the method using a CCG supertagger and parser, obtaining significant speed increases on newspaper text with no loss in accuracy. |
Abstract | We also show that the method can be used to adapt the CCG parser to new domains, obtaining accuracy and speed improvements for Wikipedia and biomedical text. |
Adaptive Supertagging | CCG supertaggers are about 92% accurate when assigning a single lexical category to each word (Clark and Curran, 2004). |
Background | Figure 1 gives two sentences and their CCG derivations, showing how some of the syntactic ambiguity is transferred to the supertagging component in a lexicalised grammar. |
Background | Figure 1: Two CCG derivations with PP ambiguity. |
Background | Clark and Curran (2004) applied supertagging to CCG , using a flexible multi-tagging approach. |
Data | We have used Sections 02-21 of CCGbank (Hock-enmaier and Steedman, 2007), the CCG version of the Penn Treebank (Marcus et al., 1993), as training data for the newspaper domain. |
Data | For supertagger evaluation, one thousand sentences were manually annotated with CCG lexical categories and POS tags. |
Introduction | Parsing with lexicalised grammar formalisms, such as Lexicalised Tree Adjoining Grammar and Combinatory Categorial Grammar ( CCG ; Steed-man, 2000), can be made more efficient using a supertagger. |
Introduction | In this paper, we focus on the CCG parser and supertagger described in Clark and Curran (2007). |
Introduction | Since the CCG lexical category set used by the supertagger is much larger than the Penn Treebank POS tag set, the accuracy of supertagging is much lower than POS tagging; hence the CCG supertagger assigns multiple supertags1 to a word, when the local context does not provide enough information to decide on the correct supertag. |
Abstract | We compare the CCG parser of Clark and Curran (2007) with a state-of-the-art Penn Treebank (PTB) parser. |
Abstract | An accuracy comparison is performed by converting the CCG derivations into PTB trees. |
Abstract | We show that the conversion is extremely difficult to perform, but are able to fairly compare the parsers on a representative subset of the PTB test section, obtaining results for the CCG parser that are statistically no different to those for the Berkeley parser. |
Introduction | The second approach is to apply statistical methods to parsers based on linguistic formalisms, such as HPSG, LFG, TAG, and CCG , with the grammar being defined manually or extracted from a formalism-specific treebank. |
Introduction | The formalism-based parser we use is the CCG parser of Clark and Curran (2007), which is based on CCGbank (Hockenmaier and Steedman, 2007), a CCG version of the Penn Treebank. |
Introduction | The comparison focuses on accuracy and is performed by converting CCG derivations into PTB phrase-structure trees. |
The CCG to PTB Conversion | shows that converting gold-standard CCG derivations into the GRs in DepBank resulted in an F-score of only 85%; hence the upper bound on the performance of the CCG parser, using this evaluation scheme, was only 85%. |
The CCG to PTB Conversion | First, the corresponding derivations in the treebanks are not isomorphic: a CCG derivation is not simply a relabelling of the nodes in the PTB tree; there are many constructions, such as coordination and control structures, where the trees are a different shape, as well as having different labels. |
Abstract | We adapt techniques from sapertagging — a relatively recent technique that performs complex lexical tagging before full parsing (Bangalore and Joshi, 1999; Clark, 2002) — for chart realization in OpenCCG, an open-source NLP toolkit for CCG . |
Background | The OpenCCG surface realizer is based on Steed-man’s (2000) version of CCG elaborated with Baldridge and Kruijff’s multi-modal extensions for lexically specified derivation control (Baldridge, 2002; Baldridge and Kruijff, 2003) and hybrid logic dependency semantics (Baldridge and Kruijff, 2002). |
Background | (2007) describe an ongoing effort to engineer a grammar from the CCGbank (Hockenmaier and Steedman, 2007) — a corpus of CCG derivations derived from the Penn Treebank — suitable for realization with OpenCCG. |
Background | Changes to the derivations are necessary to reflect the lexicalized treatment of coordination and punctuation assumed by the multi-modal version of CCG that is implemented in OpenCCG. |
Conclusion | We have introduced a novel type of supertagger, which we have dubbed a hypertagger, that assigns CCG category labels to elementary predications in a structured semantic representation with high accuracy at several levels of tagging ambiguity in a fashion reminiscent of (Bangalore and Rambow, 2000). |
Conclusion | We have also shown that, by integrating this hypertagger with a broad-coverage CCG chart realizer, considerably faster realization times are possible (approximately twice as fast as compared with a realizer that performs simple lexical lookups) with higher BLEU, METEOR and exact string match scores. |
Introduction | In lexicalized grammatical formalisms such as Lexicalized Tree Adjoining Grammar (Schabes et al., 1988, LTAG), Combinatory Categorial Grammar (Steedman, 2000, CCG ) and Head-Driven Phrase-Structure Grammar (Pollard and Sag, 1994, HPSG), it is possible to separate lexical category assignment — the assignment of informative syntactic categories to linguistic objects such as words or lexical predicates — from the combinatory processes that make use of such categories — such as parsing and surface realization. |
Introduction | combination thereof, as in the CCG parser in (Hockenmaier, 2003) or the chart realizer in (Carroll and Oepen, 2005). |
Introduction | Supertagging has been more recently extended to a multitagging paradigm in CCG (Clark, 2002; Curran et al., 2006), leading to extremely efficient parsing with state-of-the-art dependency recovery (Clark and Curran, 2007). |
The Approach | In the next section, we show that a supertagger for CCG realization, or hypertagger, can reduce the problem of search errors by focusing the search space on the most likely lexical categories. |
Abstract | The standard set of rules defined in Combinatory Categorial Grammar ( CCG ) fails to provide satisfactory analyses for a number of syntactic structures found in natural languages. |
Abstract | These structures can be analyzed elegantly by augmenting CCG with a class of rules based on the combinator D (Curry and Feys, 1958). |
Combinatory Categorial Grammar | CCG uses a universal set of syntactic rules based on the B, T, and S combinators of combinatory logic (Curry and Feys, 1958): (2) Br ((Bf)g)w = f(gw) T: T51: f = fa: SI ((Sf)g)w = fw(gw) CCG functors are functions over strings of symbols, |
Combinatory Categorial Grammar | The rules of this multimodal version of CCG (Baldridge, 2002; Baldridge and Kruijff, 2003) are derived as theorems of a Categorial Type Logic (CTL, Moortgat (1997)). |
Combinatory Categorial Grammar | This treats CCG as a compilation of CTL proofs, providing a principled, grammar-internal basis for restrictions on the CCG rules, transferring language-particular restrictions on rule application to the lexicon, and allowing the CCG rules to be viewed as grammatical universals (Baldridge and Kruijff, 2003; Steedman and Baldridge, To Appear). |
Introduction | Combinatory Categorial Grammar ( CCG , Steedman (2000)) is a compositional, semantically transparent formalism that is both linguistically expressive and computationally tractable. |
Introduction | A distinctive aspect of CCG is that it provides a very flexible notion of constituency. |
Introduction | that even with its flexibility, CCG as standardly defined is not permissive enough for certain linguistic constructions and greater incrementality. |
Background | We use these brackets to determine new gold-standard CCG derivations in Section 3. |
Background | Combinatory Categorial Grammar ( CCG ) (Steedman, 2000) is a type-driven, lexicalised theory of |
Background | This is an advantage of CCG , allowing it to recover long-range dependencies without the need for postprocessing, as is the case for many other parsers. |
Conversion Process | This section describes the process of converting the Vadas and Curran (2007a) data to CCG derivations. |
Introduction | CCGbank (Hockenmaier and Steedman, 2007) is the primary English corpus for Combinatory Categorial Grammar ( CCG ) (Steedman, 2000) and was created by a semiautomatic conversion from the Penn Treebank. |
Introduction | However, CCG is a binary branching grammar, and as such, cannot leave NP structure underspecified. |
Introduction | :e Structure with CCG |
Abstract | CCG affords ways to augment treepath-based features to overcome these data sparsity issues. |
Abstract | By adding features over CCG word-word dependencies and lexicalized verbal subcategorization frames (“supertags”), we can obtain an F-score that is substantially better than a previous CCG-based SRL system and competitive with the current state of the art. |
Combinatory Categorial Grammar | Rather than using standard part-of-speech tags and grammatical rules, CCG encodes much of the combinatory potential of each word by assigning a syntactically informative category. |
Combinatory Categorial Grammar | Further, CCG has the advantage of a transparent interface between the way the words combine and their dependencies with other words. |
Introduction | Brutus uses the CCG parser of (Clark and Curran, 2007, henceforth the C&C parser), Charniak’s parser (Chamiak, 2001) for additional CFG-based features, and MALT parser (Nivre et al., 2007) for dependency features, while (Punyakanok et al., 2008) use results from an ensemble of parses from Charniak’s Parser and a Collins parser (Collins, 2003; Bike], 2004). |
Introduction | We do not employ a similar strategy due to the differing notions of constituency represented in our parsers ( CCG having a much more fluid notion of constituency and the MALT parser using a different approach entirely). |
Introduction | In the following, we briefly introduce the CCG grammatical formalism and motivate its use in SRL (Sections 2—3). |
Potential Advantages to using CCG | There are many potential advantages to using the CCG formalism in SRL. |
Abstract | Combinatory Categorial Grammar ( CCG ) is generally construed as a fully lexicalized formalism, where all grammars use one and the same universal set of rules, and cross-linguistic variation is isolated in the lexicon. |
Abstract | In this paper, we show that the weak generative capacity of this ‘pure’ form of CCG is strictly smaller than that of CCG with gram-mar-specific rules, and of other mildly con-text-sensitive grammar formalisms, including Tree Adjoining Grammar (TAG). |
Abstract | Our result also carries over to a multi-modal extension of CCG . |
Introduction | Combinatory Categorial Grammar ( CCG ) (Steedman, 2001; Steedman and Baldridge, 2010) is an expressive grammar formalism with formal roots in combinatory logic (Curry et al., 1958) and links to the type-logical tradition of categorial grammar (Moortgat, 1997). |
Introduction | It is well-known that CCG can generate languages that are not context-free (which is necessary to capture natural languages), but can still be parsed in polynomial time. |
Introduction | Specifically, Vij ay-Shanker and Weir (1994) identified a version of CCG that is weakly equivalent to Tree Adjoining Grammar (TAG) (Joshi and Schabes, 1997) and other mildly context-sensitive grammar formalisms, and can generate non-context—free languages such as anbnc”. |
Abstract | The definition of combinatory categorial grammar ( CCG ) in the literature varies quite a bit from author to author. |
Abstract | However, the differences between the definitions are important in terms of the language classes of each CCG . |
Abstract | We prove that a wide range of CCGs are strongly context-free, including the CCG of CCGbank and of the parser of Clark and Curran (2007). |
Introduction | Combinatory categorial grammar ( CCG ) is a variant of categorial grammar which has attracted interest for both theoretical and practical reasons. |
Introduction | On the practical side, we have corpora with CCG derivations for each sentence (Hockenmaier and Steedman, 2007), a wide-coverage parser trained on that corpus (Clark and Curran, 2007) and a system for converting CCG derivations into semantic representations (Bos et al., 2004). |
Introduction | However, despite being treated as a single unified grammar formalism, each of these authors use variations of CCG which differ primarily on which combinators are included in the grammar and the restrictions that are put on them. |
Background and motivation | Formalisms like HPSG (Pollard and Sag, 1994), LFG (Kaplan and Bresnan, 1982), and CCG (Steedman, 2000) are linguistically motivated in the sense that they attempt to explain and predict the limited variation found in the grammars of natural languages. |
Background and motivation | Combinatory Categorial Grammar ( CCG ; Steedman, 2000) is a lexicalised grammar, which means that all grammatical dependencies are specified in the lexical entries and that the production of derivations is governed by a small set of rules. |
Background and motivation | A CCG grammar consists of a small number of schematic rules, called combinators. |
Combining CCGbank corrections | When Hockenmaier and Steedman (2002) went to acquire a CCG treebank from the PTB, this posed a problem. |
Combining CCGbank corrections | There is no equivalent way to leave these structures underspecified in CCG , because derivations must be binary branching. |
Combining CCGbank corrections | This distinction is represented in the surface syntax in CCG , because the category of a verb must specify its argument structure. |
Introduction | We then describe a novel CCG analysis of NP predicate-argument structure, which we implement using NomBank (Meyers et al., 2004). |
Introduction | We then train and evaluate a parser for these changes, to investigate their impact on the accuracy of a state-of—the-art statistical CCG parser. |
Noun predicate-argument structure | 4.1 CCG analysis |
Parsing Evaluation | Some of the changes we have made correct problems that have caused the performance of a statistical CCG parser to be overestimated. |
Abstract | Besides using traditional dependency parsers, we also use the dependency structures transformed from PCFG trees and predicate-argument structures (PASS) which are generated by an HPSG parser and a CCG parser. |
Experiments | In terms of root word comparison, we observe that MST and CCG share 87.3% of identical root words, caused by borrowing roots from MST to CCG . |
Experiments | Malt Berkeley PAS PAS CCG +syn +sem MST 70.5 62.5 69.2 53.3 87.3 (77.3) (64.6) (58.5) (5 8. |
Experiments | For example, CCG is worse than Malt in terms of P/R yet with a higher BLEU score. |
Gaining Dependency Structures | 2.5 CCG parsing |
Gaining Dependency Structures | We also use the predicate-argument dependencies generated by the CCG parser developed by Clark and Curran (2007). |
Gaining Dependency Structures | The algorithm for generating word-level dependency tree is easier than processing the PASS included in the HPSG trees, since the word level predicate-argument relations have already been included in the output of CCG parser. |
Introduction | A semantic dependency representation of a whole sentence, predicate-argument structures (PASS), are also included in the output trees of (1) a state-of-the-art head-driven phrase structure grammar (HPSG) (Pollard and Sag, 1994; Sag et al., 2003) parser, Enju1 (Miyao and Tsujii, 2008) and (2) a state-of-the-art CCG parser2 (Clark and Curran, 2007). |
Background | In all heavily lexicalised formalisms, such as LTAG, CCG , LPG and HPSG, the lexicon plays a key role in parsing. |
Background | Originally described by Bangalore and J oshi (1994) for use in LTAG parsing, it has also been used very successfully for CCG (Clark, 2002). |
Background | The supertags used in each formalism differ, being elementary trees in LTAG and CCG categories for CCG . |
Further Work | This would be similar to the CCG supertagging mechanism and is likely to give generous speedups at the possible expense of precision, but it would be illuminating to discover how this tradeoff plays out in our setup. |
Parser Restriction | This could be considered a form of supertagging as used in LTAG and CCG . |
Parser Restriction | While POS taggers such as TreeTagger are common, and there some supertaggers are available, notably that of Clark and Curran (2007) for CCG , no standard supertagger exists for HPSG. |
Discussion | To contrast, consider CCG (Steedman, 2000), in which semantic parsing is driven from the lexicon. |
Discussion | In DCS, we start with lexical triggers, which are more basic than CCG lexical entries. |
Discussion | (2008) induces first-order formulae using CCG in a small domain assuming observed lexical semantics. |
Experiments | Note that having lexical triggers is a much weaker requirement than having a CCG lexicon, and far easier to obtain than logical forms. |
Introduction | CCG is one instantiation (Steedman, 2000), which is used by many semantic parsers, e.g., Zettlemoyer and Collins (2005). |
Experiments | This difference is consistent with the result obtained by a shift-reduce CCG parser (Zhang and Clark, 2011). |
Experiments | CCG and HP SG parsers also favor the dependency-based metrics for evaluation (Clark and Curran, 2007b; Miyao and Tsujii, 2008). |
Experiments | Previous work on Chinese CCG and HP SG parsing unanimously agrees that obtaining the deep analysis of Chinese is more challenging (Yu et al., 2011; Tse and Curran, 2012). |
Related Work | CCG , HP SG, LFG and TAG, provides valuable, richer linguistic information, and researchers thus draw more and more attention to it. |
Related Work | Phrase structure trees in CTB have been semiautomatically converted to deep derivations in the CCG (Tse and Curran, 2010), LFG (Guo et al., 2007), TAG (Xia, 2001) and HPSG (Yu et al., 2010) formalisms. |
Extending a Semantic Parser Using a Schema Alignment | Using a fixed CCG grammar and a procedure based on unification in second-order logic, UBL learns a lexicon A from the training data which includes entries like: |
Extending a Semantic Parser Using a Schema Alignment | Example CCG Grammar Rules |
Previous Work | technique does require manual specification of rules that construct CCG lexical entries from dependency parses. |
Previous Work | In comparison, we fully automate the process of constructing CCG lexical entries for the semantic parser by making it a prediction task. |
Analysis and Discussion | Given that with the more refined SVM ranker, the Berkeley parser worked nearly as well as all three parsers together using the complete feature set, the prospects for future work on a more realistic scenario using the OpenCCG parser in an SVM ranker for self-monitoring now appear much more promising, either using OpenCCG’s reimplemen-tation of Hockenmaier & Steedman’s generative CCG model, or using the Berkeley parser trained on OpenCCG’s enhanced version of the CCGbank, along the lines of Fowler and Penn (2010). |
Background | In the figure, nodes correspond to discourse referents labeled with lexical predicates, and dependency relations between nodes encode argument structure (gold standard CCG lexical categories are also shown); note that semantically empty function words such as infinitival-to are missing. |
Background | The model takes as its starting point two probabilistic models of syntax that have been developed for CCG parsing, Hockenmaier & Steed-man’s (2002) generative model and Clark & Cur-ran’s (2007) normal-form model. |
Related Work | Approaches to surface realization have been developed for LFG, HPSG, and TAG, in addition to CCG , and recently statistical dependency-based approaches have been developed as well; see the report from the first surface realization shared |
Experiments | 7The comparison is imperfect for two reasons: first, the CCGBank contains only 99.44% of the original PTB sentences (Hockenmaier and Steedman, 2007); second, because PropB ank was annotated over CFGs, after converting to CCG only 99.977% of the argument spans were exact matches (Boxwell and White, 2008). |
Experiments | (2011) (B’11) uses additional supervision in the form of a CCG tag dictionary derived from supervised data with (tdc) and without (tc) a cutoff. |
Related Work | (2011) describe a method for training a semantic role labeler by extracting features from a packed CCG parse chart, where the parse weights are given by a simple ruleset. |
Related Work | (2011) require an oracle CCG tag dictionary extracted from a treebank. |
Experiment Setup 4.1 Corpus | SCF and DR: These more linguistically informed features are constructed based on the grammatical relations generated by the C&C CCG parser (Clark and Curran, 2007). |
Experiment Setup 4.1 Corpus | We extract features on the basis of the output generated by the C&C CCG parser. |
Results and Discussion | One explanation for the poor performance could be that we use all the frames generated by the CCG parser in our experiment. |
Results and Discussion | To see if Levin-selected SCFs are more effective for AVC, we match each SCF generated by the C&C CCG parser (CCG-SCF) to one of 78 Levin-defined SCFs, and refer to the resulting SCF set as unfiltered-Levin-SCF. |
A graph-based representation for LCFRS productions | If x1 and cc2 do not occur both in the same string yl or 3/2, then we say that there is a gap between cc1 and ccg . |
A graph-based representation for LCFRS productions | If cc1 <p cc2 and there is no gap between cc1 and ccg, then we write [cc1, ccg] to denote the set {cc1,cc2} U{cc|cc E V},, cc1 <}, c <}, ccg }. |
A graph-based representation for LCFRS productions | Note that the first condition means that 7“ and 7“’ are disjoint sets and, for any pair of vertices c E 7“ and cc’ E 7“’, either there is a gap between cc and cc’ or else there exists some ccg E V}, such that 9c <p ccg <p Jc’andflcg E’TUT’. |