Experimental Results | We also show that the combination of unlexicalized, open extraction in O-CRF and lexicalized , supervised extraction in R1 -CRF improves precision and F-measure compared to a standalone RE system. |
Experimental Results | The lexicalized R1 -CRF extractor is able to recover from this error; the presence of the word “Acquire” is enough to recog- |
Experimental Results | We found that while RES OLVER improves the relative recall of O-CRF by nearly 50%, O-CRF locates fewer synonyms per relation compared to its lexicalized counterpart. |
Hybrid Relation Extraction | We now describe an ensemble-based or hybrid approach to RE that leverages the different views offered by open, self- supervised extraction in O-CRF, and lexicalized , supervised extraction in Rl-CRF. |
Introduction | The relationship between standard RE systems and the new Open IE paradigm is analogous to the relationship between lexicalized and unlexicalized parsers. |
Introduction | Statistical parsers are usually lexicalized (i.e. |
Introduction | In this paper, we examine the tradeoffs between relation-specific ( “lexicalized” ) extraction and relation-independent (“unlexicalized”) extraction and reach an analogous conclusion. |
Relation Extraction | To compare the behavior of open, or “unlexicalized,” extraction to relation-specific, or “lexicalized” extraction, we developed a CRF-based extractor under the traditional RE paradigm. |
Abstract | Hiero translation models have two limitations compared to phrase-based models: 1) Limited hypothesis space; 2) No lexicalized reordering model. |
Abstract | Phrasal-Hiero still has the same hypothesis space as the original Hiero but incorporates a phrase-based distance cost feature and lexicalized reodering features into the chart decoder. |
Experiment Results | In all experiments we use phrase-orientation lexicalized reordering (Galley and Manning, 2008)2 which models monotone, swap, discontinuous orientations from both reordering with preVious phrase pair and with the next phrase pair. |
Introduction | Most phrase-based systems are equipped with a distance reordering cost feature to tune the system towards the right amount of reordering, but then also a lexicalized reordering |
Introduction | It does not have the expressive lexicalized reordering model and distance cost features of the phrase-based system. |
Introduction | If we look at the leaves of a Hiero derivation tree, the lexicals also form a segmentation of the source and target sentence, thus also form a discontinuous phrase-based translation path. |
Phrasal-Hiero Model | 2.2 Training: Lexicalized Reordering Table |
Phrasal-Hiero Model | Phrasal-Hiero needs a phrase-based lexicalized reordering table to calculate the features. |
Phrasal-Hiero Model | The lexicalized reordering table could be from a discontinuous phrase-based system. |
Abstract | Recently, it was shown (KUHLMANN, SATTA: Tree-adjoining grammars are not closed under strong lexicalization . |
Abstract | Thus, simple context-free tree grammars strongly lexicalize tree adjoining grammars and themselves. |
Introduction | A good overview on TAG, their formal properties, their linguistic motivation, and their applications is presented by Joshi and Schabes (1992) and Joshi and Schabes (1997), in which also strong lexicalization is discussed. |
Introduction | In general, lexicalization is the process of transforming a grammar into an equivalent one (potentially expressed in another formalism) such that each production contains a lexical item (or anchor). |
Introduction | alphabet, each production of a lexicalized grammar produces at least one letter of the generated string. |
Computational Complexity | We study in this section the complexity of several decision problems on MLIGs, prominently of emptiness and membership problems, in the general (Section 4.2), k-bounded (Section 4.3), and lexicalized cases (Section 4.4). |
Computational Complexity | 4.4 Lexicalized Case |
Introduction | 2. the effects of two linguistically motivated restrictions on such formalisms, lexicalization and boundedness/rankedness. |
Multiset-Valued Linear Indexed Grammars | Two restrictions on dominance links have been suggested in an attempt to reduce their complexity, sometimes in conjunction: lexicalization and k-boundedness. |
Multiset-Valued Linear Indexed Grammars | We can combine the two restrictions, thus defining the class of k:-bounded lexicalized MLIGs. |
Multiset-Valued Linear Indexed Grammars | Lexicalization Lexicalization in UVG-dls reflects the strong dependence between syntactic constructions (vectors of productions representing an extended domain of locality) and lexical anchors. |
Related Formalisms | 3 Adding terminal symbols 0 in each production would result in a lexicalized grammar, still with a non semilinear language. |
Related Formalisms | Lexicalization has now its usual definition: for every vector ({pi,1, . |
Abstract | This paper describes how external resources can be used to improve parser performance for heavily lexicalised grammars, looking at both robustness and efficiency. |
Background | In all heavily lexicalised formalisms, such as LTAG, CCG, LPG and HPSG, the lexicon plays a key role in parsing. |
Conclusion | The work reported here shows the benefits that can be gained by utilising external resources to annotate parser input in highly lexicalised grammar for-malisms. |
Conclusion | Even something as simple and readily available (for languages likely to have lexicalised grammars) as a POS tagger can massively increase the parser coverage on unseen text. |
Introduction | Heavily lexicalised grammars have been used in applications such as machine translation and information extraction because they can produce semantic structures which provide more information than less informed parsers. |
Introduction | pf Lexicalised Grammars |
Introduction | improving parser performance in these two areas, by annotating the input given to one such deep parser, the PET parser (Callmeier, 2000), which uses lexicalised grammars developed under the HPSG formalism (Pollard and Sag, 1994). |
Parser Restriction | ing a deep parser with a lexicalised grammar are the precision and depth of the analysis produced, but this depth comes from making many fine distinctions which greatly increases the parser search space, making parsing slow. |
Parser Restriction | Increasing efficiency is important for enabling these heavily lexicalised grammars to bring the benefits of their deep analyses to applications, but simi- |
Unknown Word Handling | These results show very clearly one of the potential drawbacks of using a highly lexicalised grammar formalism like HPSG: unknown words are one of the main causes of parse failure, as quantified in Baldwin et al. |
Annotations | Annotation Dev, len g 40 v = 0, h = 0 90.1 v = l, h = 0 90.5 v = 0, h = 1 90.2 v = l, h = 1 90.9 Lexicalized 90.3 |
Annotations | Another commonly-used kind of structural annotation is lexicalization (Eisner, 1996; Collins, 1997; Charniak, 1997). |
Annotations | Table 2 shows results from lexicalizing the X-bar grammar; it provides meager improvements. |
Features | Because heads of constituents are often at the beginning or the end of a span, these feature templates can (noisily) capture monolexical properties of heads without having to incur the inferential cost of lexicalized annotations. |
Introduction | For example, head lexicalization (Eisner, 1996; Collins, 1997; Charniak, 1997), structural annotation (Johnson, 1998; Klein and Manning, 2003), and state-splitting (Matsuzaki et al., 2005; Petrov et al., 2006) are all designed to take coarse symbols like PP and decorate them with additional context. |
Other Languages | Historically, many annotation schemes for parsers have required language-specific engineering: for example, lexicalized parsers require a set of head rules and manually-annotated grammars require detailed analysis of the treebank itself (Klein and Manning, 2003). |
Parsing Model | Hall and Klein (2012) employed both kinds of annotations, along with lexicalized head word annotation. |
Sentiment Analysis | Our features can also lexicalize on other discourse connectives such as but or however, which often occur at the split point between two spans. |
Experiments | Table l: # of rules used in the testing (61' = 4 , h = 6) (BP: bilingual phrase (used in Moses), TR: tree rule (only 1 tree), TSR: tree sequence rule (> 1 tree), L: fully lexicalized, P: partially lexicalized , U: unlexicalized) |
Experiments | lexicalized rules), in which the lexicalized TSRs model all non-syntactic phrase pairs with rich syntactic information. |
Experiments | It suggests that they are complementary to each other since the lexicalized TSRs are used to model non-syntactic phrases while the other two kinds of TSRs can generalize the lexicalized rules to unseen phrases. |
Related Work | (2007) integrate supertags (a kind of lexicalized syntactic description) into the target side of translation model and language mod- |
Related Work | (2006) treat all bilingual phrases as lexicalized tree-to-string rules, including those non-syntactic phrases in training corpus. |
Rule Extraction | We first generate all fully lexicalized source and target tree sequences using a dynamic programming algorithm and then iterate over all generated source and |
Tree Sequence Alignment Model | In addition, we define two new features: 1) the number of lexical words in a rule to control the model’s preference for lexicalized rules over unlexicalized |
BabelNet | Importantly, each vertex v E V contains a set of lexicalizations of the concept for different languages, e.g. |
BabelNet | We call the resulting set of multilingual lexicalizations of a given concept a babel synset. |
BabelNet | An overview of BabelNet is given in Figure l (we label vertices with English lexicalizations ): unlabeled edges are obtained from links in the Wikipedia pages (e.g. |
Experiment 2: Translation Evaluation | However, it does not say anything about the precision of the additional lexicalizations provided by BabelNet. |
Experiment 2: Translation Evaluation | those mapped with our method illustrated in Section 3.2), 200 synsets whose lexicalizations exist in Wikipedia only. |
Experiment 2: Translation Evaluation | lexicalizations ) were appropriate given the corresponding WordNet gloss and/or Wikipage. |
Methodology | By repeating this step for each English lexicalization in a babel synset, we obtain a collection of sentences for the babel synset (see left part of Figure 1). |
Methodology | Note that we had no translation for Catalan and French in the first phase, because the inter-language link was not available, and we also obtain new lexicalizations for the Spanish and Italian languages. |
Introduction | Second, by replacing the syntactic features with an approximation based on POS tags, we achieve state-of-the-art performance without relying on error-prone unlexicalized or domain-specific lexicalized parsers. |
Methodology | Table 2 shows the three variations we tested: the simple tGR type, with parameterization for the POS tags of head and dependent, and with closed-class POS tags (determiners, pronouns and prepositions) lexicalized . |
Methodology | An unlexicalized parser cannot distinguish these based just on POS tags, while a lexicalized parser requires a large treebank. |
Methodology | As with tGRs, the closed-class tags can be lexicalized , but there are no corresponding feature sets for param (since they are already built from POS tags) or lim (since there is no similar rule-based approach). |
Previous work | The BioLexicon system extracts each verb instance’s GRs using the lexicalized Enju parser tuned to the biomedical domain (Miyao, 2005). |
Previous work | The BioLexicon system induces its SCF inventory automatically, but requires a lexicalized parsing model, rendering it more sensitive to domain variation. |
Results | Third, lexicalizing the closed-class POS tags introduces semantic information outside the scope of the alternation-based definition of subcategorization. |
Comparison to BabySRL | The difference in transitive settings stems from increased lexicalization, as is apparent from their results alone; the model presented here initially performs close to their weakly lexicalized model, though training impedes agent-prediction accuracy due to an increased probability of non-canonical objects. |
Comparison to BabySRL | In sum, the unleXicalized model presented in this paper is able to achieve greater labelling accuracy than the lexicalized BabySRL models in intransitive settings, though this model does perform slightly worse in the less common transitive setting. |
Discussion | This could also be an area where a lexicalized model could do better. |
Discussion | In future, it would be interesting to incorporate lexicalization into the model presented in this paper, as this feature seems likely to bridge the gap between this model and BabySRL in transitive settings. |
Discussion | Lexicalization should also help further distinguish modifiers from arguments and improve the overall accuracy of the model. |
Evaluation | Since the model is not lexicalized , these roles correspond to the semantic roles most commonly associated With subject and object. |
Abstract | Our model is purely lexicalized and can be integrated into any MT decoder. |
Introduction | In this paper we use a basic neural network architecture and a lexicalized probability model to create a powerful MT decoding feature. |
Model Variations | Although there has been a substantial amount of past work in lexicalized joint models (Marino et al., 2006; Crego and Yvon, 2010), nearly all of these papers have used older statistical techniques such as Kneser-Ney or Maximum Entropy. |
Model Variations | Le’s model also uses minimal phrases rather than being purely lexicalized , which has two main downsides: (a) a number of complex, handcrafted heuristics are required to define phrase boundaries, which may not transfer well to new languages, (b) the effective vocabulary size is much larger, which substantially increases data sparsity issues. |
Model Variations | The fact that the model is purely lexicalized , which avoids both data sparsity and implementation complexity. |
Abstract | Combinatory Categorial Grammar (CCG) is generally construed as a fully lexicalized formalism, where all grammars use one and the same universal set of rules, and cross-linguistic variation is isolated in the lexicon. |
Combinatory Categorial Grammar | This is what makes pure CCG a lexicalized grammar formalism (Steedman and Baldridge, 2010). |
Conclusion | This means that these formalisms cannot be fully lexicalized , in the sense that certain languages can only be described by selecting language-specific rules. |
Introduction | This shows that the generative capacity of at least first-order CCG crucially relies on its ability to restrict rule instantiations, and is at odds with the general conception of CCG as a fully lexicalized formalism, in which all grammars use one and the same set of universal rules. |
Introduction | This means that word order in CCG cannot be fully lexicalized with the current formal tools; some ordering constraints must be specified via language-specific combination rules and not in lexicon entries. |
Conclusion and Future Work | Lexicalized PCFGs (where head words annotate phrasal nodes) have proved a key tool for high performance PCFG parsing, however its performance is hampered by the sparse lexical dependency exhibited in the Penn Treebank. |
Experiment Setup 4.1 Corpus | We first build a lexicalized frame for the verb break: NPl(he)-V—NP2(door)-PP(with:hammer). |
Experiment Setup 4.1 Corpus | Based on the lexicalized frame, we construct an SCF NPl-NP2-PPwith for break. |
Integration of Syntactic and Lexical Information | Dependency relation (DR): Our way to overcome data sparsity is to break lexicalized frames into lexicalized slots (a.k.a. |
Related Work | Lexicalized frames are usually obtained |
Architecture of BRAINSUP | A beam search in the space of all possible lexicalizations of a syntactic pattern promotes the words with the highest likelihood of satisfying the user specification. |
Architecture of BRAINSUP | With the compatible patterns selected, we can initiate a beam search in the space of all possible lexicalizations of the patterns, i.e., the space of all sentences that can be generated by respecting the syntactic constraints encoded by each pattern. |
Architecture of BRAINSUP | Figure 2: A partially lexicalized sentence with a highlighted empty slot marked with X. |
A Latent Variable Parser | Lexicalization has been shown to be useful in more general parsing applications due to lexical dependencies in constituent parsing (e.g. |
A Latent Variable Parser | However, topological fields explain a higher level of structure pertaining to clause-level word order, and we hypothesize that lexicalization is unlikely to be helpful. |
Experiments | We hypothesized earlier that lexicalization is unlikely to give us much improvement in performance, because topological fields work on a domain that is higher than that of lexical dependencies such as subcategorization frames. |
Experiments | However, given the locally independent nature of legitimate parentheticals, a limited form of lexicalization or some other form of stronger contextual information might be needed to improve identification performance. |
Introduction | In Dubey and Keller (2003), PCFG parsing of NEGRA is improved by using sister-head dependencies, which outperforms standard head lexicalization as well as an unlexicalized model. |
Elementary Trees to String Grammar | The frontier nodes can be merged, lexicalized , or even deleted in the tree-to-string rule associated with 7’ , as long as the alignment for the nonterminals are book-kept in the derivations. |
Elementary Trees to String Grammar | The NP tree in Figure 2 happens to be an “inside-out” style alignment, and context free grammar such as ITG (Wu, 1997) can not explain this structure well without necessary lexicalization . |
Elementary Trees to String Grammar | With lexicalization , a Hiero style rule “de X Aly AlAnfjAr I—> to ignite X” is potentially a better alternative for translating the NP tree. |
The Projectable Structures | The transformations could be as simple as merging two adjacent nonterminals into one bracket to accommodate non-contiguity on the target side, or lexicalizing those words which have fork-style, many—to—many alignment, or unaligned content words to enable the rest of the span to be generalized into nonterminals. |
The Projectable Structures | It should be easier to model the swapping of (N OUN ADJ) using the tree (NP N OUN , ADJ) instead of the original bigger tree of (N P—SBJ Azmp, N OUN , ADJ) with one lexicalized node. |
Building a Discourse Parser | EDUS Syntax Trees I Syntax Parsing (Charniak's n/parse) ' v I Tokenization l I Lexicalization l . |
Building a Discourse Parser | Lexicalized (SPADE) Tokenized EDUs Syntax Trees |
Building a Discourse Parser | Lexicalized Syntax Trees |
Abstract | In lexicalized grammatical formalisms, it is possible to separate lexical category assignment from the combinatory processes that make use of such categories, such as parsing and realization. |
Background | Changes to the derivations are necessary to reflect the lexicalized treatment of coordination and punctuation assumed by the multi-modal version of CCG that is implemented in OpenCCG. |
Introduction | In lexicalized grammatical formalisms such as Lexicalized Tree Adjoining Grammar (Schabes et al., 1988, LTAG), Combinatory Categorial Grammar (Steedman, 2000, CCG) and Head-Driven Phrase-Structure Grammar (Pollard and Sag, 1994, HPSG), it is possible to separate lexical category assignment — the assignment of informative syntactic categories to linguistic objects such as words or lexical predicates — from the combinatory processes that make use of such categories — such as parsing and surface realization. |
The Approach | Our implementation makes use of three general types of features: lexicalized features, which are simply the names of the parent and child elementary predication nodes, graph structural features, such as the total number of edges emanating from a node, the number of argument and non-argument dependents, and the names of the relations of the dependent nodes to the parent node, and syntactico-semantic attributes of nodes, such as the tense and number. |
Abstract | We propose a novel self-training method for a parser which uses a lexicalised grammar and supertagger, focusing on increasing the speed of the parser rather than its accuracy. |
Background | Lexicalised grammars typically contain a much smaller set of rules than phrase-structure grammars, relying on tags (supertags) that contain a more detailed description of each word’s role in the sentence. |
Background | Figure 1 gives two sentences and their CCG derivations, showing how some of the syntactic ambiguity is transferred to the supertagging component in a lexicalised grammar. |
Introduction | Parsing with lexicalised grammar formalisms, such as Lexicalised Tree Adjoining Grammar and Combinatory Categorial Grammar (CCG; Steed-man, 2000), can be made more efficient using a supertagger. |
Generation Systems | This set subdivides into non-lexicalized and lexicalized transformations. |
Generation Systems | Most transformation rules (335 out of 374 on average) are lexicalized for a specific verb lemma and mostly transform nominalizations as in rule (4-b) and particles (see Section 3.2). |
Introduction | Applying a strictly sequential pipeline on our data, we observe incoherent system output that is related to an interaction of generation levels, very similar to the interleaving between sentence planning and lexicalization in Example (1). |
The Data Set | Nominalizations are mapped to their verbal base forms on the basis of lexicalized rules for the nominalized lemmas observed in the corpus. |
Abstract | Natural language parsing has typically been done with small sets of discrete categories such as NP and VP, but this representation does not capture the full syntactic nor semantic richness of linguistic phrases, and attempts to improve on this by lexicalizing phrases or splitting categories only partly address the problem at the cost of huge feature spaces and sparseness. |
Introduction | Second, lexicalized parsers (Collins, 2003; Charniak, 2000) associate each category with a lexical item. |
Introduction | However, this approach necessitates complex shrinkage estimation schemes to deal with the sparsity of observations of the lexicalized categories. |
Introduction | Another approach is lexicalized parsers (Collins, 2003; Chamiak, 2000) that describe each category with a lexical item, usually the head word. |
Comparative Study | 4.1 Moses lexicalized reordering model |
Comparative Study | Figure 2: lexicalized reordering model illustration. |
Comparative Study | Our implementation is same with the default behaVior of Moses lexicalized reordering model. |
Introduction | The classifier can be trained with maximum likelihood like Moses lexicalized reordering (Koehn et al., 2007) and hierarchical lexicalized reordering model (Galley and Manning, 2008) or be trained under maximum entropy framework (Zens and Ney, 2006). |
Experiments | Figure 4: Coverage of lexicalized STSG rules on bilingual phrases. |
Experiments | We extracted phrase pairs from the training data to investigate how many phrase pairs can be captured by lexicalized tree-to-tree rules that contain only terminals. |
Rule Extraction | If the tree contains only one node, it must be a lexicalized frontier node; |
Rule Extraction | If the tree contains more than one nodes, its leaves are either non-lexicalized frontier nodes or lexicalized non-frontier nodes. |
Introduction | We define and automatically extract a lexicalized approximation of the latter. |
Verb Classification Models | (ii) The second type of tree uses lexicals as central nodes on which both GR and POS-Tag are added as the rightmost children. |
Verb Classification Models | For LEX type, we apply a lexical similarity learned with LSA to only pairs of lexicals associated with the same POS-Tag. |
Conclusions | Implicit) or overt, lexicalized as discourse connectives (i.e. |
The PDTB annotation scheme | o AltLex: when insertion of a connective leads to redundancy due to the presence of an alternatively lexicalized expression, as in (2). |
The PDTB annotation scheme | o NoRel: when neither a lexicalized discourse relation nor entity-based coherence is present. |
Evaluation | Nevertheless, as it does not have a lexicalized strategy, it is not able to filter out incorrect candidates; the precision is therefore very low (the worst). |
Multiword expressions | They are often divided into two main classes: multiword expressions defined through linguistic idiomaticity criteria ( lexicalized phrases in the terminology of Sag et al. |
Multiword expressions | They used a Tree Substitution Grammar instead of a Probabilistic Context-free Grammar (PCFG) with latent annotations in order to capture leXicalized rules as well as general rules. |
Adaptive Online MT | For example, simple indicator features like lexicalized reordering classes are potentially useful yet bloat the the feature set and, in the worst case, can negatively impact |
Experiments | The baseline “dense” model contains 19 features: the nine Moses baseline features, the hierarchical lexicalized reordering model of Galley and Manning (2008), the (log) count of each rule, and an indicator for unique rules. |
Experiments | Discriminative reordering (LO): indicators for eight lexicalized reordering classes, including the six standard mono-tone/swap/discontinuous classes plus the two simpler Moses monotone/non-monotone classes. |
Applications of Creative Retrieval | The Google ngrams can be seen as a lexicalized idea space, embedded within a larger sea of noise. |
Applications of Creative Retrieval | Each creative query is a jumping off point in a space of lexicalized ideas that is implied by a large corpus, with each successive match leading the user deeper into the space. |
Related Work and Ideas | While some techniques may suggest conventional metaphors that have become lexicalized in a language, they are unlikely to identify relatively novel expressions. |
CCG and Supertagging | CCG is a lexicalized grammar formalism encoding for each word lexical categories that are either basic (eg. |
Conclusion and Future Work | Though we have focused on CCG in this work we expect these methods to be equally useful for other linguistically motivated but computationally complex formalisms such as lexicalized tree adjoining grammar. |
Integrated Supertagging and Parsing | (2010) and lexicalized CFG parsing by Rush et al. |
Evaluation | Of especial interest are deep lexicalized rules such as |
Introduction | One approach is to use word alignments (where these can be reliably estimated, as in our testbed application) to align subtrees and extract rules (Och and Ney, 2004; Galley et al., 2004) but this leaves open the question of finding the right level of generality of the rules — how deep the rules should be and how much lexicalization they should involve — necessitating resorting to heuristics such as minimality of rules, and leading to |
The STSG Model | Second, the ability to have rules deeper than one level provides a principled way of modeling lexicalization , whose importance has been emphasized (Galley and McKeown, 2007; Yamangil and Nelken, 2008). |
DNN for word alignment | For the distortion td, we could use a lexicalized distortion model: |
DNN for word alignment | But we found in our initial experiments on small scale data, lexicalized distortion does not produce better alignment over the simple jump-distance based model. |
DNN for word alignment | So we drop the lexicalized |
Abstract | Like TextRunner, WOE’s extractor eschews lexicalized features and handles an unbounded set of semantic relations. |
Conclusion | We are also interested in merging lexicalized and open extraction methods; the use of some domain-specific lexical features might help to improve WOE’s practical performance, but the best way to do this is unclear. |
Related Work | First, J iang and Zhai’s results are tested for traditional IE where local lexicalized tokens might contain sufficient information to trigger a correct classification. |
Abstract | By adding features over CCG word-word dependencies and lexicalized verbal subcategorization frames (“supertags”), we can obtain an F-score that is substantially better than a previous CCG-based SRL system and competitive with the current state of the art. |
Potential Advantages to using CCG | Another advantage of a CCG-based approach (and lexicalist approaches in general) is the ability to encode verb-specific argument mappings. |
Results | We argue that, especially in the heaVily lexicalized CCG framework, headword evaluation is more appropriate, reflecting the emphasis on headword combinatorics in the CCG formalism. |
Conclusions and Future Work | First, we defined two simple discourse-aware similarity metrics ( lexicalized and un-lexicalized), which use the all-subtree kernel to compute similarity between discourse parse trees in accordance with the Rhetorical Structure Theory. |
Experimental Results | As expected, DR-LEX performs better than DR since it is lexicalized (at the unigram level), and also gives partial credit to correct structures. |
Our Discourse-Based Measures | We experiment with TKs applied to two different representations of the discourse tree: non-lexicalized (DR), and lexicalized (DR-LEX). |
Generating from the KBGen Knowledge-Base | To generate from the KBGen data, we induce a Feature-Based Lexicalised Tree Adjoining Grammar (FB-LTAG, (Vijay-Shanker and J oshi, 1988)) augmented with a unification-based semantics (Gardent and Kallmeyer, 2003) from the training data. |
Generating from the KBGen Knowledge-Base | 4.1 Feature-Based Lexicalised Tree Adjoining Grammar |
Generating from the KBGen Knowledge-Base | To extract a Feature-Based Lexicalised Tree Adjoining Grammar (FB-LTAG) from the KB Gen data, we parse the sentences of the training corpus; project the entity and event variables to the syntactic projection of the strings they are aligned with; and extract the elementary trees of the resulting FB-LTAG from the parse tree using semantic information. |
Fine-grained rule extraction | Head-driven phrase structure grammar (HPSG) is a lexicalist grammar framework. |
Fine-grained rule extraction | Based on TC, we can easily build a tree-to-string translation rule by further completing the right-hand-side string by sorting the spans of Tc’s leaf nodes, lexicalizing the terminal node’s span(s), and assigning a variable to each nonterminal node’s span. |
Related Work | Two kinds of supertags, from Lexicalized Tree-Adjoining Grammar and Combinatory Categorial Grammar (CCG), have been used as lexical syntactic descriptions (Hassan et al., 2007) for phrase-based SMT (Koehn et al., 2007). |
Introduction | While recent work has introduced increasingly powerful features (Feng and Hirst, 2012) and inference techniques (J oty et al., 2013), discourse relations remain hard to detect, due in part to a long tail of “alternative lexicalizations” that can be used to realize each relation (Prasad et al., 2010). |
Model | (2010) show that there is a long tail of alternative lexicalizations for discourse relations in the Penn Discourse Treebank, posing obvious challenges for approaches based on directly matching lexical features observed in the training data. |
Related Work | Prior learning-based work has largely focused on lexical, syntactic, and structural features, but the close relationship between discourse structure and semantics (Forbes-Riley et al., 2006) suggests that shallow feature sets may struggle to capture the long tail of alternative lexicalizations that can be used to realize discourse relations (Prasad et al., 2010; Marcu and Echihabi, 2002). |
Discussion | More generally, the parser used in these evaluations differs from other reported parsers in that it is not lexicalized . |
Discussion | However, we see that this language model performs well despite its lack of lexicalization . |
Discussion | This indicates that lexicalization is not a requisite part of syntactic parser performance with respect to predicting linguistic complexity, corroborating the evidence of Demberg and Keller’s (2008) ‘unlexicalized’ (POS-generating, not word-generating) parser. |