Approaches | We create binary indicator features for each model using feature templates . |
Approaches | Our feature template definitions build from those used by the top performing systems in the CoNLL—2009 Shared Task, Zhao et al. |
Approaches | Template Creation Feature templates are defined over triples of (property, positions, order). |
Experiments | 4.2 Feature Template Sets |
Experiments | Further, the best-performing low-resource features found in this work are those based on coarse feature templates and selected by information gain. |
Experiments | #FT indicates the number of feature templates used (unigrams+bigrams). |
Related Work | (2009) features, who use feature templates from combinations of word properties, syntactic positions including head and children, and semantic properties; and features from Bj orkelund et a1. |
Conclusion | We build up a small set of feature templates as part of a discriminative constituency parser and outperform the Berkeley parser on a wide range of languages. |
Features | Subsequent lines in Table 1 indicate additional surface feature templates computed over the span, which are then conjoined with the rule identity as shown in Figure l to give additional features. |
Features | Note that many of these features have been used before (Taskar et al., 2004; Finkel et al., 2008; Petrov and Klein, 2008b); our goal here is not to amass as many feature templates as possible, but rather to examine the extent to which a simple set of features can replace a complicated state space. |
Features | Because heads of constituents are often at the beginning or the end of a span, these feature templates can (noisily) capture monolexical properties of heads without having to incur the inferential cost of lexicalized annotations. |
Sentiment Analysis | We exploit this by adding an additional feature template similar to our span shape feature from Section 4.4 which uses the (deterministic) tag for each word as its descriptor. |
Surface Feature Framework | To improve the performance of our X-bar grammar, we will add a number of surface feature templates derived only from the words in the sentence. |
Experimental Setup | Features For the arc feature vector gbhflm, we use the same set of feature templates as MST V0.5 .1. |
Experimental Setup | For head/modifier vector gbh and gbm, we show the complete set of feature templates used by our model in Table 1. |
Experimental Setup | Finally, we use a similar set of feature templates as Turbo V2.1 for 3rd order parsing. |
Introduction | A predominant way to counter the high dimensionality of features is to manually design or select a meaningful set of feature templates , which are used to generate different types of features (McDonald et al., 2005a; Koo and Collins, 2010; Martins et al., 2013). |
Problem Formulation | Table 1: Word feature templates used by our model. |
Experimental Setup | Table 1: PCS tag feature templates . |
Features | First- to Third-Order Features The feature templates of first— to third-order features are mainly drawn from previous work on graph-based parsing (McDonald and Pereira, 2006), transition-based parsing (Nivre et al., 2006) and dual decomposition-based parsing (Martins et al., 2011). |
Features | The feature templates are inspired by previous feature-rich POS tagging work (Toutanova et al., 2003). |
Features | In our work we use feature templates up to 5-gram. |
Joint POS Tagging and Parsing with Nonlocal Features | However, some feature templates in Table 1 become unavailable, because POS tags for the lookahead words are not specified yet under the joint framework. |
Joint POS Tagging and Parsing with Nonlocal Features | However, all the feature templates given in Table l are just some simple structural features. |
Transition-based Constituent Parsing | Type Feature Templates |
Transition-based Constituent Parsing | Table 1 lists the feature templates used in our baseline parser, which is adopted from Zhang and Clark (2009). |
Character-Level Dependency Tree | Feature templates |
Character-Level Dependency Tree | Table 1: Feature templates encoding intra-word dependencies. |
Character-Level Dependency Tree | It adjusts the weights of segmentation and POS-tagging features, because the number of feature templates is much less for the two tasks than for parsing. |