Experiment | were the ones that had been counted”, using the feature templates in Table l, at least four times for all of the (i, j) position pairs in the training sentences. |
Experiment | We conjoined the features with three types of label pairs (C, I), (LN), or (C,N> as instances of the feature template (1,, [3-) to produce features for SEQUENCE. |
Experiment | We used the following feature templates to produce features for the outbound model: <8i—2>a <8i—1>a <80, <8i+1>a <8i+2>a <72), 02—1772), (25,-, n+1), and (3,, ti). |
Proposed Method | Table 1: Feature templates . |
Proposed Method | Table 1 shows the feature templates used to produce the features. |
Proposed Method | A feature is an instance of a feature template . |
Features | In this section, we first introduce how different types of feature templates are designed, and then show an example of how the features help transfer the syntactic structure information. |
Features | Note that the same feature templates are used for all the target grammar formalisms. |
Features | We define the following feature templates : fbmary for binary derivations, funary for unary derivations, and from; for the root nodes. |
Background | the baseline feature templates of joint S&T are the ones used in (Ng and Low, 2004; Jiang et al., 2008), as shown in Table l. A = {A1A2...AK} E RK are the weight parameters to be learned. |
Background | Table l: The feature templates of joint S&T. |
Method | The feature templates are from Zhao et al. |
Method | The same feature templates in (Wang et al., 2011) are used, i.e., "+n-gram+cluster+leXicon". |
Method | The feature templates introduced in Section 3.1 are used. |
Related Work | But overall, our approach differs in three important aspects: first, novel feature templates are defined for measuring the similarity between vertices. |
Experimental Assessment | For the arc-eager parser, we use the feature template of Zhang and Nivre (2011). |
Experimental Assessment | It turns out that our feature template , described in §4.3, is the exact merge of the templates used for the arc-eager and the arc-standard parsers. |
Model and Training | These features are then combined together into complex features, according to some feature template , and joined with the available transition types. |
Model and Training | Our feature template is an extended version of the feature template of Zhang and Nivre (2011), originally developed for the arc-eager model. |
Additional Experiments | The sparse feature templates resulted here in a total of 4.9 million possible features, of which again only a fraction were active, as shown in Table 2. |
Experiments | Table 2: Active sparse feature templates |
Experiments | These feature templates resulted in a total of 3.4 million possible features, of which only a fraction were active for the respective tuning set and optimizer, as shown in Table 2. |
Experiments | Table 3 is the feature template we set initially which generates 722 999 637 features. |
Experiments | Feature Templates |
Experiments | Table 3: feature templates for CRFs training |
Experiments | We used 5-fold cross-validation performed using the training data to tweak the included feature templates and optimize training parameters. |
Experiments | The following feature templates are used to generate features from the above words. |
Experiments | To evaluate the importance of the different types of features, the same experiment was rerun multiple times, each time including or excluding exactly one feature template . |
Character-based Chinese Parsing | Table 1 shows the feature templates of our model. |
Character-based Chinese Parsing | The feature templates in bold are novel, are designed to encode head character information. |
Experiments | We find that the parsing accuracy decreases about 0.6% when the head character related features (the bold feature templates in Table l) are removed, which demonstrates the usefulness of these features. |