Experiments | We apply 1-best and k-best sequential decoding algorithms to five NLP tagging tasks: Penn TreeBank (PTB) POS tagging, CoNLLZOOO joint POS tagging and chunking, CoNLL 2003 joint POS tagging, chunking and named entity tagging, HPSG supertag-ging (Matsuzaki et al., 2007) and a search query named entity recognition (NER) dataset. |
Experiments | As in (Kaji et al., 2010), we combine the POS tags and chunk tags to form joint tags for CoNLL 2000 dataset, e.g., NN|B-NP. |
Experiments | Similarly we combine the POS tags, chunk tags, and named entity tags to form joint tags for CoNLL 2003 dataset, e.g., PRP$|I-NP|O. |
Empirical Evaluation | (Koehn, 2005) and the CoNLL 2009 distributions of the Penn Treebank WSJ corpus (Marcus et al., 1993) for English and the SALSA corpus (Burchardt et al., 2006) for German. |
Empirical Evaluation | As standard for unsupervised SRL, we use the entire CoNLL training sets for evaluation, and use held-out sets for model selection and parameter tuning. |
Empirical Evaluation | Although the CoNLL 2009 dataset already has predicted dependency structures, we could not reproduce them so that we could use the same parser to annotate Europarl. |
Introduction | Our model admits efficient inference: the estimation time on CoNLL 2009 data (Hajic et al., 2009) and Europarl v.6 bitext (Koehn, 2005) does not exceed 5 hours on a single processor and the inference algorithm is highly parallelizable, reducing in- |
Capturing Paradigmatic Relations via Word Clustering | Features Data Brown MKCLS Baseline CoNLL 94.48% |
Combining Both | Table 10: Tagging accuracies on the test data ( CoNLL ). |
Combining Both | ( CoNLL ) |
State-of-the-Art | For detailed analysis and evaluation, we conduct further experiments following the setting of the CoNLL 2009 shared task. |
State-of-the-Art | For the following experiments, we only report results on the development data of the CoNLL setting. |
Experiments and Analysis | CTB6 is used as the Chinese data set in the CoNLL 2009 shared task (Hajic et al., 2009). |
Experiments and Analysis | We list the top three systems of the CoNLL 2009 shared task in Table 8, showing that our approach also advances the state—of—the—art parsing accuracy on this data set.10 |
Experiments and Analysis | The parsing accuracies of the top systems may be underestimated since the accuracy of the provided POS tags in CoNLL 2009 is only 92.38% on the test set, while the POS tagger used in our experiments reaches 94.08%. |