Abstract | Pure statistical parsing systems achieves high in-domain accuracy but performs poorly out-domain. |
Dependency Parsing with HPSG | With the extra features, we hope that the training of the statistical model will not overfit the in-domain data, but be able to deal with domain independent linguistic phenomena as well. |
Experiment Results & Error Analyses | With both parsers, we see slight performance drops with both HP SG feature models on in-domain tests (WSJ), compared with the original models. |
Experiment Results & Error Analyses | When we look at the performance difference between in-domain and out-domain tests for each feature model, we observe that the drop is significantly smaller for the extended models with HP SG features. |
Experiment Results & Error Analyses | Admittedly the results on PCHEMTB are lower than the best reported results in CoNLL 2007 Shared Task, we shall note that we are not using any in-domain unlabeled data. |
Experimental Evaluation | Listed together with their PARSEVAL F-measures these are: gold-standard parses from the treebank (GoldSyn, 100%), a parser trained on WSJ plus a small number of in-domain training sentences required to achieve good performance, 20 for CLANG (Syn20, 88.21%) and 40 for GEOQUERY (Syn40, 91.46%), and a parser trained on no in-domain data (Syn0, 82.15% for CLANG and 76.44% for GEOQUERY). |
Experimental Evaluation | ones trained on more in-domain data) improved our approach. |
Experimental Evaluation | Table 5: Performance on GEO25 0 (20 in-domain sentences are used in SYN20 to train the syntactic parser). |