Abstract | The model outperforms most systems participating in the English track of the CoNLL’ 12 shared task. |
Evaluation | We use the data provided for the English track of the CoNLL’ l2 shared task on multilingual coreference resolution (Pradhan et al., 2012) which is a subset of the upcoming OntoNotes 5.0 release and comes with various annotation layers provided by state-of-the-art NLP tools. |
Evaluation | We evaluate our system with the coreference resolution evaluation metrics that were used for the CoNLL shared tasks on coreference, which are MUC (Vilain et al., 1995), B3 (Bagga and Baldwin, 1998) and CEAFe (Luo, 2005). |
Evaluation | CoNLL’ 12 shared task, which are denoted as best and median respectively. |
Introduction | On the English data of the CoNLL’ 12 shared task the model outperforms most systems which participated in the shared task. |
Related Work | (2012) and ranked second in the English track at the CoNLL’ 12 shared task (Pradhan et al., 2012). |
Related Work | The top performing system at the CoNLL’ 12 shared task (Femandes et al., 2012) |
Related Work | (2011), which in turn won the CoNLL’ 11 shared task. |
Conclusion | Our transitive system is more effective at using properties than a pairwise system and a previous entity-level system, and it achieves performance comparable to that of the Stanford coreference resolution system, the winner of the CoNLL 2011 shared task. |
Experiments | We use the datasets, experimental setup, and scoring program from the CoNLL 2011 shared task (Pradhan et al., 2011), based on the OntoNotes corpus (Hovy et al., 2006). |
Experiments | 5 Unfortunately, their publicly-available system is closed-source and performs poorly on the CoNLL shared task dataset, so direct comparison is difficult. |
Experiments | Table l: CoNLL metric scores for our four different systems incorporating noisy oracle data. |
Introduction | We evaluate our system on the dataset from the CoNLL 2011 shared task using three different types of properties: synthetic oracle properties, entity phi features (number, gender, animacy, and NER type), and properties derived from unsupervised clusters targeting semantic type information. |
Introduction | Our final system is competitive with the winner of the CoNLL 2011 shared task (Lee et al., 2011). |
Abstract | By incorporating this knowledge into Dependency Model with Valence, we managed to considerably outperform the state-of-the-art results in terms of average attachment score over 20 treebanks from CoNLL 2006 and 2007 shared tasks. |
Conclusions and Future Work | We proved that such prior knowledge about stop-probabilities incorporated into the standard DMV model significantly improves the unsupervised dependency parsing and, since we are not aware of any other fully unsupervised dependency parser with higher average attachment score over CoNLL data, we state that we reached a new state-of-the-art result.5 |
Conclusions and Future Work | However, they do not provide scores measured on other CoNLL treebanks. |
Experiments | The first type are CoNLL treebanks from the year 2006 (Buchholz and Marsi, 2006) and 2007 (Nivre et al., 2007), which we use for inference and for evaluation. |
Experiments | The Wikipedia texts were automatically tokenized and segmented to sentences so that their tokenization was similar to the one in the CoNLL evaluation treebanks. |
Experiments | spective CoNLL training data. |
Experiments | To do so, we use one of the top performing systems from the CoNLL 2012 shared task (Martschat et al., 2012). |
Experiments | These two tasks were performed on documents extracted from the English test part of the CoNLL 2012 shared task (Pradhan et al., 2012). |
Experiments | The coreference resolution system used performs well on the CoNLL 2012 data. |
Coordination Structures in Treebanks | Obviously, there is a certain risk that the CS-related information contained in the source treebanks was slightly biased by the properties of the CoNLL format upon conversion. |
Coordination Structures in Treebanks | the 2nd column of Table 1), but some were originally based on constituents and thus specific converters to the CoNLL format had to be created (for instance, the Spanish phrase-structure trees were converted to dependencies using a procedure described by Civit et al. |
Related work | The primitive format used for CoNLL shared tasks is widely used in dependency parsing, but its weaknesses have already been pointed out (cf. |
Variations in representing coordination structures | 7The primary data sources are the following: Ancient Greek: Ancient Greek Dependency Treebank (B amman and Crane, 2011), Arabic: Prague Arabic Dependency Treebank 1.0 (Smri et al., 2008), Basque: Basque Dependency Treebank (larger version than CoNLL 2007 generously pro- |
Experiments | (2011), who observe that this is rarely the case with the heterogenous CoNLL treebanks. |
Introduction | In particular, the CoNLL shared tasks on dependency parsing have provided over twenty data sets in a standardized format (Buch-holz and Marsi, 2006; Nivre et al., 2007). |
Introduction | These data sets can be sufficient if one’s goal is to build monolingual parsers and evaluate their quality without reference to other languages, as in the original CoNLL shared tasks, but there are many cases where heterogenous treebanks are less than adequate. |