Index of papers in Proc. ACL that mention
  • shared task
Wu, Yuanbin and Ng, Hwee Tou
Abstract
Experimental results on the Helping Our Own shared task show that our method is competitive with state-of-the-art systems.
Conclusion
Experiments on the H00 2011 shared task show that ILP inference achieves state-of—the-art performance on grammatical error correction.
Experiments
We follow the evaluation setup in the H00 2011 shared task on grammatical error correction (Dale and Kilgarriff, 2011).
Experiments
The development set and test set in the shared task consist of conference and workshop papers taken from the Association for Computational Linguistics (ACL).
Experiments
In the H00 2011 shared task , participants can submit system edits directly or the corrected plain—text system output.
Inference with Second Order Variables
Corrections are called edits in the H00 2011 shared task .
Introduction
The task has received much attention in recent years, and was the focus of two shared tasks on grammatical error correction in 2011 and 2012 (Dale and Kilgarriff, 2011; Dale et al., 2012).
Introduction
We evaluate our proposed ILP approach on the test data from the Helping Our Own (H00) 2011 shared task (Dale and Kilgarriff, 2011).
shared task is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Zou, Bowei and Zhou, Guodong and Zhu, Qiaoming
Abstract
Evaluation on the *SEM 2012 shared task corpus indicates the usefulness of contextual discourse information in negation focus identification and justifies the effectiveness of our graph model in capturing such global information.
Baselines
Negation focus identification in *SEM’2012 shared tasks is restricted to verbal negations annotated with MNEG in PropBank, with only the constituent belonging to a semantic role selected as negation focus.
Baselines
1 In *SEM’2013, the shared task is changed with focus on "Semantic Textual Similarity".
Baselines
For better illustration of the importance of contextual discourse information, Table 1 shows the statistics of intra- and inter-sentence information necessary for manual negation focus identification with 100 instances randomly extracted from the held-out dataset of *SEM'2012 shared task corpus.
Introduction
Evaluation on the *SEM 2012 shared task corpus (Morante and Blanco, 2012) justifies our approach over several strong baselines.
Related Work
Due to the increasing demand on deep understanding of natural language text, negation recognition has been drawing more and more attention in recent years, with a series of shared tasks and workshops, however, with focus on cue detection and scope resolution, such as the BioNLP 2009 shared task for negative event detection (Kim et al., 2009) and the ACL 2010 Workshop for scope resolution of negation and speculation (Morante and Sporleder, 2010), followed by a special issue of Computational Linguistics (Morante and Sporleder, 2012) for modality and negation.
Related Work
However, although Morante and Blanco (2012) proposed negation focus identification as one of the *SEM’2012 shared tasks , only one team (Rosenberg and Bergler, 2012)1 participated in this task.
shared task is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
Packard, Woodley and Bender, Emily M. and Read, Jonathon and Oepen, Stephan and Dridan, Rebecca
Abstract
In this work, we revisit Shared Task 1 from the 2012 *SEM Conference: the automated analysis of negation.
Introduction
Owing to its immediate utility in the cura-tion of scholarly results, the analysis of negation and so-called hedges in biomedical research literature has been the focus of several workshops, as well as the Shared Task at the 2011 Conference on Computational Language Learning (CoNLL).
Introduction
1Our running example is a truncated variant of an item from the Shared Task training data.
Introduction
Though the task-specific concept of scope of negation is not the same as the notion of quantifier and operator scope in mainstream underspecified semantics, we nonetheless find that reviewing the 2012 *SEM Shared Task annotations with reference to an explicit encoding of semantic predicate-argument structure suggests a simple and straightforward operationalization of their concept of negation scope.
Related Work
(2012) describe some amount of tailoring of the Boxer lexicon to include more of the Shared Task scope cues among those that produce the negation operator in the DRSs, but otherwise the system appears to directly take the notion of scope of negation from the DRS and project it out to the string, with one caveat: As with the logical-forms representations we use, the DRS logical forms do not include function words as predicates in the semantics.
Related Work
Since the Shared Task gold standard annotations included such arguably semantically vacuous (see Bender, 2013, p. 107) words in the scope, further heuristics are needed to repair the string-based annotations coming from the DRS-based system.
System Description
From these underspecified representations of possible scopal configurations, a scope resolution component can spell out the full range of fully-connected logical forms (Koller and Thater, 2005), but it turns out that such enumeration is not relevant here: the notion of scope encoded in the Shared Task annotations is not concerned with the relative scope of quantifiers and negation, such as the two possible readings of (2) represented informally below:5
System Description
However, as shown below, the information about fixed scopal elements in an underspecified MRS is sufficient to model the Shared Task annotations.
System Description
5 In other words, a possible semantic interpretation of the (string-based) Shared Task annotation guidelines and data is in terms of a quantifier-free approach to meaning representation, or in terms of one where quantifier scope need not be made explicit (as once suggested by, among others, Alshawi, 1992).
shared task is mentioned in 20 sentences in this paper.
Topics mentioned in this paper:
Candito, Marie and Constant, Matthieu
Abstract
In this paper, we investigate various strategies to predict both syntactic dependency parsing and contiguous multiword expression (MWE) recognition, testing them on the dependency version of French Treebank (Abeille and Barrier, 2004), as instantiated in the SPMRL Shared Task (Seddah et al., 2013).
Architectures for MWE Analysis and Parsing
We compare these four architectures between them and also with two simpler architectures used by (Constant et al., 2013) within the SPMRL 13 Shared Task , in which regular and irregular MWEs are not distinguished:
Conclusion
We experimented strategies to predict both MWE analysis and dependency structure, and tested them on the dependency version of French Treebank (Abeille and Barrier, 2004), as instantiated in the SPMRL Shared Task (Seddah et al., 2013).
Data: MWEs in Dependency Trees
The Shared Task used an enhanced version of the constituency-to-dependency conversion of Candito et al.
Experiments
Moreover, we provide in table 5 a comparison of our best architecture with reg/irregular MWE distinction with other architectures that do not make this distinction, namely the two best comparable systems designed for the SPMRL Shared Task (Seddah et a1., 2013): the pipeline simple parser based on Mate-tools of Constant et al.
Introduction
While the realistic scenario of syntactic parsing with automatic MWE recognition (either done jointly or in a pipeline) has already been investigated in constituency parsing (Green et al., 2011; Constant et al., 2012; Green et al., 2013), the French dataset of the SPMRL 2013 Shared Task (Seddah et al., 2013) only recently provided the opportunity to evaluate this scenario within the framework of dependency syntax.2 In such a scenario, a system predicts dependency trees with marked groupings of tokens into MWEs.
Introduction
In this paper, we investigate various strategies for predicting from a tokenized sentence both MWEs and syntactic dependencies, using the French dataset of the SPMRL 13 Shared Task .
Introduction
2The main focus of the Shared Task was on predicting both morphological and syntactic analysis for morphologically-rich languages.
Related work
To our knowledge, the first works3 on predicting both MWEs and dependency trees are those presented to the SPMRL 2013 Shared Task that provided scores for French (which is the only dataset containing MWEs).
Related work
(2013) proposed to combine pipeline and joint systems in a reparser (Sagae and Lavie, 2006), and ranked first at the Shared Task .
Related work
It uses no feature nor treatment specific to MWEs as it focuses on the general aim of the Shared Task , namely coping with prediction of morphological and syntactic analysis.
shared task is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Martschat, Sebastian
Abstract
The model outperforms most systems participating in the English track of the CoNLL’ 12 shared task .
Evaluation
We use the data provided for the English track of the CoNLL’ l2 shared task on multilingual coreference resolution (Pradhan et al., 2012) which is a subset of the upcoming OntoNotes 5.0 release and comes with various annotation layers provided by state-of-the-art NLP tools.
Evaluation
We evaluate the model in a setting that corresponds to the shared task’s closed track, i.e.
Evaluation
We evaluate our system with the coreference resolution evaluation metrics that were used for the CoNLL shared tasks on coreference, which are MUC (Vilain et al., 1995), B3 (Bagga and Baldwin, 1998) and CEAFe (Luo, 2005).
Introduction
Quite recently, however, rule-based approaches regained popularity due to Stanford’s multi-pass sieve approach which exhibits state-of-the-art performance on many standard coreference data sets (Raghunathan et al., 2010) and also won the CoNLL-2011 shared task on coreference resolution (Lee et al., 2011; Pradhan et al., 2011).
Introduction
On the English data of the CoNLL’ 12 shared task the model outperforms most systems which participated in the shared task .
Related Work
These approaches participated in the recent CoNLL’ll shared task (Pradhan et al., 2011; Sapena et al., 2011; Cai et al., 2011b) with excellent results.
Related Work
(2012) and ranked second in the English track at the CoNLL’ 12 shared task (Pradhan et al., 2012).
Related Work
The top performing system at the CoNLL’ 12 shared task (Femandes et al., 2012)
shared task is mentioned in 16 sentences in this paper.
Topics mentioned in this paper:
Zhang, Yi and Wang, Rui
Dependency Parsing with HPSG
For these rules, we refer to the conversion of the Penn Treebank into dependency structures used in the CoNLL 2008 Shared Task , and mark the heads of these rules in a way that will arrive at a compatible dependency backbone.
Dependency Parsing with HPSG
the CoNLL shared task dependency structures, minor systematic differences still exist for some phenomena.
Experiment Results & Error Analyses
To evaluate the performance of our different dependency parsing models, we tested our approaches on several dependency treebanks for English in a similar spirit to the CoNLL 2006-2008 Shared Tasks .
Experiment Results & Error Analyses
In previous years of CoNLL Shared Tasks , several datasets have been created for the purpose of dependency parser evaluation.
Experiment Results & Error Analyses
The same dataset has been used for the domain adaptation track of the CoNLL 2007 Shared Task .
Introduction
In the meantime, successful continuation of CoNLL Shared Tasks since 2006 (Buchholz and Marsi, 2006; Nivre et al., 2007a; Surdeanu et al., 2008) have witnessed how easy it has become to train a statistical syntactic dependency parser provided that there is annotated treebank.
Parser Domain Adaptation
In recent years, two statistical dependency parsing systems, MaltParser (Nivre et al., 2007b) and MS TParser (McDonald et al., 2005b), representing different threads of research in data-driven machine learning approaches have obtained high publicity, for their state-of-the-art performances in open competitions such as CoNLL Shared Tasks .
shared task is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Björkelund, Anders and Kuhn, Jonas
Abstract
Our model obtains the best results to date on recent shared task data for Arabic, Chinese, and English.
Background
Nevertheless, the two best systems in the latest CoNLL Shared Task on coreference resolution (Pradhan et al., 2012) were both variants of the mention-pair model.
Conclusion
We evaluated our system on all three languages from the CoNLL 2012 Shared Task and present the best results to date on these data sets.
Experimental Setup
We apply our model to the CoNLL 2012 Shared Task data, which includes a training, development, and test set split for three languages: Arabic, Chinese and English.
Features
As a baseline we use the features from Bjorkelund and Farkas (2012), who ranked second in the 2012 CoNLL shared task and is publicly available.
Introduction
The combination of this modification with nonlocal features leads to further improvements in the clustering accuracy, as we show in evaluation results on all languages from the CoNLL 2012 Shared Task —Arabic, Chinese, and English.
Related Work
Latent antecedents have recently gained popularity and were used by two systems in the CoNLL 2012 Shared Task , including the winning system (Femandes et al., 2012; Chang et al., 2012).
Results
As a general baseline, we also include Bjorkelund and Farkas’ (2012) system (denoted B&F), which was the second best system in the shared task .
shared task is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Gormley, Matthew R. and Mitchell, Margaret and Van Durme, Benjamin and Dredze, Mark
Approaches
Our feature template definitions build from those used by the top performing systems in the CoNLL—2009 Shared Task , Zhao et al.
Experiments
To compare to prior work (i.e., submissions to the CoNLL-2009 Shared Task ), we also consider the joint task of semantic role labeling and predicate sense disambiguation.
Experiments
The CoNLL-2009 Shared Task (Hajic et al., 2009) dataset contains POS tags, lemmas, morphological features, syntactic dependencies, predicate senses, and semantic roles annotations for 7 languages: Catalan, Chinese, Czech, English, German, Japanese,4 Spanish.
Experiments
The CoNLL-2005 and -2008 Shared Task datasets provide English SRL annotation, and for cross dataset comparability we consider only verbal predicates (more details in § 4.4).
shared task is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Sajjad, Hassan and Fraser, Alexander and Schmid, Helmut
Abstract
We conduct experiments on data sets from the NEWS 2010 shared task on transliteration mining and achieve an F-measure of up to 92%, outperforming most of the semi-supervised systems that were submitted.
Experiments
For English/Arabic, English/Hindi and English/Tamil, our system is better than most of the semi-supervised systems presented at the NEWS 2010 shared task for transliteration mining.
Experiments
On the English/Russian data set, our system achieves 76% F-measure which is not good compared with the systems that participated in the shared task .
Experiments
The Wikipedia InterLanguage Links shared task data contains a much larger proportion of transliterations than a parallel corpus.
Introduction
We compare our unsupervised transliteration mining method with the semi-supervised systems presented at the NEWS 2010 shared task on transliteration mining (Kumaran et al., 2010) using four language pairs.
shared task is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Turchi, Marco and Anastasopoulos, Antonios and C. de Souza, José G. and Negri, Matteo
Evaluation framework
0 One artificial setting (§5) obtained from the WMT12 QE shared task data, in which train-ing/test instances are arranged to reflect homogeneous distributions of the HTER labels.
Evaluation framework
To measure the adaptability of our model to a given test set we compute the Mean Absolute Error (MAE), a metric for regression problems also used in the WMT QE shared tasks .
Evaluation framework
The results of previous WMT QE shared tasks have shown that these baseline features are particularly competitive in the regression task (with only few systems able to beat them at WMT12).
Online QE for CAT environments
The tool, which implements a large number of features proposed by participants in the WMT QE shared tasks , has been modified to process one sentence at a time as requested for integration in a CAT environment;
Related work
In the last couple of years, research in the field received a strong boost by the shared tasks organized within the WMT workshop on SMT,2 which is also the framework of our first experiment in §5.
Related work
3For a comprehensive overview of the QE approaches proposed so far we refer the reader to the WMT12 and WMT13 QE shared task reports (Callison-Burch et al., 2012; Bojar et al., 2013).
shared task is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Nivre, Joakim and McDonald, Ryan
Abstract
By letting one model generate features for the other, we consistently improve accuracy for both models, resulting in a significant improvement of the state of the art when evaluated on data sets from the CoNLL-X shared task .
Experiments
In this section, we present an experimental evaluation of the two guided models based on data from the CoNLL—X shared task , followed by a comparative error analysis including both the base models and the guided models.
Experiments
The data for the experiments are training and test sets for all thirteen languages from the CoNLL-X shared task on multilingual dependency parsing with training sets ranging in size from from 29,000 tokens (Slovene) to 1,249,000 tokens (Czech).
Experiments
Models are evaluated by their labeled attachment score (LAS) on the test set, i.e., the percentage of tokens that are assigned both the correct head and the correct label, using the evaluation software from the CoNLL-X shared task with default settings.4 Statistical significance was assessed using Dan Bikel’s randomized parsing evaluation comparator with the default setting of 10,000 iterations.5
Introduction
Both models have been used to achieve state-of-the-art accuracy for a wide range of languages, as shown in the CoNLL shared tasks on dependency parsing (Buchholz and Marsi, 2006; Nivre et al., 2007), but McDonald and Nivre (2007) showed that a detailed error analysis reveals important differences in the distribution of errors associated with the two models.
Related Work
(2007) to combine six transition-based parsers in the best performing system in the CoNLL 2007 shared task .
shared task is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Cohn, Trevor and Specia, Lucia
Multitask Quality Estimation 4.1 Experimental Setup
These were used by a highly competitive baseline entry in the WMT12 shared task , and were extracted here using the system provided by that shared task.6 They include simple counts, e.g., the tokens in sentences, as well as source and target language model probabilities.
Multitask Quality Estimation 4.1 Experimental Setup
This is generally a very strong baseline: in the WMT12 QE shared task , only five out of 19 submissions were able to significantly outperform it, and only by including many complex additional features, tree kernels, etc.
Multitask Quality Estimation 4.1 Experimental Setup
WMT12: Single task We start by comparing GP regression with alternative approaches using the WMT12 dataset on the standard task of predicting a weighted mean quality rating (as it was done in the WMT12 QE shared task ).
Quality Estimation
For an overview of various algorithms and features we refer the reader to the WMT12 shared task on QE (Callison-Burch et al., 2012).
Quality Estimation
WMT12: This dataset was distributed as part of the WMT12 shared task on QE (Callison-Burch et al., 2012).
shared task is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Durrett, Greg and Hall, David and Klein, Dan
Conclusion
Our transitive system is more effective at using properties than a pairwise system and a previous entity-level system, and it achieves performance comparable to that of the Stanford coreference resolution system, the winner of the CoNLL 2011 shared task .
Experiments
We use the datasets, experimental setup, and scoring program from the CoNLL 2011 shared task (Pradhan et al., 2011), based on the OntoNotes corpus (Hovy et al., 2006).
Experiments
5 Unfortunately, their publicly-available system is closed-source and performs poorly on the CoNLL shared task dataset, so direct comparison is difficult.
Introduction
We evaluate our system on the dataset from the CoNLL 2011 shared task using three different types of properties: synthetic oracle properties, entity phi features (number, gender, animacy, and NER type), and properties derived from unsupervised clusters targeting semantic type information.
Introduction
Our final system is competitive with the winner of the CoNLL 2011 shared task (Lee et al., 2011).
shared task is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Lassalle, Emmanuel and Denis, Pascal
Abstract
Our experiments on the C0NLL-2012 Shared Task English datasets (gold mentions) indicate that our method is robust relative to different clustering strategies and evaluation metrics, showing large and consistent improvements over a single pairwise model using the same base features.
Experiments
We evaluated the system on the English part of the corpus provided in the CoNLL-2012 Shared Task (Pradhan et al., 2012), referred to as CoNLL-2012 here.
Experiments
These metrics were recently used in the CoNLL-2011 and -2012 Shared Tasks .
Experiments
The best classifier-decoder combination reaches a score of 67.19, which would place it above the mean score (66.41) of the systems that took part in the C0NLL—2012 Shared Task (gold mentions track).
Introduction
As will be shown based on a variety of experiments on the CoNLL-2012 Shared Task English datasets, these improvements are consistent across different evaluation metrics and for the most part independent of the clustering decoder that was used.
shared task is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Huang, Fei and Yates, Alexander
Introduction
In recent semantic role labeling (SRL) competitions such as the shared tasks of CoNLL 2005 and CoNLL 2008, supervised SRL systems have been trained on newswire text, and then tested on both an in-domain test set (Wall Street Journal text) and an out-of-domain test set (fiction).
Introduction
We test our open-domain semantic role labeling system using data from the CoNLL 2005 shared task (Carreras and Marquez, 2005).
Introduction
Like the best systems from the CoNLL 2005 shared task (Punyakanok et al., 2008; Pradhan et al., 2005), they also use features from multiple parses to remain robust in the face of parser error.
shared task is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zhang, Yuan and Lei, Tao and Barzilay, Regina and Jaakkola, Tommi and Globerson, Amir
Experimental Setup
For the CATiB dataset, we report UAS including punctuation in order to be consistent with the published results in the 2013 SPMRL shared task (Seddah et al., 2013).
Introduction
This is better than the best published results in the 2013 SPMRL shared task (Seddah et al., 2013), including parser ensembles.
Results
To put these numbers into perspective, the bottom part of Table 3 shows the accuracy of the best systems from the 2013 SPMRL shared task on Arabic parsing using predicted information (Seddah et al., 2013).
Results
Bottom part shows UAS of the best systems in the SPMRL shared task .
shared task is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Hall, David and Durrett, Greg and Klein, Dan
Abstract
On the SPMRL 2013 multilingual constituency parsing shared task (Seddah et al., 2013), our system outperforms the top single parser system of Bjorkelund et al.
Introduction
Our parser is also able to generalize well across languages with little tuning: it achieves state-of-the-art results on multilingual parsing, scoring higher than the best single-parser system from the SPMRL 2013 Shared Task on a range of languages, as well as on the competition’s average Fl metric.
Other Languages
We evaluate on the constituency treebanks from the Statistical Parsing of Morphologically Rich Languages Shared Task (Seddah et al., 2013).
Other Languages
5 Their best parser, and the best overall parser from the shared task , is a reranked product of “Replaced” Berkeley parsers.
shared task is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Gómez-Rodr'iguez, Carlos and Nivre, Joakim
Abstract
In this paper, we present a transition system for 2-planar dependency trees — trees that can be decomposed into at most two planar graphs — and show that it can be used to implement a classifier-based parser that runs in linear time and outperforms a state-of-the-art transition-based parser on four data sets from the CoNLL-X shared task .
Empirical Evaluation
In order to get a first estimate of the empirical accuracy that can be obtained with transition-based 2-planar parsing, we have evaluated the parser on four data sets from the CoNLL—X shared task (Buchholz and Marsi, 2006): Czech, Danish, German and Portuguese.
Empirical Evaluation
(2006b) in the original shared task , where the pseudo-projective version of MaltParser was one of the two top performing systems (Buchholz and Marsi, 2006).
Introduction
Although the contributions of this paper are mainly theoretical, we also present an empirical evaluation of the 2-planar parser, showing that it outperforms the projective parser on four data sets from the CoNLL—X shared task (Buchholz and Marsi, 2006).
shared task is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Guzmán, Francisco and Joty, Shafiq and Màrquez, Llu'is and Nakov, Preslav
Experimental Setup
In our experiments, we used the data available for the WMT12 and the WMTll metrics shared tasks for translations into English.3 This included the output from the systems that participated in the WMT12 and the WMTll MT evaluation campaigns, both consisting of 3,003 sentences, for four different language pairs: Czech-English (CS-EN), French-English (FR-EN), German-English (DE-EN), and Spanish-English (ES-EN); as well as a dataset with the English references.
Experimental Setup
Table 1: Number of systems (systs), judgments (ranks), unique sentences (sents), and different judges (judges) for the different language pairs, for the human evaluation of the WMT12 and WMT11 shared tasks .
Introduction
We first design two discourse-aware similarity measures, which use DTs generated by a publicly-available discourse parser (J oty et al., 2012); then, we show that they can help improve a number of MT evaluation metrics at the segment- and at the system-level in the context of the WMT11 and the WMT12 metrics shared tasks (Callison-Burch et al., 2011; Callison-Burch et al., 2012).
shared task is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Huang, Fei and Yates, Alexander
Experiments
Following the CoNLL shared task from 2000, we use sections 15-18 of the Penn Treebank for our labeled training data for the supervised sequence labeler in all experiments (Tjong et al., 2000).
Experiments
The chunker’s accuracy is roughly in the middle of the range of results for the original CoNLL 2000 shared task (Tjong et al., 2000) .
Experiments
For our experiment on domain adaptation, we focus on NP chunking and POS tagging, and we use the labeled training data from the CoNLL 2000 shared task as before.
shared task is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Hasan, Kazi Saidul and Ng, Vincent
Analysis
3 A more detailed analysis of the results of the SemEval-2010 shared task and the approaches adopted by the participating systems can be found in Kim et a1.
Evaluation
To score the output of a keyphrase extraction system, the typical approach, which is also adopted by the SemEval—2010 shared task on keyphrase extraction, is (1) to create a mapping between the keyphrases in the gold standard and those in the system output using exact match, and then (2) score the output using evaluation metrics such as precision (P), recall (R), and F-score (F).
Evaluation
For example, KP-Miner (El-Beltagy and Rafea, 2010), an unsupervised system, ranked third in the SemEval-2010 shared task with an F-score of 25.2, which is comparable to the best supervised system scoring 27.5.
shared task is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Luo, Xiaoqiang and Pradhan, Sameer and Recasens, Marta and Hovy, Eduard
BLANC for Imperfect Response Mentions
Table 1: The proposed BLANC scores of the CoNLL-2011 shared task participants.
BLANC for Imperfect Response Mentions
Table 2: The proposed BLANC scores of the CoNLL-2012 shared task participants.
Introduction
The proposed BLANC is applied to the CoNLL 2011 and 2012 shared task participants, and the scores and its correlations with existing metrics are shown in Section 5.
shared task is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Ma, Xuezhe and Xia, Fei
Data and Tools
We evaluate our approach on three target languages from CoNLL shared task treebanks, which do not appear in Google Universal Treebanks.
Experiments
To make a thorough empirical comparison with previous studies, we also evaluate our system without unlabeled data (-U) on treebanks from CoNLL shared task on dependency parsing (Buchholz and Marsi, 2006; Nivre et al., 2007).
Experiments
Table 6: Parsing results on treebanks from CoNLL shared tasks for eight target languages.
shared task is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Ma, Ji and Zhang, Yue and Zhu, Jingbo
Abstract
Experiment on the SANCL 2012 shared task show that our approach achieves 93.15% average tagging accuracy, which is the best accuracy reported so far on this data set, higher than those given by ensembled syntactic parsers.
Experiments
Our experiments are conducted on the data set provided by the SANCL 2012 shared task , which aims at building a single robust syntactic analysis system across the web-domain.
Introduction
We conduct experiments on the official data set provided by the SANCL 2012 shared task (Petrov and McDonald, 2012).
shared task is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Abend, Omri and Reichart, Roi and Rappoport, Ari
Related Work
PB is a standard corpus for SRL evaluation and was used in the CoNLL SRL shared tasks of 2004 (Carreras and Marquez, 2004) and 2005 (Carreras and Marquez, 2005).
Related Work
The CoNLL shared tasks of 2004 and 2005 were devoted to SRL, and studied the influence of different syntactic annotations and domain changes on SRL results.
Related Work
Supervised clause detection was also tackled as a separate task, notably in the CoNLL 2001 shared task (Tjong Kim Sang and Dejean, 2001).
shared task is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Guinaudeau, Camille and Strube, Michael
Experiments
To do so, we use one of the top performing systems from the CoNLL 2012 shared task (Martschat et al., 2012).
Experiments
These two tasks were performed on documents extracted from the English test part of the CoNLL 2012 shared task (Pradhan et al., 2012).
Experiments
The system was trained on the English training part of the CoNLL 2012 shared task filtered in the same way as the test part.
shared task is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Zapirain, Beñat and Agirre, Eneko and Màrquez, Llu'is
Experimental Setting 3.1 Datasets
The data used in this work is the benchmark corpus provided by the SRL shared task of CoNLL-2005 (Carreras and Marquez, 2005).
Experimental Setting 3.1 Datasets
The system achieves very good performance in the CoNLL-2005 shared task dataset and in the SRL subtask of the SemEval-2007 English lexical sample task (Zapirain et al., 2007).
On the Generalization of Role Sets
This is the setting used in the CoNLL 2005 shared task (Carreras and Marquez, 2005).
shared task is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: