Index of papers in Proc. ACL 2010 that mention
  • gold-standard
Navigli, Roberto and Ponzetto, Simone Paolo
Abstract
We conduct experiments on new and existing gold-standard datasets to show the high quality and coverage of the resource.
Experiment 1: Mapping Evaluation
The gold-standard dataset includes 505 nonempty mappings, i.e.
Experiment 2: Translation Evaluation
This is assessed in terms of coverage against gold-standard resources (Section 5.1) and against a manually-validated dataset of translations (Section 5.2).
Experiment 2: Translation Evaluation
Table 2: Size of the gold-standard wordnets.
Experiment 2: Translation Evaluation
We compare BabelNet against gold-standard resources for 5 languages, namely: the subset of GermaNet (Lemnitzer and Kunze, 2002) included in EuroWordNet for German, MultiWordNet (Pianta et al., 2002) for Italian, the Multilingual Central Repository for Spanish and Catalan (Atserias et al., 2004), and WOrdnet Libre du Francais (Benoit and Fiser, 2008, WOLF) for French.
gold-standard is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
Thater, Stefan and Fürstenau, Hagen and Pinkal, Manfred
Experiment: Ranking Word Senses
To compare the predicted ranking to the gold-standard ranking, we use Spearman’s p, a standard method to compare ranked lists to each other.
Experiment: Ranking Word Senses
The first column shows the correlation of our model’s predictions with the human judgments from the gold-standard , averaged over all instances.
Experiments: Ranking Paraphrases
We follow E&P and evaluate it only on the second subtask: we extract paraphrase candidates from the gold standard by pooling all annotated gold-standard paraphrases for all instances of a verb in all contexts, and use our model to rank these paraphrase candidates in specific contexts.
Experiments: Ranking Paraphrases
P10 measures the percentage of gold-standard paraphrases in the top-ten list of paraphrases as ranked by the system, and can be defined as follows (McCarthy and Navigli, 2007):
Experiments: Ranking Paraphrases
where M is the list of 10 paraphrase candidates top-ranked by the model, G is the corresponding annotated gold-standard data, and f (s) is the weight of the individual paraphrases.
gold-standard is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Gerber, Matthew and Chai, Joyce
Conclusions and future work
First, we have created gold-standard implicit argument annotations for a small set of pervasive nominal predicates.7 Our analysis shows that these annotations add 65% to the role coverage of NomBank.
Evaluation
To factor out errors from standard SRL analyses, the model used gold-standard argument labels provided by PropBank and NomBank.
Evaluation
We also evaluated an oracle model that made gold-standard predictions for candidates within the two-sentence prediction window.
Implicit argument identification
Throughout our study, we used gold-standard discourse relations provided by the Penn Discourse TreeBank (Prasad et al., 2008).
gold-standard is mentioned in 4 sentences in this paper.
Topics mentioned in this paper: