Index of papers in Proc. ACL 2013 that mention
  • gold-standard
Kim, Joohyun and Mooney, Raymond
Abstract
Unlike conventional reranking used in syntactic and semantic parsing, gold-standard reference trees are not naturally available in a grounded setting.
Abstract
successful task completion) can be used as an alternative, experimentally demonstrating that its performance is comparable to training on gold-standard parse trees.
Experimental Evaluation
It is calculated by comparing the system’s MR output to the gold-standard MR.
Introduction
Standard reranking requires gold-standard interpretations (e.g.
Introduction
However, grounded language learning does not provide gold-standard interpretations for the training examples.
Introduction
Instead of using gold-standard annotations to determine the correct interpretations, we simply prefer interpretations of navigation instructions that, when executed in the world, actually reach the intended destination.
Modified Reranking Algorithm
Instead, our modified model replaces the gold-standard reference parse with the “pseudo-gold” parse tree
Modified Reranking Algorithm
To circumvent the need for gold-standard reference parses, we select a pseudo-gold parse from the candidates produced by the GEN function.
Modified Reranking Algorithm
In a similar vein, when reranking semantic parses, Ge and Mooney (2006) chose as a reference parse the one which was most similar to the gold-standard semantic annotation.
gold-standard is mentioned in 22 sentences in this paper.
Topics mentioned in this paper:
Bergsma, Shane and Van Durme, Benjamin
Applying Class Attributes
Our first technique provides a simple way to use our identified self-distinguishing attributes in conjunction with a classifier trained on gold-standard data.
Applying Class Attributes
(3) BootStacked: Gold Standard and Bootstrapped Combination Although we show that an accurate classifier can be trained using auto-annotated Bootstrapped data alone, we also test whether we can combine this data with any gold-standard training examples to achieve even better performance.
Conclusion
We presented three effective techniques for leveraging this knowledge within the framework of supervised user characterization: rule-based postprocessing, a leaming-by-bootstrapping approach, and a stacking approach that integrates the predictions of the bootstrapped system into a system trained on annotated gold-standard training data.
Conclusion
While our technique has advanced the state-of-the-art on this important task, our approach may prove even more useful on other tasks where training on thousands of gold-standard examples is not even an option.
Introduction
Our bootstrapped system, trained purely from automatically-annotated Twitter data, significantly reduces error over a state-of-the-art system trained on thousands of gold-standard training examples.
Learning Class Attributes
In our gold-standard gender data (Section 5), however, every user has a homepage [by dataset construction]; we might therefore incorrectly classify every user as Male.
Results
A standard classifier trained on 100 gold-standard training examples improves over this baseline, to 72.0%, while one with 2282 training examples achieves 84.0%.
Twitter Gender Prediction
We can therefore benchmark our approach against state-of-the-art supervised systems trained with plentiful gold-standard data, giving us an idea of how well our Bootstrapped system might compare to theoretically top-performing systems on other tasks, domains, and social media platforms where such gold-standard training data is not available.
gold-standard is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Narisawa, Katsuma and Watanabe, Yotaro and Mizuno, Junta and Okazaki, Naoaki and Inui, Kentaro
Related work
In order to prepare a gold-standard data set, we obtained 1,041 sentences by randomly sampling about 1% of the sentences containing numbers (Arabic digits and/or Chinese numerical characters) in a Japanese Web corpus (100 million pages) (Shinzato et al., 2012).
Related work
recall using the gold-standard data set”.
Related work
We built a gold-standard data set for numerical common sense.
gold-standard is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Laparra, Egoitz and Rigau, German
Discussion
But the actual gold-standard annotation is: [argl buyers that weren’t disclosed].
Evaluation
For every argument position in the gold-standard the scorer expects a single predicted constituent to fill in.
Evaluation
The function above relates the set of tokens that form a predicted constituent, Predicted, and the set of tokens that are part of an annotated constituent in the gold-standard , True.
Evaluation
For each missing argument, the gold-standard includes the whole coreference chain of the filler.
Introduction
The following example includes the gold-standard annotations for a traditional SRL process:
gold-standard is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
O'Connor, Brendan and Stewart, Brandon M. and Smith, Noah A.
Experiments
“Express intent to deescalate military engagement”), we elect to measure model quality as lexical scale parity: whether all the predicate paths within one automatically learned frame tend to have similar gold-standard scale scores.
Experiments
(This measures cluster cohesiveness against a one-dimensional continuous scale, instead of measuring cluster cohesiveness against a gold-standard clustering as in VI, Rand index, or purity.)
Experiments
We assign each path 212 a gold-standard scale g(w) by resolving through its matching pattern’s CAMEO code.
gold-standard is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Choi, Jinho D. and McCallum, Andrew
Selectional branching
Among all transition sequences generated by Mr_1, training instances from only T1 and T9 are used to train Mr, where T1 is the one-best sequence and T9 is a sequence giving the most accurate parse output compared to the gold-standard tree.
Transition-based dependency parsing
This decision is consulted by gold-standard trees during training and a classifier during decoding.
Transition-based dependency parsing
Table 3 shows a transition sequence generated by our parsing algorithm using gold-standard decisions.
gold-standard is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Manshadi, Mehdi and Gildea, Daniel and Allen, James
Introduction
Our rich set of features significantly improves the performance of the QSD model, even though we give up the gold-standard dependency features (Sect.
Related work
19To find the gain that can be obtained with gold-standard parses, we used MAll’s system with their hand-annotated and the equivalent automatically generated features.
Task definition
For example if G3 in Figure l is a gold-standard DAG and G1 is a candidate DAG, TC-based metrics count 2 > 3 as another match, even though it is entailed from 2 > 1 and 1 > 3.
gold-standard is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Wu, Yuanbin and Ng, Hwee Tou
Experiments
The gold-standard edits are with —> to and e —> the.
Experiments
Given a set of gold-standard edits, the original (ungrammatical) input text, and the corrected system output text, the M 2 scorer searches for the system edits that have the largest overlap with the gold—standard edits.
Experiments
The H00 2011 shared task provides two sets of gold-standard edits: the original gold-standard edits produced by the annotator, and the official gold—
gold-standard is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: