Abstract | We devise a gold-standard sense- and parse tree-annotated dataset based on the intersection of the Penn Treebank and SemCor, and experiment with different approaches to both semantic representation and disambiguation. |
Background | We diverge from this norm in focusing exclusively on a sense-annotated subset of the Brown Corpus portion of the Penn Treebank, in order to investigate the upper bound performance of the models given gold-standard sense information. |
Background | Based on gold-standard sense information, they achieved large-scale improvements over a basic parse selection model in the context of the Hinoki treebank. |
Experimental setting | One of the main requirements for our dataset is the availability of gold-standard sense and parse tree annotations. |
Experimental setting | The gold-standard sense annotations allow us to perform upper bound evaluation of the relative impact of a given semantic representation on parsing and PP attachment performance, to contrast with the performance in more realistic semantic disambiguation settings. |
Experimental setting | The gold-standard parse tree annotations are required in order to carry out evaluation of parser and PP attachment performance. |
Integrating Semantics into Parsing | We experiment with different ways of tackling WSD, using both gold-standard data and automatic methods. |
Introduction | We explore a number of disambiguation strategies, including the use of hand-annotated ( gold-standard ) senses, the |
Introduction | These results are achieved using most frequent sense information, which surprisingly outperforms both gold-standard senses and automatic WSD. |
Abstract | Statistical parsing of noun phrase (NP) structure has been hampered by a lack of gold-standard data. |
Abstract | We correct these errors in CCGbank using a gold-standard corpus of NP structure, resulting in a much more accurate corpus. |
Background | Recently, Vadas and Curran (2007a) annotated internal NP structure for the entire Penn Treebank, providing a large gold-standard corpus for NP bracketing. |
Background | We use these brackets to determine new gold-standard CCG derivations in Section 3. |
Background | PropBank (Palmer et al., 2005) is used as a gold-standard to inform these decisions, similar to the way that we use the Vadas and Curran (2007a) data. |
DepBank evaluation | Clark and Curran (2007a) report an upper bound on performance, using gold-standard CCGbank dependencies, of 84.76% F-score. |
DepBank evaluation | Firstly, we show the figures achieved using gold-standard CCGbank derivations in Table 7. |
DepBank evaluation | Table 7: DepBank gold-standard evaluation |
Experiments | Table 3: Parsing results with gold-standard POS tags |
Experiments | Table 4 shows that, unsur-prisingly, performance is lower without the gold-standard data. |
Experiments | We can see that parsing F-score has dropped by about 2% compared to using gold-standard POS and NER data, however, the NER features still improve performance by about 0.3%. |
Evaluation | Rather than inspecting a random sample of classes, the evaluation validates the results against a reference set of 40 gold-standard classes that were manually assembled as part of previous work (Pasca, 2007). |
Evaluation | To evaluate the precision of the extracted instances, the manual label of each gold-standard class (e.g., SearchEngine) is mapped into a class label extracted from text (e.g., search engines). |
Evaluation | As shown in the first two columns of Table 3, the mapping into extracted class labels succeeds for 37 of the 40 gold-standard classes. |