SciSurf: Index of 'Investigations on Word Senses and Word Usages'

Investigations on Word Senses and Word Usages

Erk, Katrin and McCarthy, Diana and Gaylord, Nicholas

Published in Proc. ACL, 2009

Article Structure

Abstract

The vast majority of work on word senses has relied on predefined sense inventories and an annotation schema where each word instance is tagged with the best fitting sense.

Introduction

The vast majority of work on word sense tagging has assumed that predefined word senses from a dictionary are an adequate proxy for the task, although of course there are issues with this enterprise both in terms of cognitive validity (Hanks, 2000; Kilgarriff, 1997; Kilgarriff, 2006) and adequacy for computational linguistics applications (Kilgarriff, 2006).

Related Work

Manual word sense assignment is difficult for human annotators (Krishnamurthy and Nicholls, 2000).

Annotation

We conducted two experiments through an online annotation interface.

Analyses

This section reports on analyses on the annotated data.

Discussion

Validity of annotation scheme.

Conclusions

We have introduced a novel annotation paradigm for word sense annotation that allows for graded judgments and for some variation between annotators.

Topics

word sense

Appears in 18 sentences as: Word Sense (1) word sense (16) word senses (3)

In Investigations on Word Senses and Word Usages

The vast majority of work on word senses has relied on predefined sense inventories and an annotation schema where each word instance is tagged with the best fitting sense.
Page 1, “Abstract”
The responses from both experiments correlate with the overlap of paraphrases from the English lexical substitution task which bodes well for the use of substitutes as a proxy for word sense .
Page 1, “Abstract”
The vast majority of work on word sense tagging has assumed that predefined word senses from a dictionary are an adequate proxy for the task, although of course there are issues with this enterprise both in terms of cognitive validity (Hanks, 2000; Kilgarriff, 1997; Kilgarriff, 2006) and adequacy for computational linguistics applications (Kilgarriff, 2006).
Page 1, “Introduction”
Furthermore, given a predefined list of senses, annotation efforts and computational approaches to word sense disambiguation (WS D) have usually assumed that one best fitting sense should be selected for each usage.
Page 1, “Introduction”
In the first one, referred to as WSsim ( Word Sense Similarity), annotators give graded ratings on the applicability of WordNet senses.
Page 1, “Introduction”
Manual word sense assignment is difficult for human annotators (Krishnamurthy and Nicholls, 2000).
Page 2, “Related Work”
Reported inter-annotator agreement (ITA) for fine-grained word sense assignment tasks has ranged between 69% (Kilgarriff and Rosenzweig, 2000) for a lexical sample using the HECTOR dictionary and 78.6% using WordNet (Landes et al., 1998) in all-words annotation.
Page 2, “Related Work”
The task was proposed following a background of discussions in the WSD community as to the adequacy of predefined word senses .
Page 2, “Related Work”
To the best of our knowledge there has been no study of how the data collected relates to word sense annotations or judgments of semantic similarity.
Page 2, “Related Work”
WSsim is a word sense annotation task using WordNet senses.5 Unlike previous word sense annotation projects, we asked annotators to provide judgments on the applicability of every WordNet sense of the target lemma with the instruction: 6
Page 3, “Annotation”
In traditional word sense annotation, such bias could be introduced directly through annotation guidelines or indirectly, through tools that make it easier to assign fewer senses.
Page 3, “Annotation”

See all papers in Proc. ACL 2009 that mention word sense.

See all papers in Proc. ACL that mention word sense.

WordNet

Appears in 14 sentences as: WordNet (17)

In Investigations on Word Senses and Word Usages

This paper examines the case for a graded notion of word meaning in two experiments, one which uses WordNet senses in a graded fashion, contrasted with the “winner takes all” annotation, and one which asks annotators to judge the similarity of two usages.
Page 1, “Abstract”
In the first one, referred to as WSsim (Word Sense Similarity), annotators give graded ratings on the applicability of WordNet senses.
Page 1, “Introduction”
The first study additionally tests to what extent the judgments on WordNet senses fall into clearcut clusters, while the second study allows us to explore meaning similarity independently of any lexicon resource.
Page 1, “Introduction”
Reported inter-annotator agreement (ITA) for fine-grained word sense assignment tasks has ranged between 69% (Kilgarriff and Rosenzweig, 2000) for a lexical sample using the HECTOR dictionary and 78.6% using WordNet (Landes et al., 1998) in all-words annotation.
Page 2, “Related Work”
Although we use WordNet for the annotation, our study is not a study of WordNet per se.
Page 2, “Related Work”
We choose WordNet because it is sufficiently fine- grained to examine subtle differences in usage, and because traditionally annotated datasets exist to which we can compare our results.
Page 2, “Related Work”
WSsim is a word sense annotation task using WordNet senses.5 Unlike previous word sense annotation projects, we asked annotators to provide judgments on the applicability of every WordNet sense of the target lemma with the instruction: 6
Page 3, “Annotation”
3The SemCor dataset was produced alongside WordNet, so it can be expected to support the WordNet sense distinctions.
Page 3, “Annotation”
5WordNet 1.7.1 was used in the annotation of both SE-3 and SemCor; we used the more current WordNet 3.0 after verifying that the lemmas included in this experiment had the same senses listed in both versions.
Page 3, “Annotation”
In the WSsim experiment, annotators rated the applicability of each WordNet 3.0 sense for a given target word occurrence.
Page 4, “Analyses”
In WordNet , they have 5, 7, and 4 senses, respectively.
Page 4, “Analyses”

See all papers in Proc. ACL 2009 that mention WordNet.

See all papers in Proc. ACL that mention WordNet.

ITA

Appears in 3 sentences as: ITA (3)

In Investigations on Word Senses and Word Usages

Reported inter-annotator agreement ( ITA ) for fine-grained word sense assignment tasks has ranged between 69% (Kilgarriff and Rosenzweig, 2000) for a lexical sample using the HECTOR dictionary and 78.6% using WordNet (Landes et al., 1998) in all-words annotation.
Page 2, “Related Work”
The use of more coarse-grained senses alleviates the problem: In OntoNotes (Hovy et al., 2006), an ITA of 90% is used as the criterion for the construction of coarse-grained sense distinctions.
Page 2, “Related Work”
However, intriguingly, for some high-frequency lemmas such as leave this ITA threshold is not reached even after multiple re-partitionings of the semantic space (Chen and Palmer, 2009).
Page 2, “Related Work”

See all papers in Proc. ACL 2009 that mention ITA.

See all papers in Proc. ACL that mention ITA.