Abstract | The proposed models are tested on three different tasks: coarse-grained word sense disambiguation, fine-grained word sense disambiguation, and detection of literal vs. nonliteral usages of potentially idiomatic expressions. |
Experimental Setup | Sense Paraphrases For word sense disambiguation tasks, the paraphrases of the sense keys are represented by information from WordNet 2.1. |
Experiments | McCarthy (2009) also ad-lresses the issue of performance and cost by com-taring supervised word sense disambiguation sys-ems with unsupervised ones. |
Experiments | .‘he reason is that although this system is claimed 0 be unsupervised, and it performs better than .11 the participating systems (including the super-'ised systems) in the SemEval-2007 shared task, it till needs to incorporate a lot of prior knowledge, pecifically information about co-occurrences be-ween different word senses , which was obtained rom a number of resources (SSI+LKB) includ-ng: (i) SemCor (manually annotated); (ii) LDC-)SO (partly manually annotated); (iii) collocation lictionaries which are then disambiguated semi-.utomatically. |
Experiments | Table 4: Model performance (F-score) for the fine-grained word sense disambiguation task. |
Introduction | Word sense disambiguation (WSD) is the task of automatically determining the correct sense for a target word given the context in which it occurs. |
Related Work | There is a large body of work on WSD, covering supervised, unsupervised ( word sense induction) and knowledge-based approaches (see McCarthy (2009) for an overview). |
Related Work | Boyd-Graber and Blei (2007) propose an unsupervised approach that integrates McCarthy et al.’s (2004) method for finding predominant word senses into a topic modelling framework. |
Related Work | Topic models have also been applied to the related task of word sense induction. |
The Sense Disambiguation Model | WordNet is a fairly rich resource which provides detailed information about word senses (glosses, example sentences, synsets, semantic relations between senses, etc.). |
The Sense Disambiguation Model | However, this assumption does not hold, as the true distribution of word senses is often highly skewed (McCarthy, 2009). |
BabelNet | We collect (a) from WordNet, all available word senses (as concepts) and all the semantic pointers between synsets (as relations); (b) from Wikipedia, all encyclopedic entries (i.e. |
Experiment 1: Mapping Evaluation | The final mapping contains 81,533 pairs of Wikipages and word senses they map to, covering 55.7% of the noun senses in WordNet. |
Experiment 2: Translation Evaluation | Language Word senses Synsets |
Experiment 2: Translation Evaluation | In Table 2 we report the number of synsets and word senses available in the gold-standard resources for the 5 languages. |
Experiment 2: Translation Evaluation | We assess the coverage of BabelNet against our gold-standard wordnets both in terms of synsets and word senses . |
Introduction | Recent studies in the difficult task of Word Sense Disambiguation (Navigli, 2009b, WSD) have shown the impact of the amount and quality of lexical knowledge (Cuadros and Rigau, 2006): richer knowledge sources can be of great benefit to both knowledge-lean systems (Navigli and Lapata, 2010) and supervised classifiers (Ng and Lee, 1996; Yarowsky and Florian, 2002). |
Methodology | We denote with w; the i-th sense of a word 7.0 with part of speech p. We use word senses to unambiguously denote the corresponding synsets (e.g. |
Methodology | Hereafter, we use word sense and synset interchangeably. |
Methodology | Given a WordNet word sense in our babel synset of interest (e.g. |
Set Expansion | Unlike some other thesauri (such as WordNet and thesaurus.com), entries are not broken down by word sense . |
Set Expansion | Unlike some other thesauri (including WordNet and thesaurus.com), the entries are not broken down by word sense or part of speech. |
Set Expansion | It consists of a large number of synsets; a synset is a set of one or more similar word senses . |
Conclusion | Also, our system is the first unsupervised method that has been applied to Erk and McCarthy’s (2009) graded word sense assignment task, showing a substantial positive correlation with the gold standard. |
Experiment: Ranking Word Senses | In this section, we apply our model to a different word sense ranking task: Given a word w in context, the task is to decide to what extent the different |
Experiment: Ranking Word Senses | (2008), we represent different word senses by the words in the corresponding synsets. |
Experiment: Ranking Word Senses | For each word sense , we compute the centroid of the second-order vectors of its synset members. |
Introduction | In a second experiment, we apply our model to the “word sense similarity task” recently proposed by Erk and McCarthy (2009), which is a refined variant of a word-sense disambiguation task. |
The model | The objective is to incorporate (inverse) selectional preference information from the context (r, w’ ) in such a way as to identify the correct word sense of w. This suggests that the dimensions of should be filtered so that only those compatible with the context remain. |
Abstract | While pseudo-words originally evaluated word sense disambiguation, they are now commonly used to evaluate selectional preferences. |
History of Pseudo-Word Disambiguation | Pseudo-words were introduced simultaneously by two papers studying statistical approaches to word sense disambiguation (WSD). |
Introduction | One way to mitigate this problem is with pseudo-words, a method for automatically creating test corpora without human labeling, originally proposed for word sense disambiguation (Gale et al., |
Introduction | While pseudo-words are now less often used for word sense disambigation, they are a common way to evaluate selectional preferences, models that measure the strength of association between a predicate and its argument filler, e.g., that the noun lunch is a likely object of eat. |
The S-Space Framework | We divide the algorithms into four categories based on their structural similarity: document-based, co-occurrence, approximation, and Word Sense Induction (WSI) models. |
The S-Space Framework | WSI models also use co-occurrence but also attempt to discover distinct word senses while building the vector space. |
Word Space Models | Word Sense Induction Models Purandare and Pedersen (Purandare and Pedersen, 2004) HERMIT (Jurgens and Stevens, 2010) |
Automatic Metaphor Recognition | This idea originates from a similarity-based word sense disambiguation method developed by Karov and Edelman (1998). |
Metaphor Annotation in Corpora | To reflect two distinct aspects of the phenomenon, metaphor annotation can be split into two stages: identifying metaphorical senses in text (akin word sense disambiguation) and annotating source — target domain mappings underlying the production of metaphorical expressions. |
Metaphor Annotation in Corpora | Such annotation can be viewed as a form of word sense disambiguation with an emphasis on metaphoricity. |