Abstract | The vast majority of work on word senses has relied on predefined sense inventories and an annotation schema where each word instance is tagged with the best fitting sense. |
Abstract | The responses from both experiments correlate with the overlap of paraphrases from the English lexical substitution task which bodes well for the use of substitutes as a proxy for word sense . |
Annotation | WSsim is a word sense annotation task using WordNet senses.5 Unlike previous word sense annotation projects, we asked annotators to provide judgments on the applicability of every WordNet sense of the target lemma with the instruction: 6 |
Annotation | In traditional word sense annotation, such bias could be introduced directly through annotation guidelines or indirectly, through tools that make it easier to assign fewer senses. |
Introduction | The vast majority of work on word sense tagging has assumed that predefined word senses from a dictionary are an adequate proxy for the task, although of course there are issues with this enterprise both in terms of cognitive validity (Hanks, 2000; Kilgarriff, 1997; Kilgarriff, 2006) and adequacy for computational linguistics applications (Kilgarriff, 2006). |
Introduction | Furthermore, given a predefined list of senses, annotation efforts and computational approaches to word sense disambiguation (WS D) have usually assumed that one best fitting sense should be selected for each usage. |
Introduction | In the first one, referred to as WSsim ( Word Sense Similarity), annotators give graded ratings on the applicability of WordNet senses. |
Related Work | Manual word sense assignment is difficult for human annotators (Krishnamurthy and Nicholls, 2000). |
Related Work | Reported inter-annotator agreement (ITA) for fine-grained word sense assignment tasks has ranged between 69% (Kilgarriff and Rosenzweig, 2000) for a lexical sample using the HECTOR dictionary and 78.6% using WordNet (Landes et al., 1998) in all-words annotation. |
Related Work | The task was proposed following a background of discussions in the WSD community as to the adequacy of predefined word senses . |
Abstract | In this paper, we propose a sense-based translation model to integrate word senses into statistical machine translation. |
Abstract | Our method is significantly different from preVious word sense disambiguation reformulated for machine translation in that the latter neglects word senses in nature. |
Abstract | Results show that the proposed model substantially outperforms not only the baseline but also the preVious reformulated word sense disambiguation. |
Introduction | Therefore a natural assumption is that word sense disambiguation (WSD) may contribute to statistical machine translation (SMT) by providing appropriate word senses for target translation selection with context features (Carpuat and Wu, 2005). |
Introduction | Carpuat and Wu (2005) adopt a standard formulation of WSD: predicting word senses that are defined on an ontology for ambiguous words. |
Introduction | As they apply WSD to Chinese-to-English translation, they predict word senses from a Chinese ontology HowNet and project the predicted senses to English glosses provided by HowNet. |
Abstract | Our approach can be applied for lexicography, as well as for applications like word sense disambiguation or semantic search. |
Introduction | Two of the fundamental components of a natural language communication are word sense discovery (Jones, 1986) and word sense disambiguation (Ide and Veronis, 1998). |
Introduction | Context plays a vital role in disambiguation of word senses as well as in the interpretation of the actual meaning of words. |
Introduction | For instance, the word “bank” has several distinct interpretations, including that of a “financial institution” and the “shore of a river.” Automatic discovery and disambiguation of word senses from a given text is an important and challenging problem which has been extensively studied in the literature (Jones, 1986; Ide and Vero-nis, 1998; Schutze, 1998; Navigli, 2009). |
Related work | Word sense disambiguation as well as word sense discovery have both remained key areas of research right from the very early initiatives in natural language processing research. |
Related work | Ide and Vero-nis (1998) present a very concise survey of the history of ideas used in word sense disambiguation; for a recent survey of the state-of-the-art one can refer to (Navigli, 2009). |
Related work | to automatic word sense discovery were made by Karen Sparck Jones (1986); later in lexicography, it has been extensively used as a preprocessing step for preparing mono- and multilingual dictionaries (Kilgarriff and Tugwell, 2001; Kilgarriff, 2004). |
Abstract | Unsupervised word sense disambiguation (WS D) methods are an attractive approach to all-words WSD due to their non-reliance on expensive annotated data. |
Abstract | Unsupervised estimates of sense frequency have been shown to be very useful for WSD due to the skewed nature of word sense distributions. |
Background and Related Work | There has been a considerable amount of research on representing word senses and disambiguating usages of words in context (WSD) as, in order to produce computational systems that understand and produce natural language, it is essential to have a means of representing and disambiguat-ing word sense . |
Background and Related Work | WSD algorithms require word sense information to disambiguate token instances of a given ambiguous word, e.g. |
Background and Related Work | One extremely useful piece of information is the word sense prior or expected word sense frequency distribution. |
Introduction | The automatic determination of word sense information has been a longterm pursuit of the NLP community (Agirre and Edmonds, 2006; Navigli, 2009). |
Introduction | Word sense distributions tend to be Zip-fian, and as such, a simple but surprisingly high-accuracy back-off heuristic for word sense disambiguation (WSD) is to tag each instance of a given word with its predominant sense (McCarthy et al., 2007). |
Introduction | Such an approach requires knowledge of predominant senses; however, word sense distributions — and predominant senses too —vary from corpus to corpus. |
A Unified Semantic Representation | We propose a representation of any lexical item as a distribution over a set of word senses , referred to as the item’s semantic signature. |
A Unified Semantic Representation | However, traditional forms of word sense disambiguation are difficult for short texts and single words because little or no contextual information is present to perform the disambiguation task. |
Abstract | We present a unified approach to semantic similarity that operates at multiple levels, all the way from comparing word senses to comparing text documents. |
Abstract | Our method leverages a common probabilistic representation over word senses in order to compare different types of linguistic data. |
Abstract | This unified representation shows state-of-the-art performance on three tasks: semantic textual similarity, word similarity, and word sense coarsening. |
Experiment 1: Textual Similarity | In addition, the system utilizes techniques such as Explicit Semantic Analysis (Gabrilovich and Markovitch, 2007) and makes use of resources such as Wiktionary and Wikipedia, a lexical substitution system based on supervised word sense disambiguation (Biemann, 2013), and a statistical machine translation system. |
Experiment 2: Word Similarity | that of Rapp (2003) uses word senses , an approach that is outperformed by our method. |
Experiment 3: Sense Similarity | WordNet is known to be a fine-grained sense inventory with many related word senses (Palmer et al., 2007). |
Experiment 3: Sense Similarity | word senses (Agirre and Lopez, 2003; McCarthy, 2006). |
Experiment 3: Sense Similarity | We benchmark the accuracy of our similarity measure in grouping word senses against those of Navigli (2006) and Snow et al. |
Introduction | Second, we propose a novel alignment-based method for word sense dis- |
Abstract | Previous research has conflicting conclusions on whether word sense disambiguation (WSD) systems can improve information retrieval (IR) performance. |
Abstract | Together with the senses predicted for words in documents, we propose a novel approach to incorporate word senses into the language modeling approach to IR and also exploit the integration of synonym relations. |
Abstract | Our experimental results on standard TRE C collections show that using the word senses tagged by a supervised WSD system, we obtain significant improvements over a state-of-the-art IR system. |
Introduction | Word sense disambiguation (WSD) is the task of identifying the correct meaning of a word in context. |
Introduction | Some of the early research showed a drop in retrieval performance by using word senses (Krovetz and Croft, 1992; Voorhees, 1993). |
Introduction | Some other experiments observed improvements by integrating word senses in IR systems (Schutze and Pedersen, 1995; Gonzalo et al., 1998; Stokoe et al., 2003; Kim et al., 2004). |
Related Work | However, it is hard to judge the effect of word senses because of the overall poor performances of their baseline method and their system. |
Word Sense Disambiguation | 4.1 Word sense disambiguation system |
Abstract | ConceptResolver performs both word sense induction and synonym resolution on relations extracted from text using an ontology and a small amount of labeled data. |
Abstract | Word sense induction is performed by inferring a set of semantic types for each noun phrase. |
Abstract | When ConceptResolver is run on N ELL’s knowledge base, 87% of the word senses it creates correspond to real-world concepts, and 85% of noun phrases that it suggests refer to the same concept are indeed synonyms. |
Introduction | Induce Word Senses i. |
Introduction | Cluster word senses with semantic type C using classifier’s predictions. |
Introduction | It first performs word sense induction, using the extracted category instances to create one or more unambiguous word senses for each noun phrase in the knowledge base. |
Abstract | This demonstrates that word sense information can indeed enhance the performance of syntactic disambiguation. |
Background | That we specifically present results for PP attachment in a parsing context is a combination of us supporting the new research direction for PP attachment established by Atterer and Schutze, and us wishing to reinforce the findings of Stetina and Nagao that word sense information significantly enhances PP attachment performance in this new setting. |
Background | There have been a number of attempts to incorporate word sense information into parsing tasks. |
Background | The only successful applications of word sense information to parsing that we are aware of are Xiong et al. |
Experimental setting | We use Bikel’s randomized parsing evaluation comparator3 (with p < 0.05 throughout) to test the statistical significance of the results using word sense information, relative to the respective baseline parser using only lexical features. |
Integrating Semantics into Parsing | This problem of identifying the correct sense of a word in context is known as word sense disambiguation (WSD: Agirre and Edmonds (2006)). |
Introduction | use of the most frequent sense, and an unsupervised word sense disambiguation (WSD) system. |
Introduction | We provide the first definitive results that word sense information can enhance Penn Treebank parser performance, building on earlier results of Bikel (2000) and Xiong et al. |
Abstract | The proposed models are tested on three different tasks: coarse-grained word sense disambiguation, fine-grained word sense disambiguation, and detection of literal vs. nonliteral usages of potentially idiomatic expressions. |
Experimental Setup | Sense Paraphrases For word sense disambiguation tasks, the paraphrases of the sense keys are represented by information from WordNet 2.1. |
Experiments | McCarthy (2009) also ad-lresses the issue of performance and cost by com-taring supervised word sense disambiguation sys-ems with unsupervised ones. |
Experiments | .‘he reason is that although this system is claimed 0 be unsupervised, and it performs better than .11 the participating systems (including the super-'ised systems) in the SemEval-2007 shared task, it till needs to incorporate a lot of prior knowledge, pecifically information about co-occurrences be-ween different word senses , which was obtained rom a number of resources (SSI+LKB) includ-ng: (i) SemCor (manually annotated); (ii) LDC-)SO (partly manually annotated); (iii) collocation lictionaries which are then disambiguated semi-.utomatically. |
Experiments | Table 4: Model performance (F-score) for the fine-grained word sense disambiguation task. |
Introduction | Word sense disambiguation (WSD) is the task of automatically determining the correct sense for a target word given the context in which it occurs. |
Related Work | There is a large body of work on WSD, covering supervised, unsupervised ( word sense induction) and knowledge-based approaches (see McCarthy (2009) for an overview). |
Related Work | Boyd-Graber and Blei (2007) propose an unsupervised approach that integrates McCarthy et al.’s (2004) method for finding predominant word senses into a topic modelling framework. |
Related Work | Topic models have also been applied to the related task of word sense induction. |
The Sense Disambiguation Model | WordNet is a fairly rich resource which provides detailed information about word senses (glosses, example sentences, synsets, semantic relations between senses, etc.). |
The Sense Disambiguation Model | However, this assumption does not hold, as the true distribution of word senses is often highly skewed (McCarthy, 2009). |
Building a Translation Graph | Undirected edges in the graph denote translations between words: an edge e E 5 between (ml, [1) and (mg, [2) represents the belief that ml and ’LU2 share at least one word sense . |
Building a Translation Graph | TRANS GRAPH searched for paths in the graph between two vertices and estimated the probability that the path maintains the same word sense along all edges in the path, even when the edges come from different dictionaries. |
Building a Translation Graph | One formula estimates the probability that two multilingual dictionary entries represent the same word sense , based on the proportion of overlapping translations for the two entries. |
Introduction and Motivation | ing word senses across multiple, independently-authored dictionaries. |
Translation Inference Algorithms | However, if A, B, and C are on a circuit that starts at A, passes through B and C and returns to A, there is a high probability that all nodes on that circuit share a common word sense , given certain restrictions that we enumerate later. |
Translation Inference Algorithms | Each clique in the graph represents a set of vertices that share a common word sense . |
Translation Inference Algorithms | When two cliques intersect in two or more vertices, the intersecting vertices share the word sense of both cliques. |
BabelNet | We collect (a) from WordNet, all available word senses (as concepts) and all the semantic pointers between synsets (as relations); (b) from Wikipedia, all encyclopedic entries (i.e. |
Experiment 1: Mapping Evaluation | The final mapping contains 81,533 pairs of Wikipages and word senses they map to, covering 55.7% of the noun senses in WordNet. |
Experiment 2: Translation Evaluation | Language Word senses Synsets |
Experiment 2: Translation Evaluation | In Table 2 we report the number of synsets and word senses available in the gold-standard resources for the 5 languages. |
Experiment 2: Translation Evaluation | We assess the coverage of BabelNet against our gold-standard wordnets both in terms of synsets and word senses . |
Introduction | Recent studies in the difficult task of Word Sense Disambiguation (Navigli, 2009b, WSD) have shown the impact of the amount and quality of lexical knowledge (Cuadros and Rigau, 2006): richer knowledge sources can be of great benefit to both knowledge-lean systems (Navigli and Lapata, 2010) and supervised classifiers (Ng and Lee, 1996; Yarowsky and Florian, 2002). |
Methodology | We denote with w; the i-th sense of a word 7.0 with part of speech p. We use word senses to unambiguously denote the corresponding synsets (e.g. |
Methodology | Hereafter, we use word sense and synset interchangeably. |
Methodology | Given a WordNet word sense in our babel synset of interest (e.g. |
FrameNet — Wiktionary Alignment | This is in line with the reported higher complexity of lexical resources with respect to verbs and greater difficulty in alignments and word sense disambiguation (Laparra and Rigau, 2010). |
Method Overview | The second step is the disambiguation of the translated lemmas with respect to the target language Wiktionary in order to retrieve the linguistic information of the corresponding word sense in the target language Wiktionary (Meyer and Gurevych, 2012a). |
Method Overview | For the example sense of complete, we extract lexical information for the word sense of its German translation fertigmachen, for instance a German gloss, an example sentence, register information (colloquial), and synonyms, e. g., beenden. |
Related Work | The first, corpus-based approach is to automatically extract word senses in the target language based on parallel corpora and frame annotations in the source language. |
Related Work | They rely on a knowledge-based word sense disambiguation algorithm to establish the alignment and report F1=0.75 on a gold standard based on Tonelli and Pighin (2009). |
Related Work | Tonelli and Giuliano (2009) align FrameNet senses to Wikipedia entries with the goal to extract word senses and example sentences in Italian. |
Resource Overview | It groups word senses in frames that represent particular situations. |
Resource Overview | FrameNet release 1.5 contains 1,015 frames, and 11,942 word senses . |
Resource Overview | Wiktionary is organized like a traditional dictionary in lexical entries and word senses . |
Set Expansion | Unlike some other thesauri (such as WordNet and thesaurus.com), entries are not broken down by word sense . |
Set Expansion | Unlike some other thesauri (including WordNet and thesaurus.com), the entries are not broken down by word sense or part of speech. |
Set Expansion | It consists of a large number of synsets; a synset is a set of one or more similar word senses . |
Abstract | This paper proposes to solve the bottleneck of finding training data for word sense disambiguation (WSD) in the domain of web queries, where a complete set of ambiguous word senses are unknown. |
Abstract | In this paper, we present a combination of active learning and semi-supervised learning method to treat the case when positive examples, which have an expected word sense in web search result, are only given. |
Introduction | When retrieving texts from Web archive, we often suffer from word sense ambiguity and WSD system is indispensable. |
Introduction | Because target words are often proper nouns, their word senses are rarely listed in handcrafted lexicon. |
Introduction | In selecting pseudo negative dataset, we predict word sense of each unlabeled example using the |
Conclusion | Also, our system is the first unsupervised method that has been applied to Erk and McCarthy’s (2009) graded word sense assignment task, showing a substantial positive correlation with the gold standard. |
Experiment: Ranking Word Senses | In this section, we apply our model to a different word sense ranking task: Given a word w in context, the task is to decide to what extent the different |
Experiment: Ranking Word Senses | (2008), we represent different word senses by the words in the corresponding synsets. |
Experiment: Ranking Word Senses | For each word sense , we compute the centroid of the second-order vectors of its synset members. |
Introduction | In a second experiment, we apply our model to the “word sense similarity task” recently proposed by Erk and McCarthy (2009), which is a refined variant of a word-sense disambiguation task. |
The model | The objective is to incorporate (inverse) selectional preference information from the context (r, w’ ) in such a way as to identify the correct word sense of w. This suggests that the dimensions of should be filtered so that only those compatible with the context remain. |
Conclusion | For instance, TAD could be used to aid word sense induction more generally, or could be applied as part of other tasks such as coreference resolution. |
Introduction | Several NLP tasks, such as word sense disambiguation, word sense induction, and named entity disambiguation, address this ambiguity problem to varying degrees. |
Related Work | the well studied problems of named entity disambiguation (NED) and word sense disambiguation (WSD). |
Related Work | Both named entity and word sense disambiguation are extensively studied, and surveys on each are available (Nadeau and Sekine, 2007; Navigli, 2009). |
Related Work | Another task that shares similarities with TAD is word sense induction (WSI). |
Conclusion | We propose a probabilistic approach to infer the sentiment similarity between word senses with respect to automatically learned hidden emotions. |
Evaluation and Results | Furthermore, we employ Word Sense Disambiguation (WSD) to disambiguate the adjectives in the question and its corresponding answer. |
Hidden Emotional Model | To compute the semantic similarity between word senses , we utilize their synsets as follows: |
Introduction | For this purpose, we propose to model the hidden emotions of word senses . |
Sentiment Similarity through Hidden Emotions | Thus, we assume that the number and types of basic emotions are hidden and not predefined and propose a Probabilistic Sense Sentiment Similarity (PSSS) approach to extract the hidden emotions of word senses to infer their sentiment similarity. |
Abstract | This paper proposes a novel smoothing model with a combinatorial optimization scheme for all-words word sense disambiguation from untagged corpora. |
Conclusions | Thus it was confirmed that this method is valid for finding the optimal combination of word senses with large untagged corpora. |
Evaluation | (2004), which determines the word sense based on sense similarity and distributional similarity to the k-nearest neighbor words of a target word by distributional similarity. |
Evaluation | (2007), which determines the word sense by maximizing the sum of sense similarity to the k immediate neighbor words of a target word. |
Introduction | Word Sense Disambiguation (WSD) is a task to identify the intended sense of a word based on its context. |
Introduction | (2002) observed, the domain of the text that a word occurs in is a useful signal for performing word sense disambiguation (e.g. |
Introduction | This sense inventory provides the set of “known” word senses in the form of phrasal translations. |
Introduction | One of our key contributions is the development of a rich set of features based on monolingual text that are indicative of new word senses . |
Related Work | While word senses have been studied extensively in lexical semantics, research has focused on word sense disambiguation, the task of disambiguating words in context given a predefined sense inventory (e.g., Agirre and Edmonds (2006)), and word sense induction, the task of learning sense inventories from text (e. g., Agirre and Soroa (2007)). |
Related Work | In contrast, detecting novel senses has not received as much attention, and is typically addressed within word sense induction, rather than as a distinct SENSESPOTTING task. |
Introduction | Active learning has been applied to several NLP tasks like part-of—speech tagging (Ringger et al., 2007), chunking (Ngai and Yarowsky, 2000), syntactic parsing (Osborne and Baldridge, 2004; Hwa, 2004), Named Entity Recognition (Shen et al., 2004; Laws and Schutze, 2008; Tomanek and Hahn, 2009), Word Sense Disambiguation (Chen et al., 2006; Zhu and Hovy, 2007; Chan and Ng, 2007), text classification (Tong and Koller, 1998) or statistical machine translation (Haffari and Sarkar, 2009), and has been shown to reduce the amount of annotated data needed to achieve a certain classifier performance, sometimes by as much as half. |
Related Work | sentiment analysis, the detection of metaphors, WSD with fine-grained word senses , to name but a few). |
Related Work | Table 1: Distribution of word senses in pool and test sets |
Related Work | The different word senses are evenly distributed over the rejected instances (H1: Commitment 30, drohenl-salsa 38, Run_risk 36; H2: Commitment 3, drohenl-salsa 4, Run_risk 4). |
Four Types of Collective Annotation | In contrast, in tasks such as word sense labelling (Kilgarriff and Palmer, 2000; Palmer et al., 2007; Venhuizen et al., 2013) and PP-attachment annotation (Rosenthal et al., 2010; J ha et al., 2010) coders need to choose a category amongst a set of options specific to each item—the possible senses of each word or the possible attachment points in each sentence with a prepositional phrase. |
Four Types of Collective Annotation | 1Some authors have combined qualitative and quantitative ratings; e. g., for the Graded Word Sense dataset of Erk et a1. |
Related Work | have developed the Wordrobe set of games for annotating named entities, word senses , homographs, and pronouns. |
Related Work | Similarly, crowdsourcing via microworking sites like Amazon’s Mechanical Turk has been used in several annotation experiments related to tasks such as affect analysis, event annotation, sense definition and word sense disambiguation (Snow et al., 2008; Rumshisky, 2011; Rumshisky et al., 2012), amongst others.12 |
Abstract | Recent work on bilingual Word Sense Disambiguation (WSD) has shown that a resource deprived language (L1) can benefit from the annotation work done in a resource rich language (L2) via parameter projection. |
Conclusion | We presented a bilingual bootstrapping algorithm for Word Sense Disambiguation which allows two resource deprived languages to mutually benefit |
Parameter Projection | (2009) proposed that the various parameters essential for domain-specific Word Sense Disambiguation can be broadly classified into two categories: |
Related Work | Bootstrapping for Word Sense Disambiguation was first discussed in (Yarowsky, 1995). |
Abstract | While pseudo-words originally evaluated word sense disambiguation, they are now commonly used to evaluate selectional preferences. |
History of Pseudo-Word Disambiguation | Pseudo-words were introduced simultaneously by two papers studying statistical approaches to word sense disambiguation (WSD). |
Introduction | One way to mitigate this problem is with pseudo-words, a method for automatically creating test corpora without human labeling, originally proposed for word sense disambiguation (Gale et al., |
Introduction | While pseudo-words are now less often used for word sense disambigation, they are a common way to evaluate selectional preferences, models that measure the strength of association between a predicate and its argument filler, e.g., that the noun lunch is a likely object of eat. |
Introduction | This word sense issue has been a universal challenge for a range of Natural Language Processing applications, including sentiment analysis. |
Introduction | End-users of such a lexicon may not wish to deal with Word Sense Disam- |
Related Work | There have been recent studies that address word sense disambiguation issues for sentiment analysis. |
Conclusion | The effectiveness of our model stems from the use of a large domain-general corpus to train the distributional semantic vectors, and the implicit syntactic and word sense information pro- |
Introduction | Contextualization has been found to improve performance in tasks like lexical substitution and word sense disambiguation (Thater et al., 2011). |
Introduction | Second, the contextualization process allows the semantic vectors to implicitly encode disambiguated word sense and syntactic information, without further adding to the complexity of the generative model. |
Abstract | Current approaches for word sense disambiguation and translation selection typically require lexical resources or large bilingual corpora with rich information fields and annotations, which are often infeasible for under-resourced languages. |
Introduction | Word sense disambiguation (WSD) is the task of assigning sense tags to ambiguous lexical items (Us) in a text. |
Introduction | It can also be viewed as a simplified version of the Cross-Lingual Lexical Substitution (Mihalcea et al., 2010) and Cross-Lingual Word Sense Disambiguation (Lefever and Hoste, 2010) tasks, as defined in SemEval-2010. |
Conclusions | Beyond the immediate usability of its output and its effective use for domain Word Sense Disambiguation (Faralli and Navigli, 2012), we wish to show the benefit of GlossBoot in gloss-driven approaches to ontology learning (Navigli et al., 2011; Velardi et al., 2013) and semantic network enrichment (Navigli and Ponzetto, 2012). |
Introduction | Interestingly, electronic glossaries have been shown to be key resources not only for humans, but also in Natural Language Processing (NLP) tasks such as Question Answering (Cui et al., 2007), Word Sense Disambiguation (Duan and Yates, 2010; Faralli and Navigli, 2012) and ontology learning (Navigli et al., 2011; Velardi et al., 2013). |
Related Work | and Curran, 2008; McIntosh and Curran, 2009), learning semantic relations (Pantel and Pennac-chiotti, 2006), extracting surface text patterns for open-domain question answering (Ravichandran and Hovy, 2002), semantic tagging (Huang and Riloff, 2010) and unsupervised Word Sense Disambiguation (Yarowsky, 1995). |
The S-Space Framework | We divide the algorithms into four categories based on their structural similarity: document-based, co-occurrence, approximation, and Word Sense Induction (WSI) models. |
The S-Space Framework | WSI models also use co-occurrence but also attempt to discover distinct word senses while building the vector space. |
Word Space Models | Word Sense Induction Models Purandare and Pedersen (Purandare and Pedersen, 2004) HERMIT (Jurgens and Stevens, 2010) |
Existing algorithms 3.1 Yarowsky | th is similar to that of Yarowsky (1995) but is better specified and omits word sense disambiguation optimizations. |
Graph propagation | The tasks of Eisner and Karakos (2005) are word sense disambiguation on several English words which have two senses corresponding to two different words in French. |
Graph propagation | There is no difference on the word sense data sets. |
Introduction | Reisinger and Mooney (2010b) introduced a multi-prototype VSM where word sense discrimination is first applied by clustering contexts, and then prototypes are built using the contexts of the sense-labeled words. |
Multi-Prototype Neural Language Model | We present a way to use our learned single-prototype embeddings to represent each context window, which can then be used by clustering to perform word sense discrimination (Schutze, 1998). |
Related Work | The multi-prototype approach has been widely studied in models of categorization in psychology (Rosseel, 2002; Griffiths et al., 2009), while Schutze (1998) used clustering of contexts to perform word sense discrimination. |
Automatic Metaphor Recognition | This idea originates from a similarity-based word sense disambiguation method developed by Karov and Edelman (1998). |
Metaphor Annotation in Corpora | To reflect two distinct aspects of the phenomenon, metaphor annotation can be split into two stages: identifying metaphorical senses in text (akin word sense disambiguation) and annotating source — target domain mappings underlying the production of metaphorical expressions. |
Metaphor Annotation in Corpora | Such annotation can be viewed as a form of word sense disambiguation with an emphasis on metaphoricity. |