Background | RRR consists of 20,081 training and 3,097 test quadruples of the form (v, n1 , p, n2) , where the attachment decision is either v or n l. The best published results over RRR are those of Stetina and Nagao (1997), who employ WordNet sense predictions from an unsupervised WSD method within a decision tree classifier. |
Background | (2005) experimented with first-sense and hypernym features from HowNet and CiLin (both WordNets for Chinese) in a generative parse model applied to the Chinese Penn Treebank. |
Experimental setting | We experimented with a range of semantic representations, all of which are based on WordNet 2.1. |
Experimental setting | As mentioned above, words in WordNet are organised into sets of synonyms, called synsets. |
Experimental setting | Note that these are the two extremes of semantic granularity in WordNet , and we plan to experiment with intermediate representation levels in future research (c.f. |
Integrating Semantics into Parsing | Our choice for this work was the WordNet 2.1 lexical database, in which synonyms are grouped into synsets, which are then linked via an ISA hierarchy. |
Integrating Semantics into Parsing | WordNet contains other types of relations such as meronymy, but we did not use them in this research. |
Integrating Semantics into Parsing | mallet, square and steel-wool pad are also descendants of TOOL in WordNet , none of which would conventionally be used as the manner adjunct of cat). |
Introduction | We explore several models for semantic representation, based around WordNet (Fellbaum, 1998). |
Abstract | Our NR classification evaluation strictly follows the ACL SemEval-07 Task 4 datasets and protocol, obtaining an f-score of 70.6, as opposed to 64.8 of the best previous work that did not use the manually provided WordNet sense disambiguation tags. |
Experimental Setup | Nouns in this pair were manually labeled with their corresponding WordNet 3 labels and the web queries used to |
Experimental Setup | The 15 submitted systems were assigned into 4 categories according to whether they use the WordNet and Query tags (some systems were assigned to more than a single category, since they reported experiments in several settings). |
Experimental Setup | In our evaluation we do not utilize WordNet or Query tags, hence we compare ourselves with the corresponding group (A), containing 6 systems. |
Introduction | To improve results, some systems utilize additional manually constructed semantic resources such as WordNet (WN) (Beamer et al., 2007). |
Introduction | Furthermore, usage of such resources frequently requires disambiguation and connection of the data to the resource (word sense disambiguation in the case of WordNet ). |
Introduction | We evaluated our algorithm on SemEval-07 Task 4 data, showing superior results over participating algorithms that did not utilize WordNet disambiguation tags. |
Related Work | Many relation classification algorithms utilize WordNet . |
Related Work | Among the 15 systems presented by the 14 SemEval teams, some utilized the manually provided WordNet tags for the dataset pairs (e.g., (Beamer et al., 2007)). |
Results | Method P R F Acc Unsupervised clustering (4.3.3) 64.5 61.3 62.0 64.5 Cluster Labeling (4.3.1) 65.1 69.0 67.2 68.5 HITS Features (4.3.2) 69.1 70.6 70.6 70.1 Best Task 4 (no WordNet) 66.1 66.7 64.8 66.0 Best Task 4 (with WordNet ) 79.7 69.8 72.4 76.3 |
Results | Table 1 shows our results, along with the best Task 4 result not using WordNet labels (Costello, 2007). |
Abstract | Although handcrafted lexical resources, such as WordNet, could provide more reliable related terms, previous studies showed that query expansion using only WordNet leads to very limited performance improvement. |
Introduction | Intuitively, compared with co-occurrence-based thesauri, handcrafted thesauri, such as WordNet , could provide more reliable terms for query expansion. |
Introduction | However, previous studies failed to show any significant gain in retrieval performance when queries are expanded with terms selected from WordNet (Voorhees, 1994; Stairmand, 1997). |
Introduction | In this paper, we study several term similarity functions that exploit various information from two lexical resources, i.e., WordNet |
Related Work | Although the use of WordNet in query expansion has been studied by various researchers, the improvement of retrieval performance is often limited. |
Related Work | Voorhees (Voorhees, 1994) expanded queries using a combination of synonyms, hypemyms and hyponyms manually selected from WordNet , and achieved limited improvement (i.e., around —2% to |
Related Work | Stairmand (Stairmand, 1997) used WordNet for query expansion, but they concluded that the improvement was restricted by the coverage of the WordNet and no empirical results were reported. |
Abstract | We examine the differences in content between the 1911 and 1987 versions of Roget’s, and we test both versions with each other and WordNet on problems such as synonym identification and word relatedness. |
Abstract | We also present a novel method for measuring sentence relatedness that can be implemented in either version of Roget’s or in WordNet . |
Abstract | Although the 1987 version of the Thesaurus is better, we show that the 1911 version performs surprisingly well and that often the differences between the versions of R0-get’s and WordNet are not statistically significant. |
Introduction | We compare two versions, the 1987 and 1911 editions of the Thesaurus with each other and with WordNet 3.0. |
Introduction | Roget’s Thesaurus has a unique structure, quite different from WordNet , of which the NLP community has yet to take full advantage. |
Introduction | In this paper we demonstrate that although the 1911 version of the Thesaurus is very old, it can give results comparable to systems that use WordNet or newer versions of Roget’s Thesaurus. |
Evaluation | WordNet ? |
Evaluation | Table 1: Class labels found in WordNet in original form, or found in WordNet after removal of leading words, or not found in WordNet at all |
Evaluation | Accuracy of Class Labels: Built over many years of manual construction efforts, lexical gold standards such as WordNet (Fellbaum, 1998) provide wide-coverage upper ontologies of the English language. |
Verb Class Model 2.1 Probabilistic Model | The selectional preferences are expressed in terms of semantic concepts from WordNet , rather than a set of individual words. |
Verb Class Model 2.1 Probabilistic Model | 4. selecting a WordNet concept r for each argument slot, e.g. |
Verb Class Model 2.1 Probabilistic Model | and Light (1999) and turn WordNet into a Hidden Markov model (HMM). |
Empirical Evaluation: Simile-derived Representations | Almuhareb and Poesio (2004) used as their experimental basis a sampling of 214 English nouns from 13 of WordNet’s upper-level semantic categories, and proceeded to harvest adjectival features for these noun-concepts from the web using the textual pattern “[a | an | the] * C [is | was]”. |
Harvesting Knowledge from Similes: English and Chinese | Veale and Hao (2007) use the Google API in conjunction with Princeton WordNet (Fellbaum, 1998) as the basis of their harvesting system. |
Harvesting Knowledge from Similes: English and Chinese | They first extracted a list of antonymous adjectives, such as “hot” or “cold”, from WordNet , the intuition being that explicit similes will tend to exploit properties that occupy an exemplary point on a scale. |
Harvesting Knowledge from Similes: English and Chinese | To harvest a comparable body of Chinese similes from the web, we also use the Google API, in conjunction with both WordNet and HowNet (Dong and Dong, 2006). |
Related Work | (1999), in which each of the textual glosses in WordNet (Fellbaum, 1998) is linguistically analyzed to yield a sense-tagged logical form, is an example of the former approach. |
Related Work | Almuhareb and Poesio go on to demonstrate that the values and attributes that are found for word-concepts on the web yield a sufficiently rich representation for these word-concepts to be automatically clustered into a form resembling that assigned by WordNet (see Fellbaum, 1998). |
Tagging and Mapping of Similes | In the case of English similes, Veale and Hao (2007) describe how two English similes “as A as N1” and “as A as N2” will be mutually disambiguating if N1 and N2 are synonyms in WordNet, or if some sense of N1 is a hypernym or hyponym of some sense of N2 in WordNet . |
Tagging and Mapping of Similes | For instance, though HowNet has a much shallower hierarchical organization than WordNet , it compensates by encapsulating the meaning of different word senses using simple logical formulae of semantic primitives, or sememes, that are derived from the meaning of common Chinese characters. |
Tagging and Mapping of Similes | WordNet and HowNet thus offer two complementary levels or granularities of generalization that can be exploited as the context demands. |
Method | where: infll and inflg are inflected variants of nounl and noung generated using the Java WordNet Libraryl; THAT is a complementizer and can be that, which, or who; and * stands for 0 or more (up to 8) instances of Google’s star operator. |
Method | Finally, we lemmatize the main verb using WordNet’s morphological analyzer Morphy (Fellbaum, 1998). |
Related Work | (2005) apply both classic (SVM and decision trees) and novel supervised models (semantic scattering and iterative semantic specialization), using WordNet , word sense disambiguation, and a set of linguistic features. |
Related Work | Their approach is highly resource intensive (uses WordNet , CoreLex and Moby’s thesaurus), and is quite sensitive to the seed set of verbs: on a collection of 453 examples and 19 relations, they achieved 52.6% accuracy with 84 seed verbs, but only 46.7% with 57 seed verbs. |
Relational Similarity Experiments | We further experimented with the SemEval’07 task 4 dataset (Girju et al., 2007), where each example consists of a sentence, a target semantic relation, two nominals to be judged on whether they are in that relation, manually annotated WordNet senses, and the Web query used to obtain the sentence: |
Relational Similarity Experiments | WordNet(el) = "vessel%l:06:OO::", WordNet(e2) = "tool%l:O6:OO::", Content—Container(e2, el) = "true", Query = "contents of the * were a" |
Relational Similarity Experiments | The SemEval competition defines four types of systems, depending on whether the manually annotated WordNet senses and the Google query are used: A (WordNet=no, Query=no), B (WordNet=yes, Query=no), C (WordNet=no, Query=yes), and D (WordNet=yes, Query=yes). |
Abstract | This study presents a novel approach to the problem of system portability across different domains: a sentiment annotation system that integrates a corpus-based classifier trained on a small set of annotated in-domain data and a lexicon-based system trained on WordNet . |
Introduction | In this paper, we present a novel approach to the problem of system portability across different domains by developing a sentiment annotation system that integrates a corpus-based classifier with a lexicon-based system trained on WordNet . |
Introduction | The information contained in lexicographical sources, such as WordNet , reflects a lay person’s general knowledge about the world, while domain-specific knowledge can be acquired through classifier training on a small set of in-domain data. |
Introduction | The final, third part of the paper presents our system, composed of an ensemble of two classifiers —one trained on WordNet glosses and synsets and the other trained on a small in-domain training set. |
Lexicon-Based Approach | A lexicon-based approach capitalizes on the fact that dictionaries, such as WordNet (Fellbaum, 1998), contain a comprehensive and domain-independent set of sentiment clues that exist in general English. |
Lexicon-Based Approach | One of the limitations of general lexicons and dictionaries, such as WordNet (Fellbaum, 1998), as training sets for sentiment tagging systems is that they contain only definitions of individual words and, hence, only unigrams could be effectively learned from dictionary entries. |
Lexicon-Based Approach | Since the structure of WordNet glosses is fairly different from that of other types of corpora, we developed a system that used the list of human-annotated adjectives from (Hatzivassiloglou and McKeown, 1997) as a seed list and then learned additional unigrams |
Empirical Evaluation | In Section 3.3, we developed three ways to compute the weight of an edge in the sentence quotation graph, i.e., clue words, semantic similarity based on WordNet and cosine similarity. |
Empirical Evaluation | The above experiments show that the widely used cosine similarity and the more sophisticated semantic similarity in WordNet are less accurate than the basic CWS in the summarization framework. |
Extracting Conversations from Multiple Emails | We explore three types of cohesion measures: (1) clue words that are based on stems, (2) semantic distance based on WordNet |
Extracting Conversations from Multiple Emails | 3.3.2 Semantic Similarity Based on WordNet |
Extracting Conversations from Multiple Emails | We use the well-known lexical database WordNet to get the semantic similarity of two words. |
Introduction | Most established resources (e.g., WordNet ) represent only the main and widely accepted relationships such as hyper-nymy and meronymy. |
Related Work | There is a large body of related work that deals with discovery of basic relationship types represented in useful resources such as WordNet , including hyper-nymy (Hearst, 1992; Pantel et al., 2004; Snow et al., 2006), synonymy (Davidov and Rappoport, 2006; Widdows and Dorow, 2002) and meronymy (Berland and Charniak, 1999; Girju et al., 2006). |
Related Work | Several algorithms use manually-prepared resources, including WordNet (Moldovan et al., 2004; Costello et al., 2006) and Wikipedia (Strube and Ponzetto, 2006). |
Related Work | Evaluation for hypemymy and synonymy usually uses WordNet (Lin and Pantel, 2002; Widdows and Dorow, 2002; Davidov and Rappoport, 2006). |
Context and Answer Detection | here, we use the product of sim(xu, Qi) and sim(:cv, {mm 62,} to estimate the possibility of being a context-answer pair for (u, v) , where sim(-, is the semantic similarity calculated on WordNet as described in Section 3.5. |
Context and Answer Detection | The semantic similarity between words is computed based on Wu and Palmer’s measure (Wu and Palmer, 1994) using WordNet (Fellbaum, 1998).1 The similarity between contiguous sentences will be used to capture the dependency for CRFs. |
Context and Answer Detection | - Similarity with the question using WordNet |
Approach | Wherever applicable, we explore different syntactic and semantic representations of the textual content, e. g., extracting the dependency-based representation of the text or generalizing words to their WordNet supersenses (WNSS) (Ciaramita and Altun, 2006). |
Approach | In all these representations we skip stop words and normalize all words to their WordNet lemmas. |
The Corpus | Each word was morphologically simplified using the morphological functions of the WordNet library8. |
The Corpus | These tags, defined by WordNet lexicographers, provide a broad semantic categorization for nouns and verbs and include labels for nouns such as food, animal, body and feeling, and for verbs labels such as communication, contact, and possession. |