Index of papers in Proc. ACL 2008 that mention
  • WordNet
Agirre, Eneko and Baldwin, Timothy and Martinez, David
Background
RRR consists of 20,081 training and 3,097 test quadruples of the form (v, n1 , p, n2) , where the attachment decision is either v or n l. The best published results over RRR are those of Stetina and Nagao (1997), who employ WordNet sense predictions from an unsupervised WSD method within a decision tree classifier.
Background
(2005) experimented with first-sense and hypernym features from HowNet and CiLin (both WordNets for Chinese) in a generative parse model applied to the Chinese Penn Treebank.
Experimental setting
We experimented with a range of semantic representations, all of which are based on WordNet 2.1.
Experimental setting
As mentioned above, words in WordNet are organised into sets of synonyms, called synsets.
Experimental setting
Note that these are the two extremes of semantic granularity in WordNet , and we plan to experiment with intermediate representation levels in future research (c.f.
Integrating Semantics into Parsing
Our choice for this work was the WordNet 2.1 lexical database, in which synonyms are grouped into synsets, which are then linked via an ISA hierarchy.
Integrating Semantics into Parsing
WordNet contains other types of relations such as meronymy, but we did not use them in this research.
Integrating Semantics into Parsing
mallet, square and steel-wool pad are also descendants of TOOL in WordNet , none of which would conventionally be used as the manner adjunct of cat).
Introduction
We explore several models for semantic representation, based around WordNet (Fellbaum, 1998).
WordNet is mentioned in 14 sentences in this paper.
Topics mentioned in this paper:
Davidov, Dmitry and Rappoport, Ari
Abstract
Our NR classification evaluation strictly follows the ACL SemEval-07 Task 4 datasets and protocol, obtaining an f-score of 70.6, as opposed to 64.8 of the best previous work that did not use the manually provided WordNet sense disambiguation tags.
Experimental Setup
Nouns in this pair were manually labeled with their corresponding WordNet 3 labels and the web queries used to
Experimental Setup
The 15 submitted systems were assigned into 4 categories according to whether they use the WordNet and Query tags (some systems were assigned to more than a single category, since they reported experiments in several settings).
Experimental Setup
In our evaluation we do not utilize WordNet or Query tags, hence we compare ourselves with the corresponding group (A), containing 6 systems.
Introduction
To improve results, some systems utilize additional manually constructed semantic resources such as WordNet (WN) (Beamer et al., 2007).
Introduction
Furthermore, usage of such resources frequently requires disambiguation and connection of the data to the resource (word sense disambiguation in the case of WordNet ).
Introduction
We evaluated our algorithm on SemEval-07 Task 4 data, showing superior results over participating algorithms that did not utilize WordNet disambiguation tags.
Related Work
Many relation classification algorithms utilize WordNet .
Related Work
Among the 15 systems presented by the 14 SemEval teams, some utilized the manually provided WordNet tags for the dataset pairs (e.g., (Beamer et al., 2007)).
Results
Method P R F Acc Unsupervised clustering (4.3.3) 64.5 61.3 62.0 64.5 Cluster Labeling (4.3.1) 65.1 69.0 67.2 68.5 HITS Features (4.3.2) 69.1 70.6 70.6 70.1 Best Task 4 (no WordNet) 66.1 66.7 64.8 66.0 Best Task 4 (with WordNet ) 79.7 69.8 72.4 76.3
Results
Table 1 shows our results, along with the best Task 4 result not using WordNet labels (Costello, 2007).
WordNet is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Fang, Hui
Abstract
Although handcrafted lexical resources, such as WordNet, could provide more reliable related terms, previous studies showed that query expansion using only WordNet leads to very limited performance improvement.
Introduction
Intuitively, compared with co-occurrence-based thesauri, handcrafted thesauri, such as WordNet , could provide more reliable terms for query expansion.
Introduction
However, previous studies failed to show any significant gain in retrieval performance when queries are expanded with terms selected from WordNet (Voorhees, 1994; Stairmand, 1997).
Introduction
In this paper, we study several term similarity functions that exploit various information from two lexical resources, i.e., WordNet
Related Work
Although the use of WordNet in query expansion has been studied by various researchers, the improvement of retrieval performance is often limited.
Related Work
Voorhees (Voorhees, 1994) expanded queries using a combination of synonyms, hypemyms and hyponyms manually selected from WordNet , and achieved limited improvement (i.e., around —2% to
Related Work
Stairmand (Stairmand, 1997) used WordNet for query expansion, but they concluded that the improvement was restricted by the coverage of the WordNet and no empirical results were reported.
WordNet is mentioned in 23 sentences in this paper.
Topics mentioned in this paper:
Kennedy, Alistair and Szpakowicz, Stan
Abstract
We examine the differences in content between the 1911 and 1987 versions of Roget’s, and we test both versions with each other and WordNet on problems such as synonym identification and word relatedness.
Abstract
We also present a novel method for measuring sentence relatedness that can be implemented in either version of Roget’s or in WordNet .
Abstract
Although the 1987 version of the Thesaurus is better, we show that the 1911 version performs surprisingly well and that often the differences between the versions of R0-get’s and WordNet are not statistically significant.
Introduction
We compare two versions, the 1987 and 1911 editions of the Thesaurus with each other and with WordNet 3.0.
Introduction
Roget’s Thesaurus has a unique structure, quite different from WordNet , of which the NLP community has yet to take full advantage.
Introduction
In this paper we demonstrate that although the 1911 version of the Thesaurus is very old, it can give results comparable to systems that use WordNet or newer versions of Roget’s Thesaurus.
WordNet is mentioned in 51 sentences in this paper.
Topics mentioned in this paper:
Paşca, Marius and Van Durme, Benjamin
Evaluation
WordNet ?
Evaluation
Table 1: Class labels found in WordNet in original form, or found in WordNet after removal of leading words, or not found in WordNet at all
Evaluation
Accuracy of Class Labels: Built over many years of manual construction efforts, lexical gold standards such as WordNet (Fellbaum, 1998) provide wide-coverage upper ontologies of the English language.
WordNet is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Schulte im Walde, Sabine and Hying, Christian and Scheible, Christian and Schmid, Helmut
Verb Class Model 2.1 Probabilistic Model
The selectional preferences are expressed in terms of semantic concepts from WordNet , rather than a set of individual words.
Verb Class Model 2.1 Probabilistic Model
4. selecting a WordNet concept r for each argument slot, e.g.
Verb Class Model 2.1 Probabilistic Model
and Light (1999) and turn WordNet into a Hidden Markov model (HMM).
WordNet is mentioned in 21 sentences in this paper.
Topics mentioned in this paper:
Veale, Tony and Hao, Yanfen and Li, Guofu
Empirical Evaluation: Simile-derived Representations
Almuhareb and Poesio (2004) used as their experimental basis a sampling of 214 English nouns from 13 of WordNet’s upper-level semantic categories, and proceeded to harvest adjectival features for these noun-concepts from the web using the textual pattern “[a | an | the] * C [is | was]”.
Harvesting Knowledge from Similes: English and Chinese
Veale and Hao (2007) use the Google API in conjunction with Princeton WordNet (Fellbaum, 1998) as the basis of their harvesting system.
Harvesting Knowledge from Similes: English and Chinese
They first extracted a list of antonymous adjectives, such as “hot” or “cold”, from WordNet , the intuition being that explicit similes will tend to exploit properties that occupy an exemplary point on a scale.
Harvesting Knowledge from Similes: English and Chinese
To harvest a comparable body of Chinese similes from the web, we also use the Google API, in conjunction with both WordNet and HowNet (Dong and Dong, 2006).
Related Work
(1999), in which each of the textual glosses in WordNet (Fellbaum, 1998) is linguistically analyzed to yield a sense-tagged logical form, is an example of the former approach.
Related Work
Almuhareb and Poesio go on to demonstrate that the values and attributes that are found for word-concepts on the web yield a sufficiently rich representation for these word-concepts to be automatically clustered into a form resembling that assigned by WordNet (see Fellbaum, 1998).
Tagging and Mapping of Similes
In the case of English similes, Veale and Hao (2007) describe how two English similes “as A as N1” and “as A as N2” will be mutually disambiguating if N1 and N2 are synonyms in WordNet, or if some sense of N1 is a hypernym or hyponym of some sense of N2 in WordNet .
Tagging and Mapping of Similes
For instance, though HowNet has a much shallower hierarchical organization than WordNet , it compensates by encapsulating the meaning of different word senses using simple logical formulae of semantic primitives, or sememes, that are derived from the meaning of common Chinese characters.
Tagging and Mapping of Similes
WordNet and HowNet thus offer two complementary levels or granularities of generalization that can be exploited as the context demands.
WordNet is mentioned in 14 sentences in this paper.
Topics mentioned in this paper:
Nakov, Preslav and Hearst, Marti A.
Method
where: infll and inflg are inflected variants of nounl and noung generated using the Java WordNet Libraryl; THAT is a complementizer and can be that, which, or who; and * stands for 0 or more (up to 8) instances of Google’s star operator.
Method
Finally, we lemmatize the main verb using WordNet’s morphological analyzer Morphy (Fellbaum, 1998).
Related Work
(2005) apply both classic (SVM and decision trees) and novel supervised models (semantic scattering and iterative semantic specialization), using WordNet , word sense disambiguation, and a set of linguistic features.
Related Work
Their approach is highly resource intensive (uses WordNet , CoreLex and Moby’s thesaurus), and is quite sensitive to the seed set of verbs: on a collection of 453 examples and 19 relations, they achieved 52.6% accuracy with 84 seed verbs, but only 46.7% with 57 seed verbs.
Relational Similarity Experiments
We further experimented with the SemEval’07 task 4 dataset (Girju et al., 2007), where each example consists of a sentence, a target semantic relation, two nominals to be judged on whether they are in that relation, manually annotated WordNet senses, and the Web query used to obtain the sentence:
Relational Similarity Experiments
WordNet(el) = "vessel%l:06:OO::", WordNet(e2) = "tool%l:O6:OO::", Content—Container(e2, el) = "true", Query = "contents of the * were a"
Relational Similarity Experiments
The SemEval competition defines four types of systems, depending on whether the manually annotated WordNet senses and the Google query are used: A (WordNet=no, Query=no), B (WordNet=yes, Query=no), C (WordNet=no, Query=yes), and D (WordNet=yes, Query=yes).
WordNet is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Andreevskaia, Alina and Bergler, Sabine
Abstract
This study presents a novel approach to the problem of system portability across different domains: a sentiment annotation system that integrates a corpus-based classifier trained on a small set of annotated in-domain data and a lexicon-based system trained on WordNet .
Introduction
In this paper, we present a novel approach to the problem of system portability across different domains by developing a sentiment annotation system that integrates a corpus-based classifier with a lexicon-based system trained on WordNet .
Introduction
The information contained in lexicographical sources, such as WordNet , reflects a lay person’s general knowledge about the world, while domain-specific knowledge can be acquired through classifier training on a small set of in-domain data.
Introduction
The final, third part of the paper presents our system, composed of an ensemble of two classifiers —one trained on WordNet glosses and synsets and the other trained on a small in-domain training set.
Lexicon-Based Approach
A lexicon-based approach capitalizes on the fact that dictionaries, such as WordNet (Fellbaum, 1998), contain a comprehensive and domain-independent set of sentiment clues that exist in general English.
Lexicon-Based Approach
One of the limitations of general lexicons and dictionaries, such as WordNet (Fellbaum, 1998), as training sets for sentiment tagging systems is that they contain only definitions of individual words and, hence, only unigrams could be effectively learned from dictionary entries.
Lexicon-Based Approach
Since the structure of WordNet glosses is fairly different from that of other types of corpora, we developed a system that used the list of human-annotated adjectives from (Hatzivassiloglou and McKeown, 1997) as a seed list and then learned additional unigrams
WordNet is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Carenini, Giuseppe and Ng, Raymond T. and Zhou, Xiaodong
Empirical Evaluation
In Section 3.3, we developed three ways to compute the weight of an edge in the sentence quotation graph, i.e., clue words, semantic similarity based on WordNet and cosine similarity.
Empirical Evaluation
The above experiments show that the widely used cosine similarity and the more sophisticated semantic similarity in WordNet are less accurate than the basic CWS in the summarization framework.
Extracting Conversations from Multiple Emails
We explore three types of cohesion measures: (1) clue words that are based on stems, (2) semantic distance based on WordNet
Extracting Conversations from Multiple Emails
3.3.2 Semantic Similarity Based on WordNet
Extracting Conversations from Multiple Emails
We use the well-known lexical database WordNet to get the semantic similarity of two words.
WordNet is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Davidov, Dmitry and Rappoport, Ari
Introduction
Most established resources (e.g., WordNet ) represent only the main and widely accepted relationships such as hyper-nymy and meronymy.
Related Work
There is a large body of related work that deals with discovery of basic relationship types represented in useful resources such as WordNet , including hyper-nymy (Hearst, 1992; Pantel et al., 2004; Snow et al., 2006), synonymy (Davidov and Rappoport, 2006; Widdows and Dorow, 2002) and meronymy (Berland and Charniak, 1999; Girju et al., 2006).
Related Work
Several algorithms use manually-prepared resources, including WordNet (Moldovan et al., 2004; Costello et al., 2006) and Wikipedia (Strube and Ponzetto, 2006).
Related Work
Evaluation for hypemymy and synonymy usually uses WordNet (Lin and Pantel, 2002; Widdows and Dorow, 2002; Davidov and Rappoport, 2006).
WordNet is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Ding, Shilin and Cong, Gao and Lin, Chin-Yew and Zhu, Xiaoyan
Context and Answer Detection
here, we use the product of sim(xu, Qi) and sim(:cv, {mm 62,} to estimate the possibility of being a context-answer pair for (u, v) , where sim(-, is the semantic similarity calculated on WordNet as described in Section 3.5.
Context and Answer Detection
The semantic similarity between words is computed based on Wu and Palmer’s measure (Wu and Palmer, 1994) using WordNet (Fellbaum, 1998).1 The similarity between contiguous sentences will be used to capture the dependency for CRFs.
Context and Answer Detection
- Similarity with the question using WordNet
WordNet is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Surdeanu, Mihai and Ciaramita, Massimiliano and Zaragoza, Hugo
Approach
Wherever applicable, we explore different syntactic and semantic representations of the textual content, e. g., extracting the dependency-based representation of the text or generalizing words to their WordNet supersenses (WNSS) (Ciaramita and Altun, 2006).
Approach
In all these representations we skip stop words and normalize all words to their WordNet lemmas.
The Corpus
Each word was morphologically simplified using the morphological functions of the WordNet library8.
The Corpus
These tags, defined by WordNet lexicographers, provide a broad semantic categorization for nouns and verbs and include labels for nouns such as food, animal, body and feeling, and for verbs labels such as communication, contact, and possession.
WordNet is mentioned in 4 sentences in this paper.
Topics mentioned in this paper: