Experiments | Experiment results show that the similarity function based on synset definitions is most effective. |
Experiments | First, the similarity function based on synset definitions is the most effective one. |
Experiments | As shown in Table 2, the similarity function based on synset definitions, i.e., sdef, is most effective. |
Introduction | We find that the most effective way to utilize the information from WordNet is to compute the term similarity based on the overlap of synset definitions. |
Term Similarity based on Lexical Resources | Every node in the WordNet is a synset , i.e., a set of synonyms. |
Term Similarity based on Lexical Resources | The definition of a synset , which is referred to as gloss, is also provided. |
Term Similarity based on Lexical Resources | For a query term, all the synsets in which the term appears can be returned, along with the definition of the synsets . |
Abstract | As a first step to automatically construct full Wordnets, we propose approaches to generate Wordnet synsets for languages both resource-rich and resource-poor, using publicly available Wordnets, a machine translator and/or a single bilingual dictionary. |
Abstract | Our algorithms translate synsets of existing Wordnets to a target language T, then apply a ranking method on the translation candidates to find best translations in T. Our approaches are applicable to any language which has at least one existing bilingual dictionary translating from English to it. |
Introduction | One of our goals is to automatically generate high quality synsets , each of which is a set of cognitive synonyms, for Wordnets having the same structure as the PWN in several languages. |
Introduction | In particular, given public Wordnets aligned to the PWN ( such as the FinnWordNet (FWN) (Linden, 2010) and the J apaneseWordNet (J WN) (Isahara et al., 2008) ) and the Microsoft Translator, we build Wordnet synsets for arb, asm, dis, ajz and vie. |
Proposed approaches | In this section, we propose approaches to create Wordnet synsets for a target languages T using existing Wordnets and the MT and/or a single bilingual dictionary. |
Proposed approaches | We take advantage of the fact that every synset in PWN has a unique oflset-POS, referring to the offset for a synset with a particular part-of—speech (POS) from the beginning of its data file. |
Proposed approaches | Each synset may have one or more words, each of which may be in one or more synsets . |
Evaluation 1: Agreement with Sentiment Lexicons | The construction of the connotation graph, denoted by GWORD+SENSE, which includes words and synsets , has been described in Section 2. |
Introduction | 1Hence a sense in WordNet is defined by synset (= synonym set), which is the set of words sharing the same sense. |
Network of Words and Senses | As shown in Figure 1, it contains two types of nodes; (i) lemmas (i.e., words, 115K) and (ii) synsets (63K), and four types of edges; (t1) predicate-argument (179K), (t2) argument-argument (144K), (t3) argument-synset (126K), and (t4) synset-synset (3.4K) edges. |
Network of Words and Senses | The argument-synset edges capture the synonymy between argument nodes through the corresponding synsets . |
Network of Words and Senses | Finally, the synset-synset edges depict the antonym relations between synset pairs. |
Pairwise Markov Random Fields and Loopy Belief Propagation | More formally, we denote the connotation graph GWORDJ'SEI‘ISE by G = (V, E), in which a total of n word and synset nodes V = {211, . |
Pairwise Markov Random Fields and Loopy Belief Propagation | and synsets connected with typed edges, - prior knowledge (i.e., probabilities) of (some or all) nodes belonging to each class, |
KurdNet: State-of-the-Art | 0 Expand: in this model, the synsets are built in correspondence with the WordNet synsets and the semantic relations are directly imported. |
KurdNet: State-of-the-Art | 0 Merge: in this model, the synsets and relations are first built independently and then they are aligned with WordNet’s. |
KurdNet: State-of-the-Art | synsets ) that play a major role in the wordnets. |
Experiment 1: Oxford Lexical Predicates | As our set C of semantic classes we selected the standard set of 3,299 core nominal synsets available in WordNet.8 However, our approach is flexible and can be used with classes of an arbitrary level of granularity. |
Large-Scale Harvesting of Semantic Predicates | As explained below, we assume the set C to be made up of representative synsets from WordNet. |
Large-Scale Harvesting of Semantic Predicates | This way we avoid building a new taxonomy and shift the problem to that of projecting the Wikipedia pages —associated with annotated filling arguments — to synsets in WordNet. |
Large-Scale Harvesting of Semantic Predicates | We exploit an existing mapping implemented in BabelNet (Navigli and Ponzetto, 2012), a wide-coverage multilingual semantic network that integrates Wikipedia and WordNet.3 Based on a disambiguation algorithm, BabelNet establishes a mapping ,u : Wikipages —> Synsets which links about 50,000 pages to their most suitable WordNet senses.4 |
Extending with non-wordnet data | Most had around 550 senses ( synsets and their lemmas): for example, for Portuguese: Englishnzl inglés. |
Extending with non-wordnet data | de Melo and Weikum (2009) also use this data (and data from a variety of other sources) to build an enhanced wordnet, in addition adding new synsets for concepts that are not in not wordnet. |
Linking Multiple Wordnets | Open class words (nouns, verbs, adjectives and adverbs) are grouped into concepts represented by sets of synonyms ( synsets ). |
Linking Multiple Wordnets | Synsets are linked by semantic relations such as hyponomy and meronomy. |
Linking Multiple Wordnets | The majority of freely available wordnets take the basic structure of the PWN and add new lemmas (words) to the existing synsets : the extend model (Vossen, 2005). |
Introduction | Section 3 describes the Synset aligned multilingual dictionary which facilitates parameter projection. |
Parameter Projection | i E Candidate Synsets J 2 Set ofdisambigaated words 6i 2 BelongingnessToDominantConcept(Si) V; = P(Si|w0rd) Wij = CorpusCooccurrencdSi, Sj) >|< l/WNConceptualDistance(Si, Sj) >|< l/WNSemanticGraphDistance(Si, 33-) |
Related Work | At the heart of our work lies parameter projection facilitated by a synset aligned |
Synset Aligned Multilingual Dictionary | One important departure in this framework from the traditional dictionary is that synsets are linked, and after that the words inside the synsets are linked. |
Synset Aligned Multilingual Dictionary | The basic mapping is thus between synsets and thereafter between the words. |
Synset Aligned Multilingual Dictionary | After the synsets are linked, cross linkages are set up manually from the words of a synset to the words of a linked synset of the pivot language. |
Experimental Setup | To obtain the paraphrases, we use the word forms, glosses and example sentences of the synset itself and a set of selected reference synsets (i.e., synsets linked to the target synset by specific semantic relations, see Table 1). |
Experimental Setup | We excluded the ‘hypemym reference synsets’, since information common to all of the child synsets may confuse the disambiguation process. |
Experimental Setup | In the latter case, each sense can be represented by its synset as well as its reference synsets . |
Experiments | We think that there are three reasons for this: first, adjectives and adverbs have fewer reference synsets for paraphrases compared with nouns and verbs (see Table 1); second, adjectives and adverbs tend to convey less key semantic content in the document, so they are more difficult to capture by the topic model; and third, adjectives and adverbs are a small portion of the test set, so their performances are statistically unstable. |
Experiments | MII+ref is the result of including the reference synsets , while MII-ref excludes the refer- |
Experiments | ence synsets . |
Related Work | Topics and synsets are then inferred together. |
The Sense Disambiguation Model | WordNet is a fairly rich resource which provides detailed information about word senses (glosses, example sentences, synsets , semantic relations between senses, etc.). |
BabelNet | We collect (a) from WordNet, all available word senses (as concepts) and all the semantic pointers between synsets (as relations); (b) from Wikipedia, all encyclopedic entries (i.e. |
BabelNet | We call the resulting set of multilingual lexicalizations of a given concept a babel synset . |
Methodology | A concept in WordNet is represented as a synonym set (called synset ), i.e. |
Methodology | For instance, the concept wind is expressed by the following synset: |
Methodology | We denote with w; the i-th sense of a word 7.0 with part of speech p. We use word senses to unambiguously denote the corresponding synsets (e.g. |
Evaluation framework | The aligner constructs a WordNet dictionary for the purpose of synset alignment. |
Evaluation framework | The CW cluster is then aligned to WordNet synsets by comparing the clusters with WordNet graph and the synset with the maximum alignment score is returned as the output. |
Evaluation framework | In summary, the aligner tool takes as input the CW cluster and returns a WordNet synset id that corresponds to the cluster words. |
Related work | A few approaches suggested by (Bond et al., 2009; Paakko' and Linden, 2012) attempt to augment WordNet synsets primarily using methods of annotation. |
Discussion | tween the two extremes of full synsets and SFs. |
Experimental setting | As mentioned above, words in WordNet are organised into sets of synonyms, called synsets . |
Experimental setting | Each synset in turn belongs to a unique semantic file (SF). |
Experimental setting | We experiment with both full synsets and SFs as instances of fine-grained and coarse-grained semantic representation, respectively. |
Integrating Semantics into Parsing | Our choice for this work was the WordNet 2.1 lexical database, in which synonyms are grouped into synsets , which are then linked via an ISA hierarchy. |
Integrating Semantics into Parsing | With any lexical semantic resource, we have to be careful to choose the appropriate level of granularity for a given task: if we limit ourselves to synsets we will not be able to capture broader gen-eralisations, such as the one between knife and scissors;1 on the other hand by grouping words related at a higher level in the hierarchy we could find that we make overly coarse groupings (e.g. |
Integrating Semantics into Parsing | 1In WordNet 2.1, knife and scissors are sister synsets , both of which have TOOL as their 4th hypernym. |
Results | In this case, synsets slightly outperform SF. |
FrameNet — Wiktionary Alignment | The PPR measure (Agirre and Soroa, 2009) maps the glosses of the two senses to a semantic vector space spanned up by WordNet synsets and then compares them using the chi-square measure. |
FrameNet — Wiktionary Alignment | where M is a transition probability matrix between the n WordNet synsets , c is a damping factor, and vppr is a vector of size n representing the probability of jumping to the node 2' associated with each vi. |
FrameNet — Wiktionary Alignment | For personalized PageRank, vppr is initialized in a particular way: the initial weight is distributed equally over the m vector components (i.e., synsets ) associated with a word in the sense gloss, other components receive a 0 value. |
Related Work | (2008) map FrameNet frames to WordNet synsets based on the embedding of FrameNet lemmas in WordNet. |
Related Work | To create MapNet, Tonelli and Pianta (2009) align FrameNet senses with WordNet synsets by exploiting the textual similarity of their glosses. |
Related Work | The similarity measure is based on stem overlap of the candidates’ glosses expanded by WordNet domains, the WordNet synset , and the set of senses for a FrameNet frame. |
Experiments | In step 1, in order to make sure we select a diverse list of words, we consider three attributes of a word: frequency in a corpus, number of parts of speech, and number of synsets according to WordNet. |
Experiments | We also group words by their number of synsets : [0,5], [6,10], [11, 20], and [20, max]. |
Experiments | (2010), we use WordNet to first randomly select one synset of the first word, we then construct a set of words in various relations to the first word’s chosen synset , including hypemyms, hy-ponyms, holonyms, meronyms and attributes. |
Clustering for Sentiment Analysis | A synonymous set of words in a WordNet is called a synset . |
Clustering for Sentiment Analysis | Each synset can be considered as a word cluster comprising of semantically similar words. |
Clustering for Sentiment Analysis | (2011) showed that WordNet synsets can act as good features for document level sentiment classification. |
Discussions | For example, on En-PD, percentage of features present in the test set and not present in the training set to those present in the test set are 34.17%, 11.24%, 0.31% for words, synsets |
Discussions | However, it must be noted that clustering based on unlabelled corpora is less taxing than manually creating paradigmatic property based clusters like WordNet synsets . |
Analysis and Discussions | 7.2 Effect of Synsets and Antonyms |
Analysis and Discussions | We show the important effect of synsets and antonyms in computing the sentiment similarity of words. |
Analysis and Discussions | This is indicates that the synsets of the words can improve the quality of the enriched matrix. |
Hidden Emotional Model | To compute the semantic similarity between word senses, we utilize their synsets as follows: |
Hidden Emotional Model | where, syn(w) is the synset of w. Let count(w,~, wi) be the co—occurrence of the w,- and w], and let count(w_,~) be the total word count. |
Hidden Emotional Model | In addition, note that employing the synset of the words help to obtain different emotional vectors for each sense of a word. |
Comparison on applications | We consider 10 measures, noted in the table as J&C (Jiang and Conrath, 1997), Resnik (Resnik, 1995), Lin (Lin, 1998), W&P (Wu and Palmer, 1994), L&C (Leacock and Chodorow, 1998), H&SO (Hirst and St—Onge, 1998), Path (counts edges between synsets ), Lesk (Banerjee and Pedersen, 2002), and finally Vector and Vector Pair (Patwardhan, 2003). |
Comparison on applications | We mean a concept in Roget’s to be either a Class, Section, ..., Semicolon Group, while a concept in WordNet is any synset . |
Comparison on applications | Likewise, in WordNet if c were a synset, then each Ci would be a hyponym synset of 0. |
Experiments | A task begins with a description of a target synset and its textual definition; following, ten annotation questions are shown. |
Video Game with a Purpose Design | First, by connecting WordNet synsets to Wikipedia pages, most synsets are associated with a set of pictures; while often noisy, these pictures sometimes illustrate the target concept and are an ideal case for validation. |
Video Game with a Purpose Design | Data We created a common set of concepts, 0, used in both games, containing sixty synsets selected from all BabelNet synsets with at least fifty associated images. |
Video Game with a Purpose Design | Using the same set of synsets , separate datasets were created for the two validation tasks. |
Model and Feature Extraction | A lexical item can belong to several synsets , which are associated with different supersenses. |
Model and Feature Extraction | For example, the word head (when used as a noun) participates in 33 synsets , three of which are related to the supersense noan.b0dy. |
Model and Feature Extraction | Hence, we select all the synsets of the nouns head and brain. |
Experiments | To enable a comparison with the state of the art, we followed Matuschek and Gurevych (2013) and performed an alignment of WordNet synsets (WN) to three different collaboratively-constructed resources: Wikipedia |
Experiments | As mentioned in Section 2.1.1, we build the WN graph by including all the synsets and semantic relations defined in WordNet (e.g., hypernymy and meronymy) and further populate the relation set by connecting a synset to all the other synsets that appear in its disambiguated gloss. |
Resource Alignment | For instance, WordNet can be readily represented as an undirected graph G whose nodes are synsets and edges are modeled after the relations between synsets defined in WordNet (e. g., hypernymy, meronymy, etc. |
Resource Alignment | ), and LG is the mapping between each synset node and the set of synonyms which express the concept. |
Resource Alignment | 3'For instance, we calculated that more than 80% of the words in WordNet are monosemous, with over 60% of all the synsets containing at least one of them. |
IndoNet | An element of a common concept hierarchy is defined as < sinid1,sinid2, ...,uwid,sum0id > where, sinidi is synset id of ith wordnet, uw _id is universal word id, and sumon is SUMO term id of the concept. |
IndoNet | Each synset of wordnet is directly linked to a concept in ‘common concept hierarchy’. |
Related Work | ILI consists of English synsets and serves as a pivot to link other wordnets. |
Related Work | Because of the small size of the top level ontology, only a few wordnet synsets can be linked directly to the ontological concept and most of the synsets get linked through subsumption relation. |
Word Polarity | Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms ( synsets ), each expressing a distinct concept (Miller, 1995). |
Word Polarity | Synsets are inter-linked by means of conceptual-semantic and lexical relations. |
Word Polarity | The simplest approach is to connect words that occur in the same WordNet synset . |
Introduction | Distributional models that integrate the visual modality have been learned from texts and images (Feng and Lapata, 2010; Bruni et al., 2012b) or from ImageNet (Deng et al., 2009), e.g., by exploiting the fact that images in this database are hierarchically organized according to WordNet synsets (Leong and Mihalcea, 2011). |
The Attribute Dataset | ImageNet has more than 14 million images spanning 21K WordNet synsets . |
The Attribute Dataset | 1Some words had to be modified in order to match the correct synset , e. g., tank_(c0ntainer) was found as storage_tank. |
Assessment of Lexical Resources | This includes the WordNet lexicographer’s file name (e.g., noun.time), synsets , and hypernyms. |
Assessment of Lexical Resources | We make extensive use of the file name, but less so from the synsets and hypernyms. |
Assessment of Lexical Resources | However, in general, we find that the file names are too coarse-grained and the synsets and hypernyms too fine-grained for generalizations on the selectors for the complements and the governors. |
Comparative Evaluation | As regards recall, we note that in two cases (i.e., DBpedia returning page super-types from its upper taxonomy, YAGO linking categories to WordNet synsets) the generalizations are neither pages nor categories and that MENTA returns heterogeneous hypernyms as mixed sets of WordNet synsets , Wikipedia pages and categories. |
Comparative Evaluation | MENTA seems to be the closest resource to ours, however, we remark that the hypernyms output by MENTA are very heterogeneous: 48% of answers are represented by a WordNet synset , 37% by Wikipedia categories and 15% are Wikipedia pages. |
Introduction | However, unlike the case with smaller manually-curated resources such as WordNet (Fellbaum, 1998), in many large automatically-created resources the taxonomical information is either missing, mixed across resources, e.g., linking Wikipedia categories to WordNet synsets as in YAGO, or coarse-grained, as in DBpedia whose hypernyms link to a small upper taxonomy. |
Experimental Framework | WordNet is organized into sets of synonyms, called synsets (SS). |
Experimental Framework | Each synset in turn belongs to a unique semantic file (SF). |
Experimental Framework | As an example, knife in its tool sense is in the EDGE TOOL USED AS A CUTTING INSTRUMENT singleton synset , and also in the ARTIFACT SF along with thousands of words including cutter. |
Experiments | To project WordNet synsets to terms, we used the first (most frequent) term in each synset . |
Experiments | A few WordNet synsets have multiple parents so we only keep the first of each such pair of overlapping trees. |
Experiments | We also discard a few trees with duplicate terms because this is mostly due to the projection of different synsets to the same term, and theoretically makes the tree a graph. |
Term Weighting and Sentiment Analysis | It consists of WordNet synsets, where each synset is assigned three probability scores that add up to 1: positive, negative, and objective. |
Term Weighting and Sentiment Analysis | These scores are assigned at sense level ( synsets in WordNet), and we use the following equations to assess the sentiment scores at the word level. |
Term Weighting and Sentiment Analysis | where synset(w) is the set of synsets of w and SWN 1208(3), SWNNeg(s) are positive and negative scores of a synset in SentiWordNet. |
Related Work | The first baseline searches for each head noun in WordNet and labels the noun as category Ck, if it has a hypernym synset corresponding to that category. |
Related Work | We manually identified the WordNet synsets that, to the best of our ability, seem to most closely correspond |
Related Work | We do not report WordNet results for TEST because there did not seem be an appropriate synset , or for the OTHER category because that is a catchall class. |
Evaluation | To create the dataset, we first compiled a list of 50 categories by selecting 50 hyponyms of the synset consumer goods in WordNet. |
System Description | In WordNet, nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms called synsets . |
System Description | Each synset in WordNet expresses a different concept and they are connected to each other with lexical, semantic and conceptual relations. |
Set Expansion | It consists of a large number of synsets; a synset is a set of one or more similar word senses. |
Set Expansion | The synsets are then connected with hypemym/hyponym links, which represent ISA relationships. |
Set Expansion | The number of types of similarity in WordNet tends to be less than that captured by Moby, because synsets in WordNet are (usually) only allowed to have a single parent. |
Experiment: Ranking Word Senses | (2008), we represent different word senses by the words in the corresponding synsets . |
Experiment: Ranking Word Senses | For each word sense, we compute the centroid of the second-order vectors of its synset members. |
Experiment: Ranking Word Senses | Since synsets tend to be small (they even may contain only the target word itself), we additionally add the centroid of the sense’s hypernyms, scaled down by the factor 10 (chosen as a rough heuristic without any attempt at optimization). |