A Unified Semantic Representation | As our sense inventory, we use WordNet 3.0 (Fellbaum, 1998). |
A Unified Semantic Representation | The WordNet ontology provides a rich network structure of semantic relatedness, connecting senses directly with their hypemyms, and providing information on semantically similar senses by virtue of their nearby locality in the network. |
A Unified Semantic Representation | To extend beyond a single sense, the random walk may be initialized and restarted from a set of senses (seed nodes), rather than just one; this multi-seed walk produces a multinomial distribution over all the senses in WordNet with higher probability assigned to senses that are frequently visited from the seeds. |
Experiment 1: Textual Similarity | Additionally, because the texts often contain named entities which are not present in WordNet , we incorporated the similarity values produced by four string-based measures, which were used by other teams in the STS task: (1) longest common substring which takes into account the length of the longest overlapping contiguous sequence of characters (substring) across two strings (Gusfield, 1997), (2) longest common subsequence which, instead, finds the longest overlapping subsequence of two strings (Allison and Dix, 1986), (3) Greedy String Tiling which allows reordering in strings (Wise, 1993), and (4) the character/word n-gram similarity proposed by Barron-Cedefio et al. |
Experiment 1: Textual Similarity | o Explicit Semantic Analysis (Gabrilovich and Markovitch, 2007) where the high-dimensional vectors are obtained on WordNet , Wikipedia and Wiktionary. |
Experiments | All these systems incorporated lexical semantics features derived from WordNet and named entity features. |
Experiments | Features used in the experiments can be categorized into six types: identical word matching (I), lemma matching (L), WordNet (WN), enhanced Lexical Semantics (LS), Named Entity matching (NE) and Answer type checking (Ans). |
Experiments | Arguably the most common source of word relations, WordNet (WN) provides the primitive features of whether two words could belong to the same synset in WordNet , could be antonyms and whether one is a hypernym of the other. |
Lexical Semantic Models | Although sets of synonyms can be easily found in thesauri or WordNet synsets, such resources typically cover only strict synonyms. |
Lexical Semantic Models | Traditionally, WordNet taxonomy is the linguistic resource for identifying hypernyms and hy-ponyms, applied broadly to many NLP problems. |
Lexical Semantic Models | However, WordNet has a number of well-known limitations including its rather limited or skewed concept distribution and the lack of the coverage of the IsA relation (Song et al., 2011). |
Related Work | Although lexical semantic information derived from WordNet has been used in some of these approaches, the research has mainly focused on modeling the mapping between the syntactic structures of questions and sentences, produced from syntactic analysis. |
Abstract | Structured resources such as WordNet offer a convenient hierarchical means for converging on a common ground for comparison, but offer little support for the divergent thinking that is needed to creatively view one concept as another. |
Abstract | These lateral views complement the vertical views of WordNet , and support a system for idea exploration called Thesaurus Rex. |
Abstract | We show also how Thesaurus Rex supports a novel, generative similarity measure for WordNet . |
Related Work and Ideas | WordNet’s taxonomic organization of noun-senses and verb-senses — in which very general categories are successively divided into increasingly informative subcategories or instance-level ideas — allows us to gauge the overlap in information content, and thus of meaning, of two lexical concepts. |
Related Work and Ideas | Wu & Palmer (1994) use the depth of a lexical concept in the WordNet hierarchy as such a proxy, and thereby estimate the similarity of two lexical concepts as twice the depth of their LCS divided by the sum of their individual depths. |
Related Work and Ideas | Rather, when using Resnick’s metric (or that of Lin, or Jiang and Conrath) for measuring the similarity of lexical concepts in WordNet, one can use the category structure of WordNet itself to estimate information content. |
Seeing is Believing (and Creating) | This reliance on the consensus viewpoint explains why WordNet (Fellbaum, 1998) has proven so useful as a basis for computational measures of lexico-semantic similarity |
Seeing is Believing (and Creating) | Using WordNet , for instance, a similarity measure can vertically converge on a common superordinate category of both inputs, and generate a single numeric result based on their distance to, and the information content of, this common generalization. |
Seeing is Believing (and Creating) | Though WordNet is ideally structured to support vertical, convergent reasoning, its comprehensive nature means it can also be used as a solid foundation for building a more lateral and divergent model of similarity. |
Abstract | Expensive feature engineering based on WordNet senses has been shown to be useful for document level sentiment classification. |
Introduction | WordNet is a byproduct of such an analysis. |
Introduction | In WordNet , paradigms are manually generated based on the principles of lexical and semantic relationship among words (Fellbaum, 1998). |
Introduction | WordNets are primarily used to address the problem of word sense disambiguation. |
FrameNet — Wiktionary Alignment | They align senses in WordNet to Wikipedia entries in a supervised setting using semantic similarity measures. |
FrameNet — Wiktionary Alignment | The PPR measure (Agirre and Soroa, 2009) maps the glosses of the two senses to a semantic vector space spanned up by WordNet synsets and then compares them using the chi-square measure. |
FrameNet — Wiktionary Alignment | The semantic vectors ppr are computed using the personalized PageRank algorithm on the WordNet graph. |
Related Work | (2008) map FrameNet frames to WordNet synsets based on the embedding of FrameNet lemmas in WordNet . |
Related Work | They use Multi-WordNet, an English-Italian wordnet , to induce an Italian FrameNet lexicon with 15,000 entries. |
Related Work | To create MapNet, Tonelli and Pianta (2009) align FrameNet senses with WordNet synsets by exploiting the textual similarity of their glosses. |
Introduction | Furthermore, the learned classes are not directly linked to existing resources such as WordNet (Fellbaum, 1998) or Wikipedia. |
Introduction | 0 We propose SPred, a novel approach which harvests predicates from Wikipedia and generalizes them by leveraging core concepts from WordNet . |
Large-Scale Harvesting of Semantic Predicates | As explained below, we assume the set C to be made up of representative synsets from WordNet . |
Large-Scale Harvesting of Semantic Predicates | We perform this in two substeps: we first link all our disambiguated arguments to WordNet (Section 3.3.1) and then leverage the WordNet taxonomy to populate the semantic classes in 0 (Section 3.3.2). |
Large-Scale Harvesting of Semantic Predicates | 3.3.1 Linking to WordNet |
Abstract | We create an open multilingual wordnet with large wordnets for over 26 languages and smaller ones for 57 languages. |
Abstract | It is made by combining wordnets with open li-cences, data from Wiktionary and the Unicode Common Locale Data Repository. |
Introduction | One of the many attractions of the semantic network WordNet (Fellbaum, 1998), is that there are numerous wordnets being built for different languages. |
Introduction | There are, in addition, many projects for groups of languages: Euro WordNet (Vossen, 1998), BalkaNet (Tufis et al., 2004), Asian Wordnet (Charoenporn et al., 2008) and more. |
Introduction | Although there are over 60 languages for which wordnets exist in some state of development (Fellbaum and Vossen, 2012, 316), less than half of these have released any data, and for those that have, the data is often not freely accessible (Bond and Paik, 2012). |
Abstract | It is a linked structure of wordnets of 18 different Indian languages, Universal Word dictionary and the Suggested Upper Merged Ontology (SUMO). |
Introduction | Past couple of decades have shown an immense growth in the development of lexical resources such as wordnet , Wikipedia, ontologies etc. |
Introduction | In this paper we present IndoNet, a lexical resource created by merging wordnets of 18 dif- |
Introduction | Suggested Upper Merged Ontology (SUMO) is the largest freely available ontology which is linked to the entire English WordNet (Niles and Pease, 2003). |
Related Work | Over the years wordnet has emerged as the most widely used lexical resource. |
Related Work | Though most of the wordnets are built by following the standards laid by English Wordnet (Fellbaum, 1998), their conceptualizations differ because of the differences in lexicalization of concepts across languages. |
Related Work | Wordnets are available in following Indian languages: Assamese, Bodo, Bengali, English, Gujarati, Hindi, Kashmiri, Konkani, Kannada, Malayalam, Ma-nipuri, Marathi, Nepali, Punjabi, Sanskrit, Tamil, Telugu and Urdu. |
Experiments and evaluation | Table 2 shows the results of the evaluation of our initial thesaurus, achieved by comparing the selected semantic neighbors with two complementary reference resources: WordNet 3.0 synonyms (Miller, 1990) [W], which characterize a semantic similarity based on paradigmatic relations, and the Moby thesaurus (Ward, 1996) [M], which gathers a larger set of types of relations and is more representative of semantic relatedness3. |
Experiments and evaluation | WordNet provides a restricted number of synonyms for each noun while the Moby thesaurus contains for each entry a large number of synonyms and similar words. |
Experiments and evaluation | As a consequence, the precisions at different cutoffs have a significantly higher value with Moby as reference than with WordNet as reference. |
Introduction | The second approach makes use of a less structured source of knowledge about words such as the definitions of classical dictionaries or the glosses of WordNet . |
Introduction | WordNet’s glosses were used to support Lesk-like measures in (Banerjee and Pedersen, 2003) and more recently, measures were also defined from Wikipedia or Wiktionaries (Gabrilovich and |
Experiments | Many of these templates utilize information from WordNet (Fell-baum, 1998). |
Experiments | 0 WordNet link types (link type list) (e.g., attribute, hypernym, entailment) |
Experiments | 0 Lexicographer filenames (lexnames)—top level categories used in WordNet (e.g., noun.body, verb.cognition) |
Experiments | from dependency parse tree) along with computing similarity in semantic spaces (using WordNet ) clearly produces an improvement in the summarization quality (+1.4 improvement in ROUGE-l F-score). |
Using the Framework | For each pair of nodes (u,v) in the graph, we compute the semantic similarity score (using WordNet ) between every pair of dependency relation (rel: a, b) in u and v as: s(u,v) = Z WN(a,-,aj) >< WN(b,-,bj), |
Using the Framework | WN(w,—, wj) is defined as the WordNet similarity score between words 212,- and to]? |
Using the Framework | 2There exists various semantic relatedness measures based on WordNet (Patwardhan and Pedersen, 2006). |
Learning Class Attributes | For these approaches, lists of instances are typically collected from publicly-available resources such as WordNet or Wikipedia (Pasca and Van Durme, 2007; |
Learning Class Attributes | 1Reisinger and Pasca (2009) considered the related problem of finding the most appropriate class for each attribute; they take an existing ontology of concepts ( WordNet ) as a class hierarchy and use a Bayesian approach to decide “the correct level of abstraction for each attribute.” |
Related Work | These efforts focused exclusively on the meronymy relation as used in WordNet (Miller et al., 1990). |
Related Work | Experts can manually specify the attributes of entities, as in the WordNet project (Miller et al., 1990). |
Related Work | In many ways WordNet can be regarded as a collection of commonsense relationships. |
Evaluation | Precisions are simply omitted because the difference to the recalls are always the number of failures on referring to WordNet by mislabeling of lemmata or POSs, which is always the same for the three methods. |
Metric Space Implementation | To calculate sense similarities, we used the WordNet similarity package by Pedersen et al. |
Metric Space Implementation | Those texts were parsed using RASP parser (Briscoe et al., 2006) version 3.1, to obtain grammatical relations for the distributional similarity, as well as to obtain lemmata and part-of-speech (POS) tags which are required to look up the sense inventory of WordNet . |
Related Work | (2010) used a WordNet pre-pruning. |
Related Work | Disambiguation is performed by considering only those candidate synsets that belong to the top-k largest connected components of the WordNet on domain corpus. |
Simultaneous Optimization of All-words WSD | The cluster centers are located at the means of hypotheses including miscellaneous alternatives not intended, thus the estimated probability distribution is, roughly speaking, offset toward the center of WordNet , which is not what we want. |
Related Work 2.1 WordNet-based Approach | For a given predicate q, the system firstly computes its distribution of argument semantic classes based on WordNet . |
Related Work 2.1 WordNet-based Approach | Clark and Weir (2002) suggest a hypothesis testing method by ascending the noun hierarchy of WordNet . |
Related Work 2.1 WordNet-based Approach | Cia-ramita and Johnson (2000) model WordNet as a Bayesian network to solve the “explain away” ambiguity. |
Connotation Induction Algorithms | Hard constrains for WordNet relations: |
Precision, Coverage, and Efficiency | In particular, we found that it becomes nearly impractical to run the ILP formulation including all words in WordNet plus all words in the argument position in Google Web IT. |
Precision, Coverage, and Efficiency | Therefore we revise those hard constraints to encode various semantic relations ( WordNet and semantic coordination) more directly. |
ImpAr algorithm | named-entities and WordNet Super-Senses4. |
ImpAr algorithm | 4Lexicographic files according to WordNet terminology. |
Related Work | VENSES++ (Tonelli and Delmonte, 2010) applied a rule based anaphora resolution procedure and semantic similarity between candidates and thematic roles using WordNet (Fellbaum, 1998). |
Experimental Design | This algorithm generated a set of synonyms from WordNet and then used the SUBTLEX frequencies to find the most frequent synonym. |
Experimental Design | This measure is taken from WordNet (Fellbaum, 1998). |
Experimental Design | Synonym Count Also taken from WordNet , this is the number of potential synonyms with which a word could be replaced. |
Introduction | Distributional models that integrate the visual modality have been learned from texts and images (Feng and Lapata, 2010; Bruni et al., 2012b) or from ImageNet (Deng et al., 2009), e.g., by exploiting the fact that images in this database are hierarchically organized according to WordNet synsets (Leong and Mihalcea, 2011). |
The Attribute Dataset | Images for the concepts in McRae et al.’s (2005) production norms were harvested from ImageNet (Deng et al., 2009), an ontology of images based on the nominal hierarchy of WordNet (Fellbaum, 1998). |
The Attribute Dataset | ImageNet has more than 14 million images spanning 21K WordNet synsets. |
Conclusion | In addition, we introduced the new and useful WordNet , Aflect, Length and Negation feature categories. |
Evaluation of Word Pairs | 1 WordNet 20.07 34.07 52.96 11.58 2 Verb Class 14.24 24.84 49.6 10.04 3 MPN 23.84 38.58 49.97 13.16 4 Modality 17.49 28.92 13.84 10.72 5 Polarity 16.46 26.36 65.15 11.58 6 Affect 18.62 31.59 59.8 13.37 7 8 9 |
Other Features | WordNet Features: We define four features based on WordNet (Fellbaum, 1998) - Synonyms, Antonyms, Hypernyms and Hyponyms. |
Introduction | processing tools, e.g., syntactic parsers (Wiebe, 2000), information extraction (IE) tools (Riloff and Wiebe, 2003) or rich lexical resources such as WordNet (Esuli and Sebastiani, 2006). |
Related Work | Many researchers have explored using relations in WordNet (Miller, 1995), e. g., Esuli and Sabastiani (2006), Andreevskaia and Bergler (2006) for English, Rao and Ravichandran (2009) for Hindi and French, and Perez-Rosas et al. |
Related Work | There is also a mismatch between the formality of many language resources, such as WordNet , and the extremely informal language of social media. |
PARMA | WordNet WordNet (Miller, 1995) is a database of information (synonyms, hypernyms, etc.) |
PARMA | For each entry, WordNet provides a set of synonyms, hypernyms, etc. |
PARMA | Given two spans, we use WordNet to determine semantic similarity by measuring how many synonym (or other) edges are needed to link two |