Index of papers in Proc. ACL 2014 that mention
  • WordNet
Bansal, Mohit and Burkett, David and de Melo, Gerard and Klein, Dan
Abstract
To train the system, we extract substructures of WordNet and dis-criminatively learn to reproduce them, using adaptive subgradient stochastic optimization.
Abstract
On the task of reproducing sub-hierarchies of WordNet , our approach achieves a 51% error reduction over a chance baseline, including a 15% error reduction due to the non-hypernym—factored sibling features.
Experiments
We considered two distinct experimental setups, one that illustrates the general performance of our model by reproducing various medium-sized WordNet domains, and another that facilitates comparison to previous work by reproducing the much larger animal subtree provided by Kozareva and Hovy (2010).
Experiments
General setup: In order to test the accuracy of structured prediction on medium-sized full-domain taxonomies, we extracted from WordNet 3.0 all bottomed-out full subtrees which had a tree-height of 3 (i.e., 4 nodes from root to leaf), and contained (10, 50] terms.11 This gives us 761 non-overlapping trees, which we partition into
Experiments
To project WordNet synsets to terms, we used the first (most frequent) term in each synset.
Introduction
However, currently available taxonomies such as WordNet are incomplete in coverage (Pennacchiotti and Pantel, 2006; Hovy et al., 2009), unavailable in many domains and languages, and
Introduction
Figure 1: An excerpt of WordNet’s vertebrates taxonomy.
Introduction
First, on the task of recreating fragments of WordNet , we achieve a 51% error reduction on ancestor-based F1 over a chance baseline, including a 15% error reduction due to the non-hypernym-factored sibling features.
WordNet is mentioned in 18 sentences in this paper.
Topics mentioned in this paper:
Pilehvar, Mohammad Taher and Navigli, Roberto
Abstract
Our approach leverages a similarity measure that enables the structural comparison of senses across lexical resources, achieving state-of-the-art performance on the task of aligning WordNet to three different collaborative resources: Wikipedia, Wiktionary and OmegaWiki.
Experiments
To enable a comparison with the state of the art, we followed Matuschek and Gurevych (2013) and performed an alignment of WordNet synsets (WN) to three different collaboratively-constructed resources: Wikipedia
Experiments
As mentioned in Section 2.1.1, we build the WN graph by including all the synsets and semantic relations defined in WordNet (e.g., hypernymy and meronymy) and further populate the relation set by connecting a synset to all the other synsets that appear in its disambiguated gloss.
Introduction
Notable examples are WordNet , Wikipedia and, more recently, collaboratively-curated resources such as OmegaWiki and Wiktionary (Hovy et al., 2013).
Introduction
When a lexical resource can be viewed as a semantic graph, as with WordNet or Wikipedia, this limit can be overcome by means of alignment algorithms that exploit the network structure to determine the similarity of concept pairs.
Introduction
We report state-of-the-art performance when aligning WordNet to Wikipedia, OmegaWiki and Wiktionary.
Related Work
A good example is WordNet , which has been exploited as a semantic network in dozens of NLP tasks (Fellbaum, 1998).
Resource Alignment
For instance, WordNet can be readily represented as an undirected graph G whose nodes are synsets and edges are modeled after the relations between synsets defined in WordNet (e. g., hypernymy, meronymy, etc.
Resource Alignment
Any semantic network with a dense relational structure, providing good coverage of the words appearing in the definitions, is a suitable candidate for H. For this purpose we used the WordNet (Fellbaum, 1998) graph which was further enriched by connecting
Resource Alignment
As an example, assume we are given two semantic signatures computed for two concepts in WordNet and Wiktionary.
WordNet is mentioned in 14 sentences in this paper.
Topics mentioned in this paper:
Mitra, Sunny and Mitra, Ritwik and Riedl, Martin and Biemann, Chris and Mukherjee, Animesh and Goyal, Pawan
Abstract
We conduct a thorough evaluation of the proposed methodology both manually as well as through comparison with WordNet .
Abstract
Remarkably, in 44% cases the birth of a novel sense is attested by WordNet, while in 46% cases and 43% cases split and join are respectively confirmed by WordNet .
Evaluation framework
6.2 Automated evaluation with WordNet
Evaluation framework
We chose WordNet for automated evaluation because not only does it have a wide coverage of word senses but also it is being maintained and updated regularly to incorporate new senses.
Evaluation framework
For our evaluation, we developed an aligner to align the word clusters obtained with WordNet senses.
Introduction
Remarkably, comparison with the English WordNet indicates that in 44% cases, as identified by our algorithm, there has been a birth of a completely novel sense, in 46% cases a new sense has split off from an older sense and in 43% cases two or more older senses have merged in to form a new sense.
Related work
A few approaches suggested by (Bond et al., 2009; Paakko' and Linden, 2012) attempt to augment WordNet synsets primarily using methods of annotation.
WordNet is mentioned in 23 sentences in this paper.
Topics mentioned in this paper:
Lau, Jey Han and Cook, Paul and McCarthy, Diana and Gella, Spandana and Baldwin, Timothy
Background and Related Work
Typically, word frequency distributions are estimated with respect to a sense-tagged corpus such as SemCor (Miller et al., 1993), a 220,000 word corpus tagged with WordNet (Fellbaum, 1998) senses.
Background and Related Work
The distributional similarity scores of the nearest neighbours are associated with the respective target word senses using a WordNet similarity measure, such as those proposed by J iang and Conrath (1997) and Banerjee and Pedersen (2002).
Introduction
(2004b) to remove low-frequency senses from WordNet , we focus on finding senses that are unattested in the corpus on the premise that, given accurate disambiguation, rare senses in a corpus contribute to correct interpretation.
Macmillan Experiments
For the purposes of this research, the choice of Macmillan is significant in that it is a conventional dictionary with sense definitions and examples, but no linking between senses.11 In terms of the original research which gave rise to the sense-tagged dataset, Macmillan was chosen over WordNet for reasons including: (l) the well-documented difficulties of sense tagging with fine-grained WordNet senses (Palmer et al., 2004; Navigli et al., 2007); (2) the regular update cycle of Macmillan (meaning it contains many recently-emerged senses); and (3) the finding in a preliminary sense-tagging task that it better captured Twitter usages than WordNet (and also OntoNotes: Hovy et al.
Macmillan Experiments
The average sense ambiguity of the 20 target nouns in Macmillan is 5.6 (but 12.3 in WordNet ).
Macmillan Experiments
We first notice that, despite the coarser-grained senses of Macmillan as compared to WordNet , the upper bound WSD accuracy using Macmillan is comparable to that of the WordNet-based datasets over the balanced BNC, and quite a bit lower than that of the two domain corpora of Koeling et al.
Methodology
the WordNet hierarchy).
WordNet Experiments
For each domain, annotators were asked to sense-annotate a random selection of sentences for each of 40 target nouns, based on WordNet v1.7.
WordNet Experiments
For each dataset, we use HDP to induce topics for each target lemma, compute the similarity between the topics and the WordNet senses (Equation (1)), and rank the senses based on the prevalence scores (Equation (2)).
WordNet Experiments
It is important to bear in mind that MKWC in these experiments makes use of full-text parsing in calculating the distributional similarity thesaurus, and the WordNet graph structure in calculating the similarity between associated words and different senses.
WordNet is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Lam, Khang Nhut and Al Tarouti, Feras and Kalita, Jugal
Abstract
Manually constructing a Wordnet is a difficult task, needing years of experts’ time.
Abstract
As a first step to automatically construct full Wordnets, we propose approaches to generate Wordnet synsets for languages both resource-rich and resource-poor, using publicly available Wordnets , a machine translator and/or a single bilingual dictionary.
Abstract
Our algorithms translate synsets of existing Wordnets to a target language T, then apply a ranking method on the translation candidates to find best translations in T. Our approaches are applicable to any language which has at least one existing bilingual dictionary translating from English to it.
Introduction
Wordnets are intricate and substantive repositories of lexical knowledge and have become important resources for computational processing of natural languages and for information retrieval.
Introduction
Good quality Wordnets are available only for a few "resource-rich" languages such as English and Japanese.
Introduction
Published approaches to automatically build new Wordnets are manual or semiautomatic and can be used only for languages that already possess some lexical resources.
WordNet is mentioned in 82 sentences in this paper.
Topics mentioned in this paper:
Kang, Jun Seok and Feng, Song and Akoglu, Leman and Choi, Yejin
Abstract
The key aspect of our method is that it is the first unified approach that assigns the polarity of both word- and sense-level connotations, exploiting the innate bipartite graph structure encoded in WordNet .
Conclusion
We have introduced a novel formulation of lexicon induction operating over both words and senses, by exploiting the innate structure between the words and senses as encoded in WordNet .
Evaluation 11: Human Evaluation on ConnotationWordNet
7Because senses in WordNet can be tricky to understand, care should be taken in designing the task so that the Turkers will focus only on the corresponding sense of a word.
Evaluation 11: Human Evaluation on ConnotationWordNet
Therefore, we provided the part of speech tag, the WordNet gloss of the selected sense, and a few examples as given in WordNet .
Introduction
We introduce ConnotationWordNet, a connotation lexicon over the network of words in conjunction with senses, as defined in WordNet .
Introduction
For example, consider “abound”, for which lexicographers of WordNet prescribe two different senses:
Introduction
Especially if we look up the WordNet entry for “bristle”, there are noticeably more negatively connotative words involved in its gloss and examples.
Network of Words and Senses
Another benefit of our approach is that for various WordNet relations (e.g., antonym relations), which are defined over synsets (not over words), we can add edges directly between corresponding synsets, rather than projecting (i.e., approximating) those relations over words.
WordNet is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Flati, Tiziano and Vannella, Daniele and Pasini, Tommaso and Navigli, Roberto
Comparative Evaluation
As regards recall, we note that in two cases (i.e., DBpedia returning page super-types from its upper taxonomy, YAGO linking categories to WordNet synsets) the generalizations are neither pages nor categories and that MENTA returns heterogeneous hypernyms as mixed sets of WordNet synsets, Wikipedia pages and categories.
Comparative Evaluation
MENTA seems to be the closest resource to ours, however, we remark that the hypernyms output by MENTA are very heterogeneous: 48% of answers are represented by a WordNet synset, 37% by Wikipedia categories and 15% are Wikipedia pages.
Introduction
However, unlike the case with smaller manually-curated resources such as WordNet (Fellbaum, 1998), in many large automatically-created resources the taxonomical information is either missing, mixed across resources, e.g., linking Wikipedia categories to WordNet synsets as in YAGO, or coarse-grained, as in DBpedia whose hypernyms link to a small upper taxonomy.
Introduction
(2005) provide a general vector-based method which, however, is incapable of linking pages which do not have a WordNet counterpart.
Introduction
Higher coverage is provided by de Melo and Weikum (2010) thanks to the use of a set of effective heuristics, however, the approach also draws on WordNet and sense frequency information.
Related Work
However, these methods do not link terms to existing knowledge resources such as WordNet , whereas those that explicitly link do so by adding new leaves to the existing taxonomy instead of acquiring wide-coverage taxonomies from scratch (Pan-tel and Ravichandran, 2004; Snow et al., 2006).
Related Work
Other approaches, such as YAGO (Suchanek et al., 2008; Hoffart et al., 2013), yield a taxonomical backbone by linking Wikipedia categories to WordNet .
Related Work
However, the categories are linked to the first, i.e., most frequent, sense of the category head in WordNet , involving only leaf categories in the linking.
WordNet is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Bengoetxea, Kepa and Agirre, Eneko and Nivre, Joakim and Zhang, Yue and Gojenola, Koldo
Abstract
This paper presents experiments with WordNet semantic classes to improve dependency parsing.
Experimental Framework
Base WordNet WordNet Clusters
Experimental Framework
WordNet .
Experimental Framework
(2011), based on WordNet 2.1.
Introduction
Broadly speaking, we can classify the methods to incorporate semantic information into parsers in two: systems using static lexical semantic repositories, such as WordNet or similar ontologies (Agirre et al., 2008; Agirre et al., 2011; Fujita et al., 2010), and systems using dynamic semantic clusters automatically acquired from corpora (Koo et al., 2008; Suzuki et al., 2009).
Introduction
0 Does semantic information in WordNet help
Introduction
0 How does WordNet compare to automatically obtained information?
Related work
Broadly speaking, we can classify the attempts to add external knowledge to a parser in two sets: using large semantic repositories such as WordNet and approaches that use information automatically acquired from corpora.
Related work
The results showed a signi-cant improvement, giving the first results over both WordNet and the Penn Treebank (PTB) to show that semantics helps parsing.
Related work
(201 1) successfully introduced WordNet classes in a dependency parser, obtaining improvements on the full PTB using gold POS tags, trying different combinations of semantic classes.
WordNet is mentioned in 17 sentences in this paper.
Topics mentioned in this paper:
Aliabadi, Purya
Introduction
WordNet (Fellbaum, 2010) has been used in numerous natural language processing tasks such as word sense disambiguation and information extraction with considerable success.
Introduction
Kurdish is a less-resourced language for which, among other resources, no wordnet has been built yet.
KurdNet: State-of-the-Art
1. highlighted the main challenges in building a wordnet for the Kurdish language (including its inherent diversity and morphological complexity),
KurdNet: State-of-the-Art
2. built the first prototype of KurdNet, the Kurdish WordNet (see a summary below), and
KurdNet: State-of-the-Art
There are two well-known models for building wordnets for a language (Vossen, 1998):
WordNet is mentioned in 18 sentences in this paper.
Topics mentioned in this paper:
Tsvetkov, Yulia and Boytsov, Leonid and Gershman, Anatole and Nyberg, Eric and Dyer, Chris
Methodology
mantic categories originating in WordNet .
Methodology
7Supersenses are called “lexicographer classes” in WordNet documentation (Fellbaum, 1998), http: / /worolnet .
Methodology
English adjectives do not, as yet, have a similar high-level semantic partitioning in WordNet , thus we use a 13-class taxonomy of adjective supersenses constructed by Tsvetkov et al.
Model and Feature Extraction
WordNet lacks coarse-grained semantic categories for adjectives.
Model and Feature Extraction
For example, the top-level classes in GermaNet include: adj.feeling (e.g., willing, pleasant, cheerful); adj.sabstance (e.g., dry, ripe, creamy); adj.spatial (e.g., adjacent, gigantic).12 For each adjective type in WordNet , they produce a vector with a classifier posterior probabilities corresponding to degrees of membership of this word in one of the 13 semantic classes,13 similar to the feature vectors we build for nouns and verbs.
Model and Feature Extraction
Consider an example related to projection of WordNet supersenses.
Related Work
(2013) describe a Concrete Category Overlap algorithm, where co-occurrence statistics and Turney’s abstractness scores are used to determine WordNet supersenses that correspond to literal usage of a given adjective or verb.
Related Work
To implement this idea, they extend MRC imageability scores to all dictionary words using links among WordNet supersenses (mostly hypernym and hyponym relations).
Related Work
Because they heavily rely on WordNet and availability of imageability scores, their approach may not be applicable to low-resource languages.
WordNet is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Litkowski, Ken
Abstract
The features make extensive use of WordNet .
Assessment of Lexical Resources
Since the PDEP system enables exploration of features from WordNet , FrameNet, and VerbNet, we are able to make some assessment of these resources.
Assessment of Lexical Resources
WordNet played a statistically significant role in the systems developed by Tratz (2011) and Srikumar and Roth (2013).
Assessment of Lexical Resources
This includes the WordNet lexicographer’s file name (e.g., noun.time), synsets, and hypernyms.
Class Analyses
We are examining the WordNet detour to FrameNet, as described in Burchardt et al.
Introduction
Section 4 describes how we are able to investigate the relationship of WordNet , FrameNet, and VerbNet to this effort and how this examination of preposition behavior can be used in working with these resources.
See http://clg.wlv.ac.uk/proiects/DVC
The feature extraction rules are (1) word class (we), (2) part of speech (pos), (3) lemma (1), (4) word (w), (5) WordNet lexical name (In), (6) WordNet synonyms (s), (7) WordNet hypernyms (h), (8) whether the word is capitalized (c), and (9) affixes (af).
See http://clg.wlv.ac.uk/proiects/DVC
For features such as the WordNet lexical name, synonyms and hypernyms, the number of values may be much larger.
WordNet is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Vannella, Daniele and Jurgens, David and Scarfini, Daniele and Toscani, Domenico and Navigli, Roberto
Abstract
Frequently, such resources are constructed through automatic mergers of complementary resources, such as WordNet and Wikipedia.
Introduction
Semantic knowledge bases such as WordNet (Fellbaum, 1998), YAGO (Suchanek et al., 2007), and BabelNet (Navigli and Ponzetto, 2010) provide ontological structure that enables a wide range of tasks, such as measuring semantic relatedness (Budanitsky and Hirst, 2006) and similarity (Pilehvar et al., 2013), paraphrasing (Kauchak and Barzilay, 2006), and word sense disambiguation (Navigli and Ponzetto, 2012; Moro et al., 2014).
Introduction
extend WordNet using distributional or structural features to identify novel semantic connections between concepts.
Introduction
The recent advent of large semistructured resources has enabled the creation of new semantic knowledge bases (Medelyan et al., 2009; Hovy et al., 2013) through automatically merging WordNet and Wikipedia (Suchanek et al., 2007; Navigli and Ponzetto, 2010; Nie-mann and Gurevych, 2011).
Related Work
Rzeniewicz and Szymanski (2013) extend WordNet with commonsense knowledge using a 20 Questions-like game.
Video Game with a Purpose Design
Knowledge base As the reference knowledge base, we chose BabelNet2 (Navigli and Ponzetto, 2010), a large-scale multilingual semantic ontology created by automatically merging WordNet with other collaboratively-constructed resources such as Wikipedia and OmegaWiki.
Video Game with a Purpose Design
First, by connecting WordNet synsets to Wikipedia pages, most synsets are associated with a set of pictures; while often noisy, these pictures sometimes illustrate the target concept and are an ideal case for validation.
Video Game with a Purpose Design
Second, BabelNet contains the semantic relations from both WordNet and hyperlinks in Wikipedia; these relations are again an ideal case of validation, as not all hyperlinks connect semantically-related pages in Wikipedia.
WordNet is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Fu, Ruiji and Guo, Jiang and Qin, Bing and Che, Wanxiang and Wang, Haifeng and Liu, Ting
Background
Some have established concept hierarchies based on manually-built semantic resources such as WordNet (Miller, 1995).
Background
Such hierarchies have good structures and high accuracy, but their coverage is limited to fine-grained concepts (e.g., “Ranunculaceae” is not included in WordNet .).
Background
(2008) link the categories in Wikipedia onto WordNet .
Introduction
In the WordNet hierarchy, senses are organized according to the “isa” relations.
Related Work
(2006) provides a global optimization scheme for extending WordNet , which is different from the above-mentioned pairwise relationships identification methods.
WordNet is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Kalchbrenner, Nal and Grefenstette, Edward and Blunsom, Phil
Experiments
CCG parser, WordNet
Experiments
hypernyms, WordNet
Experiments
head word, parser SVM hypernyms, WordNet
WordNet is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Flanigan, Jeffrey and Thomson, Sam and Carbonell, Jaime and Dyer, Chris and Smith, Noah A.
Automatic Alignments
We use WordNet to generate candidate lemmas, and we also use a fuzzy match of a concept, defined to be a word in the sentence that has the longest string prefix match with that concept’s label, if the match length is Z 4.
Automatic Alignments
WordNet lemmas and fuzzy matches are only used if the rule explicitly uses them.
Training
Strips off trailing ‘-[0-9]+’ from the concept (for example run—01 —> run), and matches any exact matching word or WordNet lemma.
WordNet is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Sun, Le and Han, Xianpei
Introduction
If the node is a preterminal node, we capture its lexical semantic by adding features indicating its WordNet sense information.
Introduction
Specifically, the first WordNet sense of the terminal word, and all this sense’s hyponym senses will be added as features.
Introduction
For example, WordNet senses {New Y0rk#], city#], district#],
WordNet is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Zou, Bowei and Zhou, Guodong and Zhu, Qiaoming
Baselines
One is word co-occurrence (if word w and word wj occur in the same sentence or in the adjacent sentences, Sim(wi,wj) increases 1), and the other is WordNet (Miller, 1995) based similarity.
Baselines
3 Total weight of words in the focus candidate using the WordNet similarity.
Baselines
4 Max weight of words in the focus candidate using the WordNet similarity.
WordNet is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: