GlossBoot | (a) Hypernym extraction: for each newly-acquired term/ gloss pair (75, g) E G k, we automatically extract a candidate hypernym h from the textual gloss 9. |
GlossBoot | To do this we use a simple unsupervised heuristic which just selects the first term in the gloss.5 We show an example of hypernym extraction for some terms in Table 2 (we report the term in column 1, the gloss in column 2 and the hypemyms extracted by the first term hypernym extraction heuristic in column 3). |
GlossBoot | (b) (Term, Hypernym )-ranking: we sort all the glosses in Gk by the number of seed terms found in each gloss. |
Introduction | Given a domain and a language of interest, we bootstrap the glossary learning process with just a few hypernymy relations (such as computer isa device), with the only condition that the (term, hypernym ) pairs must be specific enough to implicitly identify the domain in the target language. |
Related Work | To avoid the use of a large domain corpus, terminologies can be obtained from the Web by using Doubly-Anchored Patterns (DAPs) which, given a (term, hypernym) pair, harvest sentences matching manually-defined patterns like “< hypernym > such as <term>, and *” (Kozareva et al., 2008). |
Related Work | Similarly to our approach, they drop the requirement of a domain corpus and start from a small number of (term, hypernym ) seeds. |
Related Work | In contrast, GlossBoot performs the novel task of multilingual glossary learning from the Web by bootstrapping the extraction process with a few (term, hypernym ) seeds. |
Results and Discussion | Now, an obvious question arises: what if we bootstrapped GlossBoot with fewer hypernym seeds, e.g., just one seed? |
Results and Discussion | To answer this question we replicated our English experiments on each single (term, hypernym ) pair in our seed set. |
Large-Scale Harvesting of Semantic Predicates | we extract the hypernym from the textual definition of p by applying Word-Class Lattices (Navigli and Velardi, 2010, WCL6), a domain-independent hypernym extraction system successfully applied to taxonomy learning from scratch (Velardi et al., 2013) and freely available online (Faralli and Navigli, 2013). |
Large-Scale Harvesting of Semantic Predicates | If a hypernym h is successfully extracted and h is linked to a Wikipedia page 19’ for which ,u(p’) is defined, then we extend the mapping by setting Mp) :2 ,u(p’ For instance, the mapping provided by BabelNet does not provide any link for the page Peter Spence; thanks to WCL, though, we are able to set the page Journalist as its hypernym , and link it to the WordNet synsetjournalistk. |
Large-Scale Harvesting of Semantic Predicates | The semantic class for a WordNet synset S is the class 0 among those in C which is the most specific hypernym of 8 according to the WordNet taxonomy. |
FrameNet — Wiktionary Alignment | For the joint model, we employed the best single PPR configuration, and a COS configuration that uses sense gloss extended by Wiktionary hypernyms , synonyms and FrameNet frame name and frame definition, to achieve the highest score, an F1-score of 0.739. |
Intermediate Resource FNWKxx | We also extract other related lemma-POS, for instance 487 antonyms, 126 hyponyms, and 19 hypernyms . |
Resource FNWKde | Relation per FrameNet sense per frame SYNONYM 17,713 13,288 HYPONYM 4,818 3,347 HYPERNYM 6,369 3,961 ANTONYM 9,626 6,737 |
Experiments | 0 WordNet link types (link type list) (e.g., attribute, hypernym , entailment) |
Experiments | ° WordNet hypernyms |
Experiments | Curiously, although hypernyms are commonly used as features in NLP classification tasks, gloss terms, which are rarely used for these tasks, are approximately as useful, at least in this particular context. |