Abstract | We study the task of entity linking for tweets, which tries to associate each mention in a tweet with a knowledge base entry. |
Experiments | Following most existing studies, we choose Wikipedia as our knowledge base” . |
Introduction | In this work, we study the entity linking task for tweets, which maps each entity mention in a tweet to a unique entity, i.e., an entry ID of a knowledge base like Wikipedia. |
Introduction | linking task is generally considered as a bridge between unstructured text and structured machine-readable knowledge base , and represents a critical role in machine reading program (Singh et al., 2011). |
Introduction | Current entity linking methods are built on top of a large scale knowledge base such as Wikipedia. |
Our Method | 0 Total is the total number of knowledge base entities; |
Related Work | 5 TAB (http://www.w3.org/2002/05/tapl) is a shallow knowledge base that contains a broad range of lexical and taxonomic information about popular objects like music, movies, authors, sports, autos, health, etc. |
Related Work | (2012) propose LIEGE, a framework to link the entities in web lists with the knowledge base , with the assumption that entities mentioned in a Web list tend to be a collection of entities of the same conceptual type. |
Task Definition | Here, an entity refers to an item of a knowledge base . |
Task Definition | Following most existing work, we use Wikipedia as the knowledge base , and an entity is a definition page in Wikipedia; a mention denotes a sequence of tokens in a tweet that can be potentially linked to an entity. |
Abstract | Some languages lack large knowledge bases and good discriminative features for Name Entity Recognition (NER) that can generalize to previously unseen named entities. |
Abstract | One such language is Arabic, which: a) lacks a capitalization feature; and b) has relatively small knowledge bases , such as Wikipedia. |
Abstract | In this work we address both problems by incorporating cross-lingual features and knowledge bases from English using cross—lingual links. |
Conclusion | In this paper, we presented different cross-lingual features that can make use of linguistic properties and knowledge bases of other languages for NER. |
Conclusion | We used English as the “helper” language and we exploited the English capitalization feature and an English knowledge base , DBpedia. |
Cross-lingual Features | DBpedia2 is a large collaboratively-built knowledge base in which structured information is extracted from Wikipedia (Bizer et al., 2009). |
Introduction | - Using cross-lingual links to exploit a large knowledge base , namely English DBpedia, to benefit NER. |
Abstract | Methods for information extraction (IE) and knowledge base (KB) construction have been intensively studied. |
Candidate Types for Entities | We infer type disjointness constraints from the YAG02 knowledge base using occurrence statistics. |
Conclusion | This paper addressed the problem of detecting and semantically typing newly emerging entities, to support the life-cycle of large knowledge bases . |
Detection of New Entities | [9) The noun phrase is a known entity that can be directly mapped to the knowledge base . |
Detection of New Entities | d) The noun phrase is a new entity not known to the knowledge base at all. |
Detection of New Entities | To decide if a noun phrase is a true entity (i.e., an individual entity that is a member of one or more lexical classes) or a nonentity (i.e., a common noun phrase that denotes a class or a general concept), we base the decision on the following hypothesis (inspired by and generalizing (Bunescu 2006): A given noun phrase, not known to the knowledge base , is a true entity if its headword is singular and is consistently capitalized (i.e., always spelled with the first letter in upper case). |
Introduction | A large number of knowledge base (KB) construction projects have recently emerged. |
Conclusion and Future Work | In theory, WikiCiKE can be applied to any two wiki knowledge based of different languages. |
Preliminaries | 2.1 Wiki Knowledge Base and Wiki Article |
Preliminaries | We consider each language version of Wikipedia as a wiki knowledge base , which can be represented as K = {ai H21, where at, is a disambiguated article in K and p is the size of K. |
Preliminaries | For example, K 3 indicates the source wiki knowledge base, and KT denotes the target wiki knowledge base . |
Conclusion | The learned representation of entity is compact and can scale to very large knowledge base . |
Introduction | It is an essential first step for succeeding subtasks in knowledge base construction (Ji and Grishman, 2011) like populating attribute to entities. |
Introduction | However, the one-topic-per-entity assumption makes it impossible to scale to large knowledge base , as every entity has a separate word distribution P(w|e); besides, the training objective does not directly correspond with disambiguation performances. |
Experiments | While the first two can be improved by, say, using a better named entity tagger, incorporating other knowledge bases and building a question classifier, how to solve the third problem is tricky. |
Lexical Semantic Models | Probase is a knowledge base that establishes connections between 2.7 million concepts, discovered automatically by applying Hearst patterns (Hearst, 1992) to 1.68 billion Web pages. |
Lexical Semantic Models | Its abundant concept coverage distinguishes it from other knowledge bases , such as Freebase (Bollacker et al., 2008) and WikiTaxon-omy (Ponzetto and Strube, 2007). |