Experimental Setup | Instead, it decides deterministically how to generate a story on the basis of the most likely predicate-argument and predicate-predicate counts in the knowledge base . |
The Story Generator | The generator next constructs several possible stories involving these entities by consulting a knowledge base containing information about dogs and ducks (e. g., dogs bark, ducks swim) and their interactions (e.g., dogs chase ducks, ducks love dogs). |
The Story Generator | Although we are ultimately searching for the best overall story at the document level, we must also find the most suitable sentences that can be generated from the knowledge base (see Figure 4). |
The Story Generator | The space of possible stories can increase dramatically depending on the size of the knowledge base so that an exhaustive tree search becomes computationally prohibitive. |
Abstract | We present an approach to training a joint syntactic and semantic parser that combines syntactic training information from CCGbank with semantic training information from a knowledge base via distant supervision. |
Introduction | We suggest that a large populated knowledge base should play a key role in syntactic and semantic parsing: in training the parser, in resolving syntactic ambiguities when the trained parser is applied to new text, and in its output semantic representation. |
Introduction | Using semantic information from the knowledge base at training and test time will |
Introduction | A semantic representation tied to a knowledge base allows for powerful inference operations — such as identifying the possible entity referents of a noun phrase — that cannot be performed with shallower representations (e.g., frame semantics (Baker et al., 1998) or a direct conversion of syntax to logic (B08, 2005)). |
Parser Design | These logical forms are constructed using category and relation predicates from a broad coverage knowledge base . |
Parser Design | 3.1 Knowledge Base |
Prior Work | However, these approaches to semantics do not ground the text to beliefs in a knowledge base . |
Prior Work | Finally, some work has looked at applying semantic parsing to answer queries against large knowledge bases , such as YAGO (Yahya et al., 2012) and Freebase (Cai and Yates, 2013b; Cai and Yates, 2013a; Kwiatkowski et al., 2013; Be-rant et al., 2013). |
Abstract | Most existing relation extraction models make predictions for each entity pair locally and individually, while ignoring implicit global clues available in the knowledge base , sometimes leading to conflicts among local predictions from different entity pairs. |
Abstract | And, we find that the clues learnt automatically from existing knowledge bases perform comparably to those refined by human. |
Experiments | It uses Freebase as the knowledge base and New York Time corpus as the text corpus, including about 60,000 entity tuples in the training set, and about 90,000 entity tuples in the testing set. |
Introduction | Identifying predefined kinds of relationship between pairs of entities is crucial for many knowledge base related applications(Suchanek et al., 2013). |
Introduction | Many knowledge bases do not have a well-defined typing system, let alone fine-grained typing taxonomies with corresponding type recognizers, which are crucial to explicitly model the typing requirements for arguments of a relation, but rather expensive and time-consuming to collect. |
Introduction | We propose to perform joint inference upon multiple local predictions by leveraging implicit clues that are encoded with relation specific requirements and can be learnt from existing knowledge bases . |
Related Work | Their approach only captures relation dependencies, while we learn implicit relation backgrounds from knowledge bases , including argument type and cardinality requirements. |
The Framework | The clues of detecting these inconsistencies can be learnt from a knowledge base . |
The Framework | As discussed earlier, we will exploit from the knowledge base two categories of clues that implicitly capture relations’ backgrounds: their expected argument types and argument cardinalities, based on which we can discover two categories of disagreements among the candidate predictions, summarized as argument type inconsistencies and Violations of arguments’ uniqueness, which have been rarely considered before. |
The Framework | Most existing knowledge bases represent their knowledge facts in the form of (<subject, relation, 0bject>) triple, which can be seen as relational facts between entity tuples. |
Abstract | We study the task of entity linking for tweets, which tries to associate each mention in a tweet with a knowledge base entry. |
Experiments | Following most existing studies, we choose Wikipedia as our knowledge base” . |
Introduction | In this work, we study the entity linking task for tweets, which maps each entity mention in a tweet to a unique entity, i.e., an entry ID of a knowledge base like Wikipedia. |
Introduction | linking task is generally considered as a bridge between unstructured text and structured machine-readable knowledge base , and represents a critical role in machine reading program (Singh et al., 2011). |
Introduction | Current entity linking methods are built on top of a large scale knowledge base such as Wikipedia. |
Our Method | 0 Total is the total number of knowledge base entities; |
Related Work | 5 TAB (http://www.w3.org/2002/05/tapl) is a shallow knowledge base that contains a broad range of lexical and taxonomic information about popular objects like music, movies, authors, sports, autos, health, etc. |
Related Work | (2012) propose LIEGE, a framework to link the entities in web lists with the knowledge base , with the assumption that entities mentioned in a Web list tend to be a collection of entities of the same conceptual type. |
Task Definition | Here, an entity refers to an item of a knowledge base . |
Task Definition | Following most existing work, we use Wikipedia as the knowledge base , and an entity is a definition page in Wikipedia; a mention denotes a sequence of tokens in a tweet that can be potentially linked to an entity. |
Abstract | In relation extraction, distant supervision seeks to extract relations between entities from text by using a knowledge base , such as Freebase, as a source of supervision. |
Abstract | When a sentence and a knowledge base refer to the same entity pair, this approach heuristically labels the sentence with the corresponding relation in the knowledge base . |
Introduction | A particularly attractive approach, called distant supervision (DS), creates labeled data by heuristically aligning entities in text with those in a knowledge base , such as Freebase (Mintz et al., 2009). |
Introduction | knowledge base |
Introduction | With DS it is assumed that if a sentence contains an entity pair in a knowledge base, such a sentence actually expresses the corresponding relation in the knowledge base . |
Knowledge-based Distant Supervision | DS uses a knowledge base to create labeled data for relation extraction by heuristically matching entity pairs. |
Related Work | The increasingly popular approach, called distant supervision (DS), or weak supervision, utilizes a knowledge base to heuristically label a corpus (Wu and Weld, 2007; Bellare and McCallum, 2007; Pal |
Related Work | (2009) who used Freebase as a knowledge base by making the DS assumption and trained relation extractors on Wikipedia. |
Abstract | Large-scale knowledge bases are important assets in NLP. |
Abstract | We propose a cost-effective method of validating and extending knowledge bases using video games with a purpose. |
Introduction | Large-scale knowledge bases are an essential component of many approaches in Natural Language Processing (NLP). |
Introduction | Semantic knowledge bases such as WordNet (Fellbaum, 1998), YAGO (Suchanek et al., 2007), and BabelNet (Navigli and Ponzetto, 2010) provide ontological structure that enables a wide range of tasks, such as measuring semantic relatedness (Budanitsky and Hirst, 2006) and similarity (Pilehvar et al., 2013), paraphrasing (Kauchak and Barzilay, 2006), and word sense disambiguation (Navigli and Ponzetto, 2012; Moro et al., 2014). |
Introduction | Furthermore, such knowledge bases are essential for building unsupervised algorithms when training data is sparse or unavailable. |
Related Work | Last, three two-player games have focused on validating and extending knowledge bases . |
Abstract | When ConceptResolver is run on N ELL’s knowledge base , 87% of the word senses it creates correspond to real-world concepts, and 85% of noun phrases that it suggests refer to the same concept are indeed synonyms. |
Background: Never-Ending Language Learner | 1More information about NELL, including browsable and downloadable versions of its knowledge base , is available from http://rtw.ml.cmu.edu. |
Background: Never-Ending Language Learner | NELL’s knowledge base contains both definitions for predicates and extracted instances of each predicate. |
Background: Never-Ending Language Learner | At present, NELL’s knowledge base defines approximately 500 predicates and contains over half a million extracted instances of these predicates with an accuracy of approximately 0.85. |
ConceptResolver | The final output of sense induction is a sense-disambiguated knowledge base , where each noun phrase has been converted into one or more word senses, and relations hold between pairs of senses. |
Evaluation | For both experiments, we used a knowledge base created by running 140 iterations of NELL. |
Introduction | Many information extraction systems construct knowledge bases by extracting structured assertions from free text (e.g., NELL (Carlson et al., 2010), TextRunner (Banko et al., 2007)). |
Introduction | It first performs word sense induction, using the extracted category instances to create one or more unambiguous word senses for each noun phrase in the knowledge base . |
Introduction | We evaluate ConceptResolver using a subset of NELL’s knowledge base , presenting separate results for the concepts of each semantic type. |
Abstract | We leverage distant supervision using relations from the knowledge base FreeBase, but do not require any manual heuristic nor manual seed list selections. |
Introduction | The detection of relations between entities for the automatic population of knowledge bases is very useful for solving tasks such as Entity Disambiguation, Information Retrieval and Question Answering. |
Introduction | The availability of high-coverage, general-purpose knowledge bases enable the automatic identification and disambiguation of entities in text and its applications (Bunescu and Pasca, 2006; Cucerzan, 2007; McNamee and Dang, 2009; Kwok et al., 2001; Pasca et al., 2006; Weld et al., 2008; Pereira et al., 2009; Kasneci et al., 2009). |
Introduction | These systems do not need any manual data or rules, but the relational facts they extract are not immediately disambiguated to entities and relations from a knowledge base . |
Unsupervised relational pattern learning | Similar to other distant supervision methods, our approach takes as input an existing knowledge base containing entities and relations, and a textual corpus. |
Unsupervised relational pattern learning | In this work it is not necessary for the corpus to be related to the knowledge base . |
Unsupervised relational pattern learning | In what follows we assume that all the relations studied are binary and hold between exactly two entities in the knowledge base . |
Abstract | We present a simple, data-driven approach to generation from knowledge bases (KB). |
Conclusion | Using the KBGen benchmark, we then showed that the resulting induced FB-LTAG compares favorably with competing symbolic and statistical approaches when used to generate from knowledge base data. |
Introduction | In this paper we present a grammar based approach for generating from knowledge bases (KB) which is linguistically principled and conceptually simple. |
Introduction | To evaluate our approach, we use the benchmark provided by the KBGen challenge (Banik et al., 2012; Banik et al., 2013), a challenge designed to evaluate generation from knowledge bases ; where the input is a KB subset; and where the expected output is a complex sentence conveying the meaning represented by the input. |
Related Work | With the development of the semantic web and the proliferation of knowledge bases, generation from knowledge bases has attracted increased interest and so called ontology verbalisers have been proposed which support the generation of text from (parts of) knowledge bases . |
Related Work | strand of work maps each axiom in the knowledge base to a clause. |
Related Work | The MIAKT project (Bontcheva and Wilks., 2004) and the ONTOGENERATION project (Aguado et al., 1998) use symbolic NLG techniques to produce textual descriptions from some semantic information contained in a knowledge base . |
The KBGen Task | Specifically, the task is to verbalise a subset of a knowledge base . |
The KBGen Task | The KB subsets forming the KB Gen input data were preselected from the AURA biology knowledge base (Gunning et al., 2010), a knowledge base about biology which was manually encoded by biology teachers and encodes knowledge about events, entities, properties and relations where relations include event-to-entity, event-to-event, |
Conclusions | We apply the new model to construct a relation knowledge base (KB), and use it as a complement to the existing manually created KBs. |
Experiments | 5.2 Knowledge Base (KB) Construction |
Experiments | Further, the medical knowledge is changing extremely quickly, making people hard to understand it, and update it in the knowledge base in a timely manner. |
Experiments | Table 3: Knowledge Base Comparison Recall@20 Recall@50 Recall@3000 Our KB 135/742 182/742 301/742 |
Identifying Key Medical Relations | To achieve this, we parsed all 80M sentences in our medical corpus, looking for the sentences containing the terms that are associated with the CUI pairs in the knowledge base . |
Identifying Key Medical Relations | For example, we know from the knowledge base that “antibiotic drug” may treat “Lyme disease”. |
Introduction | In candidate answer generation, relations enable the background knowledge base to be used for potential candidate |
Introduction | We also apply our model to build a new medical relation knowledge base as a complement to the existing knowledge bases . |
Conclusions | Although compiling time-aware knowledge bases is an important open challenge (Weikum et al., 2011), it has remained unexplored until very recently (Wang et al., 2011; Talukdar et al., 2012). |
Conclusions | We have also studied the limits of the distant supervision approach to relation extraction, showing empirically that its performance depends not only on the nature of reference knowledge base and document corpus (Riedel et al., 2010), but also on the relation to be extracted. |
Distant Supervised Relation Extraction | From a reference Knowledge Base (KB), we extract a set of relation triples or seeds: (entity,relati0n,value), where the relation is one of the target relations. |
Evaluation | It has been shown that this assumption is more often violated when training knowledge base and document collection are of different type, e. g. Wikipedia and newswire (Riedel et al., 2010). |
Related Work | Compiling a Knowledge Base of temporally anchored facts is an open research challenge (Weikum et al., 2011). |
Related Work | There have been attempts to extend an existing knowledge base . |
Related Work | While ACE required only to identify time expressions and classify their relation to events, KBP requires to infer explicitly the start/end time of relations, which is a realistic approach in the context of building time-aware knowledge bases . |
Introduction | m knowledge bases 2. |
Introduction | The intuition of the paradigm is that one can take advantage of several knowledge bases , such as WordNet3, Freebase4 and YAGO5, to automatically label free texts, like Wikipedia6 and New York Times corpora7, based on some heuristic alignment assumptions. |
Introduction | >) are not only involved in the relation instances8 coming from knowledge bases (President—of(Barack Obama, U.S.) and Born—in (Barack Obama, U . |
Related Work | (2004) used WordNet as the knowledge base to discover more h-pyernym/hyponym relations between entities from news articles. |
Related Work | (2009) adopted Freebase (Bollacker et al., 2008; Bollacker et al., 2007), a large-scale crowdsourcing knowledge base online which contains billions of relation instances and thousands of relation names, to distantly supervise Wikipedia corpus. |
Related Work | (2012) proposed a novel approach to multi-instance multi-label learning for relation extraction, which jointly modeled all the sentences in texts and all labels in knowledge bases for a given entity pair. |
Abstract | Some languages lack large knowledge bases and good discriminative features for Name Entity Recognition (NER) that can generalize to previously unseen named entities. |
Abstract | One such language is Arabic, which: a) lacks a capitalization feature; and b) has relatively small knowledge bases , such as Wikipedia. |
Abstract | In this work we address both problems by incorporating cross-lingual features and knowledge bases from English using cross—lingual links. |
Conclusion | In this paper, we presented different cross-lingual features that can make use of linguistic properties and knowledge bases of other languages for NER. |
Conclusion | We used English as the “helper” language and we exploited the English capitalization feature and an English knowledge base , DBpedia. |
Cross-lingual Features | DBpedia2 is a large collaboratively-built knowledge base in which structured information is extracted from Wikipedia (Bizer et al., 2009). |
Introduction | - Using cross-lingual links to exploit a large knowledge base , namely English DBpedia, to benefit NER. |
Abstract | A central challenge in semantic parsing is handling the myriad ways in which knowledge base predicates can be expressed. |
Abstract | Traditionally, semantic parsers are trained primarily from text paired with knowledge base information. |
Abstract | Our goal is to exploit the much larger amounts of raw text not tied to any knowledge base . |
Introduction | We consider the semantic parsing problem of mapping natural language utterances into logical forms to be executed on a knowledge base (KB) (Zelle and Mooney, 1996; Zettlemoyer and Collins, 2005; Wong and Mooney, 2007; Kwiatkowski et al., 2010). |
Introduction | Scaling semantic parsers to large knowledge bases has attracted substantial attention recently (Cai and Yates, 2013; Berant et al., 2013; Kwiatkowski et al., 2013), since it drives applications such as question answering (QA) and information extraction (IE). |
Setup | Our task is as follows: Given (i) a knowledge base IC, and (ii) a training set of question-answer pairs 3/1)};1, output a semantic parser that maps new questions at to answers 3/ via latent logical forms 2. |
Setup | A knowledge base IC is a set of assertions (61,19, 62) E 5 X ’P X 5 (e.g., (BillGates,PlaceOfBirth, Seattle». |
Abstract | Methods for information extraction (IE) and knowledge base (KB) construction have been intensively studied. |
Candidate Types for Entities | We infer type disjointness constraints from the YAG02 knowledge base using occurrence statistics. |
Conclusion | This paper addressed the problem of detecting and semantically typing newly emerging entities, to support the life-cycle of large knowledge bases . |
Detection of New Entities | [9) The noun phrase is a known entity that can be directly mapped to the knowledge base . |
Detection of New Entities | d) The noun phrase is a new entity not known to the knowledge base at all. |
Detection of New Entities | To decide if a noun phrase is a true entity (i.e., an individual entity that is a member of one or more lexical classes) or a nonentity (i.e., a common noun phrase that denotes a class or a general concept), we base the decision on the following hypothesis (inspired by and generalizing (Bunescu 2006): A given noun phrase, not known to the knowledge base , is a true entity if its headword is singular and is consistently capitalized (i.e., always spelled with the first letter in upper case). |
Introduction | A large number of knowledge base (KB) construction projects have recently emerged. |
Fact Candidates | The triple format is the most common representation of facts in knowledge bases . |
Fact Candidates | NELL’s entity typing method has high recall because when entities are not in the knowledge base , it performs on-the-fly type inference using the Web. |
Frequent bi- grams | We evaluated FactChecker on three datasets: i) KB Fact Candidates: The first dataset consists of fact candidates taken from the fact extraction pipeline of a state-of-the-art knowledge base , NELL (Carlson et al., 2010). |
Frequent bi- grams | ii) Wikipedia Fact Candidates: For the second dataset, we did not restrict the fact candidates to specific topics from a knowledge base , instead we aimed to evaluate all fact candidates about a given entity. |
Introduction | These projects have produced knowledge bases containing many millions of relational facts between entities. |
Abstract | Wikification for tweets aims to automatically identify each concept mention in a tweet and link it to a concept referent in a knowledge base (e.g., Wikipedia). |
Experiments | We use a Wikipedia dump on May 3, 2013 as our knowledge base , which includes 30 million pages. |
Introduction | concept referent in a knowledge base (KB) (e.g., Wikipedia). |
Principles and Approach Overview | Knowledge Base (Wikipedia) |
Related Work | The task of linking concept mentions to a knowledge base has received increased attentions over the past several years, from the linking of concept mentions in a single text (Mihalcea and Csomai, 2007; Milne and Witten, 2008b; Milne and Witten, 2008a; Kulkami et al., 2009; He et al., 2011; Ratinov et al., 2011; Cassidy et al., 2012; Cheng and Roth, 2013), to the linking of a cluster of corefer- |
Concept-based Representation for Medical Records Retrieval | In particular, MetaMap is used to map terms from queries and documents (e.g., medical records) to the semantic concepts from biomedical knowledge bases such as UMLS. |
Conclusions and Future Work | Second, we will study how to leverage other information from knowledge bases to further improve the performance. |
Introduction | In the past decades, significant efforts have been put on constructing biomedical knowledge bases (Aronson and Lang, 2010; Lipscomb, 2000; Corporation, 1999) and developing natural language processing (NLP) tools, such as MetaMap, to utilize the information from the knowledge bases (Aronson, 2001; McInnes et al., 2009). |
Introduction | Indeed, concept-based representation is one of the commonly used approaches that leverage knowledge bases to improve the retrieval performance (Limsopatham et al., 2013d; Limsopatham et al., 2013b). |
Introduction | The basic idea is to represent both queries and documents as “bags of concepts”, where the concepts are identified based on the information from the knowledge bases . |
Abstract | A typical knowledge-based question answering (KB-QA) system faces two challenges: one is to transform natural language questions into their meaning representations (MRs); the other is to retrieve answers from knowledge bases (KBs) using generated MRs. |
Introduction | Knowledge-based question answering (KB-QA) computes answers to natural language (NL) questions based on existing knowledge bases (KBs). |
Introduction | Compared to their work, our method gains an improvement in two aspects: (1) Instead of using facts extracted using the open IE method, we leverage a large scale, high-quality knowledge base ; (2) We can handle multiple-relation questions, instead of single-relation queries only, based on our translation based KB-QA framework. |
Introduction | (2013) is one of the latest work which has reported QA results based on a large scale, general domain knowledge base (Freebase), we consider their evaluation result on WEBQUESTIONS as our baseline. |
Conclusion and Future Work | In theory, WikiCiKE can be applied to any two wiki knowledge based of different languages. |
Preliminaries | 2.1 Wiki Knowledge Base and Wiki Article |
Preliminaries | We consider each language version of Wikipedia as a wiki knowledge base , which can be represented as K = {ai H21, where at, is a disambiguated article in K and p is the size of K. |
Preliminaries | For example, K 3 indicates the source wiki knowledge base, and KT denotes the target wiki knowledge base . |
Abstract | We also compare our results with existing knowledge base to outline the similarities and differences of the granularity and diversity of the harvested knowledge. |
Introduction | 0 A comparison of the results with some large existing knowledge bases . |
Results | Initially, we decided to conduct an automatic evaluation comparing our results to knowledge bases that have been extracted in a similar way (i.e., through pattern application over unstructured text). |
Results | In this section, we compare the performance of our approach with the semantic knowledge base Yago4 that contains 2 million entitiess, 95% of which were manually confirmed to be correct. |
Background | The set of generated inference rules can be regarded as the knowledge base KB. |
Evaluation | o PSL-no-DIR: Our PSL system without distributional inference rules(empty knowledge base ). |
PSL for STS | Given the logical forms for a pair of sentences, a text T and a hypothesis H, and given a set of weighted rules derived from the distributional semantics (as explained in section 2.6) composing the knowledge base KB, we build a PSL model that supports determining the truth value of H in the most probable interpretation (i.e. |
PSL for STS | KB: The knowledge base is a set of lexical and phrasal rules generated from distributional semantics, along with a similarity score for each rule (section 2.6). |
Abstract | Distant supervision usually utilizes only unlabeled data and existing knowledge bases to learn relation extraction models. |
Guided DS | Our goal is to jointly model human-labeled ground truth and structured data from a knowledge base in distant supervision. |
Introduction | It automatically labels its own training data by heuristically aligning a knowledge base of facts with an unlabeled corpus. |
Introduction | Table 1: Classic errors in the training data generated by a toy knowledge base of only one entry personTit|e(Abu Zubaydah, leader). |
Conclusion and Future Work | Facebook would an ideal ground truth knowledge base . |
Introduction | Inspired by the concept of distant supervision, we collect training tweets by matching attribute ground truth from an outside “knowledge base” such as Facebook or Google Plus. |
Model | Lists of universities and companies are taken from knowledge base NELLB. |
Related Work | Figure 1: Illustration of Goolge Plus “knowledge base” . |
Abstract | We evaluate OntoUSP by using it to extract a knowledge base from biomedical abstracts and answer questions. |
Background 2.1 Ontology Learning | Besides, many of them either bootstrap from heuristic patterns (e.g., Hearst patterns (Hearst, 1992)) or build on existing structured or semistructured knowledge bases (e.g., WordNet (Fellbaum, 1998) and Wikipedial), thus are limited in coverage. |
Background 2.1 Ontology Learning | Our approach can also leverage existing ontologies and knowledge bases to conduct semi-supervised ontology induction (e.g., by incorporating existing structures as hard constraints or penalizing deviation from them). |
Experiments | These MAP parses formed the knowledge base (KB). |
Bayesian Logic Programs | Given a knowledge base as a BLP, standard logical inference (SLD resolution) is used to automatically construct a Bayes net for a given problem. |
Experimental Evaluation | The final knowledge base included all unique rules learned from any subset. |
Introduction | Since manually developing such a knowledge base is difficult and arduous, an effective alternative is to automatically learn such rules by mining a substantial database of facts that an IE system has already automatically extracted from a large corpus of text (Nahm and Mooney, 2000). |
Learning BLPs to Infer Implicit Facts | Typically, an ILP system takes a set of positive and negative instances for a target relation, along with a background knowledge base (in our case, other facts extracted from the same document) from which the positive instances are potentially inferable. |
Automatic Metaphor Interpretation | Veale and Hao (2008), however, did not evaluate to which extent their knowledge base of Talking Points and the associated reasoning framework are useful to interpret metaphorical expressions occurring in text. |
Automatic Metaphor Recognition | If the system fails to recognize metonymy, it proceeds to search the knowledge base for a relevant analogy in order to discriminate metaphorical relations from anomalous ones. |
Automatic Metaphor Recognition | met* then searches its knowledge base for a triple containing a hypemym of both the actual argument and the desired argument and finds (thing, use, energy_source), which represents the metaphorical interpretation. |
Metaphor Resources | One of the first attempts to create a multipurpose knowledge base of source—target domain mappings is the Master Metaphor List (Lakoff et al., 1991). |
Abstract | Extracting contexts and answers together with the questions will yield not only a coherent forum summary but also a valuable QA knowledge base . |
Introduction | Another motivation of detecting contexts and answers of the questions in forum threads is that it could be used to enrich the knowledge base of community-based question and answering (CQA) services such as Live QnA and Yahoo! |
Introduction | To enrich the knowledge base , not only the answers, but also the contexts are critical; otherwise the answer to a question such as How much is the taxi would be useless without context in the database. |
Abstract | Answering natural language questions using the Freebase knowledge base has recently been explored as a platform for advancing the state of the art in open domain semantic parsing. |
Abstract | Those efforts map questions to sophisticated meaning representations that are then attempted to be matched against Viable answer candidates in the knowledge base . |
Introduction | Question answering (QA) from a knowledge base (KB) has a long history within natural language processing, going back to the 1960s and 1970s, with systems such as Baseball (Green Jr et al., 1961) and Lunar (Woods, 1977). |
Introduction | The creation and use of machine-readable knowledge has not only entailed researchers (Mitchell, 2005; Mirkin et al., 2009; Poon et al., 2010) developing huge, broad-coverage knowledge bases (Hovy et al., 2013; Suchanek and Weikum, 2013), but it has also hit big industry players such as Google (Singhal, 2012) and IBM (Ferrucci, 2012), which are moving fast towards large-scale knowledge-oriented systems. |
Introduction | The creation of very large knowledge bases has been made possible by the availability of collaboratively-curated online resources such as Wikipedia and Wiktionary. |
Related Work | A second project, MENTA (de Melo and Weikum, 2010), creates one of the largest multilingual lexical knowledge bases by interconnecting more than 13M articles in 271 languages. |
Learning Quality Knowledge | A good knowledge base should have the capacity of handling this ambiguity. |
Learning Quality Knowledge | Such patterns compose our knowledge base as shown below. |
Learning Quality Knowledge | As the knowledge is extracted from each cluster individually, we represent our knowledge base as a set of clusters, where each cluster consists of a set of frequent 2-patterns mined using FPM, e. g., |
Experiments and Results | The reference knowledge base is derived from an October 2008 dump of English Wikipedia, which includes 818,741 nodes. |
Introduction | The Entity Linking (EL) task consists in linking name mentions of named entities (NEs) found in a document to their corresponding entities in a reference Knowledge Base (KB). |
Related Work | It relies on the Wikipedia-derived YAGO2 (Hoffart et al., 2011a) knowledge base . |
Experiments | While the first two can be improved by, say, using a better named entity tagger, incorporating other knowledge bases and building a question classifier, how to solve the third problem is tricky. |
Lexical Semantic Models | Probase is a knowledge base that establishes connections between 2.7 million concepts, discovered automatically by applying Hearst patterns (Hearst, 1992) to 1.68 billion Web pages. |
Lexical Semantic Models | Its abundant concept coverage distinguishes it from other knowledge bases , such as Freebase (Bollacker et al., 2008) and WikiTaxon-omy (Ponzetto and Strube, 2007). |
Conclusion | The learned representation of entity is compact and can scale to very large knowledge base . |
Introduction | It is an essential first step for succeeding subtasks in knowledge base construction (Ji and Grishman, 2011) like populating attribute to entities. |
Introduction | However, the one-topic-per-entity assumption makes it impossible to scale to large knowledge base , as every entity has a separate word distribution P(w|e); besides, the training objective does not directly correspond with disambiguation performances. |
Abstract | We apply the models to resolve entity types in new queries and to assign prior type distributions over an existing knowledge base . |
Conclusion | Our proposed models can be efficiently trained using an EM algorithm and can be further used to assign prior type distributions to entities in an existing knowledge base and to insert new entities into it. |
Joint Model of Types and User Intents | Fitting to an existing Knowledge Base: Although in general our model decodes type distributions for arbitrary entities, in many practical cases it is beneficial to constrain the types to those admissible in a fixed knowledge base (such as Freebase). |
Association Model | Clicked results in a vertical search engine are edges between queries and entities e in the vertical’s knowledge base . |
Association Model | Throughout our models, we make the simplifying assumption that the knowledge base E is complete. |
Introduction | In this paper, we focus instead on associating surface contexts with entities that refer to a particular entry in a knowledge base such as Freebase, IMDB, Amazon’s product catalog, or The Library of Congress. |
Abstract | Many researchers are trying to use information extraction (IE) to create large-scale knowledge bases from natural language text on the Web. |
Conclusion | Many researchers are trying to use IE to create large-scale knowledge bases from natural language text on the Web, but existing relation-specific techniques do not scale to the thousands of relations encoded in Web text — while relation-independent techniques suffer from lower precision and recall, and do not canonicalize the relations. |
Heuristic Generation of Training Data | Wikipedia is an ideal starting point for our longterm goal of creating a massive knowledge base of extracted facts for two reasons. |
Extracting Rules from Wikipedia | Our goal is to utilize the broad knowledge of Wikipedia to extract a knowledge base of lexical reference rules. |
Extracting Rules from Wikipedia | We note that the last three extraction methods should not be considered as Wikipedia specific, since many Web-like knowledge bases contain redirects, hyperlinks and disambiguation means. |
Introduction | To perform such inferences, systems need large scale knowledge bases of LR rules. |