The Tradeoffs Between Open and Traditional Relation Extraction
Banko, Michele and Etzioni, Oren

Article Structure

Abstract

Traditional Information Extraction (IE) takes a relation name and hand-tagged examples of that relation as input.

Introduction

Relation Extraction (RE) is the task of recognizing the assertion of a particular relationship between two or more entities in text.

The Nature of Relations in English

How are relationships expressed in English sentences?

Relation Extraction

Given a relation name, labeled examples of the relation, and a corpus, traditional Relation Extraction (RE) systems output instances of the given relation found in the corpus.

Hybrid Relation Extraction

Since O-CRF and Rl-CRF have complementary views of the extraction process, it is natural to wonder whether they can be combined to produce a more powerful extractor.

Experimental Results

The following experiments demonstrate the benefits of Open IE for two tasks: open extraction and targeted extraction.

Related Work

TEXTRUNNER, the first Open IE system, is part of a body of work that reflects a growing interest in avoiding relation-specificity during extraction.

Conclusions and Future Work

Our experiments have demonstrated the promise of relation-independent extraction using the Open IE paradigm.

Topics

lexicalized

Appears in 11 sentences as: lexicalized (8) “lexicalized” (3)
In The Tradeoffs Between Open and Traditional Relation Extraction
  1. The relationship between standard RE systems and the new Open IE paradigm is analogous to the relationship between lexicalized and unlexicalized parsers.
    Page 1, “Introduction”
  2. Statistical parsers are usually lexicalized (i.e.
    Page 1, “Introduction”
  3. In this paper, we examine the tradeoffs between relation-specific ( “lexicalized” ) extraction and relation-independent (“unlexicalized”) extraction and reach an analogous conclusion.
    Page 2, “Introduction”
  4. Is it possible to combine Open IE with a “lexicalized” RE system to improve performance?
    Page 2, “Introduction”
  5. 0 In the targeted extraction case, we compare the performance of O-CRF to a traditional RE system and find that without any relation-specific input, O-CRF obtains the same precision with lower recall compared to a lexicalized extractor trained using hundreds, and sometimes thousands, of labeled examples per relation.
    Page 2, “Introduction”
  6. 0 We present H-CRF, an ensemble-based extractor that learns to combine the output of the lexicalized and unlexicalized RE systems and achieves a 10% relative increase in precision with comparable recall over traditional RE.
    Page 2, “Introduction”
  7. To compare the behavior of open, or “unlexicalized,” extraction to relation-specific, or “lexicalized” extraction, we developed a CRF-based extractor under the traditional RE paradigm.
    Page 5, “Relation Extraction”
  8. We now describe an ensemble-based or hybrid approach to RE that leverages the different views offered by open, self- supervised extraction in O-CRF, and lexicalized , supervised extraction in Rl-CRF.
    Page 5, “Hybrid Relation Extraction”
  9. We also show that the combination of unlexicalized, open extraction in O-CRF and lexicalized , supervised extraction in R1 -CRF improves precision and F-measure compared to a standalone RE system.
    Page 6, “Experimental Results”
  10. The lexicalized R1 -CRF extractor is able to recover from this error; the presence of the word “Acquire” is enough to recog-
    Page 7, “Experimental Results”
  11. We found that while RES OLVER improves the relative recall of O-CRF by nearly 50%, O-CRF locates fewer synonyms per relation compared to its lexicalized counterpart.
    Page 7, “Experimental Results”

See all papers in Proc. ACL 2008 that mention lexicalized.

See all papers in Proc. ACL that mention lexicalized.

Back to top.

CRF

Appears in 9 sentences as: CRF (9)
In The Tradeoffs Between Open and Traditional Relation Extraction
  1. Figure 1: Relation Extraction as Sequence Labeling: A CRF is used to identify the relationship, born in, between Kafka and Prague
    Page 4, “Relation Extraction”
  2. The resulting set of labeled examples are described using features that can be extracted without syntactic or semantic analysis and used to train a CRF , a sequence model that learns to identify spans of tokens believed to indicate explicit mentions of relationships between entities.
    Page 4, “Relation Extraction”
  3. The entity pair serves to anchor each end of a linear-chain CRF , and both entities in the pair are assigned a fixed label of ENT.
    Page 4, “Relation Extraction”
  4. O-CRF was built using the CRF implementation provided by MALLET (McCallum, 2002), as well as part-of-speech tagging and phrase-chunking tools available from OPENNLP.2
    Page 4, “Relation Extraction”
  5. The CRF is then used to label instances relations for each possible entity pair, subject to the constraints mentioned previously.
    Page 5, “Relation Extraction”
  6. We refer to this system as R l - CRF .
    Page 5, “Relation Extraction”
  7. Due to the sequential nature of our RE task, H-CRF employs a CRF as the meta-leamer, as opposed to a decision tree or regression-based classifier.
    Page 5, “Hybrid Relation Extraction”
  8. To obtain the probability at each position of a linear-chain CRF , the constrained forward-backward technique described in (Culotta and McCallum, 2004) is used.
    Page 5, “Hybrid Relation Extraction”
  9. (2006) used a CRF for RE, yet their task differs greatly from open extraction.
    Page 8, “Related Work”

See all papers in Proc. ACL 2008 that mention CRF.

See all papers in Proc. ACL that mention CRF.

Back to top.

Relation Extraction

Appears in 9 sentences as: Relation Extraction (5) relation extraction (4)
In The Tradeoffs Between Open and Traditional Relation Extraction
  1. raditional Relation Extraction
    Page 1, “Abstract”
  2. Relation Extraction (RE) is the task of recognizing the assertion of a particular relationship between two or more entities in text.
    Page 1, “Introduction”
  3. In this section, we show that many relationships are consistently expressed using a compact set of relation-independent lexico-syntactic patterns, and quantify their frequency based on a sample of 500 sentences selected at random from an IE training corpus developed by (Bunescu and Mooney, 2007).1 This observation helps to explain the success of open relation extraction , which learns a relation-independent extraction model as described in Section 3.1.
    Page 2, “The Nature of Relations in English”
  4. Given a relation name, labeled examples of the relation, and a corpus, traditional Relation Extraction (RE) systems output instances of the given relation found in the corpus.
    Page 3, “Relation Extraction”
  5. Figure 1: Relation Extraction as Sequence Labeling: A CRF is used to identify the relationship, born in, between Kafka and Prague
    Page 4, “Relation Extraction”
  6. Linear-chain CRFs have been applied to a variety of sequential text processing tasks including named-entity recognition, part-of—speech tagging, word segmentation, semantic role identification, and recently relation extraction (Culotta et al., 2006).
    Page 4, “Relation Extraction”
  7. The set of features used by O-CRF is largely similar to those used by O-NB and other state-of-the-art relation extraction systems, They include part-of-speech tags (predicted using a separately trained maximum-entropy model), regular expressions (e. g.detecting capitalization, punctuation, eta), context words, and conjunctions of features occurring in adjacent positions within six words to the left and six words to the right of the current word.
    Page 4, “Relation Extraction”
  8. 4.2 Stacked Relation Extraction
    Page 5, “Hybrid Relation Extraction”
  9. We also plan to explore the capacity of Open IE to automatically provide labeled training data, when traditional relation extraction is a more appropriate choice.
    Page 8, “Conclusions and Future Work”

See all papers in Proc. ACL 2008 that mention Relation Extraction.

See all papers in Proc. ACL that mention Relation Extraction.

Back to top.

part-of-speech

Appears in 4 sentences as: part-of-speech (4)
In The Tradeoffs Between Open and Traditional Relation Extraction
  1. The set of features used by O-CRF is largely similar to those used by O-NB and other state-of-the-art relation extraction systems, They include part-of-speech tags (predicted using a separately trained maximum-entropy model), regular expressions (e. g.detecting capitalization, punctuation, eta), context words, and conjunctions of features occurring in adjacent positions within six words to the left and six words to the right of the current word.
    Page 4, “Relation Extraction”
  2. O-CRF was built using the CRF implementation provided by MALLET (McCallum, 2002), as well as part-of-speech tagging and phrase-chunking tools available from OPENNLP.2
    Page 4, “Relation Extraction”
  3. A large number of false negatives on the part of O-CRF can be attributed to its lack of lexical features, which are often crucial when part-of-speech tagging errors are present.
    Page 7, “Experimental Results”
  4. nize the positive instance, despite the incorrect part-of-speech tag.
    Page 7, “Experimental Results”

See all papers in Proc. ACL 2008 that mention part-of-speech.

See all papers in Proc. ACL that mention part-of-speech.

Back to top.

part-of-speech tagging

Appears in 4 sentences as: part-of-speech tag (1) part-of-speech tagging (2) part-of-speech tags (1)
In The Tradeoffs Between Open and Traditional Relation Extraction
  1. The set of features used by O-CRF is largely similar to those used by O-NB and other state-of-the-art relation extraction systems, They include part-of-speech tags (predicted using a separately trained maximum-entropy model), regular expressions (e. g.detecting capitalization, punctuation, eta), context words, and conjunctions of features occurring in adjacent positions within six words to the left and six words to the right of the current word.
    Page 4, “Relation Extraction”
  2. O-CRF was built using the CRF implementation provided by MALLET (McCallum, 2002), as well as part-of-speech tagging and phrase-chunking tools available from OPENNLP.2
    Page 4, “Relation Extraction”
  3. A large number of false negatives on the part of O-CRF can be attributed to its lack of lexical features, which are often crucial when part-of-speech tagging errors are present.
    Page 7, “Experimental Results”
  4. nize the positive instance, despite the incorrect part-of-speech tag .
    Page 7, “Experimental Results”

See all papers in Proc. ACL 2008 that mention part-of-speech tagging.

See all papers in Proc. ACL that mention part-of-speech tagging.

Back to top.

extraction system

Appears in 3 sentences as: extraction system (2) extraction systems (1)
In The Tradeoffs Between Open and Traditional Relation Extraction
  1. Second, when the number of target relations is small, and their names are known in advance, we show that O-CRF is able to match the precision of a traditional extraction system , though at substantially lower recall.
    Page 1, “Abstract”
  2. The unique nature of the open extraction task has led us to develop O-CRF, an open extraction system that uses the power of graphical models to identify relations in text.
    Page 3, “Relation Extraction”
  3. The set of features used by O-CRF is largely similar to those used by O-NB and other state-of-the-art relation extraction systems , They include part-of-speech tags (predicted using a separately trained maximum-entropy model), regular expressions (e. g.detecting capitalization, punctuation, eta), context words, and conjunctions of features occurring in adjacent positions within six words to the left and six words to the right of the current word.
    Page 4, “Relation Extraction”

See all papers in Proc. ACL 2008 that mention extraction system.

See all papers in Proc. ACL that mention extraction system.

Back to top.

graphical models

Appears in 3 sentences as: graphical models (3)
In The Tradeoffs Between Open and Traditional Relation Extraction
  1. The unique nature of the open extraction task has led us to develop O-CRF, an open extraction system that uses the power of graphical models to identify relations in text.
    Page 3, “Relation Extraction”
  2. Whereas classifiers predict the label of a single variable, graphical models model multiple, in-
    Page 3, “Relation Extraction”
  3. Conditional Random Fields (CRFs) (Lafferty et al., 2001), are undirected graphical models trained to maximize the conditional probability of a finite set of labels Y given a set of input observations X.
    Page 4, “Relation Extraction”

See all papers in Proc. ACL 2008 that mention graphical models.

See all papers in Proc. ACL that mention graphical models.

Back to top.