Encoding Relation Requirements for Relation Extraction via Joint Inference
Chen, Liwei and Feng, Yansong and Huang, Songfang and Qin, Yong and Zhao, Dongyan

Article Structure

Abstract

Most existing relation extraction models make predictions for each entity pair locally and individually, while ignoring implicit global clues available in the knowledge base, sometimes leading to conflicts among local predictions from different entity pairs.

Introduction

Identifying predefined kinds of relationship between pairs of entities is crucial for many knowledge base related applications(Suchanek et al., 2013).

Related Work

Since traditional supervised relation extraction methods (Soderland et al., 1995; Zhao and Gr-ishman, 2005) require manual annotations and are often domain-specific, nowadays many efforts focus on semi-supervised or unsupervised methods (Banko et al., 2007; Fader et al., 2011).

The Framework

Our framework takes a set of entity pairs and their supporting sentences as its input.

Experiments

4.1 Datasets

Conclusions

In this paper, we make use of the global clues derived from KB to help resolve the disagreements among local relation predictions, thus reduce the incorrect predictions and improve the performance of relation extraction.

Topics

relation extraction

Appears in 16 sentences as: relation extraction (11) relation extractor (4) relation extractors (1) relations extracted (1)
In Encoding Relation Requirements for Relation Extraction via Joint Inference
  1. Most existing relation extraction models make predictions for each entity pair locally and individually, while ignoring implicit global clues available in the knowledge base, sometimes leading to conflicts among local predictions from different entity pairs.
    Page 1, “Abstract”
  2. Experimental results on three datasets, in both English and Chinese, show that our framework outperforms the state-of-the-art relation extraction models when such clues are applicable to the datasets.
    Page 1, “Abstract”
  3. In the literature, relation extraction (RE) is usually investigated in a classification style, where relations are simply treated as isolated class labels, while their definitions or background information are sometimes ignored.
    Page 1, “Introduction”
  4. On the other hand, most previous relation extractors process each entity pair (we will use entity pair and entity tuple exchangeably in the rest of the paper) locally and individually, i.e., the extractor makes decisions solely based on the sentences containing the current entity pair and ignores other related pairs, therefore has difficulties to capture possible disagreements among different entity pairs.
    Page 1, “Introduction”
  5. In this paper, we will address how to derive and exploit two categories of these clues: the expected types and the cardinality requirements of a relation’s arguments, in the scenario of relation extraction .
    Page 1, “Introduction”
  6. Specifically, the joint inference framework operates on the output of a sentence level relation extractor as input, derives 5 types of constraints from an existing KB to implicitly capture
    Page 1, “Introduction”
  7. Since traditional supervised relation extraction methods (Soderland et al., 1995; Zhao and Gr-ishman, 2005) require manual annotations and are often domain-specific, nowadays many efforts focus on semi-supervised or unsupervised methods (Banko et al., 2007; Fader et al., 2011).
    Page 2, “Related Work”
  8. To bridge the gaps between the relations extracted from open information extraction and the canonicalized relations in KBs, Yao et al.
    Page 2, “Related Work”
  9. Since we will focus on the open domain relation extraction , we still follow the distant supervision paradigm to collect our training data guided by a KB, and train the local extractor accordingly.
    Page 3, “The Framework”
  10. Traditionally, both lexical features and syntactic features are used in relation extraction .
    Page 3, “The Framework”
  11. addition to lexical and syntactic features, we also use n-gram features to train our preliminary relation extraction model.
    Page 3, “The Framework”

See all papers in Proc. ACL 2014 that mention relation extraction.

See all papers in Proc. ACL that mention relation extraction.

Back to top.

ILP

Appears in 15 sentences as: ILP (16)
In Encoding Relation Requirements for Relation Extraction via Joint Inference
  1. We use integer linear programming ( ILP ) as the solver and evaluate our framework on English and Chinese datasets.
    Page 2, “Introduction”
  2. de Lacalle and Lapata (2013) encode general domain knowledge as FOL rules in a topic model while our instantiated constraints are directly operated in an ILP model.
    Page 2, “Related Work”
  3. In this paper, we propose to solve the problem by using an ILP tool, IBM ILOG Cplexl.
    Page 5, “The Framework”
  4. By adopting ILP , we can combine the local information including MaXEnt confidence scores and the implicit relation backgrounds that are embedded into global consistencies of the entity tuples together.
    Page 5, “The Framework”
  5. It tends to result in a high recall, and its weakness of low precision is perfectly fixed by the ILP model.
    Page 6, “Experiments”
  6. Our ILP model and its variants all outperform Mintz++ in precision in both datasets, indicating that our approach helps filter out incorrect predictions from the output of MaxEnt model.
    Page 6, “Experiments”
  7. Compared to ILP-2cand and original ILP , ILP-lcand leads to slightly lower precision but much lower recall, showing that selecting more candidates may help us collect more potentially correct predictions.
    Page 7, “Experiments”
  8. Comparing ILP-2cand and original ILP , the latter hardly makes any improvement in precision, but is slightly longer in recall, indicating using three candidates can still collect some more potentially correct predictions, although the number may be limited.
    Page 7, “Experiments”
  9. In order to study how our framework improves the performances on the DBpedia dataset and the Chinese dataset, we further investigate the number of incorrect predictions eliminated by ILP and the number of incorrect predictions corrected by ILP .
    Page 7, “Experiments”
  10. Table 1: Details of the improvements made by ILP in the DBpedia and Chinese datasets.
    Page 7, “Experiments”
  11. dictions newly introduce by ILP , which were NA in Mintz++.
    Page 7, “Experiments”

See all papers in Proc. ACL 2014 that mention ILP.

See all papers in Proc. ACL that mention ILP.

Back to top.

knowledge base

Appears in 11 sentences as: knowledge base (6) knowledge bases (5)
In Encoding Relation Requirements for Relation Extraction via Joint Inference
  1. Most existing relation extraction models make predictions for each entity pair locally and individually, while ignoring implicit global clues available in the knowledge base , sometimes leading to conflicts among local predictions from different entity pairs.
    Page 1, “Abstract”
  2. And, we find that the clues learnt automatically from existing knowledge bases perform comparably to those refined by human.
    Page 1, “Abstract”
  3. Identifying predefined kinds of relationship between pairs of entities is crucial for many knowledge base related applications(Suchanek et al., 2013).
    Page 1, “Introduction”
  4. Many knowledge bases do not have a well-defined typing system, let alone fine-grained typing taxonomies with corresponding type recognizers, which are crucial to explicitly model the typing requirements for arguments of a relation, but rather expensive and time-consuming to collect.
    Page 1, “Introduction”
  5. We propose to perform joint inference upon multiple local predictions by leveraging implicit clues that are encoded with relation specific requirements and can be learnt from existing knowledge bases .
    Page 1, “Introduction”
  6. Their approach only captures relation dependencies, while we learn implicit relation backgrounds from knowledge bases , including argument type and cardinality requirements.
    Page 2, “Related Work”
  7. The clues of detecting these inconsistencies can be learnt from a knowledge base .
    Page 3, “The Framework”
  8. As discussed earlier, we will exploit from the knowledge base two categories of clues that implicitly capture relations’ backgrounds: their expected argument types and argument cardinalities, based on which we can discover two categories of disagreements among the candidate predictions, summarized as argument type inconsistencies and Violations of arguments’ uniqueness, which have been rarely considered before.
    Page 3, “The Framework”
  9. Most existing knowledge bases represent their knowledge facts in the form of (<subject, relation, 0bject>) triple, which can be seen as relational facts between entity tuples.
    Page 4, “The Framework”
  10. It is rare to find inconsistencies among the triples in the knowledge base .
    Page 4, “The Framework”
  11. It uses Freebase as the knowledge base and New York Time corpus as the text corpus, including about 60,000 entity tuples in the training set, and about 90,000 entity tuples in the testing set.
    Page 6, “Experiments”

See all papers in Proc. ACL 2014 that mention knowledge base.

See all papers in Proc. ACL that mention knowledge base.

Back to top.

confidence scores

Appears in 7 sentences as: confidence score (1) confidence scores (7)
In Encoding Relation Requirements for Relation Extraction via Joint Inference
  1. We first train a preliminary sentence level extractor which can output confidence scores for its predictions, e.g., a maximum entropy or logistic regression model, and use this local extractor to produce local predictions.
    Page 2, “The Framework”
  2. Now the confidence score of a relation 7“ E R75 being assigned to tuple t can be calculated as:
    Page 3, “The Framework”
  3. The first component is the sum of the original confidence scores of all the selected candidates, and the second one is the sum of the maximal mention-level confidence scores of all the selected candidates.
    Page 5, “The Framework”
  4. The latter is designed to encourage the model to select the candidates with higher individual mention-level confidence scores .
    Page 5, “The Framework”
  5. By adopting ILP, we can combine the local information including MaXEnt confidence scores and the implicit relation backgrounds that are embedded into global consistencies of the entity tuples together.
    Page 5, “The Framework”
  6. The preliminary relation extractor of our optimization framework is not limited to the MaxEnt extractor, and can take any sentence level relation extractor with confidence scores .
    Page 8, “Experiments”
  7. Furthermore, the confidence scores which MultiR outputs are not normalized to the same scale, which brings us difficulties in setting up a confidence threshold to select the candidates.
    Page 8, “Experiments”

See all papers in Proc. ACL 2014 that mention confidence scores.

See all papers in Proc. ACL that mention confidence scores.

Back to top.

MaxEnt

Appears in 7 sentences as: MaXEnt (2) MaxEnt (5)
In Encoding Relation Requirements for Relation Extraction via Joint Inference
  1. By adopting ILP, we can combine the local information including MaXEnt confidence scores and the implicit relation backgrounds that are embedded into global consistencies of the entity tuples together.
    Page 5, “The Framework”
  2. Our ILP model and its variants all outperform Mintz++ in precision in both datasets, indicating that our approach helps filter out incorrect predictions from the output of MaxEnt model.
    Page 6, “Experiments”
  3. However, in the Riedel’s dataset, Mintz++, the MaxEnt relation extractor, does not perform well, and our framework cannot improve its performance.
    Page 6, “Experiments”
  4. Hence, our framework does not perform well due to the poor performance of MaXEnt extractor and the lack of clues.
    Page 7, “Experiments”
  5. The preliminary relation extractor of our optimization framework is not limited to the MaxEnt extractor, and can take any sentence level relation extractor with confidence scores.
    Page 8, “Experiments”
  6. The results are not as high as when we use MaxEnt as the preliminary extractor.
    Page 8, “Experiments”
  7. Furthermore, our framework is scalable for other local sentence level extractors in addition to the MaxEnt model.
    Page 9, “Conclusions”

See all papers in Proc. ACL 2014 that mention MaxEnt.

See all papers in Proc. ACL that mention MaxEnt.

Back to top.

distant supervision

Appears in 4 sentences as: Distant supervision (1) distant supervision (3)
In Encoding Relation Requirements for Relation Extraction via Joint Inference
  1. Distant supervision (DS) is a semi-supervised RE framework and has attracted many attentions (Bunescu, 2007; Mintz et al., 2009; Yao et al., 2010; Surdeanu et al., 2010; Hoffmann et al., 2011; Surdeanu et al., 2012).
    Page 2, “Related Work”
  2. (2013) utilize relation cardinality to create negative samples for distant supervision while we use both implicit type clues and relation cardinality expectations to discover possible inconsistencies among local predictions.
    Page 2, “Related Work”
  3. Since we will focus on the open domain relation extraction, we still follow the distant supervision paradigm to collect our training data guided by a KB, and train the local extractor accordingly.
    Page 3, “The Framework”
  4. We also use two distant supervision approaches for the comparison.
    Page 6, “Experiments”

See all papers in Proc. ACL 2014 that mention distant supervision.

See all papers in Proc. ACL that mention distant supervision.

Back to top.

optimization problem

Appears in 3 sentences as: optimization problem (3)
In Encoding Relation Requirements for Relation Extraction via Joint Inference
  1. We formalize this procedure as a constrained optimization problem , which can be solved by many optimization frameworks.
    Page 2, “Introduction”
  2. This is an NP-hard optimization problem .
    Page 5, “The Framework”
  3. After the optimization problem is solved, we will obtain a list of selected candidate relations for each tuple, which will be our final output.
    Page 5, “The Framework”

See all papers in Proc. ACL 2014 that mention optimization problem.

See all papers in Proc. ACL that mention optimization problem.

Back to top.