Distant Supervision for Relation Extraction with Matrix Completion
Fan, Miao and Zhao, Deli and Zhou, Qiang and Liu, Zhiyuan and Zheng, Thomas Fang and Chang, Edward Y.

Article Structure

Abstract

The essence of distantly supervised relation extraction is that it is an incomplete multi-label classification problem with sparse and noisy features.

Introduction

Relation Extraction (RE) is the process of generating structured relation knowledge from unstructured natural language texts.

Related Work

The idea of distant supervision was firstly proposed in the field of bioinformatics (Craven and Kumlien, 1999).

Model

We apply a new technique in the field of applied mathematics, i.e., low-rank matrix completion with convex optimization.

Algorithm

The matrix rank minimization problem is NP-hard.

Experiments

In order to conduct reliable experiments, we adjust and estimate the parameters for our approaches, DRMC-b and DRMC-l, and compare them with other four kinds of landmark methods (Mintz et al., 2009; Hoffmann et al., 2011; Surdeanu et al., 2012; Riedel et al., 2013) on two public datasets.

Discussion

We have mentioned that the basic alignment assumption of distant supervision (Mintz et al., 2009) tends to generate noisy (noisy features and

Conclusion and Future Work

In this paper, we contributed two noise-tolerant optimization models”, DRMC-b and DRMC-l, for distantly supervised relation extraction task from a novel perspective.

Topics

relation instances

Appears in 11 sentences as: relation instance (3) relation instances (8)
In Distant Supervision for Relation Extraction with Matrix Completion
  1. The relation instances are the triples related to President Barack Obama in the Freebase, and the relation mentions are some sentences describing him in the Wikipedia.
    Page 1, “Introduction”
  2. 8According to convention, we regard a structured triple r(ei, ej) as a relation instance which is composed of a pair of entities <81, ej >and a relation name 7“ with respect to them.
    Page 1, “Introduction”
  3. Not all relation mentions express the corresponding relation instances .
    Page 2, “Introduction”
  4. For example, the second relation mention in Figure 1 does not explicitly describe any relation instance , so features extracted from this sentence can be noisy.
    Page 2, “Introduction”
  5. However, the incomplete knowledge base does not contain the corresponding relation instance (Senate—of (Barack Obama, U .
    Page 2, “Introduction”
  6. As we are stepping into the big data era, the explosion of unstructured Web texts simulates us to build more powerful models that can automatically extract relation instances from large-scale online natural language corpora without hand-labeled annotation.
    Page 3, “Related Work”
  7. (2009) adopted Freebase (Bollacker et al., 2008; Bollacker et al., 2007), a large-scale crowdsourcing knowledge base online which contains billions of relation instances and thousands of relation names, to distantly supervise Wikipedia corpus.
    Page 3, “Related Work”
  8. Finally, we can achieve Top-N predicted relation instances via ranking the values of P7173 |pi).
    Page 4, “Model”
  9. At each round of iteration, we gain a recovered matrix and average the F114 scores from Top-5 to Top-all predicted relation instances to measure the performance.
    Page 6, “Experiments”
  10. In practical applications, we also concern about the precision on Top-N predicted relation instances .
    Page 7, “Experiments”
  11. Table 3: Precision of NFE-13, DRMC-b and DRMC-l on Top-100, Top-200 and Top-500 predicted relation instances .
    Page 8, “Experiments”

See all papers in Proc. ACL 2014 that mention relation instances.

See all papers in Proc. ACL that mention relation instances.

Back to top.

relation extraction

Appears in 10 sentences as: Relation Extraction (1) Relation extraction (1) relation extraction (8)
In Distant Supervision for Relation Extraction with Matrix Completion
  1. The essence of distantly supervised relation extraction is that it is an incomplete multi-label classification problem with sparse and noisy features.
    Page 1, “Abstract”
  2. Relation Extraction (RE) is the process of generating structured relation knowledge from unstructured natural language texts.
    Page 1, “Introduction”
  3. Figure 1: Training corpus generated by the basic alignment assumption of distantly supervised relation extraction .
    Page 1, “Introduction”
  4. In essence, distantly supervised relation extraction is an incomplete multi-label classification task with sparse and noisy features.
    Page 2, “Introduction”
  5. To the best of our knowledge, we are the first to apply this technique on relation extraction with distant supervision.
    Page 2, “Introduction”
  6. 11It is the abbreviation for Distant supervision for Relation extraction with Matrix Completion
    Page 3, “Related Work”
  7. (2012) proposed a novel approach to multi-instance multi-label learning for relation extraction , which jointly modeled all the sentences in texts and all labels in knowledge bases for a given entity pair.
    Page 3, “Related Work”
  8. Our models for relation extraction are based on the theoretic framework proposed by Goldberg et al.
    Page 3, “Model”
  9. In this paper, we contributed two noise-tolerant optimization models”, DRMC-b and DRMC-l, for distantly supervised relation extraction task from a novel perspective.
    Page 9, “Conclusion and Future Work”
  10. Our proposed models also leave open questions for distantly supervised relation extraction task.
    Page 9, “Conclusion and Future Work”

See all papers in Proc. ACL 2014 that mention relation extraction.

See all papers in Proc. ACL that mention relation extraction.

Back to top.

knowledge bases

Appears in 7 sentences as: knowledge base (3) knowledge bases (4)
In Distant Supervision for Relation Extraction with Matrix Completion
  1. m knowledge bases 2.
    Page 1, “Introduction”
  2. The intuition of the paradigm is that one can take advantage of several knowledge bases , such as WordNet3, Freebase4 and YAGO5, to automatically label free texts, like Wikipedia6 and New York Times corpora7, based on some heuristic alignment assumptions.
    Page 1, “Introduction”
  3. >) are not only involved in the relation instances8 coming from knowledge bases (President—of(Barack Obama, U.S.) and Born—in (Barack Obama, U .
    Page 1, “Introduction”
  4. However, the incomplete knowledge base does not contain the corresponding relation instance (Senate—of (Barack Obama, U .
    Page 2, “Introduction”
  5. (2004) used WordNet as the knowledge base to discover more h-pyernym/hyponym relations between entities from news articles.
    Page 3, “Related Work”
  6. (2009) adopted Freebase (Bollacker et al., 2008; Bollacker et al., 2007), a large-scale crowdsourcing knowledge base online which contains billions of relation instances and thousands of relation names, to distantly supervise Wikipedia corpus.
    Page 3, “Related Work”
  7. (2012) proposed a novel approach to multi-instance multi-label learning for relation extraction, which jointly modeled all the sentences in texts and all labels in knowledge bases for a given entity pair.
    Page 3, “Related Work”

See all papers in Proc. ACL 2014 that mention knowledge bases.

See all papers in Proc. ACL that mention knowledge bases.

Back to top.

distant supervision

Appears in 6 sentences as: Distant supervision (1) distant supervision (5)
In Distant Supervision for Relation Extraction with Matrix Completion
  1. Therefore, the distant supervision paradigm may generate incomplete labeling corpora.
    Page 2, “Introduction”
  2. To the best of our knowledge, we are the first to apply this technique on relation extraction with distant supervision .
    Page 2, “Introduction”
  3. The idea of distant supervision was firstly proposed in the field of bioinformatics (Craven and Kumlien, 1999).
    Page 3, “Related Work”
  4. 11It is the abbreviation for Distant supervision for Relation extraction with Matrix Completion
    Page 3, “Related Work”
  5. However, they did not concern about the data noise brought by the basic assumption of distant supervision .
    Page 3, “Related Work”
  6. We have mentioned that the basic alignment assumption of distant supervision (Mintz et al., 2009) tends to generate noisy (noisy features and
    Page 8, “Discussion”

See all papers in Proc. ACL 2014 that mention distant supervision.

See all papers in Proc. ACL that mention distant supervision.

Back to top.

named entity

Appears in 3 sentences as: name entity (1) named entity (2)
In Distant Supervision for Relation Extraction with Matrix Completion
  1. As we cannot tell what kinds of features are effective in advance, we have to use NLP toolkits, such as Stanford CoreNLPIO, to extract a variety of textual features, e.g., named entity tags, part-of-speech tags and lexicalized dependency paths.
    Page 2, “Introduction”
  2. Other literatures (Takamatsu et al., 2012; Min et al., 2013; Zhang et al., 2013; Xu et al., 2013) addressed more specific issues, like how to construct the negative class in learning or how to adopt more information, such as name entity tags, to improve the performance.
    Page 3, “Related Work”
  3. Three kinds of features, namely, lexical, syntactic and named entity tag features, were extracted from relation mentions.
    Page 6, “Experiments”

See all papers in Proc. ACL 2014 that mention named entity.

See all papers in Proc. ACL that mention named entity.

Back to top.

SVD

Appears in 3 sentences as: SVD (3)
In Distant Supervision for Relation Extraction with Matrix Completion
  1. We perform the singular value decomposition ( SVD ) (Golub and Kahan, 1965) for A at first, and then cut down each singular value.
    Page 5, “Algorithm”
  2. Shrinkage step: UEVT = SVD (A), Z = U max(2 — TZMO) VT. end while end foreach
    Page 5, “Algorithm”
  3. Shrinkage step: UEVT = SVD (A), Z = U max(2 — TZMO) VT.
    Page 5, “Algorithm”

See all papers in Proc. ACL 2014 that mention SVD.

See all papers in Proc. ACL that mention SVD.

Back to top.