Unsupervised Relation Discovery with Sense Disambiguation
Yao, Limin and Riedel, Sebastian and McCallum, Andrew

Article Structure

Abstract

To discover relation types from text, most methods cluster shallow or syntactic patterns of relation mentions, but consider only one possible sense per pattern.

Introduction

Relation extraction (RE) is the task of determining semantic relations between entities mentioned in text.

Our Approach

We induce pattern senses by clustering the entity pairs associated with a pattern, and discover semantic relations by clustering these sense clusters.

Experiments

We carry out experiments on New York Times articles from years 2000 to 2007 (Sandhaus, 2008).

Evaluations

4.1 Automatic Evaluation against Freebase

Related Work

There has been considerable interest in unsupervised relation discovery, including clustering approach, generative models and many other approaches.

Conclusion

We explore senses of paths to discover semantic relations.

Topics

entity type

Appears in 21 sentences as: entity type (12) entity types (10)
In Unsupervised Relation Discovery with Sense Disambiguation
  1. fine-grained entity types of two arguments, to handle polysemy.
    Page 2, “Introduction”
  2. However, such fine grained entity types come at a high cost.
    Page 2, “Introduction”
  3. It is difficult to discover a high-quality set of fine-grained entity types due to unknown criteria for developing such a set.
    Page 2, “Introduction”
  4. In particular, the optimal granularity of entity types depends on the particular pattern we consider.
    Page 2, “Introduction”
  5. In addition, there are senses that just cannot be determined by entity types alone: Take the meaning of “A beat B” where A and B are both persons; this could mean A physically beats B, or it could mean that A defeated B in a competition.
    Page 2, “Introduction”
  6. In this paper we address the problem of polysemy, while we circumvent the problem of finding fine-grained entity types .
    Page 2, “Introduction”
  7. Experimental results show that our approach improves over the baselines, and that using global features achieves better performance than using entity type based features.
    Page 2, “Introduction”
  8. Local+Type This system adds entity type features to the previous system.
    Page 5, “Experiments”
  9. This allows us to compare performance of using global features against entity type features.
    Page 5, “Experiments”
  10. To determine entity types , we link named entities to Wikipedia pages using the Wikifier (Rati-nov et al., 2011) package and extract categories from the Wikipedia page.
    Page 5, “Experiments”
  11. As we argued in Section 1, it is difficult to determine the right granularity of the entity types to use.
    Page 6, “Experiments”

See all papers in Proc. ACL 2012 that mention entity type.

See all papers in Proc. ACL that mention entity type.

Back to top.

semantic relations

Appears in 13 sentences as: semantic relation (4) semantic relations (9)
In Unsupervised Relation Discovery with Sense Disambiguation
  1. We merge these sense clusters into semantic relations using hierarchical agglomerative clustering.
    Page 1, “Abstract”
  2. Relation extraction (RE) is the task of determining semantic relations between entities mentioned in text.
    Page 1, “Introduction”
  3. We induce pattern senses by clustering the entity pairs associated with a pattern, and discover semantic relations by clustering these sense clusters.
    Page 2, “Our Approach”
  4. We take each sense cluster of a pattern as an atomic cluster, and use hierarchical agglomerative clustering to organize them into semantic relations .
    Page 2, “Our Approach”
  5. Therefore, a semantic relation comprises a set of sense clusters of patterns.
    Page 2, “Our Approach”
  6. Note that one pattern can fall into different semantic relations when it has multiple senses.
    Page 2, “Our Approach”
  7. After discovering sense clusters of paths, we employ hierarchical agglomerative clustering (HAC) to discover semantic relations from these sense clusters.
    Page 3, “Our Approach”
  8. Our approach produces sense clusters for each path and semantic relation clusters of the whole data.
    Page 3, “Our Approach”
  9. DIRT calculates distributional similarities between different paths to find paths which bear the same semantic relation .
    Page 5, “Experiments”
  10. Pairwise metrics measure how often two tuples which are clustered in one semantic relation are labeled with the same Freebase label.
    Page 6, “Evaluations”
  11. Both use distributional similarity to find patterns representing similar semantic relations .
    Page 8, “Related Work”

See all papers in Proc. ACL 2012 that mention semantic relations.

See all papers in Proc. ACL that mention semantic relations.

Back to top.

generative model

Appears in 10 sentences as: generative model (6) Generative models (1) generative models (4)
In Unsupervised Relation Discovery with Sense Disambiguation
  1. We compare our approach with several baseline systems, including a generative model approach, a clustering method that does not disambiguate between senses, and our approach with different features.
    Page 2, “Introduction”
  2. The two theme features are extracted from generative models , and each is a topic number.
    Page 3, “Our Approach”
  3. We compare our approach against several baseline systems, including a generative model approach and variations of our own approach.
    Page 5, “Experiments”
  4. Rel-LDA: Generative models have been successfully applied to unsupervised relation extraction (Rink and Harabagiu, 2011; Yao et al., 2011).
    Page 5, “Experiments”
  5. The generative model approach with 300 topics achieves similar precision to the hierarchical clustering approach.
    Page 6, “Evaluations”
  6. With more topics, the precision increases, however, the recall of the generative model is much lower than those of other approaches.
    Page 6, “Evaluations”
  7. The generative model approach produces more coherent clusters when the number of relation topics increases.
    Page 7, “Evaluations”
  8. There has been considerable interest in unsupervised relation discovery, including clustering approach, generative models and many other approaches.
    Page 8, “Related Work”
  9. Our approach employs generative models for path sense disambiguation, which achieves better performance than directly applying generative models to unsupervised relation discovery.
    Page 8, “Related Work”
  10. Experimental results show our approach discovers precise relation clusters and outperforms a generative model approach and a clustering method which does not address sense disambiguation.
    Page 8, “Conclusion”

See all papers in Proc. ACL 2012 that mention generative model.

See all papers in Proc. ACL that mention generative model.

Back to top.

sense disambiguation

Appears in 10 sentences as: Sense Disambiguation (1) sense disambiguation (9)
In Unsupervised Relation Discovery with Sense Disambiguation
  1. Experimental results show our proposed approach discovers dramatically more accurate clusters than models without sense disambiguation , and that incorporating global features, such as the document theme, is crucial.
    Page 1, “Abstract”
  2. 2.1 Sense Disambiguation
    Page 2, “Our Approach”
  3. For the sense disambiguation model, we set the number of topics (senses) to 50.
    Page 5, “Experiments”
  4. One sense per path (HAC): This system uses only hierarchical clustering to discover relations, skipping sense disambiguation .
    Page 5, “Experiments”
  5. Without using sense disambiguation , the performance of hierarchical clustering decreases significantly, losing 17% in precision in the pairwise measure, and 15% in terms of B3.
    Page 6, “Evaluations”
  6. The clusters produced by HAC (without sense disambiguation ) is coherent if all the paths in one relation take a particular sense.
    Page 7, “Evaluations”
  7. Selectional preferences discovery (Ritter et al., 2010; Seaghdha, 2010) can help path sense disambiguation , however, we show that using global features performs better than entity type features.
    Page 8, “Related Work”
  8. And our sense disambiguation model is inspired by this work.
    Page 8, “Related Work”
  9. Our approach employs generative models for path sense disambiguation , which achieves better performance than directly applying generative models to unsupervised relation discovery.
    Page 8, “Related Work”
  10. Experimental results show our approach discovers precise relation clusters and outperforms a generative model approach and a clustering method which does not address sense disambiguation .
    Page 8, “Conclusion”

See all papers in Proc. ACL 2012 that mention sense disambiguation.

See all papers in Proc. ACL that mention sense disambiguation.

Back to top.

topic model

Appears in 8 sentences as: topic model (5) topic models (3)
In Unsupervised Relation Discovery with Sense Disambiguation
  1. In particular, we employ a topic model to partition entity pairs associated with patterns into sense clusters using local and global features.
    Page 1, “Abstract”
  2. We represent each pattern as a list of entity pairs and employ a topic model to partition them into different sense clusters using local and global features.
    Page 2, “Our Approach”
  3. We employ a topic model to discover senses for each path.
    Page 3, “Our Approach”
  4. It does not employ global topic model features extracted from documents and sentences.
    Page 5, “Experiments”
  5. Local: This system uses our approach (both sense clustering with topic models and hierarchical clustering), but without global features.
    Page 5, “Experiments”
  6. Hachey (2009) uses topic models to perform dimensionality reduction on features when clustering entity pairs into relations.
    Page 8, “Related Work”
  7. For example, varieties of topic models are employed for both open domain (Yao et al., 2011) and in-domain relation discovery (Chen et al., 2011; Rink and Harabagiu, 2011).
    Page 8, “Related Work”
  8. We employ a topic model to partition entity pairs of a path into different sense clusters and use hierarchical agglomerative clustering to merge senses into semantic relations.
    Page 8, “Conclusion”

See all papers in Proc. ACL 2012 that mention topic model.

See all papers in Proc. ACL that mention topic model.

Back to top.

dependency paths

Appears in 7 sentences as: dependency path (1) dependency paths (6)
In Unsupervised Relation Discovery with Sense Disambiguation
  1. Such patterns could be sequences of lemmas and Part-of-Speech tags, or lexicalized dependency paths .
    Page 1, “Introduction”
  2. Whether we use sequences or dependency paths , we will encounter the problem of polysemy.
    Page 1, “Introduction”
  3. We perform experiments on New York Times articles and consider lexicalized dependency paths as patterns in our data.
    Page 2, “Introduction”
  4. We extract dependency paths for each pair of named entities in one sentence.
    Page 3, “Experiments”
  5. for words on the dependency paths .
    Page 4, “Experiments”
  6. Each entity pair tun and the dependency path which connects them form wit a tuple.
    Page 4, “Experiments”
  7. Both DIRT and our approach represent dependency paths using their arguments.
    Page 8, “Related Work”

See all papers in Proc. ACL 2012 that mention dependency paths.

See all papers in Proc. ACL that mention dependency paths.

Back to top.

fine-grained

Appears in 6 sentences as: fine-grained (6)
In Unsupervised Relation Discovery with Sense Disambiguation
  1. fine-grained entity types of two arguments, to handle polysemy.
    Page 2, “Introduction”
  2. It is difficult to discover a high-quality set of fine-grained entity types due to unknown criteria for developing such a set.
    Page 2, “Introduction”
  3. In this paper we address the problem of polysemy, while we circumvent the problem of finding fine-grained entity types.
    Page 2, “Introduction”
  4. Since our system predicts fine-grained clusters comparing against Freebase relations, the measure of recall is underestimated.
    Page 6, “Evaluations”
  5. Since our systems predict more fine-grained clusters than
    Page 7, “Evaluations”
  6. They cluster arguments to fine-grained entity types and rank the associations of a relation with these entity types to discover selectional preferences.
    Page 8, “Related Work”

See all papers in Proc. ACL 2012 that mention fine-grained.

See all papers in Proc. ACL that mention fine-grained.

Back to top.

LDA

Appears in 6 sentences as: LDA (6)
In Unsupervised Relation Discovery with Sense Disambiguation
  1. In our experiments, we use the meta-descriptors of a document as side information and train a standard LDA model to find the theme of a document.
    Page 3, “Our Approach”
  2. This model is a minor variation on standard LDA and the difference is that instead of drawing an observation from a hidden topic variable, we draw multiple observations from a hidden topic variable.
    Page 3, “Our Approach”
  3. To this end we interpret the de-scriptors as words in documents, and train a standard LDA model based on these documents.
    Page 5, “Experiments”
  4. We also train a standard LDA model to obtain the theme of a sentence.
    Page 5, “Experiments”
  5. The LDA model assigns each word to a topic.
    Page 5, “Experiments”
  6. We compare against one such model: An extension to standard LDA that falls into the framework presented by Yao et al.
    Page 5, “Experiments”

See all papers in Proc. ACL 2012 that mention LDA.

See all papers in Proc. ACL that mention LDA.

Back to top.

named entities

Appears in 4 sentences as: named entities (4)
In Unsupervised Relation Discovery with Sense Disambiguation
  1. Many relation discovery methods rely exclusively on the notion of either shallow or syntactic patterns that appear between two named entities (B ollegala et al., 2010; Lin and Pantel, 2001).
    Page 1, “Introduction”
  2. We extract dependency paths for each pair of named entities in one sentence.
    Page 3, “Experiments”
  3. To determine entity types, we link named entities to Wikipedia pages using the Wikifier (Rati-nov et al., 2011) package and extract categories from the Wikipedia page.
    Page 5, “Experiments”
  4. (2004) cluster pairs of named entities according to the similarity of context words intervening between them.
    Page 8, “Related Work”

See all papers in Proc. ACL 2012 that mention named entities.

See all papers in Proc. ACL that mention named entities.

Back to top.

relation extraction

Appears in 4 sentences as: Relation extraction (1) relation extraction (2) relation extractor (1)
In Unsupervised Relation Discovery with Sense Disambiguation
  1. Relation extraction (RE) is the task of determining semantic relations between entities mentioned in text.
    Page 1, “Introduction”
  2. Here, the relation extractor simultaneously discovers facts expressed in natural language, and the ontology into which they are assigned.
    Page 1, “Introduction”
  3. Rel-LDA: Generative models have been successfully applied to unsupervised relation extraction (Rink and Harabagiu, 2011; Yao et al., 2011).
    Page 5, “Experiments”
  4. Many generative probabilistic models have been applied to relation extraction .
    Page 8, “Related Work”

See all papers in Proc. ACL 2012 that mention relation extraction.

See all papers in Proc. ACL that mention relation extraction.

Back to top.

relation instances

Appears in 4 sentences as: relation instances (4)
In Unsupervised Relation Discovery with Sense Disambiguation
  1. For automatic evaluation, we use relation instances in Freebase as ground truth, and employ two clustering
    Page 2, “Introduction”
  2. Many users also contribute to Freebase by annotating relation instances .
    Page 6, “Evaluations”
  3. One reason is the following: some relation instances should have multiple labels but they have only one label in Freebase.
    Page 6, “Evaluations”
  4. They employ a self-leamer to extract relation instances , but no attempt is made to cluster instances into relations.
    Page 8, “Related Work”

See all papers in Proc. ACL 2012 that mention relation instances.

See all papers in Proc. ACL that mention relation instances.

Back to top.

natural language

Appears in 3 sentences as: natural language (3)
In Unsupervised Relation Discovery with Sense Disambiguation
  1. Here, the relation extractor simultaneously discovers facts expressed in natural language , and the ontology into which they are assigned.
    Page 1, “Introduction”
  2. Following (Yao et al., 2011), we filter out noisy documents and use natural language packages to annotate the documents, including NER tagging (Finkel et al., 2005) and dependency parsing (Nivre et al., 2004).
    Page 3, “Experiments”
  3. Three graduate students in natural language processing annotate intruding paths.
    Page 6, “Evaluations”

See all papers in Proc. ACL 2012 that mention natural language.

See all papers in Proc. ACL that mention natural language.

Back to top.