A Discriminative Hierarchical Model for Fast Coreference at Large Scale
Wick, Michael and Singh, Sameer and McCallum, Andrew

Article Structure

Abstract

Methods that measure compatibility between mention pairs are currently the dominant approach to coreference.

Introduction

Coreference resolution, the task of clustering mentions into partitions representing their underlying real-world entities, is fundamental for high-level information extraction and data integration, including semantic search, question answering, and knowledge base construction.

Background: Pairwise Coreference

Coreference is the problem of clustering mentions such that mentions in the same set refer to the same real-world entity; it is also known as entity disambiguation, record linkage, and de-duplication.

Hierarchical Coreference

Instead of only capturing a single coreference clustering between mention pairs, we can imagine multiple levels of coreference decisions over different

Experiments: Author Coreference

Author coreference is a tremendously important task, enabling improved search and mining of scientific papers by researchers, funding agencies, and governments.

Related Work

Singh et al.

Conclusion

In this paper we present a new hierarchical model for large scale coreference and demonstrate it on the problem of author disambiguation.

Acknowledgments

We would like to thank Veselin Stoyanov for his feedback.

Topics

coreference

Appears in 58 sentences as: coref (1) Coreference (4) coreference (59) coreferent (5)
In A Discriminative Hierarchical Model for Fast Coreference at Large Scale
  1. Methods that measure compatibility between mention pairs are currently the dominant approach to coreference .
    Page 1, “Abstract”
  2. These trees succinctly summarize the mentions providing a highly compact, information-rich structure for reasoning about entities and coreference uncertainty at massive scales.
    Page 1, “Abstract”
  3. We demonstrate that the hierarchical model is several orders of magnitude faster than pairwise, allowing us to perform coreference on six million author mentions in under four hours on a single CPU.
    Page 1, “Abstract”
  4. Coreference resolution, the task of clustering mentions into partitions representing their underlying real-world entities, is fundamental for high-level information extraction and data integration, including semantic search, question answering, and knowledge base construction.
    Page 1, “Introduction”
  5. For example, coreference is Vital for determining author publication lists in bibliographic knowledge bases such as CiteSeer and Google Scholar, where the repository must know if the “R.
    Page 1, “Introduction”
  6. 31 for Fast Coreference at Large Scale
    Page 1, “Introduction”
  7. Over the years, various machine learning techniques have been applied to different variations of the coreference problem.
    Page 1, “Introduction”
  8. A commonality in many of these approaches is that they model the problem of entity coreference as a collection of decisions between mention pairs (Bagga and Baldwin, 1999; Soon et al., 2001; McCallum and Wellner, 2004; Singla and Domingos, 2005; Bengston and Roth, 2008).
    Page 1, “Introduction”
  9. That is, coreference is solved by answering a quadratic number of questions of the form “does mention A refer to the same entity as mention B?” with a compatibility function that indicates how likely A and B are coreferent .
    Page 1, “Introduction”
  10. Recent work has shown that these entity-level properties allow systems to correct coreference errors made from myopic pairwise decisions (Ng, 2005; Culotta et al., 2007; Yang et al., 2008; Rahman and Ng, 2009; Wick et al., 2009), and can even provide a strong signal for unsupervised coreference (Bhat-tacharya and Getoor, 2006; Haghighi and Klein, 2007; Haghighi and Klein, 2010).
    Page 1, “Introduction”
  11. A second problem, that has received significantly less attention in the literature, is that the pairwise coreference models scale poorly to large collections of mentions especially when the expected
    Page 1, “Introduction”

See all papers in Proc. ACL 2012 that mention coreference.

See all papers in Proc. ACL that mention coreference.

Back to top.

CRF

Appears in 4 sentences as: CRF (4)
In A Discriminative Hierarchical Model for Fast Coreference at Large Scale
  1. For higher accuracy, a graphical model such as a conditional random field ( CRF ) is constructed from the compatibility functions to jointly reason about the pairwise decisions (McCallum and Wellner, 2004).
    Page 3, “Background: Pairwise Coreference”
  2. We now describe the pairwise CRF for coreference as a factor graph.
    Page 3, “Background: Pairwise Coreference”
  3. Given the pairwise CRF , the problem of coreference is then solved by searching for the setting of the coreference decision variables that has the highest probability according to Equation 1 subject to the
    Page 3, “Background: Pairwise Coreference”
  4. Indeed, inference in the hierarchy is orders of magnitude faster than a pairwise CRF , allowing us to infer accurate coreference on
    Page 8, “Conclusion”

See all papers in Proc. ACL 2012 that mention CRF.

See all papers in Proc. ACL that mention CRF.

Back to top.

graphical model

Appears in 4 sentences as: graphical model (3) graphical models (1)
In A Discriminative Hierarchical Model for Fast Coreference at Large Scale
  1. For higher accuracy, a graphical model such as a conditional random field (CRF) is constructed from the compatibility functions to jointly reason about the pairwise decisions (McCallum and Wellner, 2004).
    Page 3, “Background: Pairwise Coreference”
  2. The pairwise compatibility functions become the factors in the graphical model .
    Page 3, “Background: Pairwise Coreference”
  3. Figure 2: Pairwise model on six mentions: Open circles are the binary coreference decision variables, shaded circles are the observed mentions, and the black boxes are the factors of the graphical model that encode the pairwise compatibility functions.
    Page 4, “Background: Pairwise Coreference”
  4. Techniques such as lifted inference (Singla and Domingos, 2008) for graphical models exploit redundancy in the data, but typically do not achieve any significant compression on coreference data be-
    Page 7, “Related Work”

See all papers in Proc. ACL 2012 that mention graphical model.

See all papers in Proc. ACL that mention graphical model.

Back to top.

subtrees

Appears in 4 sentences as: subtrees (4)
In A Discriminative Hierarchical Model for Fast Coreference at Large Scale
  1. each step of inference is computationally efficient because evaluating the cost of attaching (or detaching) subtrees requires computing just a single compatibility function (as seen in Figure 1).
    Page 3, “Introduction”
  2. Finally, if memory is limited, redundant mentions can be pruned by replacing subtrees with their roots.
    Page 3, “Introduction”
  3. More specificaly, for each MH step, we first randomly select two subtrees headed by node-
    Page 5, “Hierarchical Coreference”
  4. Otherwise T7; and rj are subtrees in the same entity tree, then the following proposals are used instead: 0 Split Right - Make the subtree rj the root of a new entity by detaching it from its parent 0 Collapse - If 73; has a parent, then move 73’s children to ri’s parent and then delete m.
    Page 6, “Hierarchical Coreference”

See all papers in Proc. ACL 2012 that mention subtrees.

See all papers in Proc. ACL that mention subtrees.

Back to top.

recursive

Appears in 3 sentences as: recursive (3)
In A Discriminative Hierarchical Model for Fast Coreference at Large Scale
  1. First, the recursive nature of the tree (arbitrary depth and width) allows the model to adapt to different types of data and effectively compress entities of different scales (e.g., entities with more mentions may require a deeper hierarchy to compress).
    Page 3, “Introduction”
  2. This partitioning can be recursive , i.e., each of these sets can be further partitioned, capturing candidate splits for an entity that can facilitate inference.
    Page 5, “Hierarchical Coreference”
  3. In order to represent our recursive model of coreference, we include two types of factors: pairwise factors wpw that measure compatibility between a child node-record and its parent, and unit-wise factors wrw that measure compatibilities of the node-records themselves.
    Page 5, “Hierarchical Coreference”

See all papers in Proc. ACL 2012 that mention recursive.

See all papers in Proc. ACL that mention recursive.

Back to top.