Abstract | We investigate different ways of learning structured perceptron models for coreference resolution when using nonlocal features and beam search. |
Background | Coreference resolution is the task of grouping referring expressions (or mentions) in a text into disjoint clusters such that all mentions in a cluster refer to the same entity. |
Background | In recent years much work on coreference resolution has been devoted to increasing the ex-pressivity of the classical mention-pair model, in which each coreference classification decision is limited to information about two mentions that make up a pair. |
Background | Nevertheless, the two best systems in the latest CoNLL Shared Task on coreference resolution (Pradhan et al., 2012) were both variants of the mention-pair model. |
Introducing Nonlocal Features | While beam search and early updates have been successfully applied to other NLP applications, our task differs in two important aspects: First, coreference resolution is a much more difficult task, which relies on more (world) knowledge than what is available in the training data. |
Introduction | We show that for the task of coreference resolution the straightforward combination of beam search and early update (Collins and Roark, 2004) falls short of more limited feature sets that allow for exact search. |
Introduction | This approach provides a powerful boost to the performance of coreference resolvers , but we find that it does not combine well with the LaSO learning strategy. |
Related Work | The perceptron has previously been used to train coreference resolvers either by casting the problem as a binary classification problem that considers pairs of mentions in isolation (Bengtson and Roth, 2008; Stoyanov et al., 2009; Chang et al., 2012, inter alia) or in the structured manner, where a clustering for an entire document is predicted in one go (Fernandes et al., 2012). |
Results | For English we also compare it to the Berkeley system (Durrett and Klein, 2013), which, to our knowledge, is the best publicly available system for English coreference resolution (denoted D&K). |
Evaluation and Discussion | With no surprise, the coreference resolution performance plays an important role in the final grounding performance (see the grounding performance of using manually annotated coreference in the bottom part of Table 1). |
Evaluation and Discussion | Due to the simplicity of our current coreference classifier and the flexibility of the human-human dialogue in the data, the pairwise coreference resolution only achieves 0.74 in precision and 0.43 in recall. |
Evaluation and Discussion | The low recall of coreference resolution makes it difficult to link interrelated referring expressions and resolve them jointly. |
Probabilistic Labeling for Reference Grounding | Our system first processes the data using automatic semantic parsing and coreference resolution . |
Probabilistic Labeling for Reference Grounding | We then perform pairwise coreference resolution on the discourse entities to find out the discourse relations between entities from different utterances. |
Probabilistic Labeling for Reference Grounding | Based on the semantic parsing and pairwise coreference resolution results, our system further builds a graph representation to capture the collaborative discourse and formulate referential grounding as a probabilistic labeling problem, as described next. |
Abstract | In this paper, we propose a model for cross-document coreference resolution that achieves robustness by learning similarity from unlabeled data. |
Conclusions | Our primary contribution consists of new modeling ideas, and associated inference techniques, for the problem of cross-document coreference resolution . |
Introduction | In this paper, we propose a method for jointly (1) learning similarity between names and (2) clustering name mentions into entities, the two major components of cross-document coreference resolution systems (Baron and Freedman, 2008; Finin et al., 2009; Rao et al., 2010; Singh et al., 2011; Lee et al., 2012; Green et al., 2012). |
Overview and Related Work | Cross-document coreference resolution (CDCR) was first introduced by Bagga and Baldwin (1998b). |
Overview and Related Work | Name similarity is also an important component of within-document coreference resolution , and efforts in that area bear resemblance to our approach. |
Approach | Coreference resolution , which could help avoid vague question generation, is discussed in Section 5. |
Linguistic Challenges | Here we briefly describe three challenges: negation detection, coreference resolution , and verb forms. |
Linguistic Challenges | 5.2 Coreference Resolution |
Linguistic Challenges | Currently, our system does not use any type of coreference resolution . |
Data | While previous work uses the Stanford CoreNLP toolkit to identify characters and extract typed dependencies for them, we found this approach to be too slow for the scale of our data (a total of 1.8 billion tokens); in particular, syntactic parsing, with cubic complexity in sentence length, and out-of-the-box coreference resolution (with thousands of potential antecedents) prove to be |
Data | 3.2 Pronominal Coreference Resolution |
Data | While the character clustering stage is essentially performing proper noun coreference resolution , approximately 74% of references to characters in books come in the form of pronouns.5 To resolve this more difficult class at the scale of an entire book, we train a log-linear discriminative classifier only on the task of resolving pronominal anaphora (i.e., ignoring generic noun phrases such as the paint or the rascal). |
Introduction | (2013) explicitly learn character types (or “personas”) in a dataset of Wikipedia movie plot summaries; and entity-centric models form one dominant approach in coreference resolution (Durrett et al., 2013; Haghighi and Klein, 2010). |
Introduction | Coreference resolution aims at identifying natural language expressions (or mentions) that refer to the same entity. |
Introduction | A critically important problem is how to measure the quality of a coreference resolution system. |
Introduction | Therefore, the identical-mention-set assumption limits BLANC-gold’s applicability when gold mentions are not available, or when one wants to have a single score measuring both the quality of mention detection and coreference resolution . |
Original BLANC | When Tk, = Tr, Rand Index can be applied directly since coreference resolution reduces to a clustering problem where mentions are partitioned into clusters (entities): |
Problem Description | Thus, in order to align event sequences, we need to compute scores corresponding to cross-narrative medical event coreference resolution and cross-narrative temporal relations. |
Problem Description | 4 Cross-Narrative Coreference Resolution and Temporal Relation Learning |
Problem Description | The coreference resolution performs with 71.5% precision and 82.3% recall. |