An Extension of BLANC to System Mentions
Luo, Xiaoqiang and Pradhan, Sameer and Recasens, Marta and Hovy, Eduard

Article Structure

Abstract

BLANC is a link-based coreference evaluation metric for measuring the quality of coreference systems on gold mentions.

Introduction

Coreference resolution aims at identifying natural language expressions (or mentions) that refer to the same entity.

Notations

To facilitate the presentation, we define the notations used in the paper.

Original BLANC

BLANC-gold is adapted from Rand Index (Rand, 1971), a metric for clustering objects.

BLANC for Imperfect Response Mentions

Under the assumption that the key and response mention sets are identical (which implies that Tk, 2 T7), Equations (2) to (7) make sense.

Topics

coreference

Appears in 30 sentences as: Coreference (2) coreference (34)
In An Extension of BLANC to System Mentions
  1. BLANC is a link-based coreference evaluation metric for measuring the quality of coreference systems on gold mentions.
    Page 1, “Abstract”
  2. Coreference resolution aims at identifying natural language expressions (or mentions) that refer to the same entity.
    Page 1, “Introduction”
  3. A critically important problem is how to measure the quality of a coreference resolution system.
    Page 1, “Introduction”
  4. In particular, MUC measures the degree of agreement between key coreference links (i.e., links among mentions within entities) and response coreference links, while non-coreference links (i.e., links formed by mentions from different entities) are not explicitly taken into account.
    Page 1, “Introduction”
  5. This leads to a phenomenon where coreference systems outputting large entities are scored more favorably
    Page 1, “Introduction”
  6. BLANC (Recasens and Hovy, 2011), on the other hand, considers both coreference links and non-coreference links.
    Page 1, “Introduction”
  7. It calculates recall, precision and F-measure separately on coreference and non-coreference links in the usual way, and defines the overall recall, precision and F-measure as the mean of the respective measures for coreference and non-coreference links.
    Page 1, “Introduction”
  8. Therefore, the identical-mention-set assumption limits BLANC-gold’s applicability when gold mentions are not available, or when one wants to have a single score measuring both the quality of mention detection and coreference resolution.
    Page 1, “Introduction”
  9. Let and Or be the set of coreference links formed by mentions in 19, and 73-:
    Page 2, “Notations”
  10. Note that when an entity consists of a single mention, its coreference link set is empty.
    Page 2, “Notations”
  11. When Tk, = Tr, Rand Index can be applied directly since coreference resolution reduces to a clustering problem where mentions are partitioned into clusters (entities):
    Page 2, “Original BLANC”

See all papers in Proc. ACL 2014 that mention coreference.

See all papers in Proc. ACL that mention coreference.

Back to top.

F-measure

Appears in 10 sentences as: F-measure (14)
In An Extension of BLANC to System Mentions
  1. It calculates recall, precision and F-measure separately on coreference and non-coreference links in the usual way, and defines the overall recall, precision and F-measure as the mean of the respective measures for coreference and non-coreference links.
    Page 1, “Introduction”
  2. BLANC-gold solves this problem by averaging the F-measure computed over coreference links and the F-measure over non-coreference links.
    Page 2, “Original BLANC”
  3. Using the notations in Section 2, the recall, precision, and F-measure on coreference links are:
    Page 2, “Original BLANC”
  4. Similarly, the recall, precision, and F-measure on non-coreference links are computed as:
    Page 2, “Original BLANC”
  5. (8) indicates that BLANC-gold assigns equal weight to Few), the F-measure from coreference links, and F739), the F-measure from non-coreference links.
    Page 3, “Original BLANC”
  6. and we propose to extend the coreference F-measure and non-coreference F-measure as follows.
    Page 3, “BLANC for Imperfect Response Mentions”
  7. Coreference recall, precision and F-measure are changed to:
    Page 3, “BLANC for Imperfect Response Mentions”
  8. Non-coreference recall, precision and F-measure are changed to:
    Page 3, “BLANC for Imperfect Response Mentions”
  9. Since there is no coreference link, BLANC reduces to the non-coreference F-measure Fn.
    Page 4, “BLANC for Imperfect Response Mentions”
  10. Since there is no non-coreference link, BLANC reduces to the coreference F-measure FC.
    Page 4, “BLANC for Imperfect Response Mentions”

See all papers in Proc. ACL 2014 that mention F-measure.

See all papers in Proc. ACL that mention F-measure.

Back to top.

CoNLL

Appears in 7 sentences as: CoNLL (8)
In An Extension of BLANC to System Mentions
  1. The proposed BLANC falls back seamlessly to the original one if system mentions are identical to gold mentions, and it is shown to strongly correlate with existing metrics on the 2011 and 2012 CoNLL data.
    Page 1, “Abstract”
  2. The proposed BLANC is applied to the CoNLL 2011 and 2012 shared task participants, and the scores and its correlations with existing metrics are shown in Section 5.
    Page 2, “Introduction”
  3. We have updated the publicly available CoNLL coreference scorer1 with the proposed BLANC, and used it to compute the proposed BLANC scores for all the CoNLL 2011 (Pradhan et al., 2011) and 2012 (Pradhan et al., 2012) participants in the official track, where participants had to automatically predict the mentions.
    Page 4, “BLANC for Imperfect Response Mentions”
  4. Table 3: Pearson’s r correlation coefficients between the proposed BLANC and the other coreference measures based on the CoNLL 2011/2012 results.
    Page 5, “BLANC for Imperfect Response Mentions”
  5. Figure 1: Correlation plot between the proposed BLANC and the other measures based on the CoNLL 2011/2012 results.
    Page 5, “BLANC for Imperfect Response Mentions”
  6. However, the CoNLL data sets come from OntoNotes (Hovy et al., 2006), where singleton entities are not annotated, and BLANC has a wider dynamic range on data sets with singletons (Recasens and Hovy, 2011).
    Page 5, “BLANC for Imperfect Response Mentions”
  7. Since BLANC works on imperfect system mentions, we have used it to score the CoNLL 2011 and 2012 coreference systems.
    Page 5, “BLANC for Imperfect Response Mentions”

See all papers in Proc. ACL 2014 that mention CoNLL.

See all papers in Proc. ACL that mention CoNLL.

Back to top.

coreference resolution

Appears in 4 sentences as: Coreference resolution (1) coreference resolution (3)
In An Extension of BLANC to System Mentions
  1. Coreference resolution aims at identifying natural language expressions (or mentions) that refer to the same entity.
    Page 1, “Introduction”
  2. A critically important problem is how to measure the quality of a coreference resolution system.
    Page 1, “Introduction”
  3. Therefore, the identical-mention-set assumption limits BLANC-gold’s applicability when gold mentions are not available, or when one wants to have a single score measuring both the quality of mention detection and coreference resolution .
    Page 1, “Introduction”
  4. When Tk, = Tr, Rand Index can be applied directly since coreference resolution reduces to a clustering problem where mentions are partitioned into clusters (entities):
    Page 2, “Original BLANC”

See all papers in Proc. ACL 2014 that mention coreference resolution.

See all papers in Proc. ACL that mention coreference resolution.

Back to top.

shared task

Appears in 3 sentences as: shared task (3)
In An Extension of BLANC to System Mentions
  1. The proposed BLANC is applied to the CoNLL 2011 and 2012 shared task participants, and the scores and its correlations with existing metrics are shown in Section 5.
    Page 2, “Introduction”
  2. Table 1: The proposed BLANC scores of the CoNLL-2011 shared task participants.
    Page 4, “BLANC for Imperfect Response Mentions”
  3. Table 2: The proposed BLANC scores of the CoNLL-2012 shared task participants.
    Page 5, “BLANC for Imperfect Response Mentions”

See all papers in Proc. ACL 2014 that mention shared task.

See all papers in Proc. ACL that mention shared task.

Back to top.