``Ask Not What Textual Entailment Can Do for You...''
Sammons, Mark and Vydiswaran, V.G.Vinod and Roth, Dan

Article Structure

Abstract

We challenge the NLP community to participate in a large-scale, distributed effort to design and build resources for developing and evaluating solutions to new and existing NLP tasks in the context of Recognizing Textual Entailment.

Introduction

Much of the work in the field of Natural Language Processing is founded on an assumption of semantic compositionality: that there are identifiable, separable components of an unspecified inference process that will develop as research in NLP progresses.

NLP Insights from Textual Entailment

The task of Recognizing Textual Entailment (RTE), as formulated by (Dagan et al., 2006), requires automated systems to identify when a human reader would judge that given one span of text (the Text) and some unspecified (but restricted) world knowledge, a second span of text (the Hy-

Annotation Proposal and Pilot Study

As part of our challenge to the NLP community, we propose a distributed OntoNotes-style approach (Hovy et al., 2006) to this annotation effort: distributed, because it should be undertaken by a diverse range of researchers with interests in different semantic phenomena; and similar to the OntoNotes annotation effort because it should not presuppose a fixed, closed ontology of entailment phenomena, but rather, iteratively hypothesize and refine such an ontology using inter-annotator agreement as a guiding principle.

Pilot RTE System Analysis

In this section, we sketch out ways in which the proposed analysis can be applied to learn something about RTE system behavior, even when those systems do not provide anything beyond the output label.

Discussion

NLP researchers in the broader community continually seek new problems to solve, and pose more ambitious tasks to develop NLP and NLU capabilities, yet recognize that even solutions to problems which are considered “solved” may not perform as well on domains different from the resources used to train and develop them.

Conclusions

In this paper, we have presented a case for a broad, longterm effort by the NLP community to coordinate annotation efforts around RTE corpora, and to evaluate solutions to NLP tasks relating to textual inference in the context of RTE.

Topics

coreference

Appears in 8 sentences as: Coreference (2) coreference (7)
In ``Ask Not What Textual Entailment Can Do for You...''
  1. Tasks such as Named Entity and coreference resolution, syntactic and shallow semantic parsing, and information and relation extraction have been identified as worthwhile tasks and pursued by numerous researchers.
    Page 1, “Introduction”
  2. relevant NLP tasks such as NER, Coreference , parsing, data acquisition and application, and others.
    Page 2, “Introduction”
  3. ported by their designers were the use of structured representations of shallow semantic content (such as augmented dependency parse trees and semantic role labels); the application of NLP resources such as Named Entity recognizers, syntactic and dependency parsers, and coreference resolvers; and the use of special-purpose ad-hoc modules designed to address specific entailment phenomena the researchers had identified, such as the need for numeric reasoning.
    Page 4, “NLP Insights from Textual Entailment”
  4. As the example in figure 1 illustrates, most RTE examples require a number of phenomena to be correctly resolved in order to reliably determine the correct label (the Interaction problem); a perfect coreference resolver might as a result yield little improvement on the standard RTE evaluation, even though coreference resolution is clearly required by human readers in a significant percentage of RTE examples.
    Page 4, “NLP Insights from Textual Entailment”
  5. From the tables it is apparent that good performance on a range of phenomena in our inference model are likely to have a significant effect on RTE results, with coreference being deemed essential to the inference process for 35% of examples, and a number of other phenomena are sufficiently well represented to merit near-future attention (assuming that RTE systems do not already handle these phenomena, a question we address in section 4).
    Page 6, “Annotation Proposal and Pilot Study”
  6. Phenomenon Occurrence Agreement coreference 35.00% 0.698 simple rewrite rule 32.62% 0.580 lexical relation 25.00% 0.738 implicit relation 23.33% 0.633 factoid 15.00% 0.412 parent-sibling 1 1.67% 0.500 genetive relation 9.29% 0.608 nominalization 8.33% 0.514 event chain 6.67% 0.589 coerced relation 6.43% 0.540 passive-active 5.24% 0.583 numeric reasoning 4.05% 0.847 spatial reasoning 3.57% 0.720
    Page 6, “Annotation Proposal and Pilot Study”
  7. The results confirmed our initial intuition about some phenomena: for example, that coreference resolution is central to RTE, and that detecting the connecting structure is crucial in discerning negative from positive examples.
    Page 6, “Annotation Proposal and Pilot Study”
  8. Typically, for positive examples this involved overlap between phenomena; for example, Coreference might be expected to resolve implicit rela-
    Page 6, “Annotation Proposal and Pilot Study”

See all papers in Proc. ACL 2010 that mention coreference.

See all papers in Proc. ACL that mention coreference.

Back to top.

Named Entity

Appears in 6 sentences as: named entities (1) Named Entity (5)
In ``Ask Not What Textual Entailment Can Do for You...''
  1. Tasks such as Named Entity and coreference resolution, syntactic and shallow semantic parsing, and information and relation extraction have been identified as worthwhile tasks and pursued by numerous researchers.
    Page 1, “Introduction”
  2. ported by their designers were the use of structured representations of shallow semantic content (such as augmented dependency parse trees and semantic role labels); the application of NLP resources such as Named Entity recognizers, syntactic and dependency parsers, and coreference resolvers; and the use of special-purpose ad-hoc modules designed to address specific entailment phenomena the researchers had identified, such as the need for numeric reasoning.
    Page 4, “NLP Insights from Textual Entailment”
  3. Phenomenon Occurrence Agreement Named Entity 91.67% 0.856 locative 17.62% 0.623 Numerical Quantity 14.05% 0.905 temporal 5 .48% 0.960 nominalization 4.05 % 0.245 implicit relation 1.90% 0.651
    Page 6, “Annotation Proposal and Pilot Study”
  4. Phenomenon Occurrence Agreement missing argument 16.19% 0.763 missing relation 14.76% 0.708 excluding argument 10.48% 0.952 Named Entity mismatch 9.29% 0.921 excluding relation 5.00% 0.870 disconnected relation 4.52% 0.580 missing modifier 3.81% 0.465 disconnected argument 3.33% 0.764 Numeric Quant.
    Page 6, “Annotation Proposal and Pilot Study”
  5. (b) All top-5 systems make consistent errors in cases where identifying a mismatch in named entities (NE) or numerical quantities (NQ) is important to make the right decision.
    Page 7, “Pilot RTE System Analysis”
  6. However, we believe that if a system could recognize key negation phenomena such as Named Entity mismatch, presence of Excluding arguments, etc.
    Page 8, “Pilot RTE System Analysis”

See all papers in Proc. ACL 2010 that mention Named Entity.

See all papers in Proc. ACL that mention Named Entity.

Back to top.

coreference resolution

Appears in 4 sentences as: coreference resolution (3) coreference resolver (1) coreference resolvers (1)
In ``Ask Not What Textual Entailment Can Do for You...''
  1. Tasks such as Named Entity and coreference resolution , syntactic and shallow semantic parsing, and information and relation extraction have been identified as worthwhile tasks and pursued by numerous researchers.
    Page 1, “Introduction”
  2. ported by their designers were the use of structured representations of shallow semantic content (such as augmented dependency parse trees and semantic role labels); the application of NLP resources such as Named Entity recognizers, syntactic and dependency parsers, and coreference resolvers ; and the use of special-purpose ad-hoc modules designed to address specific entailment phenomena the researchers had identified, such as the need for numeric reasoning.
    Page 4, “NLP Insights from Textual Entailment”
  3. As the example in figure 1 illustrates, most RTE examples require a number of phenomena to be correctly resolved in order to reliably determine the correct label (the Interaction problem); a perfect coreference resolver might as a result yield little improvement on the standard RTE evaluation, even though coreference resolution is clearly required by human readers in a significant percentage of RTE examples.
    Page 4, “NLP Insights from Textual Entailment”
  4. The results confirmed our initial intuition about some phenomena: for example, that coreference resolution is central to RTE, and that detecting the connecting structure is crucial in discerning negative from positive examples.
    Page 6, “Annotation Proposal and Pilot Study”

See all papers in Proc. ACL 2010 that mention coreference resolution.

See all papers in Proc. ACL that mention coreference resolution.

Back to top.

Natural Language

Appears in 4 sentences as: Natural Language (3) natural language (1)
In ``Ask Not What Textual Entailment Can Do for You...''
  1. Much of the work in the field of Natural Language Processing is founded on an assumption of semantic compositionality: that there are identifiable, separable components of an unspecified inference process that will develop as research in NLP progresses.
    Page 1, “Introduction”
  2. While many have (nearly) immediate application to real world tasks like search, many are also motivated by their potential contribution to more ambitious Natural Language tasks.
    Page 1, “Introduction”
  3. But there is no clear process for identifying potential tasks (other than consensus by a sufficient number of researchers), nor for quantifying their potential contribution to existing NLP tasks, let alone to Natural Language Understanding.
    Page 1, “Introduction”
  4. This is an appropriate time to consider a systematic process for identifying semantic analysis tasks relevant to natural language understanding, and for assessing their potential impact on NLU system performance.
    Page 1, “Introduction”

See all papers in Proc. ACL 2010 that mention Natural Language.

See all papers in Proc. ACL that mention Natural Language.

Back to top.

Question Answering

Appears in 3 sentences as: Question Answering (3)
In ``Ask Not What Textual Entailment Can Do for You...''
  1. selves to solve tasks requiring more complex reasoning and synthesis of information; many other tasks must be solved to achieve human-like performance on tasks such as Question Answering .
    Page 1, “Introduction”
  2. Techniques developed for RTE have now been successfully applied in the domains of Question Answering (Harabagiu and Hickl, 2006) and Machine Translation (Pado et al., 2009), (Mirkin et al., 2009).
    Page 1, “Introduction”
  3. The RTE task has been designed specifically to exercise textual inference capabilities, in a format that would make RTE systems potentially useful components in other “deep” NLP tasks such as Question Answering and Machine Translation.
    Page 2, “Introduction”

See all papers in Proc. ACL 2010 that mention Question Answering.

See all papers in Proc. ACL that mention Question Answering.

Back to top.