Identifying Non-Explicit Citing Sentences for Citation-Based Summarization.
Qazvinian, Vahed and Radev, Dragomir R.

Article Structure

Abstract

Identifying background (context) information in scientific articles can help scholars understand major contributions in their research area more easily.

Introduction

In scientific literature, scholars use citations to refer to external sources.

Prior Work

Analyzing the structure of scientific articles and their relations has received a lot of attention recently.

Data

The ACL Anthology Network (AAN)2 is a collection of papers from the ACL Anthology3 published in the Computational Linguistics journal and proceedings from ACL conferences and workshops and includes more than 14,000 papers over a period of four decades (Radev et al., 2009).

Proposed Method

In this section we propose our methodology that enables us to identify the context information of a cited paper.

Experiments

The intrinsic evaluation of our methodology means to directly compare the output of our method with the gold standards obtained from the annotated data.

Impact on Survey Generation

We also performed an extrinsic evaluation of our context extraction methodology.

Conclusion

In this paper we proposed a framework based on probabilistic inference to extract sentences that appear in the scientific literature, and which are about a secondary source, but which do not contain explicit citations to that secondary source.

Acknowledgments

The authors would like to thank Arzucan Ozgur from University of Michigan for annotations.

Topics

context information

Appears in 7 sentences as: context information (7)
In Identifying Non-Explicit Citing Sentences for Citation-Based Summarization.
  1. In this paper, we propose a general framework based on probabilistic inference to extract such context information from scientific papers.
    Page 1, “Abstract”
  2. Our experiments show greater pyramid scores for surveys generated using such context information rather than citation sentences alone.
    Page 1, “Abstract”
  3. We refer to such implicit citations that contain information about a specific secondary source but do not explicitly cite it, as sentences with context information or context sentences for short.
    Page 1, “Introduction”
  4. In this section we propose our methodology that enables us to identify the context information of a cited paper.
    Page 4, “Proposed Method”
  5. To find the sentences from a paper that form the context information of a given cited paper, we build an MRF in which a hidden node :13,- and an observed node y,- correspond to each sentence.
    Page 5, “Proposed Method”
  6. Our experiments on generating surveys for Question Answering and Dependency Parsing show how surveys generated using such context information along with citation sentences have higher quality than those built using citations alone.
    Page 9, “Conclusion”
  7. Our future goal is to combine summarization and bibliometric techniques towards building automatic surveys that employ context information as an important part of the generated surveys.
    Page 9, “Conclusion”

See all papers in Proc. ACL 2010 that mention context information.

See all papers in Proc. ACL that mention context information.

Back to top.

graphical models

Appears in 4 sentences as: Graphical models (1) graphical models (3)
In Identifying Non-Explicit Citing Sentences for Citation-Based Summarization.
  1. In summary, our proposed model is based on the probabilistic inference of these random variables using graphical models .
    Page 1, “Introduction”
  2. In our work we use graphical models to extract context sentences.
    Page 2, “Prior Work”
  3. Graphical models have a number of properties and corresponding techniques and have been used before on Information Retrieval tasks.
    Page 2, “Prior Work”
  4. A particular class of graphical models known as Markov Random Fields (MRFs) are suited for solving inference problems with uncertainty in observed data.
    Page 4, “Proposed Method”

See all papers in Proc. ACL 2010 that mention graphical models.

See all papers in Proc. ACL that mention graphical models.

Back to top.

cosine similarity

Appears in 3 sentences as: cosine similarity (3)
In Identifying Non-Explicit Citing Sentences for Citation-Based Summarization.
  1. To formalize this assumption we use the sigmoid of the cosine similarity of two sentences to build it.
    Page 6, “Proposed Method”
  2. Intuitively, if a sentence has higher similarity with the reference paper, it should have a higher potential of being in class 1 or C. The flag of each sentence here is a value between 0 and l and is determined by its cosine similarity to the reference.
    Page 6, “Proposed Method”
  3. LexRank is a multidocument summarization system, which first builds a cosine similarity graph of all the candidate sentences.
    Page 8, “Impact on Survey Generation”

See all papers in Proc. ACL 2010 that mention cosine similarity.

See all papers in Proc. ACL that mention cosine similarity.

Back to top.

Dependency Parsing

Appears in 3 sentences as: Dependency Parsing (2) Dependency Parsing” (1)
In Identifying Non-Explicit Citing Sentences for Citation-Based Summarization.
  1. lBuchholz and Marsi “CoNLL-X Shared Task On Multilingual Dependency Parsing” , CoNLL 2006
    Page 1, “Introduction”
  2. that contains two sets of cited papers and corresponding citing sentences, one on Question Answering (QA) with 10 papers and the other on Dependency Parsing (DP) with 16 papers.
    Page 8, “Impact on Survey Generation”
  3. Our experiments on generating surveys for Question Answering and Dependency Parsing show how surveys generated using such context information along with citation sentences have higher quality than those built using citations alone.
    Page 9, “Conclusion”

See all papers in Proc. ACL 2010 that mention Dependency Parsing.

See all papers in Proc. ACL that mention Dependency Parsing.

Back to top.

Question Answering

Appears in 3 sentences as: Question Answering (2) question answering (1)
In Identifying Non-Explicit Citing Sentences for Citation-Based Summarization.
  1. C 0 Lin and Pantel (2001) extract inference rules, which are related to paraphrases (for example, X wrote Y implies X is the author of Y), to improve question answering .
    Page 3, “Data”
  2. that contains two sets of cited papers and corresponding citing sentences, one on Question Answering (QA) with 10 papers and the other on Dependency Parsing (DP) with 16 papers.
    Page 8, “Impact on Survey Generation”
  3. Our experiments on generating surveys for Question Answering and Dependency Parsing show how surveys generated using such context information along with citation sentences have higher quality than those built using citations alone.
    Page 9, “Conclusion”

See all papers in Proc. ACL 2010 that mention Question Answering.

See all papers in Proc. ACL that mention Question Answering.

Back to top.