A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
Wang, Lu and Raghavan, Hema and Castelli, Vittorio and Florian, Radu and Cardie, Claire

Article Structure

Abstract

We consider the problem of using sentence compression techniques to facilitate query-focused multi-document summarization.

Introduction

The explosion of the Internet clearly warrants the development of techniques for organizing and presenting information to users in an effective way.

Related Work

Existing research on query-focused multi-document summarization (MDS) largely relies on extractive approaches, where systems usually take as input a set of documents and select the top relevant sentences for inclusion in the final summary.

The Framework

We now present our query-focused MDS framework consisting of three steps: Sentence Ranking, Sentence Compression and Postprocessing.

Sentence Compression

Sentence compression is typically formulated as the problem of removing secondary information from a sentence while maintaining its grammaticality and semantic structure (Knight and Marcu, 2000; McDonald, 2006; Galley and McKeown, 2007; Clarke and Lapata, 2008).

Experimental Setup

We evaluate our methods on the DUC 2005, 2006 and 2007 datasets (Dang, 2005; Dang, 2006; Dang, 2007), each of which is a collection of newswire articles.

Results

The results in Table 5 use the official ROUGE software with standard options5 and report ROUGE-2 (R-2) (measures bigram overlap) and ROUGE-SU4 (R-SU4) (measures unigram and skip-bigram separated by up to four words).

Conclusion

We have presented a framework for query-focused multi-document summarization based on sentence compression.

Topics

sentence compression

Appears in 23 sentences as: Sentence Compression (4) Sentence compression (2) sentence compression (16) Sentence compressions (1) sentence compressor (1)
In A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
  1. We consider the problem of using sentence compression techniques to facilitate query-focused multi-document summarization.
    Page 1, “Abstract”
  2. Sentence compression techniques (Knight and Marcu, 2000; Clarke and Lapata, 2008) are the standard for producing a compact and grammatical version of a sentence while preserving relevance, and prior research (e.g.
    Page 2, “Introduction”
  3. Similarly, strides have been made to incorporate sentence compression into query-focused MDS systems (Zajic et al., 2006).
    Page 2, “Introduction”
  4. Most attempts, however, fail to produce better results than those of the best systems built on pure extraction-based approaches that use no sentence compression .
    Page 2, “Introduction”
  5. In this paper we investigate the role of sentence compression techniques for query-focused MDS.
    Page 2, “Introduction”
  6. We extend existing work in the area first by investigating the role of learning-based sentence compression techniques.
    Page 2, “Introduction”
  7. Our top-performing sentence compression algorithm incorporates measures of query relevance, content importance, redundancy and language quality, among others.
    Page 2, “Introduction”
  8. Our tree-based methods rely on a scoring function that allows for easy and flexible tailoring of sentence compression to the summarization task, ultimately resulting in significant improvements for MDS, while at the same time remaining competitive with existing methods in terms of sentence compression , as discussed next.
    Page 2, “Introduction”
  9. With these results we believe we are the first to successfully show that sentence compression can provide statistically significant improvements over pure extraction-based approaches for query-focused MDS.
    Page 2, “Introduction”
  10. Our work is more related to the less studied area of sentence compression as applied to (single) document summarization.
    Page 2, “Related Work”
  11. We now present our query-focused MDS framework consisting of three steps: Sentence Ranking, Sentence Compression and Postprocessing.
    Page 3, “The Framework”

See all papers in Proc. ACL 2013 that mention sentence compression.

See all papers in Proc. ACL that mention sentence compression.

Back to top.

beam search

Appears in 9 sentences as: Beam Search (2) Beam search (1) beam search (6)
In A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
  1. An innovative beam search decoder is proposed to efficiently find highly probable compressions.
    Page 1, “Abstract”
  2. As the space of possible compressions is exponential in the number of leaves in the parse tree, instead of looking for the globally optimal solution, we use beam search to find a set of highly likely compressions and employ a language model trained on a large corpus for evaluation.
    Page 5, “Sentence Compression”
  3. A Beam Search Decoder.
    Page 5, “Sentence Compression”
  4. The beam search decoder (see Algorithm 1) takes as input the sentence’s parse tree T = 750751 .
    Page 5, “Sentence Compression”
  5. Algorithm 1: Beam search decoder.
    Page 5, “Sentence Compression”
  6. Our BASIC Tree-based Compression in-stantiates the beam search decoder with postorder traversal and a hypothesis scorer that takes a possible sentence compression—a sequence of nodes (e.g.
    Page 5, “Sentence Compression”
  7. 4.3.1 Improving Beam Search
    Page 6, “Sentence Compression”
  8. Furthermore, our HEAD-driven beam search method with MULTI-scorer beats all systems on DUC 20066 and all systems on DUC 2007 except the best system in terms of R-2 (p < 0.01).
    Page 7, “Results”
  9. Four native speakers who are undergraduate students in computer science (none are authors) performed the task, We compare our system based on HEAD-driven beam search with MULTI-scorer to the best systems in DUC 2006 achieving top ROUGE scores (Jagarlamudi et al., 2006) (Best DUC system (ROUGE)) and top linguistic quality scores (Lacatusu et al., 2006) (Best DUC system (LQ))7.
    Page 8, “Results”

See all papers in Proc. ACL 2013 that mention beam search.

See all papers in Proc. ACL that mention beam search.

Back to top.

parse tree

Appears in 9 sentences as: parse tree (6) parse tree, (1) parse trees (2)
In A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
  1. We present a sentence-compression-based framework for the task, and design a series of learning-based compression models built on parse trees .
    Page 1, “Abstract”
  2. Rather than attempt to derive a new parse tree like Knight and Marcu (2000) and Galley and McKeown (2007), we learn to safely remove a set of constituents in our parse tree-based compression model while preserving grammatical structure and essential content.
    Page 3, “Related Work”
  3. Our tree-based compression methods are in line with syntax-driven approaches (Galley and McKeown, 2007), where operations are carried out on parse tree constituents.
    Page 4, “Sentence Compression”
  4. Unlike previous work (Knight and Marcu, 2000; Galley and McKeown, 2007), we do not produce a new parse tree,
    Page 4, “Sentence Compression”
  5. Formally, given a parse tree T of the sentence to be compressed and a tree traversal algorithm, T can be presented as a list of ordered constituent nodes, T = 750751 .
    Page 5, “Sentence Compression”
  6. As the space of possible compressions is exponential in the number of leaves in the parse tree , instead of looking for the globally optimal solution, we use beam search to find a set of highly likely compressions and employ a language model trained on a large corpus for evaluation.
    Page 5, “Sentence Compression”
  7. The beam search decoder (see Algorithm 1) takes as input the sentence’s parse tree T = 750751 .
    Page 5, “Sentence Compression”
  8. Input : parse tree T, ordering O = 0001 .
    Page 5, “Sentence Compression”
  9. Those issues can be addressed by analyzing k-best parse trees and we leave it in the future work.
    Page 8, “Results”

See all papers in Proc. ACL 2013 that mention parse tree.

See all papers in Proc. ACL that mention parse tree.

Back to top.

rule-based

Appears in 7 sentences as: RULE-BASED (1) Rule-Based (1) Rule-based (1) rule-based (3) “Rule-Based” (1)
In A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
  1. Rule-Based Features
    Page 4, “The Framework”
  2. Below we describe the sentence compression approaches developed in this research: RULE-BASED COMPRESSION, SEQUENCE—BASED COMPRESSION, and TREE-BASED COMPRESSION.
    Page 4, “Sentence Compression”
  3. 4.1 Rule-based Compression
    Page 4, “Sentence Compression”
  4. Our rule-based approach extends existing work (Conroy et al., 2006; Toutanova et al., 2007) to create the linguistically-motivated compression rules of Table 2.
    Page 4, “Sentence Compression”
  5. For the “Syntactic Tree”, “Dependency Tree” and “Rule-Based” features, we also include features for the two words that precede and the two that follow the current word.
    Page 4, “Sentence Compression”
  6. Its R-SU4 score is also significantly (p < 0.01) better than extractive methods, rule-based and sequence-based compression methods on both DUC 2006 and 2007.
    Page 7, “Results”
  7. For grammatical relation evaluation, our head-driven tree-based system obtains statistically significantly (p < 0.01) better Fl score (Rel-F1 than all the other systems except the rule-based system).
    Page 9, “Results”

See all papers in Proc. ACL 2013 that mention rule-based.

See all papers in Proc. ACL that mention rule-based.

Back to top.

statistically significant

Appears in 6 sentences as: statistically significant (3) statistically significantly (3)
In A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
  1. Our best model achieves statistically significant improvement over the state-of-the-art systems on several metrics (e. g. 8.0% and 5.4% improvements in ROUGE-2 respectively) for the DUC 2006 and 2007 summarization task.
    Page 1, “Abstract”
  2. We evaluate the summarization models on the standard Document Understanding Conference (DUC) 2006 and 2007 corpora 2 for query-focused MDS and find that all of our compression-based summarization models achieve statistically significantly better performance than the best DUC 2006 systems.
    Page 2, “Introduction”
  3. With these results we believe we are the first to successfully show that sentence compression can provide statistically significant improvements over pure extraction-based approaches for query-focused MDS.
    Page 2, “Introduction”
  4. Our sentence-compression-based systems (marked with T) show statistically significant improvements over pure extractive summarization for both R-2 and R-SU4 (paired t-test, p < 0.01).
    Page 7, “Results”
  5. In Table 7, our context-aware and head-driven tree-based compression systems show statistically significantly (p < 0.01) higher precisions (Uni-
    Page 9, “Results”
  6. For grammatical relation evaluation, our head-driven tree-based system obtains statistically significantly (p < 0.01) better Fl score (Rel-F1 than all the other systems except the rule-based system).
    Page 9, “Results”

See all papers in Proc. ACL 2013 that mention statistically significant.

See all papers in Proc. ACL that mention statistically significant.

Back to top.

language model

Appears in 6 sentences as: Language Model (1) language model (5)
In A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
  1. As the space of possible compressions is exponential in the number of leaves in the parse tree, instead of looking for the globally optimal solution, we use beam search to find a set of highly likely compressions and employ a language model trained on a large corpus for evaluation.
    Page 5, “Sentence Compression”
  2. Given the N -best compressions from the decoder, we evaluate the yield of the trimmed trees using a language model trained on the Gigaword (Graff, 2003) corpus and return the compression with the highest probability.
    Page 6, “Sentence Compression”
  3. Thus, the decoder is quite flexible — its learned scoring function allows us to incorporate features salient for sentence compression while its language model guarantees the linguistic quality of the compressed string.
    Page 6, “Sentence Compression”
  4. Language Model .
    Page 6, “Sentence Compression”
  5. We let scorelm be the probability of W computed by a language model .
    Page 6, “Sentence Compression”
  6. Beam size is fixed at 2000.4 Sentence compressions are evaluated by a 5-gram language model trained on Gigaword (Graff, 2003) by SRILM (Stolcke, 2002).
    Page 7, “Experimental Setup”

See all papers in Proc. ACL 2013 that mention language model.

See all papers in Proc. ACL that mention language model.

Back to top.

coreference

Appears in 6 sentences as: corefer (1) coreference (5)
In A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
  1. Finally, the postprocessing stage applies coreference resolution and sentence reordering to build the summary.
    Page 3, “The Framework”
  2. Then we conduct simple query expansion based on the title of the topic and cross-document coreference resolution.
    Page 3, “The Framework”
  3. And for each mention in the query, we add other mentions within the set of documents that corefer with this mention.
    Page 3, “The Framework”
  4. Cross-document coreference resolution, semantic role labeling and relation extraction are accomplished Via the methods described in Section 5.
    Page 3, “The Framework”
  5. Postprocessing performs coreference resolution and sentence ordering.
    Page 3, “The Framework”
  6. Documents are processed by a full NLP pipeline, including token and sentence segmentation, parsing, semantic role labeling, and an information extraction pipeline consisting of mention detection, NP coreference , cross-document resolution, and relation detection (Florian et al., 2004; Luo et al., 2004; Luo and Zitouni, 2005).
    Page 7, “Experimental Setup”

See all papers in Proc. ACL 2013 that mention coreference.

See all papers in Proc. ACL that mention coreference.

Back to top.

semantic role

Appears in 5 sentences as: semantic role (5)
In A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
  1. semantic role overlap
    Page 3, “The Framework”
  2. We also derive the semantic role overlap and relation instance overlap between the query and each sentence.
    Page 3, “The Framework”
  3. Cross-document coreference resolution, semantic role labeling and relation extraction are accomplished Via the methods described in Section 5.
    Page 3, “The Framework”
  4. semantic role label
    Page 4, “The Framework”
  5. Documents are processed by a full NLP pipeline, including token and sentence segmentation, parsing, semantic role labeling, and an information extraction pipeline consisting of mention detection, NP coreference, cross-document resolution, and relation detection (Florian et al., 2004; Luo et al., 2004; Luo and Zitouni, 2005).
    Page 7, “Experimental Setup”

See all papers in Proc. ACL 2013 that mention semantic role.

See all papers in Proc. ACL that mention semantic role.

Back to top.

coreference resolution

Appears in 4 sentences as: coreference resolution (4)
In A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
  1. Finally, the postprocessing stage applies coreference resolution and sentence reordering to build the summary.
    Page 3, “The Framework”
  2. Then we conduct simple query expansion based on the title of the topic and cross-document coreference resolution .
    Page 3, “The Framework”
  3. Cross-document coreference resolution , semantic role labeling and relation extraction are accomplished Via the methods described in Section 5.
    Page 3, “The Framework”
  4. Postprocessing performs coreference resolution and sentence ordering.
    Page 3, “The Framework”

See all papers in Proc. ACL 2013 that mention coreference resolution.

See all papers in Proc. ACL that mention coreference resolution.

Back to top.

scoring function

Appears in 4 sentences as: scoring function (4)
In A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
  1. Under this framework, we show how to integrate various indicative metrics such as linguistic motivation and query relevance into the compression process by deriving a novel formulation of a compression scoring function .
    Page 1, “Abstract”
  2. Our tree-based methods rely on a scoring function that allows for easy and flexible tailoring of sentence compression to the summarization task, ultimately resulting in significant improvements for MDS, while at the same time remaining competitive with existing methods in terms of sentence compression, as discussed next.
    Page 2, “Introduction”
  3. postorder) as a sequence of nodes in T, the set L of possible node labels, a scoring function 8 for evaluating each sentence compression hypothesis, and a beam size N. Specifically, O is a permutation on the set {0, l, .
    Page 5, “Sentence Compression”
  4. Thus, the decoder is quite flexible — its learned scoring function allows us to incorporate features salient for sentence compression while its language model guarantees the linguistic quality of the compressed string.
    Page 6, “Sentence Compression”

See all papers in Proc. ACL 2013 that mention scoring function.

See all papers in Proc. ACL that mention scoring function.

Back to top.

significant improvements

Appears in 4 sentences as: significant improvement (1) significant improvements (3)
In A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
  1. Our best model achieves statistically significant improvement over the state-of-the-art systems on several metrics (e. g. 8.0% and 5.4% improvements in ROUGE-2 respectively) for the DUC 2006 and 2007 summarization task.
    Page 1, “Abstract”
  2. Our tree-based methods rely on a scoring function that allows for easy and flexible tailoring of sentence compression to the summarization task, ultimately resulting in significant improvements for MDS, while at the same time remaining competitive with existing methods in terms of sentence compression, as discussed next.
    Page 2, “Introduction”
  3. With these results we believe we are the first to successfully show that sentence compression can provide statistically significant improvements over pure extraction-based approaches for query-focused MDS.
    Page 2, “Introduction”
  4. Our sentence-compression-based systems (marked with T) show statistically significant improvements over pure extractive summarization for both R-2 and R-SU4 (paired t-test, p < 0.01).
    Page 7, “Results”

See all papers in Proc. ACL 2013 that mention significant improvements.

See all papers in Proc. ACL that mention significant improvements.

Back to top.

beam size

Appears in 4 sentences as: Beam size (1) beam size (2) beam sizes (1)
In A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
  1. postorder) as a sequence of nodes in T, the set L of possible node labels, a scoring function 8 for evaluating each sentence compression hypothesis, and a beam size N. Specifically, O is a permutation on the set {0, l, .
    Page 5, “Sentence Compression”
  2. Om, L ={RET, REM, PAR}, hypothesis scorer S, beam size N
    Page 5, “Sentence Compression”
  3. Beam size is fixed at 2000.4 Sentence compressions are evaluated by a 5-gram language model trained on Gigaword (Graff, 2003) by SRILM (Stolcke, 2002).
    Page 7, “Experimental Setup”
  4. 4We looked at various beam sizes on the heldout data, and observed that the performance peaks around this value.
    Page 7, “Results”

See all papers in Proc. ACL 2013 that mention beam size.

See all papers in Proc. ACL that mention beam size.

Back to top.

Dependency Tree

Appears in 3 sentences as: Dependency Tree (1) “Dependency Tree (1) “Dependency Tree” (1)
In A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
  1. Dependency Tree Features in NP/VP/ADVP/ADJP chunk?
    Page 4, “The Framework”
  2. “Dependency Tree Features” encode the grammatical relations in which each word is involved as a dependent.
    Page 4, “Sentence Compression”
  3. For the “Syntactic Tree”, “Dependency Tree” and “Rule-Based” features, we also include features for the two words that precede and the two that follow the current word.
    Page 4, “Sentence Compression”

See all papers in Proc. ACL 2013 that mention Dependency Tree.

See all papers in Proc. ACL that mention Dependency Tree.

Back to top.

CRF

Appears in 3 sentences as: CRF (3)
In A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
  1. The CRF model is built using the features shown in Table 3.
    Page 4, “Sentence Compression”
  2. During inference, we find the maximally likely sequence Y according to a CRF with parameter 6 (Y = arg maXy/ P(Y’|X;6)), while simultaneously enforcing the rules of Table 2 to reduce the hypothesis space and encourage grammatical compression.
    Page 4, “Sentence Compression”
  3. The dataset from Clarke and Lapata (2008) is used to train the CRF and MaxEnt classifiers (Section 4).
    Page 7, “Experimental Setup”

See all papers in Proc. ACL 2013 that mention CRF.

See all papers in Proc. ACL that mention CRF.

Back to top.

model trained

Appears in 3 sentences as: model trained (3)
In A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
  1. As the space of possible compressions is exponential in the number of leaves in the parse tree, instead of looking for the globally optimal solution, we use beam search to find a set of highly likely compressions and employ a language model trained on a large corpus for evaluation.
    Page 5, “Sentence Compression”
  2. Given the N -best compressions from the decoder, we evaluate the yield of the trimmed trees using a language model trained on the Gigaword (Graff, 2003) corpus and return the compression with the highest probability.
    Page 6, “Sentence Compression”
  3. Beam size is fixed at 2000.4 Sentence compressions are evaluated by a 5-gram language model trained on Gigaword (Graff, 2003) by SRILM (Stolcke, 2002).
    Page 7, “Experimental Setup”

See all papers in Proc. ACL 2013 that mention model trained.

See all papers in Proc. ACL that mention model trained.

Back to top.

role labeling

Appears in 3 sentences as: role label (1) role labeling (2)
In A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
  1. Cross-document coreference resolution, semantic role labeling and relation extraction are accomplished Via the methods described in Section 5.
    Page 3, “The Framework”
  2. semantic role label
    Page 4, “The Framework”
  3. Documents are processed by a full NLP pipeline, including token and sentence segmentation, parsing, semantic role labeling , and an information extraction pipeline consisting of mention detection, NP coreference, cross-document resolution, and relation detection (Florian et al., 2004; Luo et al., 2004; Luo and Zitouni, 2005).
    Page 7, “Experimental Setup”

See all papers in Proc. ACL 2013 that mention role labeling.

See all papers in Proc. ACL that mention role labeling.

Back to top.

semantic role labeling

Appears in 3 sentences as: semantic role label (1) semantic role labeling (2)
In A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
  1. Cross-document coreference resolution, semantic role labeling and relation extraction are accomplished Via the methods described in Section 5.
    Page 3, “The Framework”
  2. semantic role label
    Page 4, “The Framework”
  3. Documents are processed by a full NLP pipeline, including token and sentence segmentation, parsing, semantic role labeling , and an information extraction pipeline consisting of mention detection, NP coreference, cross-document resolution, and relation detection (Florian et al., 2004; Luo et al., 2004; Luo and Zitouni, 2005).
    Page 7, “Experimental Setup”

See all papers in Proc. ACL 2013 that mention semantic role labeling.

See all papers in Proc. ACL that mention semantic role labeling.

Back to top.

bigram

Appears in 3 sentences as: bigram (3)
In A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
  1. unigram/bigram/skip bigram (at most four words apart) overlap unigram/bigram TF/TF—IDF similarity
    Page 3, “The Framework”
  2. 2011; Ouyang et al., 2011), we use the ROUGE-2 score, which measures bigram overlap between a sentence and the abstracts, as the objective for regression.
    Page 3, “The Framework”
  3. The results in Table 5 use the official ROUGE software with standard options5 and report ROUGE-2 (R-2) (measures bigram overlap) and ROUGE-SU4 (R-SU4) (measures unigram and skip-bigram separated by up to four words).
    Page 7, “Results”

See all papers in Proc. ACL 2013 that mention bigram.

See all papers in Proc. ACL that mention bigram.

Back to top.