SciSurf: Index of 'A Class of Submodular Functions for Document Summarization'

A Class of Submodular Functions for Document Summarization

Lin, Hui and Bilmes, Jeff

Published in Proc. ACL, 2011

Article Structure

Abstract

We design a class of submodular functions meant for document summarization tasks.

Introduction

In this paper, we address the problem of generic and query-based extractive summarization from collections of related documents, a task commonly known as multi-document summarization.

Background on Submodularity

We are given a set of objects V = {211, .

Submodularity in Summarization

3.1 Summarization with knapsack constraint

Monotone Submodular Objectives

Two properties of a good summary are relevance and non-redundancy.

Experiments

The document understanding conference (DUC) (http : //duc .

Conclusion and discussion

In this paper, we show that submodularity naturally arises in document summarization.

Topics

objective function

Appears in 13 sentences as: objective function (10) Objective functions (1) objective functions (2)

In A Class of Submodular Functions for Document Summarization

Of course, none of this is useful if the objective function .7: is inappropriate for the summarization task.
Page 1, “Introduction”
(1999) on the budgeted maximum cover problem to the general submodular framework, and show a practical greedy algorithm with a (1 — 1/ fi)-approximation factor, where each greedy step adds the unit with the largest ratio of objective function gain to scaled cost, while not violating the budget constraint (see (Lin and Bilmes, 2010) for details).
Page 3, “Submodularity in Summarization”
In particular, Carbonell and Goldstein (1998) define an objective function gain of adding element k to set S (k ¢ 8) as:
Page 3, “Submodularity in Summarization”
Although the authors may not have noticed, their objective functions are also submodular, adding more evidence suggesting that submodularity is natural for summarization tasks.
Page 3, “Submodularity in Summarization”
Similar to the MMR approach, in (Lin and Bilmes, 2010), a submodular graph based objective function is proposed where a graph cut function, measuring the similarity of the summary to the rest of document, is combined with a subtracted redundancy penalty function.
Page 4, “Submodularity in Summarization”
The objective function is submodular but again, non-monotone.
Page 4, “Submodularity in Summarization”
We theoretically justify that the performance guarantee of the greedy algorithm holds for this objective function with high probability (Lin and Bilmes, 2010).
Page 4, “Submodularity in Summarization”
Objective functions for extractive summarization usually measure these two separately and then mix them together trading off encouraging relevance and penalizing redundancy.
Page 5, “Monotone Submodular Objectives”
The redundancy penalty usually violates the monotonicity of the objective functions (Carbonell and Goldstein, 1998; Lin and Bilmes, 2010).
Page 5, “Monotone Submodular Objectives”
As shown in Table 1, optimizing this objective function gives a ROUGE-1 F—measure score 32.44%.
Page 7, “Experiments”
Figure 1: ROUGE-1 F-measure scores on DUC—03 when oz and K vary in objective function 51(8) —|— AR1(S),
Page 7, “Experiments”

See all papers in Proc. ACL 2011 that mention objective function.

See all papers in Proc. ACL that mention objective function.

F-measure

Appears in 8 sentences as: F-measure (8)

In A Class of Submodular Functions for Document Summarization

Table 1: ROUGE-1 recall (R) and F-measure (F) results (%) on DUC-04.
Page 7, “Experiments”
On the other hand, when using £1(S) with an 04 < 1 (the value of 04 was determined on DUC-03 using a grid search), a ROUGE-1 F-measure score 38.65%
Page 7, “Experiments”
ROUGE-1 F-measure (%)
Page 7, “Experiments”
Figure 1: ROUGE-1 F-measure scores on DUC—03 when oz and K vary in objective function 51(8) —|— AR1(S),
Page 7, “Experiments”
Table 2: ROUGE-2 recall (R) and F-measure (F) results (%) on DUC-05, where DUC-05 was used as training set.
Page 8, “Experiments”
Table 3: ROUGE-2 recall (R) and F-measure (F) results on DUC-05 (%).
Page 8, “Experiments”
While it is possible to use an approach that is similar to (Toutanova et al., 2007) to learn the coefficients in our objective function, we trained all coefficients to maximize ROUGE-2 F-measure score using the Nelder—Mead (derivative-free) method.
Page 9, “Experiments”
As shown in Table 5, using this objective function with multi-resolution diversity rewards improves our results further, and outperforms the best system in DUC-07 in terms of ROUGE-2 F-measure score.
Page 9, “Experiments”

See all papers in Proc. ACL 2011 that mention F-measure.

See all papers in Proc. ACL that mention F-measure.

development set

Appears in 3 sentences as: development set (3)

In A Class of Submodular Functions for Document Summarization

We used DUC-03 as our development set , and tested on DUC-04 data.
Page 6, “Experiments”
DUC-03 was used as development set .
Page 7, “Experiments”
Figure 1 illustrates how ROUGE-1 scores change when 04 and K vary on the development set (DUC-03).
Page 7, “Experiments”

See all papers in Proc. ACL 2011 that mention development set.

See all papers in Proc. ACL that mention development set.

n-gram

Appears in 3 sentences as: n-gram (3)

In A Class of Submodular Functions for Document Summarization

By definition (Lin, 2004), ROUGE-N is the n-gram recall between a candidate summary and a set of reference summaries.
Page 4, “Submodularity in Summarization”
Precisely, let S be the candidate summary (a set of sentences extracted from the ground set V), Ce : 2V —> Z+ be the number of times n-gram 6 occurs in summary 8, and R,- be the set of n-grams contained in the reference summary i (suppose we have K reference summaries, i.e., i = 1, - - - ,K).
Page 4, “Submodularity in Summarization”
where T6,, is the number of times n-gram 6 occurs in reference summary 2'.
Page 4, “Submodularity in Summarization”

See all papers in Proc. ACL 2011 that mention n-gram.

See all papers in Proc. ACL that mention n-gram.