Subtree Extractive Summarization via Submodular Maximization
Morita, Hajime and Sasano, Ryohei and Takamura, Hiroya and Okumura, Manabu

Article Structure

Abstract

This study proposes a text summarization model that simultaneously performs sentence extraction and compression.

Introduction

Text summarization is often addressed as a task of simultaneously performing sentence extraction and sentence compression (Berg-Kirkpatrick et al., 2011; Martins and Smith, 2009).

Related Work

Submodnlarity is formally defined as a property of a set function for a finite universe V. The function f : 2V —> R maps a subset S g V to a real value.

Budgeted Submodular Maximization with Cost Function

3.1 Problem Definition

Joint Model of Extraction and Compression

We will formalize the unified task of sentence compression and extraction as a budgeted monotone nondecreasing submodular function maximization with a cost function.

Experimental Settings

We evaluate our method on Japanese QA test collections from NTCIR-7 ACLIA1 and NTCIR-8 ACLIA2 (Mitamura et al., 2008; Mitamura et al., 2010).

Results

Table 1 summarizes our results.

Conclusions and Future Work

We formalized a query-oriented summarization, which is a task in which one simultaneously performs sentence compression and extraction, as a new optimization problem: budgeted monotone nondecreasing submodular function maximization with a cost function.

Topics

subtrees

Appears in 44 sentences as: subt (7) subtrees (52)
In Subtree Extractive Summarization via Submodular Maximization
  1. We translate the text summarization task into a problem of extracting a set of dependency subtrees in the document cluster.
    Page 1, “Abstract”
  2. In this study, we avoid this difficulty by reducing the task to one of extracting dependency subtrees from sentences in the source documents.
    Page 1, “Introduction”
  3. The reduction replaces the difficulty of numerous linear constraints with another difficulty wherein two subtrees can share the same word to-
    Page 1, “Introduction”
  4. ken when they are selected from the same sentence, and as a result, the cost of the union of the two subtrees is not always the mere sum of their costs.
    Page 2, “Introduction”
  5. Let V be the finite set of all valid subtrees in the source documents, where valid subtrees are defined to be the ones that can be regarded as grammatical sentences.
    Page 2, “Budgeted Submodular Maximization with Cost Function”
  6. In this paper, we regard subtrees containing the root node of the sentence as valid.
    Page 2, “Budgeted Submodular Maximization with Cost Function”
  7. Accordingly, V denotes a set of all rooted subtrees in all sentences.
    Page 2, “Budgeted Submodular Maximization with Cost Function”
  8. Let us consider the following problem of budgeted monotone nondecreasing submodular function maximization with a cost function: maxsg/ {f(S) : c (S) g L}, where S is a summary represented as a set of subtrees, is the cost function for the set of subtrees , L is our budget, and the submodular function f scores the summary quality.
    Page 2, “Budgeted Submodular Maximization with Cost Function”
  9. The cost function is not always the sum of the costs of the covered subtrees, but depends on the set of the covered elements by the subtrees .
    Page 2, “Budgeted Submodular Maximization with Cost Function”
  10. V is partitioned into exclusive subsets B of valid subtrees, and each subset corresponds to the original sentence from which the valid subtrees derived.
    Page 2, “Budgeted Submodular Maximization with Cost Function”
  11. However, the cost of a union of subtrees from different sentences is simply the sum of the costs of subtrees, while the cost of a union of subtrees from the same sentence is smaller than the sum of the costs.
    Page 2, “Budgeted Submodular Maximization with Cost Function”

See all papers in Proc. ACL 2013 that mention subtrees.

See all papers in Proc. ACL that mention subtrees.

Back to top.

objective function

Appears in 19 sentences as: Objective Function (1) objective function (18) objective functions (1)
In Subtree Extractive Summarization via Submodular Maximization
  1. The algorithm it-eratively adds to the current summary the element 3,- that has the largest ratio of the objective function gain to the additional cost, unless adding it violates the budget constraint.
    Page 3, “Budgeted Submodular Maximization with Cost Function”
  2. After the loop, the algorithm compares Gi with the {3*} that has the largest value of the objective function among all subtrees that are under the budget, and it outputs the summary candidate with the largest value.
    Page 3, “Budgeted Submodular Maximization with Cost Function”
  3. 4.1 Objective Function
    Page 4, “Joint Model of Extraction and Compression”
  4. We designed our objective function by combining this relevance score with a penalty for redundancy and too-compressed sentences.
    Page 5, “Joint Model of Extraction and Compression”
  5. The behavior can be represented by a submodular objective function that reduces word scores depending on those already included in the summary.
    Page 5, “Joint Model of Extraction and Compression”
  6. An optimization problem with this objective function cannot be regarded as an ILP problem because it contains nonlinear terms.
    Page 5, “Joint Model of Extraction and Compression”
  7. vantageous that the submodular maximization can deal with such objective functions .
    Page 5, “Joint Model of Extraction and Compression”
  8. Note that the objective function is such that it can be calculated according to the type of word.
    Page 5, “Joint Model of Extraction and Compression”
  9. Due to the nature of the objective function , we can use dynamic programming to effectively search for the subtree with the maximal density.
    Page 5, “Joint Model of Extraction and Compression”
  10. Consider the objective function Eq.
    Page 5, “Joint Model of Extraction and Compression”
  11. Next, let us consider the objective function that returns the sum of values of submodular word gain functions.
    Page 6, “Joint Model of Extraction and Compression”

See all papers in Proc. ACL 2013 that mention objective function.

See all papers in Proc. ACL that mention objective function.

Back to top.

sentence compression

Appears in 16 sentences as: sentence compression (16)
In Subtree Extractive Summarization via Submodular Maximization
  1. Text summarization is often addressed as a task of simultaneously performing sentence extraction and sentence compression (Berg-Kirkpatrick et al., 2011; Martins and Smith, 2009).
    Page 1, “Introduction”
  2. (2011) formulated a unified task of sentence extraction and sentence compression as an ILP.
    Page 2, “Related Work”
  3. These requirements enable us to represent sentence compression as the extraction of subtrees from a sentence.
    Page 3, “Budgeted Submodular Maximization with Cost Function”
  4. We will formalize the unified task of sentence compression and extraction as a budgeted monotone nondecreasing submodular function maximization with a cost function.
    Page 4, “Joint Model of Extraction and Compression”
  5. In this paper, we address the task of summarization of Japanese text by means of sentence compression and extraction.
    Page 4, “Joint Model of Extraction and Compression”
  6. Therefore, sentence compression can be represented as edge pruning.
    Page 4, “Joint Model of Extraction and Compression”
  7. First, we need a dependency parser of the language in order to represent sentence compression as dependency tree pruning.
    Page 4, “Joint Model of Extraction and Compression”
  8. We extract subtrees from sentences in order to solve the query-oriented summarization problem as a unified one consisting of sentence compression and extraction.
    Page 4, “Joint Model of Extraction and Compression”
  9. With such a similarity, sentence compression extracts nearly only the query terms and fails to contain important information.
    Page 4, “Joint Model of Extraction and Compression”
  10. Let d be the damping rate, count3(w) be the number of sentences containing word 21) in summary S, words(S) be the set of words included in summary 8, qsb(w) be the query relevance score of word w, and 7 be a parameter that adjusts the rate of sentence compression .
    Page 5, “Joint Model of Extraction and Compression”
  11. Since KNP internally has a flag that indicates either an “obligatory case” or an “adj acent case”, we regarded dependency relations flagged by KNP as obligatory in the sentence compression .
    Page 7, “Experimental Settings”

See all papers in Proc. ACL 2013 that mention sentence compression.

See all papers in Proc. ACL that mention sentence compression.

Back to top.

joint model

Appears in 6 sentences as: joint model (3) Joint models (2) joint models (1)
In Subtree Extractive Summarization via Submodular Maximization
  1. Joint models of sentence extraction and compression have a great benefit in that they have a large degree of freedom as far as controlling redundancy goes.
    Page 1, “Introduction”
  2. In contrast, conventional two-stage approaches (Za-jic et al., 2006), which first generate candidate compressed sentences and then use them to generate a summary, have less computational complexity than joint models .
    Page 1, “Introduction”
  3. Joint models can prune unimportant or redundant descriptions without resorting to enumeration.
    Page 1, “Introduction”
  4. Therefore, the joint model can extract an arbitrarily compressed sentence as a subtree without enumerating all candidates.
    Page 4, “Joint Model of Extraction and Compression”
  5. The joint model can remove the redundant part as well as the irrelevant part of a sentence, because the model simultaneously extracts and compresses sentences.
    Page 4, “Joint Model of Extraction and Compression”
  6. In this joint model , we generate a compressed sentence by extracting an arbitrary subtree from a dependency tree of a sentence.
    Page 4, “Joint Model of Extraction and Compression”

See all papers in Proc. ACL 2013 that mention joint model.

See all papers in Proc. ACL that mention joint model.

Back to top.

dependency tree

Appears in 5 sentences as: dependency tree (5)
In Subtree Extractive Summarization via Submodular Maximization
  1. In Japanese, syntactic subtrees that contain the root of the dependency tree of the original sentence often make grammatical sentences.
    Page 4, “Joint Model of Extraction and Compression”
  2. In this joint model, we generate a compressed sentence by extracting an arbitrary subtree from a dependency tree of a sentence.
    Page 4, “Joint Model of Extraction and Compression”
  3. To avoid generating such ungrammatical sentences, we need to detect and retain the obligatory dependency relations in the dependency tree .
    Page 4, “Joint Model of Extraction and Compression”
  4. First, we need a dependency parser of the language in order to represent sentence compression as dependency tree pruning.
    Page 4, “Joint Model of Extraction and Compression”
  5. Moreover, although, in Japanese, obligatory cases distinguish which edges of the dependency tree can be pruned or not, we need another technique to distinguish them in other languages.
    Page 4, “Joint Model of Extraction and Compression”

See all papers in Proc. ACL 2013 that mention dependency tree.

See all papers in Proc. ACL that mention dependency tree.

Back to top.

ILP

Appears in 4 sentences as: ILP (4)
In Subtree Extractive Summarization via Submodular Maximization
  1. Integer linear programming ( ILP ) formulations can represent such flexible constraints, and they are commonly used to model text summarization (McDonald, 2007).
    Page 2, “Related Work”
  2. (2011) formulated a unified task of sentence extraction and sentence compression as an ILP .
    Page 2, “Related Work”
  3. However, it is hard to solve large-scale ILP problems exactly in a practical amount of time.
    Page 2, “Related Work”
  4. An optimization problem with this objective function cannot be regarded as an ILP problem because it contains nonlinear terms.
    Page 5, “Joint Model of Extraction and Compression”

See all papers in Proc. ACL 2013 that mention ILP.

See all papers in Proc. ACL that mention ILP.

Back to top.

score function

Appears in 4 sentences as: score function (5) score functions (1)
In Subtree Extractive Summarization via Submodular Maximization
  1. By formalizing the subtree extraction problem as this new maximization problem, we can treat the constraints regarding the grammaticality of the compressed sentences in a straightforward way and use an arbitrary monotone submodular word score function for words including our word score function (shown later).
    Page 2, “Introduction”
  2. The score function is supermodular as a score function of subtree extraction3, because the union of two subtrees can have extra word pairs that are not included in either subtree.
    Page 5, “Joint Model of Extraction and Compression”
  3. Our score function for a summary 8 is as follows:
    Page 5, “Joint Model of Extraction and Compression”
  4. Since our algorithm requires that the objective function is the sum of word score functions , our proposed method has a restriction that we cannot use an arbitrary monotone submodular function as the objective function for the summary.
    Page 8, “Conclusions and Future Work”

See all papers in Proc. ACL 2013 that mention score function.

See all papers in Proc. ACL that mention score function.

Back to top.

optimization problem

Appears in 3 sentences as: optimization problem (3)
In Subtree Extractive Summarization via Submodular Maximization
  1. We argue that our optimization problem can be regarded as an extraction of subtrees rooted at a given node from a directed graph, instead of from a tree.
    Page 3, “Budgeted Submodular Maximization with Cost Function”
  2. An optimization problem with this objective function cannot be regarded as an ILP problem because it contains nonlinear terms.
    Page 5, “Joint Model of Extraction and Compression”
  3. We formalized a query-oriented summarization, which is a task in which one simultaneously performs sentence compression and extraction, as a new optimization problem : budgeted monotone nondecreasing submodular function maximization with a cost function.
    Page 8, “Conclusions and Future Work”

See all papers in Proc. ACL 2013 that mention optimization problem.

See all papers in Proc. ACL that mention optimization problem.

Back to top.

word pairs

Appears in 3 sentences as: word pairs (3)
In Subtree Extractive Summarization via Submodular Maximization
  1. Although the authors of QSB also provided scores of word pairs to avoid putting excessive penalties
    Page 4, “Joint Model of Extraction and Compression”
  2. on word overlaps, we do not score word pairs .
    Page 5, “Joint Model of Extraction and Compression”
  3. The score function is supermodular as a score function of subtree extraction3, because the union of two subtrees can have extra word pairs that are not included in either subtree.
    Page 5, “Joint Model of Extraction and Compression”

See all papers in Proc. ACL 2013 that mention word pairs.

See all papers in Proc. ACL that mention word pairs.

Back to top.