A Syntax Free Sequence-oriented Sentence Compression Method | As an alternative to syntactic parsing, we propose two novel features, intra-sentence positional term weighting (IPTW) and the patched language model (PLM) for our syntax-free sentence compressor . |
A Syntax Free Sequence-oriented Sentence Compression Method | 3.1 Sentence Compression as a Combinatorial Optimization Problem |
Abstract | Conventional sentence compression methods employ a syntactic parser to compress a sentence without changing its meaning. |
Abstract | Moreover, for the goal of on-demand sentence compression , the time spent in the parsing stage is not negligible. |
Analysis of reference compressions | This statistic supports the view that sentence compression that strongly depends on syntax is not useful in reproducing reference compressions. |
Analysis of reference compressions | We need a sentence compression method that can drop intermediate nodes in the syntactic tree aggressively beyond the tree-scoped boundary. |
Analysis of reference compressions | In addition, sentence compression methods that strongly depend on syntactic parsers have two problems: ‘parse error’ and ‘decoding speed.’ 44% of sentences output by a state-of-the-art Japanese dependency parser contain at least one error (Kudo and Matsumoto, 2005). |
Introduction | In accordance with this idea, conventional sentence compression methods employ syntactic parsers. |
Introduction | Moreover, on-demand sentence compression is made problematic by the time spent in the parsing stage. |
Introduction | This paper proposes a syntax-free sequence-oriented sentence compression method. |
Introduction | We consider three paraphrase applications in our experiments, including sentence compression , sentence simplification, and sentence similarity computation. |
Results and Analysis | Results show that the percentages of test sentences that can be paraphrased are 97.2%, 95.4%, and 56.8% for the applications of sentence compression , simplification, and similarity computation, respectively. |
Results and Analysis | Further results show that the average number of unit replacements in each sentence is 5.36, 4.47, and 1.87 for sentence compression , simplification, and similarity computation. |
Results and Analysis | A source sentence s is paraphrased in each application and we can see that: (l) for sentence compression , the paraphrase t is 8 bytes shorter than s; (2) for sentence simplification, the words wealth and part in t are easier than their sources asset and proportion, especially for the nonnative speakers; (3) for sentence similarity computation, the reference sentence s’ is listed below t, in which the words appearing in t but not in s are highlighted in blue. |
Statistical Paraphrase Generation | On the contrary, SPG has distinct purposes in different applications, such as sentence compression , sentence simplification, etc. |
Statistical Paraphrase Generation | The application in this example is sentence compression . |
Statistical Paraphrase Generation | Paraphrase application: sentence compression |
Abstract | We describe our experiments with training algorithms for tree-to-tree synchronous tree-substitution grammar (STSG) for monolingual translation tasks such as sentence compression and paraphrasing. |
Abstract | We formalize nonparametric Bayesian STSG with epsilon alignment in full generality, and provide a Gibbs sampling algorithm for posterior inference tailored to the task of extractive sentence compression . |
Introduction | Such induction of tree mappings has application in a variety of natural-language-processing tasks including machine translation, paraphrase, and sentence compression . |
Introduction | In this work, we explore techniques for inducing synchronous tree-substitution grammars (STSG) using as a testbed application extractive sentence compression . |
Introduction | In this work, we use an extension of the aforementioned models of generative segmentation for STSG induction, and describe an algorithm for posterior inference under this model that is tailored to the task of extractive sentence compression . |
Sentence compression | Sentence compression is the task of summarizing a sentence while retaining most of the informational content and remaining grammatical (Jing, 2000). |
Sentence compression | In extractive sentence compression , which we focus on in this paper, an order-preserving subset of the words in the sentence are selected to form the summary, that is, we summarize by deleting words (Knight and Marcu, 2002). |
Sentence compression | In supervised sentence compression , the goal is to generalize from a parallel training corpus of sentences (source) and their compressions (target) to unseen sentences in a test set to predict their compressions. |
The STSG Model | Our sampling updates are extensions of those used by Cohn and Blunsom (2009) in MT, but are tailored to our task of extractive sentence compression . |
Budgeted Submodular Maximization with Cost Function | These requirements enable us to represent sentence compression as the extraction of subtrees from a sentence. |
Experimental Settings | Since KNP internally has a flag that indicates either an “obligatory case” or an “adj acent case”, we regarded dependency relations flagged by KNP as obligatory in the sentence compression . |
Introduction | Text summarization is often addressed as a task of simultaneously performing sentence extraction and sentence compression (Berg-Kirkpatrick et al., 2011; Martins and Smith, 2009). |
Joint Model of Extraction and Compression | We will formalize the unified task of sentence compression and extraction as a budgeted monotone nondecreasing submodular function maximization with a cost function. |
Joint Model of Extraction and Compression | In this paper, we address the task of summarization of Japanese text by means of sentence compression and extraction. |
Joint Model of Extraction and Compression | Therefore, sentence compression can be represented as edge pruning. |
Related Work | (2011) formulated a unified task of sentence extraction and sentence compression as an ILP. |
Abstract | We consider the problem of using sentence compression techniques to facilitate query-focused multi-document summarization. |
Introduction | Sentence compression techniques (Knight and Marcu, 2000; Clarke and Lapata, 2008) are the standard for producing a compact and grammatical version of a sentence while preserving relevance, and prior research (e.g. |
Introduction | Similarly, strides have been made to incorporate sentence compression into query-focused MDS systems (Zajic et al., 2006). |
Introduction | Most attempts, however, fail to produce better results than those of the best systems built on pure extraction-based approaches that use no sentence compression . |
Related Work | Our work is more related to the less studied area of sentence compression as applied to (single) document summarization. |
The Framework | We now present our query-focused MDS framework consisting of three steps: Sentence Ranking, Sentence Compression and Postprocessing. |
Abstract | In addition, we propose a multitask learning framework to take advantage of existing data for extractive summarization and sentence compression . |
Compressive Summarization | similar manner as described in §2, but with an additional component for the sentence compressor , and slight modifications in the other components. |
Compressive Summarization | In addition, we included hard constraints to prevent the deletion of certain arcs, following previous work in sentence compression (Clarke and Lapata, 2008). |
Experiments | (2011), but we augmented the training data with extractive summarization and sentence compression datasets, to help train the |
Experiments | For sentence compression , we adapted the Simple English Wikipedia dataset of Woodsend and Lapata (2011), containing aligned sentences for 15,000 articles from the English and Simple English Wikipedias. |
Extractive Summarization | However, extending these models to allow for sentence compression (as will be detailed in §3) breaks the diminishing returns property, making submodular optimization no longer applicable. |
Introduction | For example, such solvers are unable to take advantage of efficient dynamic programming routines for sentence compression (McDonald, 2006). |
Introduction | 0 We propose multitask learning (§4) as a principled way to train compressive summarizers, using auxiliary data for extractive summarization and sentence compression . |
MultiTask Learning | The goal is to take advantage of existing data for related tasks, such as extractive summarization (task #2), and sentence compression (task #3). |
MultiTask Learning | 0 For the sentence compression task, the parts correspond to arc-deletion features only. |
A Sentence Trimmer with CRFs | In the context of sentence compression , a linear programming based approach such as Clarke and Lapata (2006) is certainly one that deserves consideration. |
A Sentence Trimmer with CRFs | 2Note that a sentence compression can be represented as an array of binary labels, one of them marking words to be retained in compression and the other those to be dropped. |
Conclusions | This paper introduced a novel approach to sentence compression in Japanese, which combines a syntactically motivated generation model and CRFs, in or- |
Introduction | For better or worse, much of prior work on sentence compression (Riezler et al., 2003; McDonald, 2006; Turner and Charniak, 2005) turned to a single corpus developed by Knight and Marcu (2002) (K&M, henceforth) for evaluating their approaches. |
Introduction | Despite its limited scale, prior work in sentence compression relied heavily on this particular corpus for establishing results (Turner and Charniak, 2005; McDonald, 2006; Clarke and Lapata, 2006; Galley and McKeown, 2007). |
Introduction | An obvious benefit of using CRFs for sentence compression is that the model provides a general (and principled) probabilistic framework which permits information from various sources to be integrated towards compressing sentence, a property K&M do not share. |
The Dependency Path Model | In what follows, we will describe somewhat in detail a prior approach to sentence compression in Japanese which we call the ”dependency path model,” or DPM. |
Experimental Setup | There are no sentence length or grammaticality constraints, as there is no sentence compression . |
Introduction | Sentence compression is often regarded as a promising first step towards ameliorating some of the problems associated with extractive summarization. |
Introduction | Interfacing extractive summarization with a sentence compression module could improve the conciseness of the generated summaries and render them more informative (Jing, 2000; Lin, 2003; Zajic et al., 2007). |
Introduction | Despite the bulk of work on sentence compression and summarization (see Clarke and Lapata 2008 and Mani 2001 for overviews) only a handful of approaches attempt to do both in a joint model (Daume III and Marcu, 2002; Daume III, 2006; Lin, 2003; Martins and Smith, 2009). |
Related work | A few previous approaches have attempted to interface sentence compression with summarization. |
Related work | The latter optimizes an objective function consisting of two parts: an extraction component, essentially a non-greedy variant of maximal marginal relevance (McDonald, 2007), and a sentence compression component, a more compact reformulation of Clarke and Lapata (2008) based on the output of a dependency parser. |
Results | Furthermore, as a standalone sentence compression system it yields state of the art performance, comparable to McDonald’s (2006) discriminative model and superior to Hedge Trimmer (Zajic et al., 2007), a less sophisticated deterministic system. |
Abstract | Sentence compression has been shown to benefit from joint inference involving both n-gram and dependency-factored objectives but this typically requires expensive integer programming. |
Abstract | While dynamic programming is viable for bigram-based sentence compression , finding optimal compressed trees within graphs is NP-hard. |
Experiments | Following evaluations in machine translation as well as previous work in sentence compression (Unno et al., 2006; Clarke and Lapata, 2008; Martins and Smith, 2009; Napoles et al., 2011b; Thadani and McKeown, 2013), we evaluate system performance using F1 metrics over n-grams and dependency edges produced by parsing system output with RASP (Briscoe et al., 2006) and the Stanford parser. |
Introduction | Sentence compression is a text-to-text generation task in which an input sentence must be transformed into a shorter output sentence which accurately reflects the meaning in the input and also remains grammatically well-formed. |
Introduction | Following an assumption often used in compression systems, the compressed output in this corpus is constructed by dropping tokens from the input sentence without any paraphrasing or reordering.1 A number of diverse approaches have been proposed for deletion-based sentence compression , including techniques that assemble the output text under an n-gram factorization over the input text (McDonald, 2006; Clarke and Lapata, 2008) or an arc factorization over input dependency parses (Filippova and Strube, 2008; Galanis and Androutsopoulos, 2010; Filippova and Altun, 2013). |
Introduction | Our proposed approximation strategies are evaluated using automated metrics in order to address the question: under what conditions should a real-world sentence compression system implementation consider exact inference with an ILP or approximate inference? |
Related Work | Sentence compression is one of the better-studied text-to-text generation problems and has been observed to play a significant role in human summarization (Jing, 2000; Jing and McKeown, 2000). |
Related Work | Most approaches to sentence compression are supervised (Knight and Marcu, 2002; Riezler et al., 2003; Turner and Charniak, 2005; McDonald, 2006; Unno et al., 2006; Galley and McKeown, 2007; Nomoto, 2007; Cohn and Lapata, 2009; Galanis and Androutsopoulos, 2010; Gan-itkevitch et al., 2011; Napoles et al., 2011a; Filippova and Altun, 2013) following the release of datasets such as the Ziff-Davis corpus (Knight and Marcu, 2000) and the Edinburgh compression corpora (Clarke and Lapata, 2006; Clarke and Lapata, 2008), although unsupervised approaches—largely based on ILPs—have also received consideration (Clarke and Lapata, 2007; Clarke and Lapata, 2008; Filippova and Strube, 2008). |
Abstract | Many methods of text summarization combining sentence selection and sentence compression have recently been proposed. |
Conclusion | Hence, utilizing these for sentence compression has been left for future work. |
Experiment | However, introducing sentence compression to the system greatly improved the ROUGE score (0.354). |
Introduction | There has recently been increasing attention focused on approaches that jointly optimize sentence extraction and sentence compression (Tomita et al., 2009; |
Related work | Extracting a subtree from the dependency tree of words is one approach to sentence compression (Tomita et al., 2009; Qian and Liu, 2013; Morita et al., 2013; Gillick and Favre, 2009). |
Related work | The method of Filippova and Strube (2008) allows the model to extract non-rooted subtrees in sentence compression tasks that compress a single sentence with a given compression ratio. |
Introduction | The first, compression-based method uses a robust sentence compressor with an aggressive compression rate to get to the core of the sentence (Sec. |
Introduction | To the best of our knowledge, this is the first time that this task has been proposed; it can be considered as abstractive sentence compression, in contrast to most existing sentence compression systems which are based on selecting words from the original sentence or rewriting with simpler paraphrase tables. |
Pattern extraction by sentence compression | Sentence compression is a summarization technique that shortens input sentences preserving the most important content (Grefenstette, 1998; McDonald, 2006; Clarke and Lapata, 2008, inter alia). |
Pattern extraction by sentence compression | To our knowledge, this application of sentence compressors is novel. |
Pattern extraction by sentence compression | Sentence compression methods are abundant but very few can be configured to produce output satisfying certain constraints. |
Abstract | We address this challenge with contributions in two folds: first, we introduce the new task of image caption generalization, formulated as visually-guided sentence compression , and present an efficient algorithm based on dynamic beam search with dependency-based constraints. |
Related Work | In comparison to prior work on sentence compression , our approach falls somewhere between unsupervised to distant-supervised approach (e. g., Turner and Charniak (2005), Filippova and Strube (2008)) in that there is not an in-domain training corpus to learn generalization patterns directly. |
Sentence Generalization as Constraint Optimization | Casting the generalization task as visually-guided sentence compression with lightweight revisions, we formulate a constraint optimization problem that aims to maximize content selection and local linguistic fluency while satisfying constraints driven from dependency parse trees. |