Conclusions | Results show that the state of the art is improved in automatic and manual metrics, with speeds close to extractive systems . |
Experiments | The bottom rows show the results achieved by our implementation of a pure extractive system (similar to the learned extractive summarizer of Berg-Kirkpatrick et al., 2011); a system that post-combines extraction and compression components trained separately, as in Martins and Smith (2009); and our compressive summarizer trained as a single task, and in the multitask setting. |
Experiments | The ROUGE and Pyramid scores show that the compressive summarizers (when properly trained) yield considerable benefits in content coverage over extractive systems , confirming the results of Berg-Kirkpatrick et al. |
Experiments | Our ROUGE-2 score (12.30%) is, to our knowledge, the highest reported on the TAG-2008 dataset, with little harm in grammaticality with respect to an extractive system that preserves the original sentences. |
Introduction | Up to now, extractive systems have been the most popular in multi-document summarization. |
Introduction | However, extractive systems are rather limited in the summaries they can produce. |
Introduction | All approaches above are based on integer linear programming (ILP), suffering from slow runtimes, when compared to extractive systems . |
Experiments | In our study, we compared the characteristics of summaries generated by the eight human summarizers with those generated by the peer summaries, which are basically extractive systems . |
Experiments | Purely extractive systems would thus be expected to score 1.0, as would systems that perform text compression by remov- |
Experiments | Peer 2 shows a relatively high level of aggregation despite being an extractive system . |
Introduction | In automatic summarization, centrality has been one of the guiding principles for content selection in extractive systems . |
Related Work | Domain-dependent template-based summarization systems have been an alternative to extractive systems which make use of rich knowledge about a domain and information extraction techniques to generate a summary, possibly using a natural language generation system (Radev and McKeown, 1998; White et al., 2001; McKeown et al., 2002). |
Related Work | Several studies complement this paper by examining the best possible extractive system using current evaluation measures, such as ROUGE (Lin and Hovy, 2003; Conroy et al., 2006). |
Related Work | They find that the best possible extractive systems score higher or as highly than human summarizers, but it is unclear whether this means the oracle summaries are actually as useful as human ones in an extrinsic setting. |
Baseline | As the baseline, we choose a state-of-the-art Chinese event extraction system , as described in Li et al. |
Experimentation | For fair comparison, we adopt the same experimental settings as the state-of-the-art event extraction system (Li et al. |
Experimentation | Besides, all the experiments on argument extraction are done on the output of the trigger extraction system as described in Li et al. |
Experimentation | Table 3 shows the performance of the baseline trigger extraction system and Line 1 in Table 4 illustrates the results of argument identification and role determination based on this system. |
Introduction | Section 3 describes a state-of-the-art Chinese argument extraction system as the baseline. |