Abstractive Summarization of Spoken and Written Conversations Based on Phrasal Queries
Mehdad, Yashar and Carenini, Giuseppe and Ng, Raymond T.

Article Structure

Abstract

We propose a novel abstractive query-based summarization system for conversations, where queries are defined as phrases reflecting a user information needs.

Introduction

Our lives are increasingly reliant on multimodal conversations with others.

Phrasal Query Abstraction Framework

Our phrasal query abstraction framework generates a grammatical abstract from a conversation following three steps, as shown in Figure l.

Experimental Setup

In this section, we show the evaluation results of our proposed framework and its comparison to the baselines and a state-of-the-art query-focused eXtractive summarization system.

Conclusion

We have presented an unsupervised framework for abstractive summarization of spoken and written conversations based on phrasal queries.

Topics

I’m

Appears in 8 sentences as: I’m (7) i’m (1)
In Abstractive Summarization of Spoken and Written Conversations Based on Phrasal Queries
  1. James had not ever had use for something like that so I’m not sure where I would graft that in.
    Page 2, “Introduction”
  2. James said that I’m thinking about moving that to on—activation instead of on—startup anyway as it should still work for a main form - but i still wonder if the on—startup parameter issue should be considered a bug — as it shouldn’t choke.
    Page 2, “Introduction”
  3. - i’m willing to scrap it if there is a better schema hidden in gnue somewhere :)
    Page 3, “Phrasal Query Abstraction Framework”
  4. F: umm, I’m afraid apparant non—sequiturs are always a hazard of doing summaries ;—)
    Page 9, “Experimental Setup”
  5. E: I’m just convulsing my thoughts to the irc log
    Page 9, “Experimental Setup”
  6. umm, I’m afraid apparant non—sequiturs are always a hazard of doing summaries ;-)
    Page 9, “Experimental Setup”
  7. I’m just convulsing my thoughts to the irc log good morning.
    Page 9, “Experimental Setup”
  8. I’m just convulsing my thoughts to the irc log.
    Page 9, “Experimental Setup”

See all papers in Proc. ACL 2014 that mention I’m.

See all papers in Proc. ACL that mention I’m.

Back to top.

manual evaluation

Appears in 6 sentences as: Manual Evaluation (1) manual evaluation (5)
In Abstractive Summarization of Spoken and Written Conversations Based on Phrasal Queries
  1. Automatic and manual evaluation results over meeting, chat and email conversations show that our approach significantly outperforms baselines and previous extractive models.
    Page 1, “Abstract”
  2. Automatic evaluation on the chat dataset and manual evaluation over the meetings and emails show that our system uniformly and statistically significantly outperforms baseline systems, as well as a state-of-the-art query-based extractive summarization system.
    Page 2, “Introduction”
  3. For manual evaluation of query-based abstracts (meeting and email datasets), we perform a simple user study assessing the following aspects: i) Overall quality given a query (5-point scale)?
    Page 6, “Experimental Setup”
  4. For the manual evaluation , we only compare our full system with LexRank (LR) and Biased LexRank (Biased LR).
    Page 6, “Experimental Setup”
  5. 3.4.2 Manual Evaluation
    Page 8, “Experimental Setup”
  6. Both automatic and manual evaluation of our model show substantial improvement over extraction-based methods, including Biased LeXRank, which is considered a state-of-the-art system.
    Page 9, “Conclusion”

See all papers in Proc. ACL 2014 that mention manual evaluation.

See all papers in Proc. ACL that mention manual evaluation.

Back to top.

extractive system

Appears in 5 sentences as: extractive system (3) extractive systems (2)
In Abstractive Summarization of Spoken and Written Conversations Based on Phrasal Queries
  1. Moreover, we compare our abstractive system with the first part of our framework (utterance extraction in Figure l), which can be presented as an extractive query-based summarization system (our extractive system ).
    Page 6, “Experimental Setup”
  2. We also show the results of the version we use in our pipeline (our pipeline extractive system ).
    Page 6, “Experimental Setup”
  3. In contrast, in the stand alone version ( extractive system ) we limit the number of retrieved sentences to the desired length of the summary.
    Page 6, “Experimental Setup”
  4. Our extractive query-based method beats all other extractive systems with a higher ROUGE-l and ROUGE-2 which shows the effectiveness of our utterance extraction model in comparison with other extractive models.
    Page 7, “Experimental Setup”
  5. Considering this marginal improvement and relatively high results of pure extractive systems , we can infer that the Biased LexRank extracted summaries do not carry much query relevant content.
    Page 8, “Experimental Setup”

See all papers in Proc. ACL 2014 that mention extractive system.

See all papers in Proc. ACL that mention extractive system.

Back to top.

cosine similarity

Appears in 4 sentences as: cosine similarity (5)
In Abstractive Summarization of Spoken and Written Conversations Based on Phrasal Queries
  1. We use the K-mean clustering algorithm by cosine similarity as a distance function between sentence vectors composed of tfidf scores.
    Page 4, “Phrasal Query Abstraction Framework”
  2. 1) Cosine- 1 st: we rank the utterances in the chat log based on the cosine similarity between the utterance and query.
    Page 6, “Experimental Setup”
  3. 2) Cosine-all: we rank the utterances in the chat log based on the cosine similarity between the utterance and query and then select the utterances with a cosine similarity greater than 0;
    Page 6, “Experimental Setup”
  4. Query Relevance: another interesting observation is that relying only on the cosine similarity (i.e., cosine-all) to measure the query relevance presents a quite strong baseline.
    Page 7, “Experimental Setup”

See all papers in Proc. ACL 2014 that mention cosine similarity.

See all papers in Proc. ACL that mention cosine similarity.

Back to top.

development set

Appears in 3 sentences as: development set (3)
In Abstractive Summarization of Spoken and Written Conversations Based on Phrasal Queries
  1. The parameters a and fl are tuned on a development set and sum up to 1.
    Page 3, “Phrasal Query Abstraction Framework”
  2. We estimate the percentage of the retrieved utterances based on the development set .
    Page 3, “Phrasal Query Abstraction Framework”
  3. For parameters estimation, we tune all parameters (utterance selection and path ranking) ex-haustively with 0.1 intervals using our development set .
    Page 6, “Experimental Setup”

See all papers in Proc. ACL 2014 that mention development set.

See all papers in Proc. ACL that mention development set.

Back to top.

significantly outperforms

Appears in 3 sentences as: significantly outperforms (3)
In Abstractive Summarization of Spoken and Written Conversations Based on Phrasal Queries
  1. Automatic and manual evaluation results over meeting, chat and email conversations show that our approach significantly outperforms baselines and previous extractive models.
    Page 1, “Abstract”
  2. Automatic evaluation on the chat dataset and manual evaluation over the meetings and emails show that our system uniformly and statistically significantly outperforms baseline systems, as well as a state-of-the-art query-based extractive summarization system.
    Page 2, “Introduction”
  3. Results indicate that our system significantly outperforms baselines in overall quality and responsiveness, for both meeting and email datasets.
    Page 8, “Experimental Setup”

See all papers in Proc. ACL 2014 that mention significantly outperforms.

See all papers in Proc. ACL that mention significantly outperforms.

Back to top.

statistical significance

Appears in 3 sentences as: statistical significance (1) statistically significant (1) statistically significantly (1)
In Abstractive Summarization of Spoken and Written Conversations Based on Phrasal Queries
  1. Automatic evaluation on the chat dataset and manual evaluation over the meetings and emails show that our system uniformly and statistically significantly outperforms baseline systems, as well as a state-of-the-art query-based extractive summarization system.
    Page 2, “Introduction”
  2. Abstractive vs. Extractive: our full query-based abstractive summariztion system show statistically significant improvements over baselines
    Page 7, “Experimental Setup”
  3. 4The statistical significance tests was calculated by approximate randomization, as described in (Yeh, 2000).
    Page 7, “Experimental Setup”

See all papers in Proc. ACL 2014 that mention statistical significance.

See all papers in Proc. ACL that mention statistical significance.

Back to top.