SciSurf: Index of 'Metadata-Aware Measures for Answer Summarization in Community Question Answering'

Topics

scoring function (7)
feature space (6)
human annotation (5)
Social Media (4)
n-grams (3)
natural language (3)
Question Answering (3)
weight vector (3)

Topics

scoring function (7)
feature space (6)
human annotation (5)
Social Media (4)
n-grams (3)
natural language (3)
Question Answering (3)
weight vector (3)

Metadata-Aware Measures for Answer Summarization in Community Question Answering

Tomasoni, Mattia and Huang, Minlie

Published in Proc. ACL, 2010

Article Structure

Abstract

This paper presents a framework for automatically processing information coming from community Question Answering (cQA) portals with the purpose of generating a trustful, complete, relevant and succinct summary in response to a question.

Introduction

Community Question Answering (cQA) portals are an example of Social Media where the information need of a user is expressed in the form of a question for which a best answer is picked among the ones generated by other users.

The summarization framework

2.1 Quality as a ranking problem

Experiments

3.1 Datasets and filters

Discussion and Future Directions

We conclude by discussing a few alternatives to the approaches we presented.

Related Work

A work with a similar objective to our own is that of Liu et al.

Conclusions

We presented a framework to generate trustful, complete, relevant and succinct answers to questions posted by users in cQA portals.

Topics

scoring function

Appears in 7 sentences as: scoring function (6) scoring functions (1)

In Metadata-Aware Measures for Answer Summarization in Community Question Answering

2.5 The concept scoring functions
Page 4, “The summarization framework”
Analogously to what had been done with scoring function (6), the (I) space was augmented with a dimension representing the
Page 4, “The summarization framework”
The concept score for the same BE in two separate answers is very likely to be different because it belongs to answers with their own Quality and Coverage values: this only makes the scoring function context-dependent and does not interfere with the calculation the Coverage, Relevance and Novelty measures, which are based on information overlap and will regard two BEs with overlapping equivalence classes as being the same, regardless of their score being different.
Page 5, “The summarization framework”
At this point a second version of the dataset was created to evaluate the summarization performance under scoring function (6) and (7); it was generated by manually selecting questions that arouse subjective, human interest from the previous 89,814 question-answer pairs.
Page 6, “Experiments”
Figure 2: Increase in ROUGE-L, ROUGE-l and ROUGE-2 performances of the S H system as more measures are taken in consideration in the scoring function , starting from Relevance alone (R) to the complete system (RQNC).
Page 7, “Experiments”
In order to determine what influence the single measures had on the overall performance, we conducted a final experiment on the filtered dataset to evaluate (the SH scoring function was used).
Page 7, “Experiments”
Table 2: A summarized answer composed of five different portions of text generated with the SH scoring function ; the chosen best answer is presented for comparison.
Page 8, “Discussion and Future Directions”

See all papers in Proc. ACL 2010 that mention scoring function.

See all papers in Proc. ACL that mention scoring function.

Back to top.

feature space

Appears in 6 sentences as: feature space (5) feature spaces (1)

In Metadata-Aware Measures for Answer Summarization in Community Question Answering

feature space to capture the following syntactic, behavioral and statistical properties:
Page 2, “The summarization framework”
The features mentioned above determined a space \II; An answer a, in such feature space , assumed the vectorial form:
Page 2, “The summarization framework”
To demonstrate it, we conducted a set of experiments on the original unfiltered dataset to establish whether the feature space \11 was powerful enough to capture the quality of answers; our specific objective was to estimate the
Page 6, “Experiments”
The Quality assessing component itself could be built as a module that can be adjusted to the kind of Social Media in use; the creation of customized Quality feature spaces would make it possible to handle different sources of UGC (forums, collaborative authoring websites such as Wikipedia, blogs etc.).
Page 8, “Discussion and Future Directions”
A great obstacle is the lack of systematically available high quality training examples: a tentative solution could be to make use of clustering algorithms in the feature space ; high and low quality clusters could then be labeled by comparison with examples of virtuous behavior (such as Wikipedia’s Featured Articles).
Page 8, “Discussion and Future Directions”
(2008) which inspired us in the design of the Quality feature space presented in Section 2.1.
Page 9, “Related Work”

See all papers in Proc. ACL 2010 that mention feature space.

See all papers in Proc. ACL that mention feature space.

Back to top.

human annotation

Appears in 5 sentences as: human annotated (1) human annotation (3) human annotators (1)

In Metadata-Aware Measures for Answer Summarization in Community Question Answering

Our decision to proceed in an unsupervised direction came from the consideration that any use of external human annotation would have made it impracticable to build an actual system on larger scale.
Page 2, “The summarization framework”
i=1 A second approach that made use of human annotation to learn a vector of weights V 2 (v1, v2, v3, 214) that linearly combined the scores was investigated.
Page 4, “The summarization framework”
In order to learn the weight vector V that would combine the above scores, we asked three human annotators to generate question-biased extractive summaries based on all answers available for a certain question.
Page 4, “The summarization framework”
We calculated ROUGE-1 and ROUGE-2 scores10 against human annotation on the filtered version of the dataset presented in Section 3.1.
Page 7, “Experiments”
Evaluation results on human annotated data showed that our summarized answers constitute a solid complement to best answers voted by the cQA users.
Page 9, “Conclusions”

See all papers in Proc. ACL 2010 that mention human annotation.

See all papers in Proc. ACL that mention human annotation.

Back to top.

Social Media

Appears in 4 sentences as: Social Media (4)

In Metadata-Aware Measures for Answer Summarization in Community Question Answering

Community Question Answering (cQA) portals are an example of Social Media where the information need of a user is expressed in the form of a question for which a best answer is picked among the ones generated by other users.
Page 1, “Introduction”
Interestingly, a great amount of information is embedded in the metadata generated as a byproduct of users’ action and interaction on Social Media .
Page 1, “Introduction”
Quality assessing of information available on Social Media had been studied before mainly as a binary classification problem with the objective of detecting low quality content.
Page 2, “The summarization framework”
The Quality assessing component itself could be built as a module that can be adjusted to the kind of Social Media in use; the creation of customized Quality feature spaces would make it possible to handle different sources of UGC (forums, collaborative authoring websites such as Wikipedia, blogs etc.).
Page 8, “Discussion and Future Directions”

See all papers in Proc. ACL 2010 that mention Social Media.

See all papers in Proc. ACL that mention Social Media.

Back to top.

n-grams

Appears in 3 sentences as: n-grams (3)

In Metadata-Aware Measures for Answer Summarization in Community Question Answering

We adopt a representation of concepts alternative to n-grams and propose two concept—scoring functions based on semantic overlap.
Page 1, “Abstract”
To represent sentences and answers we adopted an alternative approach to classical n-grams that could be defined bag-of-BEs.
Page 3, “The summarization framework”
Different from n-grams , they are variant in length and depend on parsing techniques, named entity detection, part-of-speech tagging and resolution of syntactic forms such as hyponyms, pronouns, per-tainyms, abbreviation and synonyms.
Page 3, “The summarization framework”

See all papers in Proc. ACL 2010 that mention n-grams.

See all papers in Proc. ACL that mention n-grams.

Back to top.

natural language

Appears in 3 sentences as: Natural Language (1) natural language (2)

In Metadata-Aware Measures for Answer Summarization in Community Question Answering

cQA websites are becoming an increasingly popular complement to search engines: overnight, a user can expect a human-crafted, natural language answer tailored to her specific needs.
Page 1, “Introduction”
BEs are a strong theoretical instrument to tackle the ambiguity inherent in natural language that find successful practical applications in real-world query-based summarization systems.
Page 3, “The summarization framework”
(2009) with a system that makes use of semantic-aware Natural Language Preprocessing techniques.
Page 8, “Related Work”

See all papers in Proc. ACL 2010 that mention natural language.

See all papers in Proc. ACL that mention natural language.

Back to top.

Question Answering

Appears in 3 sentences as: Question Answering (3)

In Metadata-Aware Measures for Answer Summarization in Community Question Answering

This paper presents a framework for automatically processing information coming from community Question Answering (cQA) portals with the purpose of generating a trustful, complete, relevant and succinct summary in response to a question.
Page 1, “Abstract”
Community Question Answering (cQA) portals are an example of Social Media where the information need of a user is expressed in the form of a question for which a best answer is picked among the ones generated by other users.
Page 1, “Introduction”
Our approach differs in two fundamental aspects: it took in consideration the peculiarities of the data in input by exploiting the nature of UGC and available metadata; additionally, along with relevance, we addressed challenges that are specific to Question Answering , such as Coverage and Novelty.
Page 8, “Related Work”

See all papers in Proc. ACL 2010 that mention Question Answering.

See all papers in Proc. ACL that mention Question Answering.

Back to top.

weight vector

Appears in 3 sentences as: weight vector (3)

In Metadata-Aware Measures for Answer Summarization in Community Question Answering

We trained a Linear Regression classifier to learn the weight vector W = (7.01, w2, 2123, 2124) that would combine the above feature.
Page 2, “The summarization framework”
It was calculated as dot product between the learned weight vector W and the feature vector for answer \II“.
Page 2, “The summarization framework”
In order to learn the weight vector V that would combine the above scores, we asked three human annotators to generate question-biased extractive summaries based on all answers available for a certain question.
Page 4, “The summarization framework”

See all papers in Proc. ACL 2010 that mention weight vector.

See all papers in Proc. ACL that mention weight vector.

Back to top.