Community Answer Summarization for Multi-Sentence Question with Group L1 Regularization
Chan, Wen and Zhou, Xiangdong and Wang, Wei and Chua, Tat-Seng

Article Structure

Abstract

We present a novel answer summarization method for community Question Answering services (cQAs) to address the problem of “incomplete answer”, i.e., the “best answer” of a complex multi-sentence question misses valuable information that is contained in other answers.

Introduction

Community Question and Answering services (cQAs) have become valuable resources for users to pose questions of their interests and share their knowledge by providing answers to questions.

Definitions and Related Work

2.1 Definitions

The Summarization Framework

3.1 Conditional Random Fields

Experimental Setting

4.1 Dataset

Experimental Results

5.1 Summarization Results

Conclusions

We proposed a general CRF based community answer summarization method to deal with the incomplete answer problem for deep understanding of complex multi-sentence questions.

Topics

CRF

Appears in 21 sentences as: CRF (21)
In Community Answer Summarization for Multi-Sentence Question with Group L1 Regularization
  1. In order to automatically generate a novel and non-redundant community answer summary, we segment the complex original multi-sentence question into several sub questions and then propose a general Conditional Random Field ( CRF ) based answer summary method with group L1 regularization.
    Page 1, “Abstract”
  2. We tackle the answer summary task as a sequential labeling process under the general Conditional Random Fields ( CRF ) framework: every answer sentence in the question thread is labeled as a summary sentence or non-summary sentence, and we concatenate the sentences with summary label to form the final summarized answer.
    Page 2, “Introduction”
  3. First, we present a general CRF based framework
    Page 2, “Introduction”
  4. Second, we propose a group L1-regularization approach in the CRF model for automatic optimal feature learning to unleash the potential of the features and enhance the performance of answer summarization.
    Page 2, “Introduction”
  5. The experimental results show that the proposed model improve the performance signifi-cantly(in terms of precision, recall and F1 measures) as well as the ROUGE-l, ROUGE-2 and ROUGE-L measures as compared to the state-of-the-art methods, such as Support Vector Machines (SVM), Logistic Regression (LR) and Linear CRF (LCRF) (Shen et al., 2007).
    Page 2, “Introduction”
  6. Then under CRF (Lafferty et al., 2001), the conditional probability of y given X obeys the following distribution: p(ylx) = 22mm 2 Mgl<v,y|.,x>
    Page 3, “The Summarization Framework”
  7. Therefore, to explore the optimal combination of these features, we propose a group L1 regularization term in the general CRF model (Section 3.3) for feature learning.
    Page 4, “The Summarization Framework”
  8. These sentence-level features can be easily utilized in the CRF framework.
    Page 4, “The Summarization Framework”
  9. Therefore, we propose the following two kinds of contextual factors for selecting the answer sentences in the CRF model.
    Page 5, “The Summarization Framework”
  10. textual factors in our proposed general CRF based
    Page 6, “The Summarization Framework”
  11. As a result, we group the parameters in our CRF model with their related features3 and introduce a group Ll-regularization term for selecting the most useful features from the least important ones, where the regularization term becomes,
    Page 6, “The Summarization Framework”

See all papers in Proc. ACL 2012 that mention CRF.

See all papers in Proc. ACL that mention CRF.

Back to top.

sentence-level

Appears in 6 sentences as: Sentence-level (1) sentence-level (5)
In Community Answer Summarization for Multi-Sentence Question with Group L1 Regularization
  1. In this section, we give a detailed description of the different sentence-level cQA features and the contextual modeling between sentences used in our model for answer summarization.
    Page 3, “The Summarization Framework”
  2. Sentence-level Features
    Page 3, “The Summarization Framework”
  3. These sentence-level features can be easily utilized in the CRF framework.
    Page 4, “The Summarization Framework”
  4. 3We note that every sentence-level feature discussed in Section 3.2 presents a variety of instances (e. g., the sentence with longer or shorter length is the different instance), and we may call it sub—feature of the original sentence—level feature in the micro view.
    Page 6, “The Summarization Framework”
  5. which just utilize the independent sentence-level features, behave not vary well here, and there is no statistically significant performance difference between them.
    Page 8, “Experimental Results”
  6. To see how much the different textual and non-textual features contribute to community answer summarization, the accumulated weight of each group of sentence-level features5 is presented in Figure 2.
    Page 9, “Experimental Results”

See all papers in Proc. ACL 2012 that mention sentence-level.

See all papers in Proc. ACL that mention sentence-level.

Back to top.

SVM

Appears in 6 sentences as: SVM (6)
In Community Answer Summarization for Multi-Sentence Question with Group L1 Regularization
  1. The experimental results show that the proposed model improve the performance signifi-cantly(in terms of precision, recall and F1 measures) as well as the ROUGE-l, ROUGE-2 and ROUGE-L measures as compared to the state-of-the-art methods, such as Support Vector Machines ( SVM ), Logistic Regression (LR) and Linear CRF (LCRF) (Shen et al., 2007).
    Page 2, “Introduction”
  2. We adapt the Support Vector Machine ( SVM ) and Logistic Regression (LR) which have been reported to be effective for classification and the Linear CRF (LCRF) which is used to summarize ordinary text documents in (Shen et al., 2007) as baselines for comparison.
    Page 7, “Experimental Results”
  3. Table 2 shows that our general CRF model based on question segmentation with group L1 regularization outperforms the baselines significantly in all three measures (gCRF—QS-ll is 13.99% better than SVM in precision, 9.77% better in recall and 11.72% better in F1 score).
    Page 7, “Experimental Results”
  4. We note that both SVM and LR,
    Page 7, “Experimental Results”
  5. Model Precision Recall Fl SVM 65.93% 61.96% 63.88% LR 66.92%- 61.31%- 63.99%-LCRF 69.80% T 63.9l%- 66.73%T gCRF 73.77%T 69.43%T 71.53%T gCRF-QS 74.78%T 72.51%T 73.63%T gCRF-QS-II 79.92%T 71.73%- 75.60%T
    Page 8, “Experimental Results”
  6. Table 2: The Precision, Recall and F1 measures of the baselines SVM ,LR, LCRF and our general CRF based models (gCRF, gCRF-QS, gCRF-QS-ll).
    Page 8, “Experimental Results”

See all papers in Proc. ACL 2012 that mention SVM.

See all papers in Proc. ACL that mention SVM.

Back to top.

F1 score

Appears in 3 sentences as: F1 score (3)
In Community Answer Summarization for Multi-Sentence Question with Group L1 Regularization
  1. In our experiments, we also compare the precision, recall and F1 score in the ROUGE-1, ROUGE-2 and ROUGE-L measures (Lin , 2004) for answer summarization performance.
    Page 7, “Experimental Setting”
  2. Table 2 shows that our general CRF model based on question segmentation with group L1 regularization outperforms the baselines significantly in all three measures (gCRF—QS-ll is 13.99% better than SVM in precision, 9.77% better in recall and 11.72% better in F1 score ).
    Page 7, “Experimental Results”
  3. It is observed that our gCRF-QS-ll model improves the performance in terms of precision, recall and F1 score on all three measurements of ROUGE-l, ROUGE-2 and ROUGE-L by a significant margin compared to other baselines due to the use of local and nonlocal contextual factors and factors based on Q8 with group L1 regularization.
    Page 8, “Experimental Results”

See all papers in Proc. ACL 2012 that mention F1 score.

See all papers in Proc. ACL that mention F1 score.

Back to top.

Logistic Regression

Appears in 3 sentences as: Logistic Regression (2) logistic regression (1)
In Community Answer Summarization for Multi-Sentence Question with Group L1 Regularization
  1. The experimental results show that the proposed model improve the performance signifi-cantly(in terms of precision, recall and F1 measures) as well as the ROUGE-l, ROUGE-2 and ROUGE-L measures as compared to the state-of-the-art methods, such as Support Vector Machines (SVM), Logistic Regression (LR) and Linear CRF (LCRF) (Shen et al., 2007).
    Page 2, “Introduction”
  2. predicting asker-rated quality of answers was evaluated by using a logistic regression model.
    Page 3, “Definitions and Related Work”
  3. We adapt the Support Vector Machine (SVM) and Logistic Regression (LR) which have been reported to be effective for classification and the Linear CRF (LCRF) which is used to summarize ordinary text documents in (Shen et al., 2007) as baselines for comparison.
    Page 7, “Experimental Results”

See all papers in Proc. ACL 2012 that mention Logistic Regression.

See all papers in Proc. ACL that mention Logistic Regression.

Back to top.

semantic similarity

Appears in 3 sentences as: Semantic similarity (1) semantic similarity (2)
In Community Answer Summarization for Multi-Sentence Question with Group L1 Regularization
  1. Similarity to Question: Semantic similarity to the question and question context.
    Page 4, “The Summarization Framework”
  2. We compute the semantic similarity (Simpson and Crowe, 2005) between sentences or sub ques-
    Page 4, “The Summarization Framework”
  3. 2We use the semantic similarity of Equation 2 for all our similarity measurement in this paper.
    Page 5, “The Summarization Framework”

See all papers in Proc. ACL 2012 that mention semantic similarity.

See all papers in Proc. ACL that mention semantic similarity.

Back to top.

statistical significance

Appears in 3 sentences as: statistical significance (3) statistically significant (1)
In Community Answer Summarization for Multi-Sentence Question with Group L1 Regularization
  1. The uparrow denotes the performance improvement compared to the precious method (above) with statistical significance under p value of 0.05, the short line ’-’ denotes there is no difference in statistical significance .
    Page 8, “Experimental Results”
  2. which just utilize the independent sentence-level features, behave not vary well here, and there is no statistically significant performance difference between them.
    Page 8, “Experimental Results”
  3. We also find that LCRF which utilizes the local context information between sentences perform better than the LR method in precision and F1 with statistical significance .
    Page 8, “Experimental Results”

See all papers in Proc. ACL 2012 that mention statistical significance.

See all papers in Proc. ACL that mention statistical significance.

Back to top.