Conclusions | Evaluation results on human annotated data showed that our summarized answers constitute a solid complement to best answers voted by the cQA users. |
Experiments | We calculated ROUGE-1 and ROUGE-2 scores10 against human annotation on the filtered version of the dataset presented in Section 3.1. |
The summarization framework | Our decision to proceed in an unsupervised direction came from the consideration that any use of external human annotation would have made it impracticable to build an actual system on larger scale. |
The summarization framework | i=1 A second approach that made use of human annotation to learn a vector of weights V 2 (v1, v2, v3, 214) that linearly combined the scores was investigated. |
The summarization framework | In order to learn the weight vector V that would combine the above scores, we asked three human annotators to generate question-biased extractive summaries based on all answers available for a certain question. |
Experimental Evaluation and Discussion | 5 In our experiments human annotators do not give labels. |
Experimental Evaluation and Discussion | Figure 9 shows the same comparison in terms of required queries to human annotators . |
Experimental Evaluation and Discussion | Number of queris to human annotators |