Index of papers in Proc. ACL 2014 that mention
  • manual evaluation
Mehdad, Yashar and Carenini, Giuseppe and Ng, Raymond T.
Abstract
Automatic and manual evaluation results over meeting, chat and email conversations show that our approach significantly outperforms baselines and previous extractive models.
Conclusion
Both automatic and manual evaluation of our model show substantial improvement over extraction-based methods, including Biased LeXRank, which is considered a state-of-the-art system.
Experimental Setup
For manual evaluation of query-based abstracts (meeting and email datasets), we perform a simple user study assessing the following aspects: i) Overall quality given a query (5-point scale)?
Experimental Setup
For the manual evaluation , we only compare our full system with LexRank (LR) and Biased LexRank (Biased LR).
Experimental Setup
3.4.2 Manual Evaluation
Introduction
Automatic evaluation on the chat dataset and manual evaluation over the meetings and emails show that our system uniformly and statistically significantly outperforms baseline systems, as well as a state-of-the-art query-based extractive summarization system.
manual evaluation is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Mitra, Sunny and Mitra, Ritwik and Riedl, Martin and Biemann, Chris and Mukherjee, Animesh and Goyal, Pawan
Abstract
Manual evaluation indicates that the algorithm could correctly identify 60.4% birth cases from a set of 48 randomly picked samples and 57% split/join cases from a set of 21 randomly picked samples.
Conclusions
Through manual evaluation we found that the algorithm could correctly identify 60.4% birth cases from a set of 48 random samples and 57% split/join cases from a set of 21 randomly picked samples.
Evaluation framework
6.1 Manual evaluation
Evaluation framework
The accuracy as per manual evaluation was found to be 60.4% for the birth cases and 57% for the split/join cases.
Evaluation framework
correspond to the candidate words, words obtained in the cluster of each candidate word (we will use the term ‘birth cluster’ for these words, henceforth), which indicated a new sense, the results of manual evaluation as well as the possible sense this birth cluster denotes.
manual evaluation is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Sun, Weiwei and Du, Yantao and Kou, Xin and Ding, Shuoyang and Wan, Xiaojun
Abstract
The reliability of this linguistically-motivated GR extraction procedure is highlighted by manual evaluation .
Conclusion
Manual evaluation demonstrate the effectiveness of our method.
GB-grounded GR Extraction
Table 1: Manual evaluation of 209 sentences.
GB-grounded GR Extraction
2.3 Manual Evaluation
GB-grounded GR Extraction
To have a precise understanding of whether our extraction algorithm works well, we have selected 20 files that contains 209 sentences in total for manual evaluation .
Introduction
Manual evaluation highlights the reliability of our linguistically-motivated GR extraction algorithm: The overall dependency-based precision and recall are 99.17 and 98.87.
manual evaluation is mentioned in 6 sentences in this paper.
Topics mentioned in this paper: