Abstract | Automatic and manual evaluation results over meeting, chat and email conversations show that our approach significantly outperforms baselines and previous extractive models. |
Conclusion | Both automatic and manual evaluation of our model show substantial improvement over extraction-based methods, including Biased LeXRank, which is considered a state-of-the-art system. |
Experimental Setup | For manual evaluation of query-based abstracts (meeting and email datasets), we perform a simple user study assessing the following aspects: i) Overall quality given a query (5-point scale)? |
Experimental Setup | For the manual evaluation , we only compare our full system with LexRank (LR) and Biased LexRank (Biased LR). |
Experimental Setup | 3.4.2 Manual Evaluation |
Introduction | Automatic evaluation on the chat dataset and manual evaluation over the meetings and emails show that our system uniformly and statistically significantly outperforms baseline systems, as well as a state-of-the-art query-based extractive summarization system. |
Abstract | Manual evaluation indicates that the algorithm could correctly identify 60.4% birth cases from a set of 48 randomly picked samples and 57% split/join cases from a set of 21 randomly picked samples. |
Conclusions | Through manual evaluation we found that the algorithm could correctly identify 60.4% birth cases from a set of 48 random samples and 57% split/join cases from a set of 21 randomly picked samples. |
Evaluation framework | 6.1 Manual evaluation |
Evaluation framework | The accuracy as per manual evaluation was found to be 60.4% for the birth cases and 57% for the split/join cases. |
Evaluation framework | correspond to the candidate words, words obtained in the cluster of each candidate word (we will use the term ‘birth cluster’ for these words, henceforth), which indicated a new sense, the results of manual evaluation as well as the possible sense this birth cluster denotes. |
Abstract | The reliability of this linguistically-motivated GR extraction procedure is highlighted by manual evaluation . |
Conclusion | Manual evaluation demonstrate the effectiveness of our method. |
GB-grounded GR Extraction | Table 1: Manual evaluation of 209 sentences. |
GB-grounded GR Extraction | 2.3 Manual Evaluation |
GB-grounded GR Extraction | To have a precise understanding of whether our extraction algorithm works well, we have selected 20 files that contains 209 sentences in total for manual evaluation . |
Introduction | Manual evaluation highlights the reliability of our linguistically-motivated GR extraction algorithm: The overall dependency-based precision and recall are 99.17 and 98.87. |