Data Annotation | Previously (Lin and Hovy, 2002) had shown that information overlap judgment is a difficult task for human annotators . |
Data Annotation | For each n-gram, w, in a given headline, we look if 212 is part of any nugget in either human annotations . |
Data Annotation | Table 2 shows the unigram, bigram, and trigram-based average H between the two human annotators (Humanl, Human2). |
Corpus Creation | Often human annotators have different interpretations about the same sentence, and a speaker’s opiniorflattitude is sometimes ambiguous. |
Experiments | We use human annotated dialogue acts (DA) as the extraction units. |
Experiments | The system-generated summaries are compared to human annotated extractive and abstractive summaries. |
Experiments | We also examined the system output and human annotation and found some reasons for the system errors: |
Abstract | We study the cost/benefit tradeoff of using human annotators from different language backgrounds for the proposed evaluation metric, and compare whether providing the original source text helps. |
Abstract | The correlation coefficient of the SRL based evaluation metric driven by bilingual human annotators (0.351) is slightly better than that driven by monolingual human annotators (0.315); however, using bilinguals in the evaluation process is more costly than using monolinguals. |
Abstract | The correlation coefficient of the SRL based evaluation metric driven by bilingual human annotators who see also the source input sentences is 0.315 which is the same as that driven by monolingual human annotators . |
UK and XP stand for unknown and X phrase, respectively. | number of tokens (1) If the number of tokens in a sentence was different in the human annotation and the system output, the sentence was excluded from the calculation. |
UK and XP stand for unknown and X phrase, respectively. | This discrepancy sometimes occurred because the tokenization of the system sometimes differed from that of the human annotators . |
UK and XP stand for unknown and X phrase, respectively. | In the technique, transformation rules are obtained by comparing the output of a POS tagger and the human annotation so that the differences between the two are reduced. |