SciSurf: Index of 'Automatically Detecting Corresponding Edit-Turn-Pairs in Wikipedia'

Topics

manually annotated (4)
turkers (4)
Cosine similarity (3)
machine learning (3)
SVM (3)

Topics

manually annotated (4)
turkers (4)
Cosine similarity (3)
machine learning (3)
SVM (3)

Automatically Detecting Corresponding Edit-Turn-Pairs in Wikipedia

Daxenberger, Johannes and Gurevych, Iryna

Published in Proc. ACL, 2014

Article Structure

Abstract

In this study, we analyze links between edits in Wikipedia articles and turns from their discussion page.

Introduction

The process of user interaction in collaborative writing has been the topic of many studies in recent years (Erkens et al., 2005).

Edit-Turn-Pairs

In this section, we will define the basic units of our task, namely edits and turns.

Corpus

With the help of Amazon Mechanical Turkl, we crowdsourced annotations on a corpus of edit-turn-pairs from 26 random English Wikipedia articles in various thematic categories.

Machine Learning with Edit-Turn-Pairs

We used DKPro TC (Daxenberger et al., 2014) to carry out the machine learning experiments on edit-turn-pairs.

Related Work

Besides the work by Ferschke et al.

Conclusion

The novelty of this paper is a computational analysis of the relationship between the edit history and the discussion of a Wikipedia article.

Topics

manually annotated

Appears in 4 sentences as: manually annotated (4)

In Automatically Detecting Corresponding Edit-Turn-Pairs in Wikipedia

We manually annotated a corpus of 636 corresponding and non-corresponding edit-turn-pairs.
Page 1, “Abstract”
To assess the reliability of these annotations, one of the coauthors manually annotated a random subset of 100 edit-tum-pairs contained in ETP-gold as corresponding or non-corresponding.
Page 3, “Corpus”
To test this system, we manually annotated a corpus of corresponding and non-corresponding edit-turn-pairs.
Page 5, “Conclusion”
With regard to future work, an extension of the manually annotated corpus is the most important issue.
Page 5, “Conclusion”

See all papers in Proc. ACL 2014 that mention manually annotated.

See all papers in Proc. ACL that mention manually annotated.

Back to top.

turkers

Appears in 4 sentences as: turkers (5)

In Automatically Detecting Corresponding Edit-Turn-Pairs in Wikipedia

It was important to find a reasonable amount of corresponding edit-turn-pairs before the actual annotation could take place, as we needed a certain amount of positive seeds to keep turkers from simply labeling pairs as non-corresponding all the time.
Page 2, “Corpus”
The resulting 750 pairs have each been annotated by five turkers .
Page 3, “Corpus”
The turkers were presented the turn text, the turn topic name, the edit in its context, and the edit comment (if present).
Page 3, “Corpus”
To select good turkers and to block spammers, we carried out a pilot study on a small portion of manually confirmed corresponding and non-corresponding pairs, and required turkers to pass a qualification test.
Page 3, “Corpus”

See all papers in Proc. ACL 2014 that mention turkers.

See all papers in Proc. ACL that mention turkers.

Back to top.

Cosine similarity

Appears in 3 sentences as: Cosine similarity (2) cosine similarity (1)

In Automatically Detecting Corresponding Edit-Turn-Pairs in Wikipedia

We used the cosine similarity , longest common subsequence, and word n- gram similarity measures.
Page 4, “Machine Learning with Edit-Turn-Pairs”
Cosine similarity was applied on binary weighted term vectors (L2 norm).
Page 4, “Machine Learning with Edit-Turn-Pairs”
Cosine similarity , longest common subsequence, and word n-gram similarity were also applied to measure the similarity between the edit comment and the turn text as well as the similarity between the edit comment and the turn topic name.
Page 4, “Machine Learning with Edit-Turn-Pairs”

See all papers in Proc. ACL 2014 that mention Cosine similarity.

See all papers in Proc. ACL that mention Cosine similarity.

Back to top.

machine learning

Appears in 3 sentences as: machine learning (3)

In Automatically Detecting Corresponding Edit-Turn-Pairs in Wikipedia

With the help of supervised machine learning , we achieve an accuracy of .87 for this task.
Page 1, “Abstract”
We used DKPro TC (Daxenberger et al., 2014) to carry out the machine learning experiments on edit-turn-pairs.
Page 3, “Machine Learning with Edit-Turn-Pairs”
We have presented a machine learning system to automatically detect corresponding edit-turn-pairs.
Page 5, “Conclusion”

See all papers in Proc. ACL 2014 that mention machine learning.

See all papers in Proc. ACL that mention machine learning.

Back to top.

SVM

Appears in 3 sentences as: SVM (3)

In Automatically Detecting Corresponding Edit-Turn-Pairs in Wikipedia

Baseline R. Forest SVM Accuracy .799 :|:.031 .866 :|:.026T .858 :|:.027T Fimac, NaN .789 1.032 .763 1.033 Precisionmac.
Page 4, “Machine Learning with Edit-Turn-Pairs”
A reduction of the feature set as judged by a X2 ranker improved the results for both Random Forest as well as the SVM , so we limited our feature set to the 100 best features.
Page 4, “Machine Learning with Edit-Turn-Pairs”
In a 10-fold cross-validation experiment, we tested a Random Forest classifier (Breiman, 2001) and an SVM (Platt, 1998) with polynomial kernel.
Page 4, “Machine Learning with Edit-Turn-Pairs”

See all papers in Proc. ACL 2014 that mention SVM.

See all papers in Proc. ACL that mention SVM.

Back to top.