Index of papers in Proc. ACL 2014 that mention
  • Mechanical Turk
Martineau, Justin and Chen, Lu and Cheng, Doreen and Sheth, Amit
Abstract
In this paper we study a large, low quality annotated dataset, created quickly and cheaply using Amazon Mechanical Turk to crowd-source annotations.
Experiments
We then sent these tweets to Amazon Mechanical Turk for annotation.
Experiments
In order to evaluate our approach in real world scenarios, instead of creating a high quality annotated dataset and then introducing artificial noise, we followed the common practice of crowdsouc-ing, and collected emotion annotations through Amazon Mechanical Turk (AMT).
Experiments
Amazon Mechanical Turk Annotation: we posted the set of 100K tweets to the workers on AMT for emotion annotation.
Introduction
There are generally two ways to collect annotations of a dataset: through a few expert annotators, or through crowdsourcing services (e.g., Amazon’s Mechanical Turk ).
Introduction
We employ Amazon’s Mechanical Turk (AMT) to label the emotions of Twitter data, and apply the proposed methods to the AMT dataset with the goals of improving the annotation quality at low cost, as well as learning accurate emotion classifiers.
Mechanical Turk is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Nakashole, Ndapandula and Mitchell, Tom M.
Fact Candidates
3.1 Mechanical Turk Study
Fact Candidates
We deployed an annotation study on Amazon Mechanical Turk (MTurk)3, a crowdsourcing platform for tasks requiring human input.
Fact Candidates
For training and testing data, we used the labeled data from the Mechanical Turk study.
Introduction
A Mechanical Turk study we carried out revealed that there is a significant correlation between objectivity of language and trustworthiness of sources.
Introduction
To test this hypothesis, we designed a Mechanical Turk study.
Introduction
(3) Objectivity Classifier: Using labeled data from the Mechanical Turk study, we developed and trained an objectivity classifier which performed better than prior proposed lexicons from literature.
Mechanical Turk is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Yan, Rui and Gao, Mingkun and Pavlick, Ellie and Callison-Burch, Chris
Crowdsourcing Translation
This data set consists 1,792 Urdu sentences from a variety of news and online sources, each paired with English translations provided by nonprofessional translators on Mechanical Turk .
Introduction
Rather than relying on volunteers or gamifica-tion, NLP research into crowdsourcing translation has focused on hiring workers on the Amazon Mechanical Turk (MTurk) platform (Callison-Burch, 2009).
Related work
Our setup uses anonymous crowd workers hired on Mechanical Turk , whose motivation to participate is financial.
Related work
Most NLP research into crowdsourcing has focused on Mechanical Turk , following pioneering work by Snow et al.
Related work
Although hiring professional translators to create bilingual training data for machine translation systems has been deemed infeasible, Mechanical Turk has provided a low cost way of creating large volumes of translations (Callison-Burch, 2009; Ambati and Vogel, 2010).
Mechanical Turk is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Soni, Sandeep and Mitra, Tanushree and Gilbert, Eric and Eisenstein, Jacob
Annotation
We used Amazon Mechanical Turk (AMT) to collect ratings of claims.
Introduction
This dataset was annotated by Mechanical Turk workers who gave ratings for the factuality of the scoped claims in each Twitter message.
Modeling factuality judgments
While these findings must be interpreted with caution, they suggest that readers — at least, Mechanical Turk workers — use relatively little independent judgment to assess the validity of quoted text that they encounter on Twitter.
Related work
(2012) conduct an empirical evaluation of FactBank ratings from Mechanical Turk workers, finding a high degree of disagreement between raters.
Mechanical Turk is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Christensen, Janara and Soderland, Stephen and Bansal, Gagan and Mausam
Abstract
In an Amazon Mechanical Turk evaluation, users pref-ered SUMMA ten times as often as flat MDS and three times as often as timelines.
Experiments
We hired Amazon Mechanical Turk (AMT) workers and assigned two topics to each worker.
Introduction
We conducted an Amazon Mechanical Turk (AMT) evaluation where AMT workers compared the output of SUMMA to that of timelines and flat summaries.
Mechanical Turk is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Labutov, Igor and Lipson, Hod
Abstract
Using an artificial language vocabulary, we evaluate a set of algorithms for generating code-switched text automatically by presenting it to Mechanical Turk subjects and measuring recall in a sentence completion task.
Experiments
We carried out experiments on the effectiveness of our approach using the Amazon Mechanical Turk platform.
Model
For collecting data about which words are likely to be “predicted” given their content, we developed an Amazon Mechanical Turk task that presented turkers with excerpts of a short story (English translation of “The Man who Repented” by
Mechanical Turk is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Tan, Chenhao and Lee, Lillian and Pang, Bo
Introduction
In an Amazon Mechanical Turk (AMT) experiment (§4), we found that humans achieved an average accuracy of 61.3%: not that high, but better than chance, indicating that it is somewhat possible for humans to predict greater message spread from different deliveries of the same information.
Introduction
We first ran a pilot study on Amazon Mechanical Turk (AMT) to determine whether humans can identify, based on wording differences alone, which of two topic- and author- controlled tweets is spread more widely.
Introduction
We outperform the average human accuracy of 61% reported in our Amazon Mechanical Turk experiments (for a different data sample); fiTAC+ff+time fails to do so.
Mechanical Turk is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Tibshirani, Julie and Manning, Christopher D.
Abstract
Annotation errors can significantly hurt classifier performance, yet datasets are only growing noisier with the increased use of Amazon Mechanical Turk and techniques like distant supervision that automatically generate labels.
Experiments
The data was created by taking various Wikipedia articles and giving them to five Amazon Mechanical Turkers to annotate.
Introduction
Low-quality annotations have become even more common in recent years with the rise of Amazon Mechanical Turk , as well as methods like distant supervision and co-training that involve automatically generating training data.
Mechanical Turk is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: