Index of papers in Proc. ACL that mention

Turkers

Seen in text as:

Turkers (66)
Turker (43)
turkers (31)
Turkers’ (4)

Seen in 137 sentences in 12 papers.

1. ``Was It Good? It Was Provocative.'' Learning the Meaning of Scalar Adjectives

de Marneffe, Marie-Catherine and Manning, Christopher D. and Potts, Christopher

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Our experimental results closely match the Turkers’ response data, demonstrating that meanings can be learned from Web data and that such meanings can drive pragmatic inference.
Analysis and discussion	Figure 5: Correlation between agreement among Turkers and whether the system gets the correct answer.
Analysis and discussion	For each dialogue, we plot a circle at Turker response entropy and either 1 = correct inference or 0 = incorrect inference, except the points are jittered a little vertically to show where the mass of data lies.
Analysis and discussion	late almost perfectly with the Turkers’ responses.
Corpus description	Given a written dialogue between speakers A and B, Turkers were asked to judge what B’s answer conveys: ‘definite yes’, ‘probable yes’, ‘uncertain’, ‘probable no’ , ‘definite no’.
Corpus description	For each dialogue, we got answers from 30 Turkers , and we took the dominant response as the correct one though we make extensive use of the full response distributions in evaluating our approach.2 We also computed entropy values for the distribution of answers for each item.
Corpus description	2120 Turkers were involved (the median number of items done was 28 and the mean 56.5).
Evaluation and results	In the case of the scalar modifiers experiment, there were just two examples whose dominant response from the Turkers was ‘Uncertain’, so we have left that category out of the results.
Evaluation and results	We count an inference as successful if it matches the dominant Turker response category.

Turkers is mentioned in 13 sentences in this paper.

Topics mentioned in this paper:

2. A Taxonomy, Dataset, and Classifier for Automatic Noun Compound Interpretation

Tratz, Stephen and Hovy, Eduard

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Evaluation	Due to the relatively high speed and low cost of Amazon’s Mechanical Turk serVice, we chose to use Mechanical Turkers as our annotators.
Evaluation	The first and most significant drawback is that it is impossible to force each Turker to label every data point without putting all the terms onto a single web page, which is highly impractical for a large taxonomy.
Evaluation	Some Turkers may label every compound, but most do not.
Taxonomy	We then embarked on a series of changes, testing each generation by annotation using Amazon’s Mechanical Turk service, a relatively quick and inexpensive online platform where requesters may publish tasks for anonymous online workers ( Turkers ) to perform.
Taxonomy	Turkers were asked to select one or, if they deemed it appropriate, two categories for each noun pair.
Taxonomy	In addition to influencing the category definitions, some taxonomy groupings were altered with the hope that this would improve inter-annotator agreement for cases where Turker disagreement was systematic.

Turkers is mentioned in 24 sentences in this paper.

Topics mentioned in this paper:

3. Generating Code-switched Text for Lexical Learning

Labutov, Igor and Lipson, Hod

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Our experimental procedure was as follows: 162 turkers were partitioned into four groups, each corresponding to a treatment condition: OPT (N=34), HF (N=4l), RANDOM (N=43), MAN (N=44).
Experiments	Font-size correlates with the score given by judge turkers in evaluating guesses of other turkers that were presented with the same text, but the word replaced with a blank.
Experiments	Turkers were solicited to participate in a study that involved “reading a short story with a twist” (title of HIT).
Model	For collecting data about which words are likely to be “predicted” given their content, we developed an Amazon Mechanical Turk task that presented turkers with excerpts of a short story (English translation of “The Man who Repented” by
Model	Turkers were required to type in their best guess, and the number of semantically similar guesses were counted by an average number of 6 other turkers .
Model	Turkers that judged the semantic similarity of the guesses of other turkers achieved an average Cohen’s kappa agreement of 0.44, indicating fair to poor agreement.

Turkers is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

4. Towards a General Rule for Identifying Deceptive Opinion Spam

Li, Jiwei and Ott, Myle and Cardie, Claire and Hovy, Eduard

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	customer generated truthful reviews, Turker generated deceptive reviews and employee (domain-expert) generated deceptive reviews.
Dataset Construction	3.1 Turker set, using Mechanical Turk
Dataset Construction	Anyone with basic programming skills can create Human Intelligence Tasks (HITs) and access a marketplace of anonymous online workers ( Turkers ) willing to complete the tasks.
Dataset Construction	to create their dataset, such as restricting task to Turkers located in the United States, and who maintain an approval rating of at least 90%.
Experiments	Specifically, we reframe it as a intra-domain multi-class classification task, where given the labeled training data from one domain, we learn a classifier to classify reviews according to their source, i.e., Employee, Turker and Customer.
Feature-based Additive Model	If we instead use SVM, for example, we would have to train classifiers one by one (due to the distinct features from different sources) to draw conclusions regarding the differences between Turker vs Expert vs truthful reviews, positive expert vs negative expert reviews, or reviews from different domains.
Feature-based Additive Model	ysource E {employee, turker , customer}}
Introduction	Despite the advantages of soliciting deceptive gold-standard material from Turkers (it is easy, large-scale, and affordable), it is unclear whether Turkers are representative of the general population that generate fake reviews, or in other words, Ott et al.’s data set may correspond to only one type of online deceptive opinion spam — fake reviews generated by people who have never been to offerings or experienced the entities.
Introduction	In contrast to existing work (Ott et al., 2011; Li et al., 2013b), our new gold standard includes three types of reviews: domain expert deceptive opinion spam (Employee), crowdsourced deceptive opinion spam ( Turker ), and truthful Customer reviews (Customer).
Related Work	created a gold-standard collection by employing Turkers to write fake reviews, and followup research was based on their data (Ott et al., 2012; Ott et al., 2013; Li et al., 2013b; Feng and Hirst, 2013).

Turkers is mentioned in 19 sentences in this paper.

Topics mentioned in this paper:

5. Are Two Heads Better than One? Crowdsourced Translation via a Two-Step Collaboration of Non-Professional Translators and Editors

Yan, Rui and Gao, Mingkun and Pavlick, Ellie and Callison-Burch, Chris

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Crowdsourcing Translation	52 different Turkers took part in the translation task, each translating 138 sentences on average.
Crowdsourcing Translation	In the editing task, 320 Turkers participated, averaging 56 sentences each.
Problem Formulation	We form two graphs: the first graph (GT) represents Turkers (translator/editor pairs) as nodes; the second graph (G0) represents candidate translated and
Problem Formulation	GT = (VT, ET) is a weighted undirected graph representing collaborations between Turkers .
Problem Formulation	The mutual reinforcement framework couples the two random walks on GT and G0 that rank candidates and Turkers in isolation.
Related work	ferent Turkers for a collection of Urdu sentences that had been previously professionally translated by the Linguistics Data Consortium.
Related work	They also hired US-based Turkers to edit the translations, since the translators were largely based in Pakistan and exhibited errors that are characteristic of speakers of English as a language.

Turkers is mentioned in 25 sentences in this paper.

Topics mentioned in this paper:

Turker (25)
TER (12)
BLEU (11)

6. Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning

Feng, Song and Kang, Jun Seok and Kuznetsova, Polina and Choi, Yejin

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Results 11	About 300 unique Turkers participated the evaluation tasks.
Experimental Results 11	Otherwise we treat them as ambiguous cases.17 Figure 3 shows a part of the AMT task, where Turkers are presented with questions that help judges to determine the subtle connotative polarity of each word, then asked to rate the degree of connotation on a scale from -5 (most negative) and 5 (most positive).
Experimental Results 11	17We allow Turkers to mark words that can be used with both positive and negative connotation, which results in about 7% of words that are excluded from the gold standard set.

Turkers is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

7. Finding Deceptive Opinion Spam by Any Stretch of the Imagination

Ott, Myle and Choi, Yejin and Cardie, Claire and Hancock, Jeffrey T.

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Dataset Construction and Human Performance	Crowdsourcing services such as AMT have made large-scale data annotation and collection efforts financially affordable by granting anyone with basic programming skills access to a marketplace of anonymous online workers (known as Turkers ) willing to complete small tasks.
Dataset Construction and Human Performance	To ensure that opinions are written by unique authors, we allow only a single submission per Turker .
Dataset Construction and Human Performance	We also restrict our task to Turkers who are located in the United States, and who maintain an approval rating of at least 90%.

Turkers is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

8. Linguistic Models for Analyzing and Detecting Biased Language

Recasens, Marta and Danescu-Niculescu-Mizil, Cristian and Jurafsky, Dan

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Human Perception of Biased Language	Turkers were shown Wikipedia’s definition of a “biased statement” and two example sentences that illustrated the two types of bias, framing and epistemological.
Human Perception of Biased Language	Before the 10 sentences, turkers were asked to list the languages they spoke as well as their primary language in primary school.
Human Perception of Biased Language	On average, it took turkers about four minutes to complete each HIT.

Turkers is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

9. Fine-grained Semantic Typing of Emerging Entities

Nakashole, Ndapandula and Tylenda, Tomasz and Weikum, Gerhard

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Evaluation	To evaluate the quality of types assigned to emerging entities, we presented turkers with sentences from the news tagged with out-of-KB entities and the types inferred by the methods under test.
Evaluation	The turkers task was to assess the correctness of types assigned to an entity mention.
Evaluation	To make it easy to understand the task for the turkers , we combined the extracted entity and type into a sentence.

Turkers is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

10. Modeling Factuality Judgments in Social Media Text

Soni, Sandeep and Mitra, Tanushree and Gilbert, Eric and Eisenstein, Jacob

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Annotation	To ensure quality control we required the Turkers to have at least 85% hit approval rating and to reside in the United States, because the Twitter messages in our dataset were related to American politics.
Annotation	we obtained five independent ratings from Turkers satisfying the above qualifications.
Annotation	We also allowed for “Not Applicable” option to capture ratings where the Turkers did not have sufficient knowledge about the statement or if the statement was not really a claim.

Turkers is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

11. Automatically Detecting Corresponding Edit-Turn-Pairs in Wikipedia

Daxenberger, Johannes and Gurevych, Iryna

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Corpus	It was important to find a reasonable amount of corresponding edit-turn-pairs before the actual annotation could take place, as we needed a certain amount of positive seeds to keep turkers from simply labeling pairs as non-corresponding all the time.
Corpus	The resulting 750 pairs have each been annotated by five turkers .
Corpus	The turkers were presented the turn text, the turn topic name, the edit in its context, and the edit comment (if present).

Turkers is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

12. ConnotationWordNet: Learning Connotation over the Word+Sense Network

Kang, Jun Seok and Feng, Song and Akoglu, Leman and Choi, Yejin

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Evaluation 11: Human Evaluation on ConnotationWordNet	We first describe the labeling process of sense-level connotation: We selected 350 polysemous words and one of their senses, and each Turker was asked to rate the connotative polarity of a given word (or of a given sense), from -5 to 5, 0 being the neutral.7 For each word, we asked 5 Turkers to rate and we took the average of the 5 ratings as the connotative intensity score of the word.
Evaluation 11: Human Evaluation on ConnotationWordNet	7Because senses in WordNet can be tricky to understand, care should be taken in designing the task so that the Turkers will focus only on the corresponding sense of a word.
Evaluation 11: Human Evaluation on ConnotationWordNet	As an incentive, each Turker was rewarded $0.07 per hit which consists of 10 words to label.

Turkers is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

synsets (24)
word-level (15)
WordNet (11)