Towards a General Rule for Identifying Deceptive Opinion Spam
Li, Jiwei and Ott, Myle and Cardie, Claire and Hovy, Eduard

Article Structure

Abstract

Consumers’ purchase decisions are increasingly influenced by user-generated online reviews.

Introduction

Consumers increasingly rely on user-generated online reviews when making purchase decision (Cone, 2011; Ipsos, 2012).

Related Work

Spam has been historically studied in the contexts of Web text (Gyo'ngyi et al., 2004; Ntoulas et al., 2006) or email (Drucker et al., 1999).

Dataset Construction

In this section, we report our efforts to gather gold-standard opinion spam datasets.

Feature-based Additive Model

In this section, we briefly describe our model.

Experiments

In this section, we report our experimental results.

General Linguistic Cues of Deceptive Opinion Spam

In this section, we examine a number of general POS and LIWC features that may shed light on a general rule for identifying deceptive opinion

Conclusion and Discussion

In this work, we have developed a multi-domain large-scale dataset containing gold-standard deceptive opinion spam.

Acknowledgement

We thank Wenjie Li and Xun Wang for useful discussions and suggestions.

Topics

Turker

Appears in 19 sentences as: Turker (12) turker (1) Turkers (7)
In Towards a General Rule for Identifying Deceptive Opinion Spam
  1. customer generated truthful reviews, Turker generated deceptive reviews and employee (domain-expert) generated deceptive reviews.
    Page 1, “Abstract”
  2. Despite the advantages of soliciting deceptive gold-standard material from Turkers (it is easy, large-scale, and affordable), it is unclear whether Turkers are representative of the general population that generate fake reviews, or in other words, Ott et al.’s data set may correspond to only one type of online deceptive opinion spam — fake reviews generated by people who have never been to offerings or experienced the entities.
    Page 1, “Introduction”
  3. In contrast to existing work (Ott et al., 2011; Li et al., 2013b), our new gold standard includes three types of reviews: domain expert deceptive opinion spam (Employee), crowdsourced deceptive opinion spam ( Turker ), and truthful Customer reviews (Customer).
    Page 2, “Introduction”
  4. created a gold-standard collection by employing Turkers to write fake reviews, and followup research was based on their data (Ott et al., 2012; Ott et al., 2013; Li et al., 2013b; Feng and Hirst, 2013).
    Page 3, “Related Work”
  5. 3.1 Turker set, using Mechanical Turk
    Page 3, “Dataset Construction”
  6. Anyone with basic programming skills can create Human Intelligence Tasks (HITs) and access a marketplace of anonymous online workers ( Turkers ) willing to complete the tasks.
    Page 3, “Dataset Construction”
  7. to create their dataset, such as restricting task to Turkers located in the United States, and who maintain an approval rating of at least 90%.
    Page 3, “Dataset Construction”
  8. Doctor-Turker : We gathered a total number of 200 positive reviews from Turkers .
    Page 3, “Dataset Construction”
  9. If we instead use SVM, for example, we would have to train classifiers one by one (due to the distinct features from different sources) to draw conclusions regarding the differences between Turker vs Expert vs truthful reviews, positive expert vs negative expert reviews, or reviews from different domains.
    Page 4, “Feature-based Additive Model”
  10. ysource E {employee, turker , customer}}
    Page 4, “Feature-based Additive Model”
  11. Specifically, we reframe it as a intra-domain multi-class classification task, where given the labeled training data from one domain, we learn a classifier to classify reviews according to their source, i.e., Employee, Turker and Customer.
    Page 5, “Experiments”

See all papers in Proc. ACL 2014 that mention Turker.

See all papers in Proc. ACL that mention Turker.

Back to top.

Unigram

Appears in 9 sentences as: Unigram (8) unigram (1)
In Towards a General Rule for Identifying Deceptive Opinion Spam
  1. In the examples in Table l, we trained a linear SVM classifier on Ott’s Chicago-hotel dataset on unigram features and tested it on a couple of different domains (the details of data acquisition are illustrated in Section 3).
    Page 2, “Introduction”
  2. Table 1: SVM performance on datasets for a classifier trained on Chicago hotel review based on Unigram feature.
    Page 2, “Introduction”
  3. We train the OVR classifier on three sets of features, LI WC, Unigram , and POS.9
    Page 5, “Experiments”
  4. In particular, the three-class classifier is around 65% accurate at distinguishing between Employee, Customer, and Tarker for each of the domains using Unigram , significantly higher than random guess.
    Page 5, “Experiments”
  5. Best performance is achieved on Unigram features, constantly outperforming LIWC and POS features in both three-class and two-class settings in the hotel domain.
    Page 5, “Experiments”
  6. Again, we explore 3 feature sets: LIWC, Unigram and POS.
    Page 5, “Experiments”
  7. Among three types of features, Unigram still performs best.
    Page 6, “Experiments”
  8. In the doctor domain, we observe that models trained on Unigram features from the hotels domain do not generalize well to doctor reviews, and the performance is a little bit better than random guess with only 0.55 accuracy.
    Page 6, “Experiments”
  9. For SVM, models trained on POS and LIWC features achieve even lower accuracy than Unigram .
    Page 6, “Experiments”

See all papers in Proc. ACL 2014 that mention Unigram.

See all papers in Proc. ACL that mention Unigram.

Back to top.

gold-standard

Appears in 8 sentences as: gold-standard (8)
In Towards a General Rule for Identifying Deceptive Opinion Spam
  1. Existing approaches for spam detection are usually focused on developing supervised leaming-based algorithms to help users identify deceptive opinion spam, which are highly dependent upon high-quality gold-standard labeled data (J in-dal and Liu, 2008; Jindal et al., 2010; Lim et al., 2010; Wang et al., 2011; Wu et al., 2010).
    Page 1, “Introduction”
  2. Despite the advantages of soliciting deceptive gold-standard material from Turkers (it is easy, large-scale, and affordable), it is unclear whether Turkers are representative of the general population that generate fake reviews, or in other words, Ott et al.’s data set may correspond to only one type of online deceptive opinion spam — fake reviews generated by people who have never been to offerings or experienced the entities.
    Page 1, “Introduction”
  3. One contribution of the work presented here is the creation of the cross-domain (i.e., Hotel, Restaurant and Doctor) gold-standard dataset.
    Page 2, “Introduction”
  4. created a gold-standard collection by employing Turkers to write fake reviews, and followup research was based on their data (Ott et al., 2012; Ott et al., 2013; Li et al., 2013b; Feng and Hirst, 2013).
    Page 3, “Related Work”
  5. In this section, we report our efforts to gather gold-standard opinion spam datasets.
    Page 3, “Dataset Construction”
  6. Due to the difficulty in obtaining gold-standard data in the literature, there is no doubt that our data set is not perfect.
    Page 4, “Dataset Construction”
  7. In this work, we have developed a multi-domain large-scale dataset containing gold-standard deceptive opinion spam.
    Page 9, “Conclusion and Discussion”
  8. However, it is still very difficult to estimate the practical impact of such methods, as it is very challenging to obtain gold-standard data in the real world.
    Page 9, “Conclusion and Discussion”

See all papers in Proc. ACL 2014 that mention gold-standard.

See all papers in Proc. ACL that mention gold-standard.

Back to top.

SVM

Appears in 6 sentences as: SVM (6)
In Towards a General Rule for Identifying Deceptive Opinion Spam
  1. In the examples in Table l, we trained a linear SVM classifier on Ott’s Chicago-hotel dataset on unigram features and tested it on a couple of different domains (the details of data acquisition are illustrated in Section 3).
    Page 2, “Introduction”
  2. Table 1: SVM performance on datasets for a classifier trained on Chicago hotel review based on Unigram feature.
    Page 2, “Introduction”
  3. If we instead use SVM , for example, we would have to train classifiers one by one (due to the distinct features from different sources) to draw conclusions regarding the differences between Turker vs Expert vs truthful reviews, positive expert vs negative expert reviews, or reviews from different domains.
    Page 4, “Feature-based Additive Model”
  4. 10We use SVMlight (J oachims, 1999) to train our linear SVM classifiers
    Page 5, “Experiments”
  5. For SVM , models trained on POS and LIWC features achieve even lower accuracy than Unigram.
    Page 6, “Experiments”
  6. tive model, SAGE achieve much better results than SVM , and is around 0.65 accurate in the cross-domain task.
    Page 6, “Experiments”

See all papers in Proc. ACL 2014 that mention SVM.

See all papers in Proc. ACL that mention SVM.

Back to top.

binary classification

Appears in 3 sentences as: binary classification (1) binary classifications (1) binary classifiers (1)
In Towards a General Rule for Identifying Deceptive Opinion Spam
  1. We report both OVR performance and the performance of three One-versus-One binary classifiers , trained to distinguish between each pair of classes.
    Page 5, “Experiments”
  2. We also observe that each of the three One-versas-One binary classifications performs significantly better than chance, suggesting that Employee, Customer, and Tarker are in fact three different classes.
    Page 5, “Experiments”
  3. For simplicity, we focus on truthful (Cas-tomer) versus deceptive (Turker) binary classification rather than a multi-class classification.
    Page 5, “Experiments”

See all papers in Proc. ACL 2014 that mention binary classification.

See all papers in Proc. ACL that mention binary classification.

Back to top.

gold standard

Appears in 3 sentences as: gold standard (3)
In Towards a General Rule for Identifying Deceptive Opinion Spam
  1. In this paper, we explore generalized approaches for identifying online deceptive opinion spam based on a new gold standard dataset, which is comprised of data from three different domains (i.e.
    Page 1, “Abstract”
  2. In contrast to existing work (Ott et al., 2011; Li et al., 2013b), our new gold standard includes three types of reviews: domain expert deceptive opinion spam (Employee), crowdsourced deceptive opinion spam (Turker), and truthful Customer reviews (Customer).
    Page 2, “Introduction”
  3. (2010) propose an alternative strategy to detect deceptive opinion spam in the absence of a gold standard .
    Page 3, “Related Work”

See all papers in Proc. ACL 2014 that mention gold standard.

See all papers in Proc. ACL that mention gold standard.

Back to top.

topic models

Appears in 3 sentences as: topic models (3)
In Towards a General Rule for Identifying Deceptive Opinion Spam
  1. (2011), which can be viewed as an combination of topic models (Blei et al., 2003) and generalized additive models (Hastie and Tibshirani, 1990).
    Page 3, “Related Work”
  2. Unlike other derivatives of topic models , SAGE drops the Dirichlet-multinomial assumption and adopts a Laplacian prior, triggering sparsity in topic-word distribution.
    Page 3, “Related Work”
  3. where we assume ysemZ—ment and ypomam are given for each document d. Note that we assume conditional independence between features and words given 3/, similar to other topic models (Blei et al., 2003).
    Page 5, “Feature-based Additive Model”

See all papers in Proc. ACL 2014 that mention topic models.

See all papers in Proc. ACL that mention topic models.

Back to top.