Mine the Easy, Classify the Hard: A Semi-Supervised Approach to Automatic Sentiment Classification
Dasgupta, Sajib and Ng, Vincent

Article Structure

Abstract

Supervised polarity classification systems are typically domain-specific.

Introduction

Sentiment analysis has recently received a lot of attention in the Natural Language Processing (NLP) community.

Spectral Clustering

In this section, we give an overview of spectral clustering, which is at the core of our algorithm for identifying ambiguous reviews.

Our Approach

While spectral clustering addresses a major drawback of k-means clustering, it still cannot be expected to accurately partition the reviews due to the presence of ambiguous reviews.

Evaluation

4.1 Experimental Setup

Conclusions

We have proposed a novel semi-supervised approach to polarity classification.

Topics

labeled data

Appears in 14 sentences as: labeled data (14)
In Mine the Easy, Classify the Hard: A Semi-Supervised Approach to Automatic Sentiment Classification
  1. perimental results on five sentiment classification datasets demonstrate that our system can generate high-quality labeled data from unambiguous reviews, which, together with a small number of manually labeled reviews selected by the active learner, can be used to effectively classify ambiguous reviews in a discriminative fashion.
    Page 2, “Introduction”
  2. However, in the absence of labeled data , it is not easy to assess feature relevance.
    Page 4, “Our Approach”
  3. Even if labeled data were present, the ambiguous points might be better handled by a discriminative leam-ing system than a clustering algorithm, as discriminative learners are more sophisticated, and can handle ambiguous feature space more effectively.
    Page 4, “Our Approach”
  4. In self-training, we iteratively train a classifier on the data labeled so far, use it to classify the unlabeled instances, and augment the labeled data with the most confidently labeled instances.
    Page 4, “Our Approach”
  5. In fact, owing to the absence of labeled data , unsupervised clustering algorithms are unable to distinguish between useful and irrelevant features for polarity classification.
    Page 5, “Our Approach”
  6. Each classifier C,-will then be trained transductively, using the 100 manually labeled points and the points in L,- as labeled data , and the remaining points (including all
    Page 6, “Our Approach”
  7. Owing to the randomness involved in the choice of labeled data , all baseline results are averaged over ten independent runs for each fold.
    Page 6, “Evaluation”
  8. We implemented Kamvar et al.’s (2003) semi-supervised spectral clustering algorithm, which incorporates labeled data into the clustering framework in the form of must-link and cannot-link constraints.
    Page 6, “Evaluation”
  9. We employ as our second baseline a transductive SVM5 trained using 100 points randomly sampled from the training folds as labeled data and the remaining 1900 points as unlabeled data.
    Page 7, “Evaluation”
  10. Active learning is the best of the three baselines, presumably because it has the ability to choose the labeled data more intelligently than the other two.
    Page 7, “Evaluation”
  11. Specifically, rows 4 and 5 show the results of the SVM classifier when it is trained on the labeled data obtained after the first step (unsupervised extraction of unambiguous reviews) and the second step (active learning), respectively.
    Page 7, “Evaluation”

See all papers in Proc. ACL 2009 that mention labeled data.

See all papers in Proc. ACL that mention labeled data.

Back to top.

SVM

Appears in 13 sentences as: SVM (14)
In Mine the Easy, Classify the Hard: A Semi-Supervised Approach to Automatic Sentiment Classification
  1. Specifically, we train a discriminative classifier using the support vector machine ( SVM ) learning algorithm (J oachims, 1999) on the set of unambiguous reviews, and then apply the resulting classifier to all the reviews in the training folds4 that are not seeds.
    Page 5, “Our Approach”
  2. As our weakly supervised learner, we employ a transductive SVM .
    Page 6, “Our Approach”
  3. Hence, instead of training just one SVM classifier, we aim to reduce classification errors by training an ensemble of five classifiers, each of which uses all 100 manually labeled reviews and a different subset of the 500 automatically labeled reviews.
    Page 6, “Our Approach”
  4. Transductive SVM .
    Page 7, “Evaluation”
  5. Specifically, we begin by training an inductive SVM on one labeled example from each class, iteratively labeling the most uncertain unlabeled point on each side of the hyperplane and retraining the SVM until 100 points are labeled.
    Page 7, “Evaluation”
  6. Finally, we train a transductive SVM on the 100 labeled points and the remaining 1900 unlabeled points, obtaining the results in row 3 of Table 1.
    Page 7, “Evaluation”
  7. Specifically, rows 4 and 5 show the results of the SVM classifier when it is trained on the labeled data obtained after the first step (unsupervised extraction of unambiguous reviews) and the second step (active learning), respectively.
    Page 7, “Evaluation”
  8. 5 All the SVM classifiers in this paper are trained using the SVM‘Wht package (Joachims, 1999).
    Page 7, “Evaluation”
  9. Specifically, we used the 500 seeds to guide the selection of active leam-ing points, but trained a transductive SVM using only the active learning points as labeled data (and the rest as unlabeled data).
    Page 7, “Evaluation”
  10. We also experimented with training a transductive SVM using only the 100 least ambiguous seeds (i.e., the points with the largest unsigned
    Page 7, “Evaluation”
  11. Remember that SVM uses only the support vectors to acquire the hyperplane, and since an unambiguous seed is likely to be far away from the hyperplane, it is less likely to be a support vector.
    Page 8, “Evaluation”

See all papers in Proc. ACL 2009 that mention SVM.

See all papers in Proc. ACL that mention SVM.

Back to top.

semi-supervised

Appears in 8 sentences as: Semi-supervised (1) semi-supervised (7)
In Mine the Easy, Classify the Hard: A Semi-Supervised Approach to Automatic Sentiment Classification
  1. To address this problem, we propose a semi-supervised approach to sentiment classification where we first mine the unambiguous reviews using spectral techniques and then exploit them to classify the ambiguous reviews via a novel combination of active learning, transductive learning, and ensemble learning.
    Page 1, “Abstract”
  2. In light of the difficulties posed by ambiguous reviews, we differentiate between ambiguous and unambiguous reviews in our classification process by addressing the task of semi-supervised polarity classification via a “mine the easy, classify the hard” approach.
    Page 2, “Introduction”
  3. Recall that the goal of this step is not only to identify the unambiguous reviews, but also to annotate them as POSITIVE or NEGATIVE, so that they can serve as seeds for semi-supervised learning in a later step.
    Page 4, “Our Approach”
  4. Semi-supervised spectral clustering.
    Page 6, “Evaluation”
  5. We implemented Kamvar et al.’s (2003) semi-supervised spectral clustering algorithm, which incorporates labeled data into the clustering framework in the form of must-link and cannot-link constraints.
    Page 6, “Evaluation”
  6. As we can see, accuracy ranges from 57.3% to 68.7% and ARI ranges from 0.02 to 0.14, which are significantly better than those of semi-supervised spectral learning.
    Page 7, “Evaluation”
  7. We have proposed a novel semi-supervised approach to polarity classification.
    Page 8, “Conclusions”
  8. Since the semi-supervised learner is discriminative, our approach can adopt a richer representation that makes use of more sophisticated features such as bigrams or manually labeled sentiment-oriented words.
    Page 8, “Conclusions”

See all papers in Proc. ACL 2009 that mention semi-supervised.

See all papers in Proc. ACL that mention semi-supervised.

Back to top.

unlabeled data

Appears in 7 sentences as: unlabeled data (7)
In Mine the Easy, Classify the Hard: A Semi-Supervised Approach to Automatic Sentiment Classification
  1. Given that we now have a labeled set (composed of 100 manually labeled points selected by active learning and 500 unambiguous points) as well as a larger set of points that are yet to be labeled (i.e., the remaining unlabeled points in the training folds and those in the test fold), we aim to train a better classifier by using a weakly supervised learner to learn from both the labeled and unlabeled data .
    Page 6, “Our Approach”
  2. points in Lj, where i 75 j) as unlabeled data .
    Page 6, “Our Approach”
  3. Since the points in the test fold are included in the unlabeled data , they are all classified in this step.
    Page 6, “Our Approach”
  4. We employ as our second baseline a transductive SVM5 trained using 100 points randomly sampled from the training folds as labeled data and the remaining 1900 points as unlabeled data .
    Page 7, “Evaluation”
  5. This could be attributed to (l) the unlabeled data , which may have provided the transductive learner with useful information that are not accessible to the other learners, and (2) the ensemble, which is more noise-tolerant to the imperfect seeds.
    Page 7, “Evaluation”
  6. Specifically, we used the 500 seeds to guide the selection of active leam-ing points, but trained a transductive SVM using only the active learning points as labeled data (and the rest as unlabeled data ).
    Page 7, “Evaluation”
  7. second eigenvector values) in combination with the active learning points as labeled data (and the rest as unlabeled data ).
    Page 8, “Evaluation”

See all papers in Proc. ACL 2009 that mention unlabeled data.

See all papers in Proc. ACL that mention unlabeled data.

Back to top.

sentiment classification

Appears in 5 sentences as: sentiment classification (5)
In Mine the Easy, Classify the Hard: A Semi-Supervised Approach to Automatic Sentiment Classification
  1. To address this problem, we propose a semi-supervised approach to sentiment classification where we first mine the unambiguous reviews using spectral techniques and then exploit them to classify the ambiguous reviews via a novel combination of active learning, transductive learning, and ensemble learning.
    Page 1, “Abstract”
  2. perimental results on five sentiment classification datasets demonstrate that our system can generate high-quality labeled data from unambiguous reviews, which, together with a small number of manually labeled reviews selected by the active learner, can be used to effectively classify ambiguous reviews in a discriminative fashion.
    Page 2, “Introduction”
  3. Section 2 gives an overview of spectral clustering, which will facilitate the presentation of our approach to unsupervised sentiment classification in Section 3.
    Page 2, “Introduction”
  4. For evaluation, we use five sentiment classification datasets, including the widely-used movie review dataset [MOV] (Pang et al., 2002) as well as four datasets that contain reviews of four different types of product from Amazon [books (BOO), DVDs (DVD), electronics (ELE), and kitchen appliances (KIT)] (Blitzer et al., 2007).
    Page 6, “Evaluation”
  5. First, none of the steps in our approach is designed specifically for sentiment classification .
    Page 8, “Conclusions”

See all papers in Proc. ACL 2009 that mention sentiment classification.

See all papers in Proc. ACL that mention sentiment classification.

Back to top.

iteratively

Appears in 3 sentences as: iteratively (3)
In Mine the Easy, Classify the Hard: A Semi-Supervised Approach to Automatic Sentiment Classification
  1. In self-training, we iteratively train a classifier on the data labeled so far, use it to classify the unlabeled instances, and augment the labeled data with the most confidently labeled instances.
    Page 4, “Our Approach”
  2. In our algorithm, we start with an initial clustering of all of the data points, and then iteratively remove the 04 most ambiguous points from the dataset and cluster the remaining points.
    Page 4, “Our Approach”
  3. Specifically, we begin by training an inductive SVM on one labeled example from each class, iteratively labeling the most uncertain unlabeled point on each side of the hyperplane and retraining the SVM until 100 points are labeled.
    Page 7, “Evaluation”

See all papers in Proc. ACL 2009 that mention iteratively.

See all papers in Proc. ACL that mention iteratively.

Back to top.

sentiment analysis

Appears in 3 sentences as: Sentiment analysis (1) sentiment analysis (2)
In Mine the Easy, Classify the Hard: A Semi-Supervised Approach to Automatic Sentiment Classification
  1. Sentiment analysis has recently received a lot of attention in the Natural Language Processing (NLP) community.
    Page 1, “Introduction”
  2. Polarity classification, whose goal is to determine whether the sentiment expressed in a document is “thumbs up” or “thumbs down”, is arguably one of the most popular tasks in document-level sentiment analysis .
    Page 1, “Introduction”
  3. (2007) have investigated a model for jointly performing sentence- and document-level sentiment analysis , allowing the relationship between the two tasks to be captured and exploited.
    Page 1, “Introduction”

See all papers in Proc. ACL 2009 that mention sentiment analysis.

See all papers in Proc. ACL that mention sentiment analysis.

Back to top.

text classification

Appears in 3 sentences as: text classification (3)
In Mine the Easy, Classify the Hard: A Semi-Supervised Approach to Automatic Sentiment Classification
  1. Unlike topic-based text classification , where a high accuracy can be achieved even for datasets with a large number of classes (e.g., 20 Newsgroups), polarity classification appears to be a more difficult task.
    Page 1, “Introduction”
  2. One reason topic-based text classification is easier than polarity classification is that topic clusters are typically well-separated from each other, resulting from the fact that word usage differs considerably between two topically-different documents.
    Page 1, “Introduction”
  3. This makes it applicable to other text classification tasks.
    Page 8, “Conclusions”

See all papers in Proc. ACL 2009 that mention text classification.

See all papers in Proc. ACL that mention text classification.

Back to top.