Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification
Bollegala, Danushka and Weir, David and Carroll, John

Article Structure

Abstract

We describe a sentiment classification method that is applicable when we do not have any labeled data for a target domain but have some labeled data for multiple other domains, designated as the source domains.

Introduction

Users express opinions about products or services they consume in blog posts, shopping sites, or review sites.

A Motivating Example

To explain the problem of cross-domain sentiment classification, consider the reviews shown in Table 1 for the domains books and kitchen appliances.

Sentiment Sensitive Thesaurus

One solution to the feature mismatch problem outlined above is to use a thesaurus that groups different words that express the same sentiment.

Feature Expansion

Our feature expansion phase augments a feature vector with additional related features selected from the

Experiments

5.1 Dataset

Related Work

Compared to single-domain sentiment classification, which has been studied extensively in previous work (Pang and Lee, 2008; Tumey, 2002), cross-domain sentiment classification has only recently received attention in response to advances in the area of domain adaptation.

Conclusions

We have described and evaluated a method to construct a sentiment-sensitive thesaurus to bridge the gap between source and target domains in cross-domain sentiment classification using multiple source domains.

Topics

sentiment classification

Appears in 31 sentences as: Sentiment Classification (1) Sentiment classification (1) sentiment classification (22) sentiment classifier (8) sentiment classifiers (2)
In Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification
  1. We describe a sentiment classification method that is applicable when we do not have any labeled data for a target domain but have some labeled data for multiple other domains, designated as the source domains.
    Page 1, “Abstract”
  2. Unlike previous cross-domain sentiment classification methods, our method can efficiently learn from multiple source domains.
    Page 1, “Abstract”
  3. Our method significantly outperforms numerous baselines and returns results that are better than or comparable to previous cross-domain sentiment classification methods on a benchmark dataset containing Amazon user reviews for different types of products.
    Page 1, “Abstract”
  4. Automatic document level sentiment classification (Pang et al., 2002; Tumey, 2002) is the task of classifying a given review with respect to the sentiment expressed by the author of the review.
    Page 1, “Introduction”
  5. For example, a sentiment classifier might classify a user review about a movie as positive or negative depending on the sentiment
    Page 1, “Introduction”
  6. Sentiment classification has been applied in numerous tasks such as opinion mining (Pang and Lee, 2008), opinion summarization (Lu et al., 2009), contextual advertising (Fan and Chang, 2010), and market analysis (Hu and Liu, 2004).
    Page 1, “Introduction”
  7. Supervised learning algorithms that require labeled data have been successfully used to build sentiment classifiers for a specific domain (Pang et al., 2002).
    Page 1, “Introduction”
  8. However, sentiment is expressed differently in different domains, and it is costly to annotate data for each new domain in which we would like to apply a sentiment classifier .
    Page 1, “Introduction”
  9. Work in cross-domain sentiment classification (Blitzer et al., 2007) focuses on the challenge of training a classifier from one or more domains (source domains) and applying the trained classifier in a different domain (target domain).
    Page 1, “Introduction”
  10. A cross-domain sentiment classification system must overcome two main challenges.
    Page 1, “Introduction”
  11. Following previous work, we define cross-domain sentiment classification as the problem of learning a binary classifier (i.e.
    Page 2, “Introduction”

See all papers in Proc. ACL 2011 that mention sentiment classification.

See all papers in Proc. ACL that mention sentiment classification.

Back to top.

labeled data

Appears in 16 sentences as: labeled data (18)
In Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification
  1. We describe a sentiment classification method that is applicable when we do not have any labeled data for a target domain but have some labeled data for multiple other domains, designated as the source domains.
    Page 1, “Abstract”
  2. Supervised learning algorithms that require labeled data have been successfully used to build sentiment classifiers for a specific domain (Pang et al., 2002).
    Page 1, “Introduction”
  3. positive or negative sentiment) given a small set of labeled data for the source domain, and unlabeled data for both source and target domains.
    Page 2, “Introduction”
  4. In particular, no labeled data is provided for the target domain.
    Page 2, “Introduction”
  5. We use labeled data from multiple source domains and unlabeled data from source and target domains to represent the distribution of features.
    Page 2, “Introduction”
  6. Unlabeled data is cheaper to collect compared to labeled data and is often available in large quantities.
    Page 2, “Introduction”
  7. Figure 3: Effect of source domain labeled data .
    Page 7, “Experiments”
  8. To investigate the impact of the quantity of source domain labeled data on our method, we vary the amount of data from zero to 800 reviews, with equal amounts of positive and negative labeled data .
    Page 7, “Experiments”
  9. Note that source domain labeled data is used both to create the sentiment sensitive thesaurus as well as to train the sentiment classifier.
    Page 7, “Experiments”
  10. positive vs. negative sentiment), a random classifier that does not utilize any labeled data would report a 50% classification accuracy.
    Page 7, “Experiments”
  11. From Figure 3, we see that when we increase the amount of source domain labeled data the accuracy increases quickly.
    Page 7, “Experiments”

See all papers in Proc. ACL 2011 that mention labeled data.

See all papers in Proc. ACL that mention labeled data.

Back to top.

unlabeled data

Appears in 16 sentences as: Unlabeled data (1) unlabeled data (18)
In Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification
  1. We automatically create a sentiment sensitive thesaurus using both labeled and unlabeled data from multiple source domains to find the association between words that express similar sentiments in different domains.
    Page 1, “Abstract”
  2. positive or negative sentiment) given a small set of labeled data for the source domain, and unlabeled data for both source and target domains.
    Page 2, “Introduction”
  3. We use labeled data from multiple source domains and unlabeled data from source and target domains to represent the distribution of features.
    Page 2, “Introduction”
  4. Unlabeled data is cheaper to collect compared to labeled data and is often available in large quantities.
    Page 2, “Introduction”
  5. The use of unlabeled data enables us to accurately estimate the distribution of words in source and target domains.
    Page 2, “Introduction”
  6. Our method can learn from a large amount of unlabeled data to leverage a robust cross-domain sentiment classifier.
    Page 2, “Introduction”
  7. Figure 4: Effect of source domain unlabeled data .
    Page 7, “Experiments”
  8. The amount of unlabeled data is held constant, so that any change in classification accu-
    Page 7, “Experiments”
  9. Figure 5: Effect of target domain unlabeled data .
    Page 7, “Experiments”
  10. To study the effect of source and target domain unlabeled data on the performance of our method, we create sentiment sensitive thesauri using different proportions of unlabeled data .
    Page 7, “Experiments”
  11. The amount of labeled data is held constant and is balanced across multiple domains as outlined in Section 5.1, so any changes in classification accuracy can be directly attributed to the contribution of unlabeled data .
    Page 7, “Experiments”

See all papers in Proc. ACL 2011 that mention unlabeled data.

See all papers in Proc. ACL that mention unlabeled data.

Back to top.

feature vector

Appears in 15 sentences as: feature vector (10) feature vectors (6)
In Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification
  1. The created thesaurus is then used to expand feature vectors to train a binary classifier.
    Page 1, “Abstract”
  2. a unigram or a bigram of word lemma) in a review using a feature vector .
    Page 2, “Introduction”
  3. We model the cross-domain sentiment classification problem as one of feature expansion, where we append additional related features to feature vectors that represent source and target domain reviews in order to reduce the mismatch of features between the two domains.
    Page 2, “Introduction”
  4. thesaurus to expand feature vectors in a binary classifier at train and test times by introducing related lexical elements from the thesaurus.
    Page 2, “Introduction”
  5. (However, the method is agnostic to the properties of the classifier and can be used to expand feature vectors for any binary classifier).
    Page 2, “Introduction”
  6. 0 We describe a method to use the created thesaurus to expand feature vectors at train and test times in a binary classifier.
    Page 2, “Introduction”
  7. For example, if we know that both excellent and delicious are positive sentiment words, then we can use this knowledge to expand a feature vector that contains the word delicious using the word excellent, thereby reducing the mismatch between features in a test instance and a trained model.
    Page 3, “Sentiment Sensitive Thesaurus”
  8. Let us denote the value of a feature 21) in the feature vector u representing a lexical element u by f (u, The vector u can be seen as a compact representation of the distribution of a lexical element u over the set of features that co-occur with u in the reviews.
    Page 4, “Sentiment Sensitive Thesaurus”
  9. From the construction of the feature vector u described in the previous paragraph, it follows that 21) can be either a sentiment feature or another lexical element that co-occurs with u in some review sentence.
    Page 4, “Sentiment Sensitive Thesaurus”
  10. Next, for two lexical elements u and 2) (represented by feature vectors u and 1), respectively), we compute the relatedness 7(2), u) of the feature 2) to the feature u as follows,
    Page 4, “Sentiment Sensitive Thesaurus”
  11. Relatedness of a lexical element u to another lexical element 2) is the fraction of feature weights in the feature vector for the element u that also co-occur with the features in the feature vector for the element 2).
    Page 4, “Sentiment Sensitive Thesaurus”

See all papers in Proc. ACL 2011 that mention feature vector.

See all papers in Proc. ACL that mention feature vector.

Back to top.

unigrams

Appears in 14 sentences as: unigram (5) unigrams (11)
In Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification
  1. a unigram or a bigram of word lemma) in a review using a feature vector.
    Page 2, “Introduction”
  2. ( unigrams ) opment, civilization
    Page 3, “A Motivating Example”
  3. We select unigrams and bigrams from each sentence.
    Page 3, “Sentiment Sensitive Thesaurus”
  4. For the remainder of this paper, we will refer to unigrams and bigrams collectively as lexical elements.
    Page 3, “Sentiment Sensitive Thesaurus”
  5. Previous work on sentiment classification has shown that both unigrams and bigrams are useful for training a sentiment classifier (Blitzer et al., 2007).
    Page 3, “Sentiment Sensitive Thesaurus”
  6. ,w N}, where the elements 212,- are either unigrams or bigrams that appear in the review d. We then represent a review d by a real-valued term-frequency vector d 6 RN , where the value of the j-th element dj is set to the total number of occurrences of the unigram or bigram wj in the review d. To find the suitable candidates to expand a vector d for the review d, we define a ranking score score(ui, d) for each base entry in the thesaurus as follows:
    Page 5, “Feature Expansion”
  7. Moreover, we weight the relatedness scores for each word wj by its normalized term-frequency to emphasize the salient unigrams and bigrams in a review.
    Page 5, “Feature Expansion”
  8. This is particularly important because we would like to score base entries ui considering all the unigrams and bigrams that appear in a review d, instead of considering each unigram or bigram individually.
    Page 5, “Feature Expansion”
  9. We then extend the original set of unigrams and bigrams {2121, .
    Page 5, “Feature Expansion”
  10. The values of the first N dimensions that correspond to unigrams and bigrams 212,- that occur in the review d are set to di, their frequency in d. The subsequent k dimensions that correspond to the top ranked based
    Page 5, “Feature Expansion”
  11. Instead, we consider all unigrams and bigrams in d when selecting the base entries for expansion.
    Page 5, “Feature Expansion”

See all papers in Proc. ACL 2011 that mention unigrams.

See all papers in Proc. ACL that mention unigrams.

Back to top.

bigrams

Appears in 13 sentences as: bigram (4) bigrams (11)
In Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification
  1. a unigram or a bigram of word lemma) in a review using a feature vector.
    Page 2, “Introduction”
  2. ( bigrams ) survey+development, develop-ment+civilization
    Page 3, “A Motivating Example”
  3. We select unigrams and bigrams from each sentence.
    Page 3, “Sentiment Sensitive Thesaurus”
  4. For the remainder of this paper, we will refer to unigrams and bigrams collectively as lexical elements.
    Page 3, “Sentiment Sensitive Thesaurus”
  5. Previous work on sentiment classification has shown that both unigrams and bigrams are useful for training a sentiment classifier (Blitzer et al., 2007).
    Page 3, “Sentiment Sensitive Thesaurus”
  6. ,w N}, where the elements 212,- are either unigrams or bigrams that appear in the review d. We then represent a review d by a real-valued term-frequency vector d 6 RN , where the value of the j-th element dj is set to the total number of occurrences of the unigram or bigram wj in the review d. To find the suitable candidates to expand a vector d for the review d, we define a ranking score score(ui, d) for each base entry in the thesaurus as follows:
    Page 5, “Feature Expansion”
  7. Moreover, we weight the relatedness scores for each word wj by its normalized term-frequency to emphasize the salient unigrams and bigrams in a review.
    Page 5, “Feature Expansion”
  8. This is particularly important because we would like to score base entries ui considering all the unigrams and bigrams that appear in a review d, instead of considering each unigram or bigram individually.
    Page 5, “Feature Expansion”
  9. We then extend the original set of unigrams and bigrams {2121, .
    Page 5, “Feature Expansion”
  10. The values of the first N dimensions that correspond to unigrams and bigrams 212,- that occur in the review d are set to di, their frequency in d. The subsequent k dimensions that correspond to the top ranked based
    Page 5, “Feature Expansion”
  11. Instead, we consider all unigrams and bigrams in d when selecting the base entries for expansion.
    Page 5, “Feature Expansion”

See all papers in Proc. ACL 2011 that mention bigrams.

See all papers in Proc. ACL that mention bigrams.

Back to top.

binary classifier

Appears in 10 sentences as: binary classification (1) binary classifier (9)
In Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification
  1. The created thesaurus is then used to expand feature vectors to train a binary classifier .
    Page 1, “Abstract”
  2. Following previous work, we define cross-domain sentiment classification as the problem of learning a binary classifier (i.e.
    Page 2, “Introduction”
  3. thesaurus to expand feature vectors in a binary classifier at train and test times by introducing related lexical elements from the thesaurus.
    Page 2, “Introduction”
  4. (However, the method is agnostic to the properties of the classifier and can be used to expand feature vectors for any binary classifier ).
    Page 2, “Introduction”
  5. 0 We describe a method to use the created thesaurus to expand feature vectors at train and test times in a binary classifier .
    Page 2, “Introduction”
  6. Using the extended vectors d’ to represent reviews, we train a binary classifier from the source domain labeled reviews to predict positive and negative sentiment in reviews.
    Page 5, “Feature Expansion”
  7. Because this is a binary classification task (i.e.
    Page 7, “Experiments”
  8. We simply train a binary classifier using unigrams and bigrams as features from the labeled reviews in the source domains and apply the trained classifier on the target domain.
    Page 8, “Experiments”
  9. After selecting salient features, the SCL algorithm is used to train a binary classifier .
    Page 8, “Experiments”
  10. For the Within-Domain baseline, we train a binary classifier using the labeled data from the target domain.
    Page 8, “Experiments”

See all papers in Proc. ACL 2011 that mention binary classifier.

See all papers in Proc. ACL that mention binary classifier.

Back to top.

domain adaptation

Appears in 4 sentences as: domain adaptation (4)
In Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification
  1. This can be considered to be a lower bound that does not perform domain adaptation .
    Page 8, “Experiments”
  2. Compared to single-domain sentiment classification, which has been studied extensively in previous work (Pang and Lee, 2008; Tumey, 2002), cross-domain sentiment classification has only recently received attention in response to advances in the area of domain adaptation .
    Page 8, “Related Work”
  3. Aue and Gammon (2005) report a number of empirical tests into domain adaptation of sentiment classifiers using an ensemble of classifiers.
    Page 8, “Related Work”
  4. In future, we intend to apply the proposed method to other domain adaptation tasks.
    Page 9, “Conclusions”

See all papers in Proc. ACL 2011 that mention domain adaptation.

See all papers in Proc. ACL that mention domain adaptation.

Back to top.

POS tags

Appears in 4 sentences as: POS tags (4)
In Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification
  1. POS tags Excellent/JJ and/CC broad/JJ
    Page 3, “A Motivating Example”
  2. We then apply a simple word filter based on POS tags to select content words (nouns, verbs, adjectives, and adverbs).
    Page 3, “Sentiment Sensitive Thesaurus”
  3. In addition to word-level sentiment features, we replace words with their POS tags to create
    Page 3, “Sentiment Sensitive Thesaurus”
  4. POS tags generalize the word-level sentiment features, thereby reducing feature sparseness.
    Page 4, “Sentiment Sensitive Thesaurus”

See all papers in Proc. ACL 2011 that mention POS tags.

See all papers in Proc. ACL that mention POS tags.

Back to top.