Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis
Kim, Jungi and Li, Jin-Ji and Lee, Jong-Hyeok

Article Structure

Abstract

This paper describes an approach to utilizing term weights for sentiment analysis tasks and shows how various term weighting schemes improve the performance of sentiment analysis systems.

Introduction

With the explosion in the amount of commentaries on current issues and personal views expressed in weblogs on the Internet, the field of studying how to analyze such remarks and sentiments has been increasing as well.

Related Work

Representing text with salient features is an important part of a text processing task, and there exists many works that explore various features for

Term Weighting and Sentiment Analysis

In this section, we describe the characteristics of terms that are useful in sentiment analysis, and present our sentiment analysis model as part of an opinion retrieval system and an ML sentiment classifier.

Experiment

Our experiments consist of an opinion retrieval task and a sentiment classification task.

Conclusion

In this paper, we proposed various term weighting schemes and how such features are modeled in the sentiment analysis task.

Topics

sentiment analysis

Appears in 27 sentences as: Sentiment Analysis (2) Sentiment analysis (1) sentiment analysis (26)
In Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis
  1. This paper describes an approach to utilizing term weights for sentiment analysis tasks and shows how various term weighting schemes improve the performance of sentiment analysis systems.
    Page 1, “Abstract”
  2. Previously, sentiment analysis was mostly studied under data-driven and lexicon-based frameworks.
    Page 1, “Abstract”
  3. We propose to model term weighting into a sentiment analysis system utilizing collection statistics, contextual and topic-related characteristics as well as opinion-related properties.
    Page 1, “Abstract”
  4. The field of opinion mining and sentiment analysis involves extracting opinionated pieces of text, determining the polarities and strengths, and extracting holders and targets of the opinions.
    Page 1, “Introduction”
  5. Much research has focused on creating testbeds for sentiment analysis tasks.
    Page 1, “Introduction”
  6. Previous studies for sentiment analysis belong to either the data-driven approach where an annotated corpus is used to train a machine learning (ML) classifier, or to the lexicon-based approach where a pre-compiled list of sentiment terms is utilized to build a sentiment score function.
    Page 1, “Introduction”
  7. This paper introduces an approach to the sentiment analysis tasks with an emphasis on how to represent and evaluate the weights of sentiment terms.
    Page 1, “Introduction”
  8. These term weighting features constitute the sentiment analysis model in our opinion retrieval system.
    Page 1, “Introduction”
  9. Sentiment analysis task have also been using various lexical, syntactic, and statistical features (Pang and Lee, 2008).
    Page 2, “Related Work”
  10. Also, syntactic features such as the dependency relationship of words and subtrees have been shown to effectively improve the performances of sentiment analysis (Kudo and Matsumoto, 2004; Gamon, 2004; Matsumoto et al., 2005; Ng et al., 2006).
    Page 2, “Related Work”
  11. While these features are usually employed by data-driven approaches, there are unsupervised approaches for sentiment analysis that make use of a set of terms that are semantically oriented toward expressing subjective statements (Yu and Hatzivassiloglou, 2003).
    Page 2, “Related Work”

See all papers in Proc. ACL 2009 that mention sentiment analysis.

See all papers in Proc. ACL that mention sentiment analysis.

Back to top.

generation model

Appears in 12 sentences as: Generation Model (1) generation model (10) generation models (1)
In Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis
  1. 3.2.3 Word Generation Model
    Page 4, “Term Weighting and Sentiment Analysis”
  2. Our word generation model p(w | d) evaluates the prominence and the discriminativeness of a word
    Page 4, “Term Weighting and Sentiment Analysis”
  3. Therefore, we estimate the word generation model with popular IR models’ the relevance scores of a document d given 212 as a query.5
    Page 5, “Term Weighting and Sentiment Analysis”
  4. Specifically, we explore the statistical term weighting features of the word generation model with Support Vector machine (SVM), faithfully reproducing previous work as closely as possible (Pang et al., 2002).
    Page 5, “Term Weighting and Sentiment Analysis”
  5. We observe that the features of our word generation model is more effective than those of the topic association model.
    Page 6, “Experiment”
  6. Among the features of the word generation model , the most improvement was achieved with BM 25, improving the MAP by 2.27%.
    Page 6, “Experiment”
  7. Since BM25 performs the best among the word generation models , its combination with other features was investigated.
    Page 6, “Experiment”
  8. This demonstrates that the word generation model and the topic association model are complementary to each other.
    Page 6, “Experiment”
  9. Similarly to the TREC experiments, the features of the word generation model perform exceptionally better than that of the topic association model.
    Page 7, “Experiment”
  10. The best performing feature of the word generation model is VS, achieving a 4.21% improvement over the baseline’s f-measure.
    Page 7, “Experiment”
  11. When combining the best performing feature of the word generation model (VS) with the features of the topic association model, LSA, PMI and DTP all performed worse than or as well as the VS in f-measure evaluation.
    Page 7, “Experiment”

See all papers in Proc. ACL 2009 that mention generation model.

See all papers in Proc. ACL that mention generation model.

Back to top.

f-measure

Appears in 9 sentences as: F-Measure (1) F-measure (3) f-measure (5)
In Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis
  1. For performance evaluations of opinion and polarity detection, we use precision, recall, and F-measure , the same measure used to report the official results at the NTCIR MOAT workshop.
    Page 6, “Experiment”
  2. System parameters are optimized for F-measure using NTCIR6 dataset with lenient evaluations.
    Page 7, “Experiment”
  3. Model Precision Recall F-Measure BASELINE 0.305 0.866 0.451 VS 0.331 0.807 0.470 BM25 0.327 0.795 0.464 LM 0.325 0.794 0.461 LSA 0.315 0.806 0.453 PMI 0.342 0.603 0.436 DTP 0.322 0.778 0.455 VS-LSA 0.335 0.769 0.466 VS-PMI 0.311 0.833 0.453 VS-DTP 0.342 0.745 0.469
    Page 7, “Experiment”
  4. The best performing feature of the word generation model is VS, achieving a 4.21% improvement over the baseline’s f-measure .
    Page 7, “Experiment”
  5. Interestingly, this is the tied top performing f-measure over all combinations of our features.
    Page 7, “Experiment”
  6. When combining the best performing feature of the word generation model (VS) with the features of the topic association model, LSA, PMI and DTP all performed worse than or as well as the VS in f-measure evaluation.
    Page 7, “Experiment”
  7. The best performing system was achieved using VS, LSA and DTP at both precision and f-measure evaluations.
    Page 7, “Experiment”
  8. It also achieves the best f-measure over other topic association features.
    Page 8, “Experiment”
  9. DTP achieves higher relative improvement (3.99% F-measure verse 2.32% MAP), and is more effective for improving the performance in combination with LSA and PMI.
    Page 8, “Experiment”

See all papers in Proc. ACL 2009 that mention f-measure.

See all papers in Proc. ACL that mention f-measure.

Back to top.

sentiment classification

Appears in 7 sentences as: sentiment classification (5) sentiment classifier (2)
In Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis
  1. (2002) presents empirical results indicating that using term presence over term frequency is more effective in a data-driven sentiment classification task.
    Page 2, “Related Work”
  2. In this section, we describe the characteristics of terms that are useful in sentiment analysis, and present our sentiment analysis model as part of an opinion retrieval system and an ML sentiment classifier .
    Page 2, “Term Weighting and Sentiment Analysis”
  3. Our experiments consist of an opinion retrieval task and a sentiment classification task.
    Page 5, “Experiment”
  4. (2002) to test various ML—based methods for sentiment classification .
    Page 7, “Experiment”
  5. We present the sentiment classification performances in Table 3.
    Page 7, “Experiment”
  6. (2002), using the raw tf drops the accuracy of the sentiment classification (-13.92%) of movie-review data.
    Page 7, “Experiment”
  7. Its effectiveness is also verified with a data-driven approach; the accuracy of a sentiment classifier trained on a polarity dataset was improved by various combinations of normalized tf and idf statistics.
    Page 8, “Experiment”

See all papers in Proc. ACL 2009 that mention sentiment classification.

See all papers in Proc. ACL that mention sentiment classification.

Back to top.

classification task

Appears in 6 sentences as: Classification task (1) classification task (5)
In Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis
  1. (2002) presents empirical results indicating that using term presence over term frequency is more effective in a data-driven sentiment classification task .
    Page 2, “Related Work”
  2. Our experiments consist of an opinion retrieval task and a sentiment classification task .
    Page 5, “Experiment”
  3. Since MOAT is a classification task , we use a threshold parameter to draw a boundary between opinionated and non-opinionated sentences.
    Page 6, “Experiment”
  4. 4.3 Classification task — SVM
    Page 7, “Experiment”
  5. 4.3.1 Experimental Setting To test our SVM classifier, we perform the classification task .
    Page 7, “Experiment”
  6. Table 3: Average tenfold cross-validation accuracies of polarity classification task with SVM.
    Page 7, “Experiment”

See all papers in Proc. ACL 2009 that mention classification task.

See all papers in Proc. ACL that mention classification task.

Back to top.

sentiment lexicon

Appears in 6 sentences as: sentiment lexicon (4) sentiment lexicons (2)
In Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis
  1. Such work generally exploits textual features for fact-based analysis tasks or lexical indicators from a sentiment lexicon .
    Page 1, “Abstract”
  2. Accordingly, much research has focused on recognizing terms’ semantic orientations and strength, and compiling sentiment lexicons (Hatzivassiloglou and Mckeown, 1997; Turney and Littman, 2003; Kamps et al., 2004; Whitelaw et al., 2005; Esuli and Sebastiani, 2006).
    Page 2, “Related Work”
  3. The goal of this paper is not to create or choose an appropriate sentiment lexicon , but rather it is to discover useful term features other than the sentiment properties.
    Page 3, “Term Weighting and Sentiment Analysis”
  4. For this reason, one sentiment lexicon , namely SentiWordNet, is utilized throughout the whole experiment.
    Page 3, “Term Weighting and Sentiment Analysis”
  5. SentiWordNet is an automatically generated sentiment lexicon using a semi-supervised method (Esuli and Sebastiani, 2006).
    Page 3, “Term Weighting and Sentiment Analysis”
  6. The word sentiment model can also make use of other types of sentiment lexicons .
    Page 3, “Term Weighting and Sentiment Analysis”

See all papers in Proc. ACL 2009 that mention sentiment lexicon.

See all papers in Proc. ACL that mention sentiment lexicon.

Back to top.

co-occurrence

Appears in 5 sentences as: co-occurrence (5)
In Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis
  1. Statistical measures of associations between terms include estimations by the co-occurrence in the whole collection, such as Point-wise Mutual Information (PMI) and Latent Semantic Analysis (LSA).
    Page 2, “Term Weighting and Sentiment Analysis”
  2. Another way is to use co-occurrence statistics
    Page 4, “Term Weighting and Sentiment Analysis”
  3. where K is the maximum window size for the co-occurrence and is arbitrarily set to 3 in our experiments.
    Page 4, “Term Weighting and Sentiment Analysis”
  4. Note that proximal features using co-occurrence and dependency relationships were used in previous work.
    Page 4, “Term Weighting and Sentiment Analysis”
  5. (2006) and Zhang and Ye (2008) used the co-occurrence of a query word and a sentiment word within a certain window size.
    Page 4, “Term Weighting and Sentiment Analysis”

See all papers in Proc. ACL 2009 that mention co-occurrence.

See all papers in Proc. ACL that mention co-occurrence.

Back to top.

LM

Appears in 5 sentences as: LM (5)
In Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis
  1. IR models, such as Vector Space (VS), probabilistic models such as BM25, and Language Modeling ( LM ), albeit in different forms of approach and measure, employ heuristics and formal modeling approaches to effectively evaluate the relevance of a term to a document (Fang et al., 2004).
    Page 5, “Term Weighting and Sentiment Analysis”
  2. In our experiments, we use the Vector Space model with Pivoted Normalization (VS), Probabilistic model (BM25), and Language modeling with Dirichlet Smoothing ( LM ).
    Page 5, “Term Weighting and Sentiment Analysis”
  3. VS 0.4196 0.4542 0.6600 BM25 0.4235 T 0.4579 0.6600 LM 0.4158 0.4520 0.6560 PMI 0.4177 0.4538 0.6620 LSA 0.4155 0.4526 0.6480 WP 0.4165 0.4533 0.6640
    Page 6, “Experiment”
  4. Model Precision Recall F-Measure BASELINE 0.305 0.866 0.451 VS 0.331 0.807 0.470 BM25 0.327 0.795 0.464 LM 0.325 0.794 0.461 LSA 0.315 0.806 0.453 PMI 0.342 0.603 0.436 DTP 0.322 0.778 0.455 VS-LSA 0.335 0.769 0.466 VS-PMI 0.311 0.833 0.453 VS-DTP 0.342 0.745 0.469
    Page 7, “Experiment”
  5. Differences in effectiveness of VS, BM25, and LM come from parameter tuning and corpus differences.
    Page 8, “Experiment”

See all papers in Proc. ACL 2009 that mention LM.

See all papers in Proc. ACL that mention LM.

Back to top.

SVM

Appears in 5 sentences as: SVM (5)
In Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis
  1. Specifically, we explore the statistical term weighting features of the word generation model with Support Vector machine ( SVM ), faithfully reproducing previous work as closely as possible (Pang et al., 2002).
    Page 5, “Term Weighting and Sentiment Analysis”
  2. 4.3 Classification task — SVM
    Page 7, “Experiment”
  3. 4.3.1 Experimental Setting To test our SVM classifier, we perform the classification task.
    Page 7, “Experiment”
  4. Table 3: Average tenfold cross-validation accuracies of polarity classification task with SVM .
    Page 7, “Experiment”
  5. To closely reproduce the experiment with the best performance carried out in (Pang et al., 2002) using SVM , we use unigram with the presence feature.
    Page 7, “Experiment”

See all papers in Proc. ACL 2009 that mention SVM.

See all papers in Proc. ACL that mention SVM.

Back to top.

Language Modeling

Appears in 4 sentences as: language model (1) Language Modeling (1) Language modeling (1) language modeling (1)
In Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis
  1. IR models, such as Vector Space (VS), probabilistic models such as BM25, and Language Modeling (LM), albeit in different forms of approach and measure, employ heuristics and formal modeling approaches to effectively evaluate the relevance of a term to a document (Fang et al., 2004).
    Page 5, “Term Weighting and Sentiment Analysis”
  2. In our experiments, we use the Vector Space model with Pivoted Normalization (VS), Probabilistic model (BM25), and Language modeling with Dirichlet Smoothing (LM).
    Page 5, “Term Weighting and Sentiment Analysis”
  3. 5With proper assumptions and derivations, p(w \ d) can be derived to language modeling approaches.
    Page 5, “Term Weighting and Sentiment Analysis”
  4. For the relevance retrieval model, we faithfully reproduce the passage-based language model with pseudo-relevance feedback (Lee et al., 2008).
    Page 5, “Experiment”

See all papers in Proc. ACL 2009 that mention Language Modeling.

See all papers in Proc. ACL that mention Language Modeling.

Back to top.

WordNet

Appears in 4 sentences as: WordNet (4)
In Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis
  1. Also, the distance between words in the local context or in the thesaurus-like dictionaries such as WordNet may be approximated as such measure.
    Page 2, “Term Weighting and Sentiment Analysis”
  2. It consists of WordNet synsets, where each synset is assigned three probability scores that add up to 1: positive, negative, and objective.
    Page 3, “Term Weighting and Sentiment Analysis”
  3. These scores are assigned at sense level (synsets in WordNet ), and we use the following equations to assess the sentiment scores at the word level.
    Page 3, “Term Weighting and Sentiment Analysis”
  4. Such methods include, but by no means limited to, semantic similarities between word pairs using lexical resources such as WordNet (Miller, 1995) and data-driven methods with various topic-dependent term weighting schemes on labeled corpus with topics such as MPQA.
    Page 8, “Conclusion”

See all papers in Proc. ACL 2009 that mention WordNet.

See all papers in Proc. ACL that mention WordNet.

Back to top.

dependency relations

Appears in 3 sentences as: dependency relations (1) dependency relationship (1) dependency relationships (1)
In Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis
  1. Also, syntactic features such as the dependency relationship of words and subtrees have been shown to effectively improve the performances of sentiment analysis (Kudo and Matsumoto, 2004; Gamon, 2004; Matsumoto et al., 2005; Ng et al., 2006).
    Page 2, “Related Work”
  2. Another method is to use proximal information of the query and the word, using syntactic structure such as dependency relations of words that provide the graphical representation of the text (Mullen and Collier, 2004).
    Page 2, “Term Weighting and Sentiment Analysis”
  3. Note that proximal features using co-occurrence and dependency relationships were used in previous work.
    Page 4, “Term Weighting and Sentiment Analysis”

See all papers in Proc. ACL 2009 that mention dependency relations.

See all papers in Proc. ACL that mention dependency relations.

Back to top.

Latent Semantic

Appears in 3 sentences as: Latent Semantic (3)
In Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis
  1. Statistical measures of associations between terms include estimations by the co-occurrence in the whole collection, such as Point-wise Mutual Information (PMI) and Latent Semantic Analysis (LSA).
    Page 2, “Term Weighting and Sentiment Analysis”
  2. Latent Semantic Analysis (LSA) (Landauer and Dumais, 1997) creates a semantic space from a collection of documents to measure the semantic relatedness of words.
    Page 4, “Term Weighting and Sentiment Analysis”
  3. For LSA, we used the online demonstration mode from the Latent Semantic Analysis page from the University of Colorado at Boulder.3 For PMI, we used the online API provided by the CogWorks Lab at the Rensselaer Polytechnic Institute.4
    Page 4, “Term Weighting and Sentiment Analysis”

See all papers in Proc. ACL 2009 that mention Latent Semantic.

See all papers in Proc. ACL that mention Latent Semantic.

Back to top.

named entity

Appears in 3 sentences as: named entities (1) named entity (2)
In Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis
  1. The statistical approaches may suffer from data sparseness problems especially for named entity terms used in the query, and the proximal clues cannot sufficiently cover all term—query associations.
    Page 4, “Term Weighting and Sentiment Analysis”
  2. Mullen and Collier (2004) manually annotated named entities in their dataset (i.e.
    Page 4, “Term Weighting and Sentiment Analysis”
  3. In general, the NTCIR topics are general descriptive words such as “regenerative medicine”, “American economy after the 911 terrorist attacks”, and “lawsuit brought against Microsoft for monopolistic practices.” The TREC topics are more named-entity-like terms such as “CarmaX”, “Wikipedia primary source”, “Jiffy Lube”, “Starbucks”, and “Windows Vista.” We have experimentally shown that LSA is more suited to finding associations between general terms because its training documents are from a general domain.9 Our PMI measure utilizes a web search engine, which covers a variety of named entity terms.
    Page 8, “Experiment”

See all papers in Proc. ACL 2009 that mention named entity.

See all papers in Proc. ACL that mention named entity.

Back to top.

synset

Appears in 3 sentences as: synset (3) synsets (3)
In Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis
  1. It consists of WordNet synsets, where each synset is assigned three probability scores that add up to 1: positive, negative, and objective.
    Page 3, “Term Weighting and Sentiment Analysis”
  2. These scores are assigned at sense level ( synsets in WordNet), and we use the following equations to assess the sentiment scores at the word level.
    Page 3, “Term Weighting and Sentiment Analysis”
  3. where synset(w) is the set of synsets of w and SWN 1208(3), SWNNeg(s) are positive and negative scores of a synset in SentiWordNet.
    Page 3, “Term Weighting and Sentiment Analysis”

See all papers in Proc. ACL 2009 that mention synset.

See all papers in Proc. ACL that mention synset.

Back to top.