Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning
Feng, Song and Kang, Jun Seok and Kuznetsova, Polina and Choi, Yejin

Article Structure

Abstract

Understanding the connotation of words plays an important role in interpreting subtle shades of sentiment beyond denotative or surface meaning of text, as seemingly objective statements often allude nuanced sentiment of the writer, and even purposefully conjure emotion from the readers’ minds.

Introduction

There has been a substantial body of research in sentiment analysis over the last decade (Pang and Lee, 2008), where a considerable amount of work has focused on recognizing sentiment that is generally explicit and pronounced rather than implied and subdued.

Connotation Induction Algorithms

We develop induction algorithms based on three distinct types of algorithmic framework that have been shown successful for the analogous task of sentiment lexicon induction: HITS & PageRank (§2.1), Label/Graph Propagation (§2.2), and Constraint Optimization via Integer Linear Programming (§2.3).

Experimental Result I

We provide comprehensive comparisons over variants of three types of algorithms proposed in §2.

Precision, Coverage, and Efficiency

In this section, we address three important aspects of an ideal induction algorithm: precision, coverage, and efi‘iciency.

Experimental Results 11

In this section, we present comprehensive intrinsic §5.l and extrinsic §5 .2 evaluations comparing three representative lexicons from §2 & §42 C-LP, OVERLAY, PRED-ARG (CP), and two popular sentiment lexicons: SentiWordNet (Baccianella et al., 2010) and GI+MPQA.14 Note that C-LP is the largest among all connotation lexicons, including ~70,000 polar words.15

Related Work

A very interesting work of Mohammad and Turney (2010) uses Mechanical Turk in order to build the lexicon of emotions evoked by words.

Conclusion

We presented a broad-coverage connotation lexicon that determines the subtle nuanced sentiment of even those words that are objective on the surface, including the general connotation of real-world named entities.

Topics

ILP

Appears in 13 sentences as: ILP (14)
In Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning
  1. Addressing limitations of graph-based algorithms (§2.2), we propose an induction algorithm based on Integer Linear Programming ( ILP ).
    Page 4, “Connotation Induction Algorithms”
  2. We formulate insights in Figure 2 using ILP as follows:
    Page 4, “Connotation Induction Algorithms”
  3. Note that a direct comparison against ILP for top N words is tricky, as ILP does not rank results.
    Page 5, “Experimental Result I”
  4. ranks based on the frequency of words for ILP .
    Page 6, “Experimental Result I”
  5. Because of this issue, the performance of top le words of ILP should be considered only as a conservative measure.
    Page 6, “Experimental Result I”
  6. Importantly, when evaluated over more than top 5k words, ILP is overall the top performer considering both precision (shown in Table 3) and coverage (omitted for brevity).12
    Page 6, “Experimental Result I”
  7. Efficiency One practical problem with ILP is efficiency and scalability.
    Page 6, “Precision, Coverage, and Efficiency”
  8. In particular, we found that it becomes nearly impractical to run the ILP formulation including all words in WordNet plus all words in the argument position in Google Web IT.
    Page 6, “Precision, Coverage, and Efficiency”
  9. Interpretation Unlike ILP , some of the variables result in fractional values.
    Page 7, “Precision, Coverage, and Efficiency”
  10. 4.2 Empirical Comparisons: ILP v.s.
    Page 7, “Precision, Coverage, and Efficiency”
  11. Efficiency-wise, LP runs within 10 minutes while ILP takes several hours.
    Page 7, “Precision, Coverage, and Efficiency”

See all papers in Proc. ACL 2013 that mention ILP.

See all papers in Proc. ACL that mention ILP.

Back to top.

sentiment lexicon

Appears in 11 sentences as: Sentiment Lexicon (1) sentiment lexicon (6) sentiment lexicons (4)
In Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning
  1. The main contribution of this paper is a broad-coverage connotation lexicon that determines the connotative polarity of even those words with ever so subtle connotation beneath their surface meaning, such as “Literature”, “Mediterranean”, and “wine Although there has been a number of previous work that constructed sentiment lexicons (e.g., Esuli and Sebastiani (2006), Wilson et al.
    Page 1, “Introduction”
  2. Although such an assumption played a key role in previous work for the analogous task of learning sentiment lexicon (Velikovich et al., 2010), we expect that the same assumption would be less reliable in drawing subtle connotative sentiments of words.
    Page 2, “Introduction”
  3. We cast the connotation lexicon induction task as a collective inference problem, and consider approaches based on three distinct types of algorithmic framework that have been shown successful for conventional sentiment lexicon induction:
    Page 2, “Introduction”
  4. We develop induction algorithms based on three distinct types of algorithmic framework that have been shown successful for the analogous task of sentiment lexicon induction: HITS & PageRank (§2.1), Label/Graph Propagation (§2.2), and Constraint Optimization via Integer Linear Programming (§2.3).
    Page 3, “Connotation Induction Algorithms”
  5. 3.1 Comparison against Conventional Sentiment Lexicon
    Page 5, “Experimental Result I”
  6. Note that we consider the connotation lexicon to be inclusive of a sentiment lexicon for two practical reasons: first, it is highly unlikely that any word with non-neutral sentiment (i.e., positive or negative) would carry connotation of the opposite, i.e., conflicting10 polarity.
    Page 5, “Experimental Result I”
  7. Therefore, sentiment lexicons can serve as a surrogate to measure a subset of connotation words induced by the algorithms, as shown in Table 3 with respect to General Inquirer (Stone and Hunt (1963)) and MPQA (Wilson et al.
    Page 5, “Experimental Result I”
  8. Discussion Table 3 shows the agreement statistics with respect to two conventional sentiment lexicons .
    Page 5, “Experimental Result I”
  9. 13 Note that doing so will prevent us from evaluating against the same sentiment lexicon used as a seed set.
    Page 6, “Precision, Coverage, and Efficiency”
  10. In this section, we present comprehensive intrinsic §5.l and extrinsic §5 .2 evaluations comparing three representative lexicons from §2 & §42 C-LP, OVERLAY, PRED-ARG (CP), and two popular sentiment lexicons : SentiWordNet (Baccianella et al., 2010) and GI+MPQA.14 Note that C-LP is the largest among all connotation lexicons, including ~70,000 polar words.15
    Page 7, “Experimental Results 11”
  11. Some recent work explored the use of constraint optimization framework for inducing domain-dependent sentiment lexicon (Choi and Cardie (2009), Lu et al.
    Page 9, “Related Work”

See all papers in Proc. ACL 2013 that mention sentiment lexicon.

See all papers in Proc. ACL that mention sentiment lexicon.

Back to top.

Turkers

Appears in 9 sentences as: Turker (2) Turkers (6) turkers (2)
In Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning
  1. About 300 unique Turkers participated the evaluation tasks.
    Page 7, “Experimental Results 11”
  2. Otherwise we treat them as ambiguous cases.17 Figure 3 shows a part of the AMT task, where Turkers are presented with questions that help judges to determine the subtle connotative polarity of each word, then asked to rate the degree of connotation on a scale from -5 (most negative) and 5 (most positive).
    Page 7, “Experimental Results 11”
  3. 17We allow Turkers to mark words that can be used with both positive and negative connotation, which results in about 7% of words that are excluded from the gold standard set.
    Page 7, “Experimental Results 11”
  4. 0 (ll/0’56: The judgement of each Turker is
    Page 8, “Experimental Results 11”
  5. Therefore, we also report the degree of agreement among human judges in Table 7, where we compute the agreement of one Turker with respect to the gold standard drawn from the rest of the Turkers , and take the average across over all five Turkerslg.
    Page 8, “Experimental Results 11”
  6. Turkers , we consider adjusted versions of 9V0“ and 950°” schemes described above.
    Page 8, “Experimental Results 11”
  7. Turkers is not as good as that of C-LP lexicon.
    Page 8, “Experimental Results 11”
  8. 19Pearson correlation coefficient among turkers is 0.28, which corresponds to a positive small to medium correlation.
    Page 8, “Experimental Results 11”
  9. Note that when the annotation of turkers is aggregated, we observe agreement as high as 77% with respect to the learned connotation lexicon.
    Page 8, “Experimental Results 11”

See all papers in Proc. ACL 2013 that mention Turkers.

See all papers in Proc. ACL that mention Turkers.

Back to top.

edge weights

Appears in 8 sentences as: edge weights (8)
In Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning
  1. One possible way of constructing such a graph is simply connecting all nodes and assign edge weights proportionate to the word association scores, such as PMI, or distributional similarity.
    Page 3, “Connotation Induction Algorithms”
  2. In particular, we consider an undirected edge between a pair of arguments a1 and a2 only if they occurred together in the “a1 and a2” or “a2 and a1” coordination, and assign edge weights as: —> —> w(a1 — a2) = CosineSim(ch>,ch>) = 4%—HalH Ha2H
    Page 4, “Connotation Induction Algorithms”
  3. The edge weights in two subgraphs are normalized so that they are in the comparable range.9
    Page 4, “Connotation Induction Algorithms”
  4. They allow only nonnegative edge weights .
    Page 4, “Connotation Induction Algorithms”
  5. We experimented with many different variations on the graph structure and edge weights , including ones that include any word pairs that occurred frequently enough together.
    Page 4, “Connotation Induction Algorithms”
  6. (Dneu : a Z wzged _ zj m Soft constraints ( edge weights ): The weights in the objective function are set as follows:
    Page 5, “Connotation Induction Algorithms”
  7. 2 The performance of graph propagation varies significantly depending on the graph topology and the corresponding edge weights .
    Page 5, “Experimental Result I”
  8. Although we employ the same graph propagation algorithm, our graph construction is fundamentally different in that we integrate stronger inductive biases into the graph topology and the corresponding edge weights .
    Page 9, “Related Work”

See all papers in Proc. ACL 2013 that mention edge weights.

See all papers in Proc. ACL that mention edge weights.

Back to top.

gold standard

Appears in 7 sentences as: Gold Standard (1) gold standard (6)
In Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning
  1. We gather gold standard only for those words for which more than half of the judges agreed on the same polarity.
    Page 7, “Experimental Results 11”
  2. 17We allow Turkers to mark words that can be used with both positive and negative connotation, which results in about 7% of words that are excluded from the gold standard set.
    Page 7, “Experimental Results 11”
  3. the gold standard , we consider two different voting schemes:
    Page 8, “Experimental Results 11”
  4. The highest agreement is 77% by C-LP and the gold standard by AMTVOte.
    Page 8, “Experimental Results 11”
  5. Therefore, we also report the degree of agreement among human judges in Table 7, where we compute the agreement of one Turker with respect to the gold standard drawn from the rest of the Turkers, and take the average across over all five Turkerslg.
    Page 8, “Experimental Results 11”
  6. 18In order to draw the gold standard from the 4 remaining
    Page 8, “Experimental Results 11”
  7. Table 7: Agreement (Accuracy) against AMT-driven Gold Standard .
    Page 8, “Experimental Results 11”

See all papers in Proc. ACL 2013 that mention gold standard.

See all papers in Proc. ACL that mention gold standard.

Back to top.

distributional similarity

Appears in 5 sentences as: (1) distributional similarities (1) distributional similarity (3)
In Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning
  1. The focus of this paper is drawing nuanced, connotative sentiments from even those words that are objective on the surface, such as “intelligence”, “human”, and “cheesecake We propose induction algorithms encoding a diverse set of linguistic insights (semantic prosody, distributional similarity , semantic parallelism of coordination) and prior knowledge drawn from lexical resources, resulting in the first broad-coverage connotation lexicon.
    Page 1, “Abstract”
  2. Therefore, in order to attain a broad coverage lexicon while maintaining good precision, we guide the induction algorithm with multiple, carefully selected linguistic insights: [1] distributional similarity , [2] semantic parallelism of coordination, [3] selectional preference, and [4] semantic prosody (e.g., Sinclair (1991), Louw (1993), Stubbs (1995), Stefanowitsch and Gries (2003))), and also exploit existing lexical resources as an additional inductive bias.
    Page 2, “Introduction”
  3. The second subgraph is based on the distributional similarities among the arguments.
    Page 3, “Connotation Induction Algorithms”
  4. One possible way of constructing such a graph is simply connecting all nodes and assign edge weights proportionate to the word association scores, such as PMI, or distributional similarity .
    Page 3, “Connotation Induction Algorithms”
  5. where (meso‘ly is the scores based on semantic prosody, (1)0007"d captures the distributional similarity over coordination, and (13%“ controls the sensitivity of connotation detection between positive (negative) and neutral.
    Page 4, “Connotation Induction Algorithms”

See all papers in Proc. ACL 2013 that mention distributional similarity.

See all papers in Proc. ACL that mention distributional similarity.

Back to top.

human judges

Appears in 5 sentences as: Human Judgements (1) HUMAN JUDGES (1) human judges (2) human judgments (1)
In Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning
  1. We provide comparative empirical results over several variants of these approaches with comprehensive evaluations including lexicon-based, human judgments , and extrinsic evaluations.
    Page 2, “Introduction”
  2. §5 presents comprehensive evaluation with human judges and extrinsic evaluations.
    Page 2, “Introduction”
  3. 5.1 Intrinsic Evaluation: Human Judgements
    Page 7, “Experimental Results 11”
  4. Therefore, we also report the degree of agreement among human judges in Table 7, where we compute the agreement of one Turker with respect to the gold standard drawn from the rest of the Turkers, and take the average across over all five Turkerslg.
    Page 8, “Experimental Results 11”
  5. C-LP SENTIWN HUMAN JUDGES 9V0“ 77.0 71.5 66.0 95”“ 73.0 69.0 69.0
    Page 8, “Experimental Results 11”

See all papers in Proc. ACL 2013 that mention human judges.

See all papers in Proc. ACL that mention human judges.

Back to top.

Linear Programming

Appears in 5 sentences as: Linear Programming (6)
In Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning
  1. We develop induction algorithms based on three distinct types of algorithmic framework that have been shown successful for the analogous task of sentiment lexicon induction: HITS & PageRank (§2.1), Label/Graph Propagation (§2.2), and Constraint Optimization via Integer Linear Programming (§2.3).
    Page 3, “Connotation Induction Algorithms”
  2. Addressing limitations of graph-based algorithms (§2.2), we propose an induction algorithm based on Integer Linear Programming (ILP).
    Page 4, “Connotation Induction Algorithms”
  3. We therefore explore an alternative approach based on Linear Programming in what follows.
    Page 6, “Precision, Coverage, and Efficiency”
  4. 4.1 Induction using Linear Programming
    Page 6, “Precision, Coverage, and Efficiency”
  5. One straightforward option for Linear Programming formulation may seem like using the same Integer Linear Programming formulation introduced in §2.3, only changing the variable definitions to be real values 6 [0, 1] rather than integers.
    Page 6, “Precision, Coverage, and Efficiency”

See all papers in Proc. ACL 2013 that mention Linear Programming.

See all papers in Proc. ACL that mention Linear Programming.

Back to top.

objective function

Appears in 5 sentences as: Objective function (2) objective function (3)
In Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning
  1. Objective function : We aim to maximize: F : (pprosody + (pcoord + (Dneu
    Page 4, “Connotation Induction Algorithms”
  2. (Dneu : a Z wzged _ zj m Soft constraints (edge weights): The weights in the objective function are set as follows:
    Page 5, “Connotation Induction Algorithms”
  3. Objective function : We aim to maximize:
    Page 6, “Precision, Coverage, and Efficiency”
  4. Hard constraints We add penalties to the objective function if the polarity of a pair of words is not consistent with its corresponding semantic relations.
    Page 6, “Precision, Coverage, and Efficiency”
  5. Notice that dszjlr, d317,; satisfying above inequalities will be always of negative values, hence in order to maximize the objective function , the LP solver will try to minimize the absolute values of dsjj, dsgf, effectively pushing i and j toward the same polarity.
    Page 7, “Precision, Coverage, and Efficiency”

See all papers in Proc. ACL 2013 that mention objective function.

See all papers in Proc. ACL that mention objective function.

Back to top.

word pairs

Appears in 5 sentences as: word pairs (5)
In Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning
  1. We experimented with many different variations on the graph structure and edge weights, including ones that include any word pairs that occurred frequently enough together.
    Page 4, “Connotation Induction Algorithms”
  2. R59”: word pairs in synonyms relation.
    Page 4, “Connotation Induction Algorithms”
  3. Ram: word pairs in antonyms relation.
    Page 4, “Connotation Induction Algorithms”
  4. R000”: word pairs in coordination relation.
    Page 4, “Connotation Induction Algorithms”
  5. Rpred: word pairs in pred-arg relation.
    Page 4, “Connotation Induction Algorithms”

See all papers in Proc. ACL 2013 that mention word pairs.

See all papers in Proc. ACL that mention word pairs.

Back to top.

graph-based

Appears in 4 sentences as: Graph-based (1) graph-based (3)
In Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning
  1. Limitations of Graph-based Algorithms
    Page 4, “Connotation Induction Algorithms”
  2. Although graph-based algorithms (§2.l, §2.2) provide an intuitive framework to incorporate various lexical relations, limitations include:
    Page 4, “Connotation Induction Algorithms”
  3. Addressing limitations of graph-based algorithms (§2.2), we propose an induction algorithm based on Integer Linear Programming (ILP).
    Page 4, “Connotation Induction Algorithms”
  4. The [OVERLAY], which is based on both Pred-Arg and Arg-Arg subgraphs (§2.2), achieves the best performance among graph-based algorithms, significantly improving the precision over all other baselines.
    Page 5, “Experimental Result I”

See all papers in Proc. ACL 2013 that mention graph-based.

See all papers in Proc. ACL that mention graph-based.

Back to top.

PageRank

Appears in 4 sentences as: PageRank (4)
In Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning
  1. We develop induction algorithms based on three distinct types of algorithmic framework that have been shown successful for the analogous task of sentiment lexicon induction: HITS & PageRank (§2.1), Label/Graph Propagation (§2.2), and Constraint Optimization via Integer Linear Programming (§2.3).
    Page 3, “Connotation Induction Algorithms”
  2. 2.1 HITS & PageRank
    Page 3, “Connotation Induction Algorithms”
  3. (2011) explored the use of HITS (Kleinberg, 1999) and PageRank (Page et al., 1999) to induce the general connotation of words hinging on the linguistic phenomena of selectional preference and semantic prosody, i.e., connotative predicates influencing the connotation of their arguments.
    Page 3, “Connotation Induction Algorithms”
  4. We find that the use of label propagation alone [PRED-ARG (CP)] improves the performance substantially over the comparable graph construction with different graph analysis algorithms, in particular, HITS and PageRank approaches of Feng et al.
    Page 5, “Experimental Result I”

See all papers in Proc. ACL 2013 that mention PageRank.

See all papers in Proc. ACL that mention PageRank.

Back to top.

WordNet

Appears in 3 sentences as: WordNet (3)
In Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning
  1. Hard constrains for WordNet relations:
    Page 5, “Connotation Induction Algorithms”
  2. In particular, we found that it becomes nearly impractical to run the ILP formulation including all words in WordNet plus all words in the argument position in Google Web IT.
    Page 6, “Precision, Coverage, and Efficiency”
  3. Therefore we revise those hard constraints to encode various semantic relations ( WordNet and semantic coordination) more directly.
    Page 6, “Precision, Coverage, and Efficiency”

See all papers in Proc. ACL 2013 that mention WordNet.

See all papers in Proc. ACL that mention WordNet.

Back to top.