Inferring User Political Preferences from Streaming Communications
Volkova, Svitlana and Coppersmith, Glen and Van Durme, Benjamin

Article Structure

Abstract

Existing models for social media personal analytics assume access to thousands of messages per user, even though most users author content only sporadically over time.

Introduction

Inferring latent user attributes such as gender, age, and political preferences (Rao et al., 2011; Zamal et al., 2012; Cohen and Ruths, 2013) automatically from personal communications and social media including emails, blog posts or public discussions has become increasingly popular with the web getting more social and volume of data available.

Identifying Twitter Social Graph

Twitter users interact with one another and engage in direct communication in different ways e.g., using retweets, user mentions e.g., @youtube or hashtags e.g., #tcot, in addition to having explicit connections among themselves such as following, friending.

Batch Models

Baseline User Model As input we are given a set of vertices representing users of interest 2),- E V along with feature vectors f derived from content authored by the user of interest.

Streaming Models

We rely on straightforward Bayesian rule update to our batch models in order to simulate a real-time streaming prediction scenario as a first step beyond the existing models as shown in Figure 2.

Experimental Setup

We design a set of experiments to analyze static and dynamic models for political affiliation classification defined in Sections 3 and 4.

Static Classification Results

6.1 Modeling User Content Influence

Streaming Classification Results

7.1 Modeling Dynamic Posterior Updates from a User Stream

Related Work

Supervised Batch Approaches The vast majority of work on predicting latent user attributes in social media apply supervised static SVM models for discrete categorical e.g., gender and regression models for continuous attributes e.g., age with lexical bag-of-word features for classifying user gender (Garera and Yarowsky, 2009; Rao et al., 2010; Burger et al., 2011; Van Durme, 2012b), age (Rao et al., 2010; Nguyen et al., 2011; Nguyen et al., 2013) or political orientation.

Conclusions and Future Work

In this paper, we extensively examined state-of-the-art static approaches and proposed novel models with dynamic Bayesian updates for streaming personal analytics on Twitter.

Topics

social media

Appears in 13 sentences as: social media (13)
In Inferring User Political Preferences from Streaming Communications
  1. Existing models for social media personal analytics assume access to thousands of messages per user, even though most users author content only sporadically over time.
    Page 1, “Abstract”
  2. Inferring latent user attributes such as gender, age, and political preferences (Rao et al., 2011; Zamal et al., 2012; Cohen and Ruths, 2013) automatically from personal communications and social media including emails, blog posts or public discussions has become increasingly popular with the web getting more social and volume of data available.
    Page 1, “Introduction”
  3. In this paper we analyze and go beyond static models formulating personal analytics in social media as a streaming task.
    Page 1, “Introduction”
  4. The proposed baseline model follows the same trends as the existing state-of-the-art approaches for user attribute classification in social media as described in Section 8.
    Page 3, “Batch Models”
  5. 7We use log-linear models over reasonable alternatives such as perceptron or SVM, following the practice of a wide range of previous work in related areas (Smith, 2004; Liu et a1., 2005; Poon et a1., 2009) including text classification in social media (Van Durme, 2012b; Yang and Eisenstein, 2013).
    Page 3, “Batch Models”
  6. Following the streaming nature of social media , we see the scarce available resource as the number of requests allowed per day to the Twitter API.
    Page 3, “Batch Models”
  7. Supervised Batch Approaches The vast majority of work on predicting latent user attributes in social media apply supervised static SVM models for discrete categorical e.g., gender and regression models for continuous attributes e.g., age with lexical bag-of-word features for classifying user gender (Garera and Yarowsky, 2009; Rao et al., 2010; Burger et al., 2011; Van Durme, 2012b), age (Rao et al., 2010; Nguyen et al., 2011; Nguyen et al., 2013) or political orientation.
    Page 8, “Related Work”
  8. Additionally, using social media for mining political opinions (O’Connor et al., 2010a; Maynard and Funk, 2012) or understanding sociopolitical trends and voting outcomes (Tumasjan et al., 2010; Gayo-Avello, 2012; Lampos et al., 2013) is becoming a common practice.
    Page 8, “Related Work”
  9. (2013) propose a bilinear user-centric model for predicting voting intentions in the UK and Australia from social media data.
    Page 8, “Related Work”
  10. text streams for a variety of NLP tasks e.g., real-time opinion mining and sentiment analysis in social media (Pang and Lee, 2008), named entity disambiguation (Sarmento et al., 2009), statistical machine translation (Levenberg et al., 2011), first story detection (Petrovic et al., 2010), and unsupervised dependency parsing (Goyal and Daume, 2011).
    Page 9, “Related Work”
  11. This may be also the effect of data heterogeneity in social media compared to e.g., political debate text (Thomas et al., 2006).
    Page 9, “Conclusions and Future Work”

See all papers in Proc. ACL 2014 that mention social media.

See all papers in Proc. ACL that mention social media.

Back to top.

log-linear

Appears in 4 sentences as: log-linear (4)
In Inferring User Political Preferences from Streaming Communications
  1. Our goal is assign to a category each user of interest 2),- based on f Here we focus on a binary assignment into the categories Democratic D or Republican R. The log-linear
    Page 3, “Batch Models”
  2. 7We use log-linear models over reasonable alternatives such as perceptron or SVM, following the practice of a wide range of previous work in related areas (Smith, 2004; Liu et a1., 2005; Poon et a1., 2009) including text classification in social media (Van Durme, 2012b; Yang and Eisenstein, 2013).
    Page 3, “Batch Models”
  3. The corresponding log-linear model is defined as:
    Page 3, “Batch Models”
  4. We experiment with log-linear models defined in Eq.
    Page 5, “Experimental Setup”

See all papers in Proc. ACL 2014 that mention log-linear.

See all papers in Proc. ACL that mention log-linear.

Back to top.

randomly sample

Appears in 4 sentences as: random sample (1) randomly sample (2) randomly sampled (1)
In Inferring User Political Preferences from Streaming Communications
  1. In the Fall of 2012, leading up to the elections, we randomly sampled n = 516 Democratic and m = 515 Republican users.
    Page 2, “Identifying Twitter Social Graph”
  2. For each such user we collect recent tweets and randomly sample their immediate k = 10 neighbors from follower, friend, user mention, reply, retweet and hashtag social circles.
    Page 2, “Identifying Twitter Social Graph”
  3. Similar to the candidate-centric graph, for each user we collect recent tweets and randomly sample user social circles in the Fall of 2012.
    Page 2, “Identifying Twitter Social Graph”
  4. This may be an effect of ‘sparseness’ of relevant user data, in that users talk about politics very sporadically compared to a random sample of their neighbors.
    Page 9, “Conclusions and Future Work”

See all papers in Proc. ACL 2014 that mention randomly sample.

See all papers in Proc. ACL that mention randomly sample.

Back to top.

log-linear models

Appears in 3 sentences as: log-linear model (1) log-linear models (2)
In Inferring User Political Preferences from Streaming Communications
  1. 7We use log-linear models over reasonable alternatives such as perceptron or SVM, following the practice of a wide range of previous work in related areas (Smith, 2004; Liu et a1., 2005; Poon et a1., 2009) including text classification in social media (Van Durme, 2012b; Yang and Eisenstein, 2013).
    Page 3, “Batch Models”
  2. The corresponding log-linear model is defined as:
    Page 3, “Batch Models”
  3. We experiment with log-linear models defined in Eq.
    Page 5, “Experimental Setup”

See all papers in Proc. ACL 2014 that mention log-linear models.

See all papers in Proc. ACL that mention log-linear models.

Back to top.