SciSurf: Index of 'A user-centric model of voting intention from Social Media'

Topics

Social Media (7)
Ground Truth (6)
overfitting (5)
linear regression (4)
feature space (3)
loss function (3)
sentiment analysis (3)

Topics

Social Media (7)
Ground Truth (6)
overfitting (5)
linear regression (4)
feature space (3)
loss function (3)
sentiment analysis (3)

A user-centric model of voting intention from Social Media

Lampos, Vasileios and Preoţiuc-Pietro, Daniel and Cohn, Trevor

Published in Proc. ACL, 2013

Article Structure

Abstract

Social Media contain a multitude of user opinions which can be used to predict real-world phenomena in many domains including politics, finance and health.

Introduction

Web Social Media platforms have ushered a new era in human interaction and communication.

Data

For the evaluation of the proposed methodologies we have created two data sets of Social Media content with different characteristics based in the UK and Austria respectively.

Methods

The textual content posted on Social Media platforms unarguably contains valuable information, but quite often it is hidden under vast amounts of unstructured user generated input.

Experiments

The proposed models are evaluated on Cuk and Call which have been introduced in Section 2.

Related Work

The topic of political opinion mining from Social Media has been the focus of various recent research works.

Conclusions and Future Work

We have presented a novel method for text regression that exploits both word and user spaces by solving a bilinear optimisation task, and an extension that applies multitask learning for multi-output inference.

Topics

Social Media

Appears in 7 sentences as: Social Media (7)

In A user-centric model of voting intention from Social Media

Social Media contain a multitude of user opinions which can be used to predict real-world phenomena in many domains including politics, finance and health.
Page 1, “Abstract”
These techniques require very careful filtering of the input texts, as most Social Media posts are irrelevant to the task.
Page 1, “Abstract”
Web Social Media platforms have ushered a new era in human interaction and communication.
Page 1, “Introduction”
For the evaluation of the proposed methodologies we have created two data sets of Social Media content with different characteristics based in the UK and Austria respectively.
Page 2, “Data”
Data processing is performed using the TrendMiner architecture for Social Media analysis (Preotiuc-Pietro et al., 2012).
Page 2, “Data”
The textual content posted on Social Media platforms unarguably contains valuable information, but quite often it is hidden under vast amounts of unstructured user generated input.
Page 3, “Methods”
The topic of political opinion mining from Social Media has been the focus of various recent research works.
Page 9, “Related Work”

See all papers in Proc. ACL 2013 that mention Social Media.

See all papers in Proc. ACL that mention Social Media.

Back to top.

Ground Truth

Appears in 6 sentences as: Ground Truth (3) ground truth (3)

In A user-centric model of voting intention from Social Media

2.3 Ground Truth
Page 2, “Data”
The ground truth for training and evaluating our regression models is formed by voting intention polls from YouGov (UK) and a collection of Austrian pollsters2 — as none performed high frequency polling — for the Austrian case study.
Page 2, “Data”
(a) Ground Truth (polls)
Page 7, “Experiments”
Sub-figure 3a is a plot of ground truth as presented in voting inten-
Page 7, “Experiments”
5 _ + oVP + FPQ + GRU 0 5 10 15 20 25 30 35 40 45 Time (a) Ground Truth (polls)
Page 7, “Experiments”
Sub-figure 4a is a plot of ground truth as presented in voting intention polls (Fig.
Page 7, “Experiments”

See all papers in Proc. ACL 2013 that mention Ground Truth.

See all papers in Proc. ACL that mention Ground Truth.

Back to top.

overfitting

Appears in 5 sentences as: overfit (1) overfitting (4)

In A user-centric model of voting intention from Social Media

Although flexible, this approach would be doomed to failure due to the sheer size of the resulting feature set, and the propensity to overfit all but the largest of training sets.
Page 3, “Methods”
The El-norm regularisation has found many applications in several scientific fields as it encourages sparse solutions which reduce the possibility of overfitting and enhance the interpretability of the inferred model (Hastie et al., 2009).
Page 3, “Methods”
Notice that there is a large performance improvement after the first step (which alone is a linear solver), but overfitting occurs after step 11.
Page 6, “Experiments”
This might be a result of overfitting the model to a single response variable which usually has a smooth behaviour.
Page 8, “Experiments”
On the contrary, the multitask learning property of BGL reduces this type of overfitting providing more statistical evidence for the terms and users and thus, yielding not only a better inference performance, but also a more accurate model.
Page 8, “Experiments”

See all papers in Proc. ACL 2013 that mention overfitting.

See all papers in Proc. ACL that mention overfitting.

Back to top.

linear regression

Appears in 4 sentences as: linear regression (4)

In A user-centric model of voting intention from Social Media

Most existing methods treat these problems as linear regression , learning to relate word frequencies and other simple features to a known response variable (e. g., voting intention polls or financial indicators).
Page 1, “Abstract”
The main theme of the aforementioned works is linear regression between word frequencies and a real-world quantity.
Page 1, “Introduction”
5 and 7), where we individually learn {W, ,8} and then {U , fl }; each step of the process is a standard linear regression problem with an 61/62 regulariser.
Page 5, “Methods”
The first makes a constant prediction of the mean value of the response variable y in the training set (By); the second predicts the last value of y (Blast); and the third baseline (LEN) is a linear regression over the terms using elastic net regularisation.
Page 6, “Experiments”

See all papers in Proc. ACL 2013 that mention linear regression.

See all papers in Proc. ACL that mention linear regression.

Back to top.

feature space

Appears in 3 sentences as: feature space (3)

In A user-centric model of voting intention from Social Media

In addition, more advanced regularisation functions enable multitask learning schemes that can exploit shared structure in the feature space .
Page 1, “Introduction”
Group LASSO exploits a predefined group structure on the feature space and tries to achieve sparsity in the group-level, i.e.
Page 5, “Methods”
In this optimisation process, we aim to enforce sparsity in the feature space but in a structured manner.
Page 5, “Methods”

See all papers in Proc. ACL 2013 that mention feature space.

See all papers in Proc. ACL that mention feature space.

Back to top.

loss function

Appears in 3 sentences as: loss function (2) loss functions (1)

In A user-centric model of voting intention from Social Media

2 is the standard regularisation loss function , namely the sum squared error over the training instances.4
Page 3, “Methods”
Biconvex functions and possible applications have been well studied in the optimisation literature (Quesada and Grossmann, 1995; 4Note that other loss functions could be used here, such as logistic loss for classification, or more generally bilinear
Page 3, “Methods”
The loss function in our evaluation is the standard Mean Square Error (MSE), but to allow a better interpretation of the results, we display its root (RMSE) in tables and figures.6
Page 6, “Experiments”

See all papers in Proc. ACL 2013 that mention loss function.

See all papers in Proc. ACL that mention loss function.

Back to top.

sentiment analysis

Appears in 3 sentences as: sentiment analysis (3)

In A user-centric model of voting intention from Social Media

They also tend to incorporate handcrafted lists of search terms to filter irrelevant content and use sentiment analysis lexicons for extracting opinion bias.
Page 1, “Introduction”
In this paper, we propose a generic method that aims to be independent of the characteristics described above (use of search terms or sentiment analysis tools).
Page 1, “Introduction”
majority of sentiment analysis tools are English-specific (or even American English) and, most importantly, political word lists (or ontologies) change in time, per country and per party; hence, generalisable methods should make an effort to limit reliance from such tools.
Page 9, “Related Work”

See all papers in Proc. ACL 2013 that mention sentiment analysis.

See all papers in Proc. ACL that mention sentiment analysis.

Back to top.