SciSurf: Index of "unigrams" in Proc. ACL 2011

Index of papers in Proc. ACL 2011 that mention

unigrams

Seen in text as:

unigrams (40)
unigram (31)
UNIGRAMS (3)

Seen in 66 sentences in 7 papers.

1. Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification

Bollegala, Danushka and Weir, David and Carroll, John

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Motivating Example	( unigrams ) opment, civilization
Feature Expansion	,w N}, where the elements 212,- are either unigrams or bigrams that appear in the review d. We then represent a review d by a real-valued term-frequency vector d 6 RN , where the value of the j-th element dj is set to the total number of occurrences of the unigram or bigram wj in the review d. To find the suitable candidates to expand a vector d for the review d, we define a ranking score score(ui, d) for each base entry in the thesaurus as follows:
Feature Expansion	Moreover, we weight the relatedness scores for each word wj by its normalized term-frequency to emphasize the salient unigrams and bigrams in a review.
Feature Expansion	This is particularly important because we would like to score base entries ui considering all the unigrams and bigrams that appear in a review d, instead of considering each unigram or bigram individually.
Introduction	a unigram or a bigram of word lemma) in a review using a feature vector.
Sentiment Sensitive Thesaurus	We select unigrams and bigrams from each sentence.
Sentiment Sensitive Thesaurus	For the remainder of this paper, we will refer to unigrams and bigrams collectively as lexical elements.
Sentiment Sensitive Thesaurus	Previous work on sentiment classification has shown that both unigrams and bigrams are useful for training a sentiment classifier (Blitzer et al., 2007).

unigrams is mentioned in 14 sentences in this paper.

Topics mentioned in this paper:

2. Integrating history-length interpolation and classes in language modeling

Schütze, Hinrich

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	256,873 unique unigrams and 4,494,222 unique bigrams.
Experimental Setup	We cluster unigrams (i = l) and bigrams (i = 2).
Experimental Setup	For all experiments, \|l31\| = \|l32\| (except in cases where \|l32\| exceeds the number of unigrams , see below).
Models	The parameters d’, d”, and d’” are the discounts for unigrams , bigrams and trigrams, respectively, as defined by Chen and Goodman (1996, p. 20, (26)).
Models	232) is the set of unigram (resp.
Models	We cluster bigram histories and unigram histories separately and write 193 (7.03 \|w1w2) for the bigram cluster model and pB(w3\|w2) for the unigram cluster model.
Related work	symbol \| denotation 2w (sum over all unigrams w)

unigrams is mentioned in 23 sentences in this paper.

Topics mentioned in this paper:

3. A Hierarchical Pitman-Yor Process HMM for Unsupervised Part of Speech Induction

Blunsom, Phil and Cohn, Trevor

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background	This work differs from previous Bayesian models in that we explicitly model a complex backoff path using a hierachical prior, such that our model jointly infers distributions over tag trigrams, bigrams and unigrams and whole words and their character level representation.
Experiments	Note that the bigram PYP-HMM outperforms the closely related BHMM (the main difference being that we smooth tag bigrams with unigrams ).
The PYP-HMM	The trigram transition distribution, Tij, is drawn from a hierarchical PYP prior which backs off to a bigram Bj and then a unigram U distribution,
The PYP-HMM	This allows the modelling of trigram tag sequences, while smoothing these estimates with their corresponding bigram and unigram distributions.
The PYP-HMM	That is, each table at one level is equivalent to a customer at the next deeper level, creating the invari-ants: Kh} = n;- andKu—i 2 715, where u = tl_1 indicates the unigram backoff context of h. The recursion terminates at the lowest level where the base distribution is static.

unigrams is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

4. Extracting Social Power Relationships from Natural Language

Bramsen, Philip and Escobar-Molano, Martha and Patel, Ami and Alonso, Rafael

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Previous work in traditional text classification and its variants — such as sentiment analysis — has achieved successful results by using the bag-of-words representation; that is, by treating text as a collection of words with no interdependencies, training a classifier on a large feature set of word unigrams which appear in the corpus.
Abstract	Few of these tactics would be effectively encapsulated by word unigrams .
Abstract	Many would be better modeled by POS tag unigrams (with no word information) or by longer n-grams consisting of either words, POS tags, or a combination of the two.

unigrams is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

5. A New Dataset and Method for Automatically Grading ESOL Texts

Yannakoudakis, Helen and Briscoe, Ted and Medlock, Ben

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Approach	(a) Word unigrams (b) Word bigrams
Approach	(a) PoS unigrams (b) PoS bigrams (c) PoS trigrams
Approach	Word unigrams and bigrams are lower-cased and used in their inflected forms.
Previous work	The Bayesian Essay Test Scoring sYstem (BETSY) (Rudner and Liang, 2002) uses multinomial or Bernoulli Naive Bayes models to classify texts into different classes (e. g. pass/fail, grades AF) based on content and style features such as word unigrams and bigrams, sentence length, number of verbs, noun—verb pairs etc.
Validity tests	(a) word unigrams within a sentence (b) word bigrams within a sentence (c) word trigrams within a sentence

unigrams is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

6. Finding Deceptive Opinion Spam by Any Stretch of the Imagination

Ott, Myle and Choi, Yejin and Cardie, Claire and Hancock, Jeffrey T.

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Automated Approaches to Deceptive Opinion Spam Detection	Specifically, we consider the following three n-gram feature sets, with the corresponding features lowercased and unstemmed: UNIGRAMS , BIGRAMS+, TRIGRAMS+, where the superscript + indicates that the feature set subsumes the preceding feature set.
Automated Approaches to Deceptive Opinion Spam Detection	We consider all three n-gram feature sets, namely UNIGRAMS , BIGRAMS+, and TRIGRAMS+, with corresponding language models smoothed using the interpolated Kneser-Ney method (Chen and Goodman, 1996).
Automated Approaches to Deceptive Opinion Spam Detection	We use SVMlight (Joachims, 1999) to train our linear SVM models on all three approaches and feature sets described above, namely POS, LIWC, UNIGRAMS , BIGRAMS+, and TRIGRAMS+.

unigrams is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

7. Content Models with Attitude

Sauper, Christina and Haghighi, Aria and Barzilay, Regina

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	The DISCRIMINATIVE baseline for this task is a standard maximum entropy discriminative binary classifier over unigrams .
Model	Global Distributions: At the global level, we draw several unigram distributions: a global background distribution 63 and attribute distributions 6% for each attribute.
Model	Product Level: For the ith product, we draw property unigram distributions 6351, .

unigrams is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: