Index of papers in Proc. ACL 2011 that mention

bigram

Seen in text as:

bigram (73)
bigrams (66)
BIGRAMS+ (7)
Bigram (6)

Seen in 129 sentences in 11 papers.

1. Jointly Learning to Extract and Compress

Berg-Kirkpatrick, Taylor and Gillick, Dan and Klein, Dan

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Joint Model	While past extractive methods have assigned value to individual sentences and then explicitly represented the notion of redundancy (Carbonell and Goldstein, 1998), recent methods show greater success by using a simpler notion of coverage: bigrams
Joint Model	Note that there is intentionally a bigram missing from (a).
Joint Model	contribute content, and redundancy is implicitly encoded in the fact that redundant sentences cover fewer bigrams (Nenkova and Vanderwende, 2005; Gillick and Favre, 2009).
Structured Learning	We use bigram recall as our loss function (see Section 3.3).
Structured Learning	Luckily, our choice of loss function, bigram recall, factors over bigrams .
Structured Learning	We simply modify each bigram value 2);, to include bigram b’s contribution to the total loss.

bigram is mentioned in 36 sentences in this paper.

Topics mentioned in this paper:

2. A Hierarchical Pitman-Yor Process HMM for Unsupervised Part of Speech Induction

Blunsom, Phil and Cohn, Trevor

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background	This work differs from previous Bayesian models in that we explicitly model a complex backoff path using a hierachical prior, such that our model jointly infers distributions over tag trigrams, bigrams and unigrams and whole words and their character level representation.
Experiments	mkcls (Och, 1999) 73.7 65.6 MLE 1HMM—LM (Clark, 2003)* 71.2 65.5 BHMM (GG07) 63.2 56.2 PR (Ganchev et al., 2010)* 62.5 54.8 Trigram PYP—HMM 69.8 62.6 Trigram PYP-lHMM 76.0 68.0 Trigram PYP—lHMM-LM 77.5 69.7 Bigram PYP-HMM 66.9 59.2 Bigram PYP— 1HMM 72.9 65.9 Trigram DP—HMM 68.1 60.0 Trigram DP— 1HMM 76.0 68.0 Trigram DP— 1HMM—LM 76.8 69.8
Experiments	If we restrict the model to bigrams we see a considerable drop in performance.
The PYP-HMM	The trigram transition distribution, Tij, is drawn from a hierarchical PYP prior which backs off to a bigram Bj and then a unigram U distribution,
The PYP-HMM	This allows the modelling of trigram tag sequences, while smoothing these estimates with their corresponding bigram and unigram distributions.
The PYP-HMM	We formulate the character—level language model as a bigram model over the character sequence comprising word 7111,

bigram is mentioned in 13 sentences in this paper.

Topics mentioned in this paper:

3. Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification

Bollegala, Danushka and Weir, David and Carroll, John

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Motivating Example	( bigrams ) survey+development, develop-ment+civilization
Feature Expansion	,w N}, where the elements 212,- are either unigrams or bigrams that appear in the review d. We then represent a review d by a real-valued term-frequency vector d 6 RN , where the value of the j-th element dj is set to the total number of occurrences of the unigram or bigram wj in the review d. To find the suitable candidates to expand a vector d for the review d, we define a ranking score score(ui, d) for each base entry in the thesaurus as follows:
Feature Expansion	Moreover, we weight the relatedness scores for each word wj by its normalized term-frequency to emphasize the salient unigrams and bigrams in a review.
Feature Expansion	This is particularly important because we would like to score base entries ui considering all the unigrams and bigrams that appear in a review d, instead of considering each unigram or bigram individually.
Introduction	a unigram or a bigram of word lemma) in a review using a feature vector.
Sentiment Sensitive Thesaurus	We select unigrams and bigrams from each sentence.
Sentiment Sensitive Thesaurus	For the remainder of this paper, we will refer to unigrams and bigrams collectively as lexical elements.
Sentiment Sensitive Thesaurus	Previous work on sentiment classification has shown that both unigrams and bigrams are useful for training a sentiment classifier (Blitzer et al., 2007).

bigram is mentioned in 13 sentences in this paper.

Topics mentioned in this paper:

4. Integrating history-length interpolation and classes in language modeling

Schütze, Hinrich

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	256,873 unique unigrams and 4,494,222 unique bigrams .
Experimental Setup	We cluster unigrams (i = l) and bigrams (i = 2).
Experimental Setup	SRILM does not directly support bigram clustering.
Models	The parameters d’, d”, and d’” are the discounts for unigrams, bigrams and trigrams, respectively, as defined by Chen and Goodman (1996, p. 20, (26)).
Models	bigram ) histories that is covered by the clusters.
Models	We cluster bigram histories and unigram histories separately and write 193 (7.03 \|w1w2) for the bigram cluster model and pB(w3\|w2) for the unigram cluster model.

bigram is mentioned in 25 sentences in this paper.

Topics mentioned in this paper:

5. Exact Decoding of Syntactic Translation Models through Lagrangian Relaxation

Rush, Alexander M. and Collins, Michael

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Simple Lagrangian Relaxation Algorithm	We now give a Lagrangian relaxation algorithm for integration of a hypergraph with a bigram language model, in cases where the hypergraph satisfies the following simplifying assumption:
A Simple Lagrangian Relaxation Algorithm	over the original (non-intersected) hypergraph, with leaf nodes having weights 6v + 0% + (3) If the output derivation from step 2 has the same set of bigrams as those from step 1, then we have an exact solution to the problem.
A Simple Lagrangian Relaxation Algorithm	C1 states that each leaf in a derivation has exactly one incoming bigram, and that each leaf not in the derivation has 0 incoming bigrams; C2 states that each leaf in a derivation has exactly one outgoing bigram , and that each leaf not in the derivation has 0 outgoing bigrams.6
Background: Hypergraphs	Throughout this paper we make the following assumption when using a bigram language model:
Background: Hypergraphs	Assumption 3.1 ( Bigram start/end assumption.)
The Full Algorithm	The set ’P of trigram paths plays an analogous role to the set 23 of bigrams in our previous algorithm.

bigram is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

6. Extracting Social Power Relationships from Natural Language

Bramsen, Philip and Escobar-Molano, Martha and Patel, Ami and Alonso, Rafael

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	To illustrate, consider the following feature set, a bigram and a trigram (each term in the n-gram either has the form word or Atag):
Abstract	please AVB and 9 is the number of bigrams in T, excluding sentence initial and final markers.
Abstract	Unigrams and Bigrams : As a different sort of baseline, we considered the results of a bag-of-words based classifier.

bigram is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

7. Finding Deceptive Opinion Spam by Any Stretch of the Imagination

Ott, Myle and Choi, Yejin and Cardie, Claire and Hancock, Jeffrey T.

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Automated Approaches to Deceptive Opinion Spam Detection	Specifically, we consider the following three n-gram feature sets, with the corresponding features lowercased and unstemmed: UNIGRAMS, BIGRAMS+ , TRIGRAMS+, where the superscript + indicates that the feature set subsumes the preceding feature set.
Automated Approaches to Deceptive Opinion Spam Detection	We consider all three n-gram feature sets, namely UNIGRAMS, BIGRAMS+ , and TRIGRAMS+, with corresponding language models smoothed using the interpolated Kneser-Ney method (Chen and Goodman, 1996).
Automated Approaches to Deceptive Opinion Spam Detection	We use SVMlight (Joachims, 1999) to train our linear SVM models on all three approaches and feature sets described above, namely POS, LIWC, UNIGRAMS, BIGRAMS+ , and TRIGRAMS+.
Conclusion and Future Work	Specifically, our findings suggest the importance of considering both the context (e.g., BIGRAMS+ ) and motivations underlying a deception, rather than strictly adhering to a universal set of deception cues (e.g., LIWC).
Results and Discussion	This suggests that a universal set of keyword-based deception cues (e.g., LIWC) is not the best approach to detecting deception, and a context-sensitive approach (e.g., BIGRAMS+ ) might be necessary to achieve state-of-the-art deception detection performance.
Results and Discussion	Additional work is required, but these findings further suggest the importance of moving beyond a universal set of deceptive language features (e. g., LIWC) by considering both the contextual (e. g., BIGRAMS+ ) and motivational parameters underlying a deception as well.

bigram is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

8. Ordering Prenominal Modifiers with a Reranking Approach

Liu, Jenny and Haghighi, Aria

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We also kept NPs with only 1 modifier to be used for generating <m0difie1; head n0un> bigram counts at training time.
Experiments	For example, the NP “the beautiful blue Macedonian vase” generates the following bigrams : <beautzful blue>, <blue Macedonian>, and <beautlful Macedonian>, along with the 3-gram <beautlful blue Macedonian>.
Experiments	In addition, we also store a table that keeps track of bigram counts for < M, H >, where H is the head noun of an NP and M is the modifier closest to it.
Related Work	Shaw and Hatzivassiloglou also use a transitivity method to fill out parts of the Count table where bigrams are not actually seen in the training data but their counts can be inferred from other entries in the table, and they use a clustering method to group together modifiers with similar positional preferences.
Related Work	Shaw and Hatzivassiloglou report a highest accuracy of 94.93% and a lowest accuracy of 65.93%, but since their methods depend heavily on bigram counts in the training corpus, they are also limited in how informed their decisions can be if modifiers in the test data are not present at training time.

bigram is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

MAXENT (16)
n-gram (15)
reranking (7)

9. A New Dataset and Method for Automatically Grading ESOL Texts

Yannakoudakis, Helen and Briscoe, Ted and Medlock, Ben

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Approach	(a) Word unigrams (b) Word bigrams
Approach	(a) PoS unigrams (b) PoS bigrams (c) PoS trigrams
Approach	Word unigrams and bigrams are lower-cased and used in their inflected forms.
Previous work	The Bayesian Essay Test Scoring sYstem (BETSY) (Rudner and Liang, 2002) uses multinomial or Bernoulli Naive Bayes models to classify texts into different classes (e. g. pass/fail, grades AF) based on content and style features such as word unigrams and bigrams , sentence length, number of verbs, noun—verb pairs etc.
Validity tests	(a) word unigrams within a sentence (b) word bigrams within a sentence (c) word trigrams within a sentence

bigram is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

10. Exploiting Web-Derived Selectional Preference to Improve Statistical Dependency Parsing

Zhou, Guangyou and Zhao, Jun and Liu, Kang and Cai, Li

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Web page hits for word pairs and trigrams are obtained using a simple heuristic query to the search engine Google.11 Inflected queries are performed by expanding a bigram or trigram into all its morphological forms.
Experiments	Although Google hits is noisier, it has very much larger coverage of bigrams or trigrams.
Experiments	This means that if pages indexed by Google doubles, then so do the bigrams or trigrams frequencies.
Related Work	Keller and Lapata (2003) evaluated the utility of using web search engine statistics for unseen bigram .

bigram is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

11. Combining Morpheme-based Machine Translation with Post-processing Morpheme Prediction

Clifton, Ann and Sarkar, Anoop

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Models 2.1 Baseline Models	After CRF based recovery of the suffix tag sequence, we use a bigram language model trained on a full segmented version on the training data to recover the original vowels.
Models 2.1 Baseline Models	We used bigrams only, because the suffix vowel harmony alternation depends only upon the preceding phonemes in the word from which it was segmented.
Models 2.1 Baseline Models	original training data: koskevaa mietintoa kasitellaan segmentation: koske+ +va+ +a mietinto+ +5 kasi+ +te+ +115+ +a+ +n (train bigram language model with mapping A = { a, 'a map final suflia‘ to abstract tag-set: koske+ +va+ +A mietinto+ +A kasi+ +te+ +11a+ +a+ +n (train CRF model to predict the final suffix) peeling of final suflia‘: koske+ +va+ mietinto+ kasi+ +te+ +11a+ +a+ (train SMT model on this transformation of training data) (a) Training

bigram is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

translation model (21)
BLEU (17)
CRF (12)