Index of papers in Proc. ACL 2013 that mention

bigram

Seen in text as:

bigram (126)
bigrams (106)
Bigram (41)
bigram’s (6)

Seen in 235 sentences in 19 papers.

1. A joint model of word segmentation and phonological variation for English word-final /t/-deletion

Börschinger, Benjamin and Johnson, Mark and Demuth, Katherine

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We find that Bigram dependencies are important for performing well on real data and for learning appropriate deletion probabilities for different contexts.1
Experiments 4.1 The data	Best performance for both the Unigram and the Bigram model in the GOLD-p condition is achieved under the left-right setting, in line with the standard analyses of /t/-deleti0n as primarily being determined by the preceding and the following context.
Experiments 4.1 The data	For the LEARN-p condition, the Bigram model still performs best in the left-right setting but the Unigram model’s performance drops
Experiments 4.1 The data	Note how the Unigram model always suffers in the LEARN- p condition whereas the Bigram model’s performance is actually best for LEARN- p in the left-right setting.
Introduction	We find that models that capture bigram dependencies between underlying forms provide considerably more accurate estimates of those probabilities than corresponding unigram or “bag of words” models of underlying forms.
The computational model	Our models build on the Unigram and the Bigram model introduced in Goldwater et al.
The computational model	Figure 1 shows the graphical model for our joint Bigram model (the Unigram case is trivially recovered by generating the Ums directly from L rather than from LUi,j_1).
The computational model	Figure l: The graphical model for our joint model of word-final /t/-deletion and Bigram word segmentation.

bigram is mentioned in 31 sentences in this paper.

Topics mentioned in this paper:

2. Decipherment Complexity in 1:1 Substitution Ciphers

Nuhn, Malte and Ney, Hermann

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In this paper we show that even for the case of 1:1 substitution ciphers—which encipher plaintext symbols by exchanging them with a unique substitute—finding the optimal decipherment with respect to a bigram language model is NP-hard.
Definitions	similarly define the bigram count N f f/ of f, f ’ E Vf as
Definitions	(a) N f fl are integer counts > 0 of bigrams found in the ciphertext ffv .
Definitions	(b) Given the first and last token of the cipher f1 and fN, the bigram counts involving the sentence boundary token $ need to fulfill
Introduction	Section 5 shows the connection between the quadratic assignment problem and decipherment using a bigram language model.

bigram is mentioned in 14 sentences in this paper.

Topics mentioned in this paper:

3. Using Supervised Bigram-based ILP for Extractive Summarization

Li, Chen and Qian, Xian and Liu, Yang

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In this paper, we propose a bigram based supervised method for extractive document summarization in the integer linear programming (ILP) framework.
Abstract	For each bigram , a regression model is used to estimate its frequency in the reference summary.
Abstract	The regression model uses a variety of indicative features and is trained discriminatively to minimize the distance between the estimated and the ground truth bigram frequency in the reference summary.
Introduction	They used bigrams as such language concepts.
Introduction	Gillick and Favre (Gillick and Favre, 2009) used bigrams as concepts, which are selected from a subset of the sentences, and their document frequency as the weight in the objective function.
Introduction	In this paper, we propose to find a candidate summary such that the language concepts (e. g., bigrams ) in this candidate summary and the reference summary can have the same frequency.
Proposed Method 2.1 Bigram Gain Maximization by ILP	We choose bigrams as the language concepts in our proposed method since they have been successfully used in previous work.
Proposed Method 2.1 Bigram Gain Maximization by ILP	In addition, we expect that the bigram oriented ILP is consistent with the ROUGE-2 measure widely used for summarization evaluation.

bigram is mentioned in 114 sentences in this paper.

Topics mentioned in this paper:

bigrams (114)
ILP (67)
regression model (16)

4. Modeling of term-distance and term-occurrence information for improving n-gram language model performance

Chong, Tze Yuang and E. Banchs, Rafael and Chng, Eng Siong and Li, Haizhou

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Evaluated on the WSJ corpus, bigram and trigram model perplexity were reduced up to 23.5% and 14.0%, respectively.
Abstract	Compared to the distant bigram , we show that word-pairs can be more effectively modeled in terms of both distance and occurrence.
Language Modeling with TD and TO	The prior, which is usually implemented as a unigram model, can be also replaced with a higher order n-gram model as, for instance, the bigram model:
Perplexity Evaluation	As seen from the table, for lower order n- gram models, the complementary information captured by the TD and TO components reduced the perplexity up to 23.5% and 14.0%, for bigram and trigram models, respectively.
Perplexity Evaluation	ter modeling of word-pairs compared to the distant bigram model.
Perplexity Evaluation	Here we compare the perplexity of both, the distance-k bigram model and distance-k TD model (for values of k ranging from two to ten), when combined with a standard bi gram model.
Related Work	The distant bigram model (Huang et.al 1993, Simon et al.
Related Work	2007) disassembles the n-gram into (n—l) word-pairs, such that each pair is modeled by a distance-k bigram model, where 1 S k s n — 1 .
Related Work	Each distance-k bigram model predicts the target-word based on the occurrence of a history-word located k positions behind.

bigram is mentioned in 17 sentences in this paper.

Topics mentioned in this paper:

5. Graph Propagation for Paraphrasing Out-of-Vocabulary Words in Statistical Machine Translation

Razmara, Majid and Siahbani, Maryam and Haffari, Reza and Sarkar, Anoop

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments & Results 4.1 Experimental Setup	The measures are evaluated by fixing the window size to 4 and maximum candidate paraphrase length to 2 (e. g. bigram ).
Experiments & Results 4.1 Experimental Setup	- - -unigram --- bigram —trigram "" " quadgram
Experiments & Results 4.1 Experimental Setup	Type Node \ MRR % \ RCL % \ Bipartite unigram 5.2 12.5 bigram 6.8 15.7 Tripartite unigram 5.9 12.6 bigram 6.9 15.9 Baseline bigram 3.9 7.7

bigram is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

6. Mining Informal Language from Chinese Microtext: Joint Word Recognition and Segmentation

Wang, Aobo and Kan, Min-Yen

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Methodology	In response to these difficulties in differentiating linguistic registers, we compute two different PMI scores for character-based bigrams from two large corpora representing news and microblogs as features.
Methodology	In addition, we also convert all the character-based bigrams into Pinyin-based bigrams (ignoring tones5) and compute the Pinyin-level PMI in the same way.
Methodology	These features capture inconsistent use of the bigram across the two domains, which assists to distinguish informal words.

bigram is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

SVM (13)
baseline systems (11)
CRF (11)

7. An improved MDL-based compression algorithm for unsupervised word segmentation

Chen, Ruey-Cheng

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Change in Description Length	Suppose that the original sequence VVz-_1 is N -word long, the selected word type pair cc and 3/ each occurs k and 1 times, respectively, and altogether x-y bigram occurs m times in VVz-_1.
Change in Description Length	In the new sequence Wi, each of the m bigrams is replaced with an unseen word 2 = my.
Regularized Compression	Hence, a new sequence IV,-is created in the i-th iteration by merging all the occurrences of some selected bigram (cc, 3/) in the original sequence Wi_1.
Regularized Compression	Note that f(:c, y) is the bigram frequency, \|Wi_1\| the sequence length of VVi_1, and AH(W,-_1, m) = FI(W,-) — H(W,-_1) is the difference between the empirical Shannon entropy measured on 1% and VVi_1, using maximum likelihood estimates.
Regularized Compression	In the new sequence 1%, each occurrence of the 53-3; bigram is replaced with a new (conceptually unseen) word 2.

bigram is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

8. Feature-Based Selection of Dependency Paths in Ad Hoc Information Retrieval

Maxwell, K. Tamsin and Oberlander, Jon and Croft, W. Bruce

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Evaluation framework	The second clique contains query bigrams that match
Evaluation framework	document bigrams in 2-word ordered windows (‘#I’), A2 = 0.1.
Evaluation framework	The third clique uses the same bigrams as clique 2 with an 8-word unordered window (‘#uw8’), A3 = 0.05.
Introduction	Integration of the identified catenae in queries also improves IR effectiveness compared to a highly effective baseline that uses sequential bigrams with no linguistic knowledge.

bigram is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

9. Fast and Accurate Shift-Reduce Constituent Parsing

Zhu, Muhua and Zhang, Yue and Chen, Wenliang and Zhang, Min and Zhu, Jingbo

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Baseline parser	bigrams sowslw, sowslc, socslw, 300310, sowqow, sowqot, socqow, socqot, qo’wa’w, (1010(1175, (10731110, (Jotmt, slwqow, slwqot, slcqow, slcqot
Semi-supervised Parsing with Large Data	2 From the dependency trees, we extract bigram lexical dependencies (2121,2122, L/R) where the symbol L (R) means that w1 (2112) is the head of ’LU2 (wl).
Semi-supervised Parsing with Large Data	(2009), we assign categories to bigram and trigram items separately according to their frequency counts.
Semi-supervised Parsing with Large Data	Hereafter, we refer to the bigram and trigram lexical dependency lists as BLD and TLD, respectively.

bigram is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

10. Adapting Discriminative Reranking to Grounded Language Learning

Kim, Joohyun and Mooney, Raymond

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Reranking Features	Bigram .
Reranking Features	Indicates whether a given bigram of nonterminal/terminals occurs for given a parent nonterminal: f (L1 —> L2 : L3) = l.
Reranking Features	Grandparent Bigram .

bigram is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

11. Unsupervised Consonant-Vowel Prediction over Hundreds of Languages

Kim, Young-Bum and Snyder, Benjamin

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Analysis	with a bigram HMM with four language clusters.
Inference	where n(t) and n(t, t’) are, respectively, unigram and bigram tag counts excluding those containing character w. Conversely, n’(t) and n’(t,t’) are, respectively, unigram and bigram tag counts only including those containing character w. The notation am] denotes the ascending factorial: a(a + l) - - - (a +n — 1).
Inference	where n(j, 19,25) and n(j, 19,75, 25’) are the numbers of languages currently assigned to cluster k which have more than j occurrences of unigram (t) and bigram (t, t’ ), respectively.
Model	We note that in practice, we implemented a trigram version of the model,2 but we present the bigram version here for notational clarity.

bigram is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

12. Accurate Word Segmentation using Transliteration and Language Model Projection

Hagiwara, Masato and Sekine, Satoshi

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Knowing that the back-transliterated un-igram “blacki” and bigram “blacki shred” are unlikely in English can promote the correct WS, jfiafi‘yi/z/I/yh“ “blackish red”.
Use of Language Model	As the English LM, we used Google Web 1T 5-gram Version 1 (Brants and Franz, 2006), limiting it to unigrams occurring more than 2000 times and bigrams occurring more than 500 times.
Word Segmentation Model	We limit the features to word unigram and bigram features, i'e'7 My) 2 Zil¢1<wi) + ¢2(7~Ui—1awi)l for y = wl...wn.

bigram is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

LM (19)
PoS tags (4)
bigram (3)

13. Multilingual Affect Polarity and Valence Prediction in Metaphor-Rich Texts

Kozareva, Zornitsa

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Task A: Polarity Classification	We studied the influence of unigrams, bigrams and a combination of the two, and saw that the best performing feature set consists of the combination of unigrams and bigrams .
Task A: Polarity Classification	In this paper, we will refer from now on to n-grams as the combination of unigrams and bigrams .
Task B: Valence Prediction	Those include n-grams (unigrams, bigrams and combination of the two), LIWC scores.

bigram is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

14. Deceptive Answer Prediction with User Preference Graph

Li, Fangtao and Gao, Yang and Zhou, Shuchang and Si, Xiance and Dai, Decheng

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Besides unigram and bigram , the most effective textual feature is URL.
Proposed Features	3.1.1 Unigrams and Bigrams The most common type of feature for text classi-
Proposed Features	feature selection method X2 (Yang and Pedersen, 1997) to select the top 200 unigrams and bigrams as features.

bigram is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

15. Discovering User Interactions in Ideological Discussions

Mukherjee, Arjun and Liu, Bing

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Phrase Ranking based on Relevance	This thread of research models bigrams by encoding them into the generative process.
Phrase Ranking based on Relevance	For each word, a topic is sampled first, then its status as a unigram or bigram is sampled, and finally the word is sampled from a topic-specific unigram or bigram distribution.
Phrase Ranking based on Relevance	In (Tomokiyo and Hurst, 2003), a language model approach is used for bigram phrase extraction.

bigram is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

16. Fast and Robust Compressive Summarization with Dual Decomposition and Multi-Task Learning

Almeida, Miguel and Martins, Andre

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Compressive Summarization	(2011), we used stemmed word bigrams as concepts, to which we associate the following concept features ((DCOV): indicators for document counts, features indicating if each of the words in the bigram is a stop-word, the earliest position in a document each concept occurs, as well as two and three-way conjunctions of these features.
Experiments	We generated oracle extracts by maximizing bigram recall with respect to the manual abstracts, as described in Berg-Kirkpatrick et al.
Extractive Summarization	1Previous work has modeled concepts as events (Filatova and Hatzivassiloglou, 2004), salient words (Lin and Bilmes, 2010), and word bigrams (Gillick etal., 2008).

bigram is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

17. Smoothed marginal distribution constraints for language modeling

Roark, Brian and Allauzen, Cyril and Riley, Michael

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	For example, if a bigram parameter is modified due to the presence of some set of trigrams, and then some or all of those trigrams are pruned from the model, the bigram associated with the modified parameter will be unlikely to have an overall expected frequency equal to its observed frequency anymore.
Marginal distribution constraints	Thus the unigram distribution is with respect to the bigram model, the bigram model is with respect to the trigram model, and so forth.
Model constraint algorithm	This can be particularly clearly seen at the unigram state, which has an arc for every unigram (the size of the vocabulary): for every bigram state (also order of the vocabulary), in the naive algorithm we must look for every possible arc.

bigram is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

18. A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization

Wang, Lu and Raghavan, Hema and Castelli, Vittorio and Florian, Radu and Cardie, Claire

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Results	The results in Table 5 use the official ROUGE software with standard options5 and report ROUGE-2 (R-2) (measures bigram overlap) and ROUGE-SU4 (R-SU4) (measures unigram and skip-bigram separated by up to four words).
The Framework	unigram/bigram/skip bigram (at most four words apart) overlap unigram/bigram TF/TF—IDF similarity
The Framework	2011; Ouyang et al., 2011), we use the ROUGE-2 score, which measures bigram overlap between a sentence and the abstracts, as the objective for regression.

bigram is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

19. Using subcategorization knowledge to improve case prediction for translation to German

Weller, Marion and Fraser, Alexander and Schulte im Walde, Sabine

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Using subcategorization information	The model has access to the basic features stem and tag, as well as the new features based on subcat-egorizaion information (explained below), using unigrams within a Window of up to four positions to the right and the left of the current position, as well as bigrams and trigrams for stems and tags (current item + left and/or right item).
Using subcategorization information	In addition to the probability/frequency of the respective functions, we also provide the CRF with bigrams containing the two parts of the tuple,
Using subcategorization information	By providing the parts of the tuple as unigrams, bigrams or trigrams to the CRF, all relevant information is available: verb, noun and the probabilities for the potential functions of the noun in the sentence.

bigram is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: