Index of papers in Proc. ACL 2010 that mention

bigram

Seen in text as:

bigram (68)
bigrams (37)
Bigrams (7)

Seen in 97 sentences in 11 papers.

1. Detecting Errors in Automatically-Parsed Dependency Relations

Dickinson, Markus

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Ad hoc rule detection	3.4 Bigram anomalies 3.4.1 Motivation
Ad hoc rule detection	The bigram method examines relationships between adjacent sisters, complementing the whole rule method by focusing on local properties.
Ad hoc rule detection	But only the final elements have anomalous bigrams : HD:ID IR:IR, IR:IR ANzRO, and ANzRO J RzIR all never occur.
Additional information	This rule is entirely correct, yet the XXzXX position has low whole rule and bigram scores.
Approach	First, the bigram method abstracts a rule to its bigrams .
Evaluation	For example, the bigram method with a threshold of 39 leads to finding 283 errors (455 x .622).
Evaluation	The whole rule and bigram methods reveal greater precision in identifying problematic dependencies, isolating elements with lower UAS and LAS scores than with frequency, along with corresponding greater pre-
Introduction and Motivation	We propose to flag erroneous parse rules, using information which reflects different grammatical properties: POS lookup, bigram information, and full rule comparisons.

bigram is mentioned in 19 sentences in this paper.

Topics mentioned in this paper:

2. Minimized Models and Grammar-Informed Initialization for Supertagging with Highly Ambiguous Lexicons

Ravi, Sujith and Baldridge, Jason and Knight, Kevin

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Most methods have employed some variant of Expectation Maximization (EM) to learn parameters for a bigram
Introduction	Ravi and Knight (2009) achieved the best results thus far (92.3% word token accuracy) via a Minimum Description Length approach using an integer program (IP) that finds a minimal bigram grammar that obeys the tag dictionary constraints and covers the observed data.
Minimized models for supertagging	The 1241 distinct supertags in the tagset result in 1.5 million tag bigram entries in the model and the dictionary contains almost 3.5 million word/tag pairs that are relevant to the test data.
Minimized models for supertagging	The set of 45 P08 tags for the same data yields 2025 tag bigrams and 8910 dictionary entries.
Minimized models for supertagging	Our objective is to find the smallest supertag grammar (of tag bigram types) that explains the entire text while obeying the lexicon’s constraints.

bigram is mentioned in 26 sentences in this paper.

Topics mentioned in this paper:

3. Finding Cognate Groups Using Phylogenies

Hall, David and Klein, Dan

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We follow prior work and use sets of bigrams within words.
Experiments	In our case, during bipartite matching the set X is the set of bigrams in the language being re-permuted, and Y is the union of bigrams in the other languages.
Experiments	Besides the heuristic baseline, we tried our model-based approach using Unigrams, Bigrams and Anchored Unigrams, with and without learning the parametric edit distances.
Message Approximation	Figure 2: Various topologies for approximating topologies: (a) a unigram model, (b) a bigram model, (c) the anchored uni gram model, and (d) the n-best plus backoff model used in Dreyer and Eisner (2009).
Message Approximation	The first is a plain unigram model, the second is a bigram model, and the third is an anchored unigram topology: a position-specific unigram model for each position up to some maximum length.
Message Approximation	The second topology we consider is the bigram topology, illustrated in Figure 2(b).

bigram is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

4. A Rational Model of Eye Movement Control in Reading

Bicknell, Klinton and Levy, Roger

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Simulation 1	Our reader’s language model was an unsmoothed bigram model created using a vocabulary set con-
Simulation 1	From this vocabulary, we constructed a bigram model using the counts from every bigram in the BNC for which both words were in vocabulary (about 222,000 bigrams ).
Simulation 1	Specifically, we constructed the model’s initial belief state (i.e., the distribution over sentences given by its language model) by directly translating the bigram model into a wFSA in the log semiring.
Simulation 2	Instead, we begin with the same set of bigrams used in Sim.
Simulation 2	1 — i.e., those that contain two in-vocabulary words — and trim this set by removing rare bigrams that occur less than 200 times in the BNC (except that we do not trim any bigrams that occur in our test corpus).
Simulation 2	This reduces our set of bigrams to about 19,000.

bigram is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

language model (15)
bigram (8)

5. Practical Very Large Scale CRFs

Lavergne, Thomas and Cappé, Olivier and Yvon, François

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conditional Random Fields	In the sequel, we distinguish between two types of feature functions: unigramfea-tures fyflc, associated with parameters Hwy, and bigram features fy/Mgc, associated with parameters Aggy)?
Conditional Random Fields	On the other hand, bigram features {fy/,y,$}(y,$)€y2xX are helpful in modelling dependencies between successive labels.
Conditional Random Fields	Assume the set of bigram features {Ag/,y,$t+l}(y/,y)€y2 is sparse with only r(:ct+1) << \|Y 2 non null values and define the \|Y\| >< \|Y\| sparse matrix

bigram is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

6. Learning Word-Class Lattices for Definition and Hypernym Extraction

Navigli, Roberto and Velardi, Paola

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	0 Bigrams: an implementation of the bigram classifier for soft pattern matching proposed by Cui et al.
Experiments	The probability is calculated as a mixture of bigram and
Experiments	WCL—l 99.88 42.09 59.22 76.06 WCL—3 98.81 60.74 75.23 83.48 Star patterns 86.74 66.14 75.05 81.84 Bigrams 66.70 82.70 73.84 75.80

bigram is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

7. Efficient Staggered Decoding for Sequence Labeling

Kaji, Nobuhiro and Fujiwara, Yasuhiro and Yoshinaga, Naoki and Kitsuregawa, Masaru

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	where we explicitly distinguish the unigram feature function o; and bigram feature function Comparing the form of the two functions, we can see that our discussion on HMMs can be extended to perceptrons by substituting 2k wigbflwwn) and 2k wg¢%(w,yn_1,yn) for logp(:cn\|yn) and 10gp(yn\|yn—1)-
Introduction	For bigram features, we compute its upper bound offline.
Introduction	The simplest case is that the bigram features are independent of the token sequence :13.

bigram is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

8. A Hybrid Hierarchical Model for Multi-Document Summarization

Celikyilmaz, Asli and Hakkani-Tur, Dilek

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments and Discussions	We use R-l (recall against unigrams), R-2 (recall against bigrams), and R-SU4 (recall against skip-4 bigrams ).
Experiments and Discussions	Note that R-2 is a measure of bigram recall and sumHLDA of HybHSumg is built on unigrams rather than bigrams .
Regression Model	We similarly include bigram features in the experiments.
Regression Model	We also include bigram extensions of DMF features.
Regression Model	We use sentence bigram frequency, sentence rank in a document, and sentence size as additional fea-

bigram is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

9. Word Representations: A Simple and General Method for Semi-Supervised Learning

Turian, Joseph and Ratinov, Lev-Arie and Bengio, Yoshua

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Clustering-based word representations	The Brown algorithm is a hierarchical clustering algorithm which clusters words to maximize the mutual information of bigrams (Brown et al., 1992).
Clustering-based word representations	So it is a class-based bigram language model.
Clustering-based word representations	One downside of Brown clustering is that it is based solely on bigram statistics, and does not consider word usage in a wider context.

bigram is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

10. Modeling Norms of Turn-Taking in Multi-Party Conversation

Laskowski, Kornel

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Excluding qt_1 2 qt bigrams (leading to 0.32M frames from 2.39M frames in “all”) offers a glimpse of expected performance differences were duration modeling to be included in the models.
Limitations and Desiderata	To produce Figures 1 and 2, a small fraction of probability mass was reserved for unseen bigram transitions (as opposed to backing off to unigram probabilities).
The Extended-Degree-of-Overlap Model	The EDO model mitigates R-specificity because it models each bigram (qt_1, qt) 2 (8,, S j) as the modified bigram (m, [0ij,nj]), involving three scalars each of which is a sum — a commutative (and therefore rotation-invariant) operation.

bigram is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

language modeling (4)
bigram (3)

11. Models of Metaphor in NLP

Shutova, Ekaterina

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Automatic Metaphor Recognition	They use hyponymy relation in WordNet and word bigram counts to predict metaphors at a sentence level.
Automatic Metaphor Recognition	Hereby they calculate bigram probabilities of verb-noun and adjective-noun pairs (including the hyponyms/hypernyms of the noun in question).
Automatic Metaphor Recognition	However, by using bigram counts over verb-noun pairs Krishnakumaran and Zhu (2007) loose a great deal of information compared to a system extracting verb-object relations from parsed text.

bigram is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: