Index of papers in Proc. ACL 2012 that mention
  • bigram
Sun, Xu and Wang, Houfeng and Li, Wenjie
System Architecture
To derive word features, first of all, our system automatically collect a list of word unigrams and bigrams from the training data.
System Architecture
To avoid overfitting, we only collect the word unigrams and bigrams whose frequency is larger than 2 in the training set.
System Architecture
This list of word unigrams and bigrams are then used as a unigram-dictionary and a bigram-dictionary to generate word-based unigram and bigram features.
bigram is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Zhao, Qiuye and Marcus, Mitch
Abstract
We consider bigram and trigram templates for generating potentially deterministic constraints.
Abstract
bigram constraint includes one contextual word (w_1|w1) or the corresponding morph feature; and a trigram constraint includes both contextual words or their morph features.
Abstract
precision recall F1 bigram 0.993 0.841 0.911 trigram 0.996 0.608 0.755
bigram is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Constant, Matthieu and Sigogne, Anthony and Watrin, Patrick
MWE-dedicated Features
We use word unigrams and bigrams in order to capture multiwords present in the training section and to extract lexical cues to discover new MWEs.
MWE-dedicated Features
For instance, the bigram coup de is often the prefix of compounds such as coup de pied (kick), coup de foudre (love at first sight), coup de main (help).
MWE-dedicated Features
We use part-of-speech unigrams and bigrams in order to capture MWEs with irregular syntactic structures that might indicate the id-iomacity of a word sequence.
bigram is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Green, Spence and DeNero, John
A Class-based Model of Agreement
The features are indicators for (character, position, label) triples for a five Character window and bigram label transition indicators.
A Class-based Model of Agreement
Bigram transition features gbt encode local agreement relations.
A Class-based Model of Agreement
We trained a simple add-1 smoothed bigram language model over gold class sequences in the same treebank training data:
bigram is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Sun, Weiwei and Uszkoreit, Hans
Capturing Paradigmatic Relations via Word Clustering
The quality is defined based on a class-based bigram language model as follows.
Capturing Paradigmatic Relations via Word Clustering
The objective function is maximizing the likelihood H2121 P(w7;|w1, ..., wi_1) of the training data given a partially class-based bigram model of the form
State-of-the-Art
Word bigrams : w_2_w_1, w_1_w, w_w+1, w+1_w+2; In order to better handle unknown words, we extract morphological features: character n- gram prefixes and suffixes for n up to 3.
State-of-the-Art
(2009) introduced a bigram HMM model with latent variables (Bi gram HMM-LA in the table) for Chinese tagging.
State-of-the-Art
Trigram HMM (Huang et al., 2009) 93.99% Bigram HMM-LA (Huang et al., 2009) 94.53% Our tagger 94.69%
bigram is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Elsner, Micha and Goldwater, Sharon and Eisenstein, Jacob
Conclusion
We have presented a noisy-channel model that simultaneously learns a lexicon, a bigram language model, and a model of phonetic variation, while using only the noisy surface forms as training data.
Introduction
Previous models with similar goals have learned from an artificial corpus with a small vocabulary (Driesen et al., 2009; Rasanen, 2011) or have modeled variability only in vowels (Feldman et al., 2009); to our knowledge, this paper is the first to use a naturalistic infant-directed corpus while modeling variability in all segments, and to incorporate word-level context (a bigram language model).
Introduction
Our model is conceptually similar to those used in speech recognition and other applications: we assume the intended tokens are generated from a bigram language model and then distorted by a noisy channel, in particular a log-linear model of phonetic variability.
Related work
In contrast, our model uses a symbolic representation for sounds, but models variability in all segment types and incorporates a bigram word-level language model.
bigram is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Tang, Hao and Keshet, Joseph and Livescu, Karen
Discussion
In the figure, phone bigram TF—IDF is labeled p2; phonetic alignment with dynamic programming is labeled DP.
Experiments
The TF-IDF features used in the experiments are based on phone bigrams .
Feature functions
In practice, we only consider n-grams of a certain order (e. g., bigrams ).
Feature functions
Then for the bi-gram /1 iy/, we have TF/liy/(fo) = 1/5 (one out of five bigrams in 1—9), and IDF /1 iy / = log(2 / 1) (one word out of two in the dictionary).
bigram is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Huang, Zhiheng and Chang, Yi and Long, Bo and Crespo, Jean-Francois and Dong, Anlei and Keerthi, Sathiya and Wu, Su-Lin
Experiments
In particular, we use the unigrams of the current and its neighboring words, word bigrams, prefixes and suffixes of the current word, capitalization, all-number, punctuation, and tag bigrams for POS, CoNLL2000 and CoNLL 2003 datasets.
Experiments
For supertag dataset, we use the same features for the word inputs, and the unigrams and bigrams for gold POS inputs.
Problem formulation
Bigram features are of form fk (yt, yt_1, xt) which are concerned with both the previous and the current labels.
bigram is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Konstas, Ioannis and Lapata, Mirella
Experimental Design
Consecutive Word/Bigram/Trigram This feature family targets adjacent repetitions of the same word, bigram or trigram, e.g., ‘show me the show me the
Problem Formulation
The weight of this rule is the bigram probability of two records conditioned on their type, multiplied with a normalization factor 7».
Problem Formulation
Rule (6) defines the expansion of field F to a sequence of (binarized) words W, with a weight equal to the bigram probability of the current word given the previous word, the current record, and field.
bigram is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Sun, Weiwei and Wan, Xiaojun
Structure-based Stacking
0 Character unigrams: ck (i — l S k: S i + l) 0 Character bigrams : ckck+1 (i — l S k: < i + l)
Structure-based Stacking
0 Character label bigrams : cgpdcgffi (i — lppd S
Structure-based Stacking
0 Bigram features: C(sk)C(sk+1) (i — [C S k; < 73 + lo), Tctb(5k)Tctb(3k+1) (i — 1ng g k; < i +1390), Tppd(5k)Tppd(3k+1) (73 — lgpd S k: < 73+ zgpd)
bigram is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: