Index of papers in Proc. ACL 2009 that mention
  • unigram
Mochihashi, Daichi and Yamada, Takeshi and Ueda, Naonori
Inference
While previous work used p(l<:|@) = (l —p($))k_1p($), this is only true for unigrams .
Inference
the number of tables tew for w in word unigrams .
Nested Pitman-Yor Language Model
Thus far we have assumed that the unigram G1 is already given, but of course it should also be generated as G1 ~ PY(G0, d, 6).
Nested Pitman-Yor Language Model
2Note that this is different from unigrams , which are posterior distribution given data.
Nested Pitman-Yor Language Model
When a word 21) is generated from its parent at the unigram node, it means that w
Pitman-Yor process and n-gram models
Suppose we have a unigram word distribution G1 = } where - ranges over each word in the lexicon.
Pitman-Yor process and n-gram models
In this representation, each n-gram context h (including the null context 6 for unigrams ) is a Chinese restaurant whose customers are the n-gram counts seated over the tables 1 - - -thw.
unigram is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Das, Dipanjan and Smith, Noah A.
Data and Task
Notice the high lexical overlap between the two sentences ( unigram overlap of 100% in one direction and 72% in the other).
Data and Task
19 is another true paraphrase pair with much lower lexical overlap ( unigram overlap of 50% in one direction and 30% in the other).
Experimental Evaluation
(2006), using features calculated directly from 51 and 52 without recourse to any hidden structure: proportion of word unigram matches, proportion of lemma-tized unigram matches, BLEU score (Papineni et al., 2001), BLEU score on lemmatized tokens, F measure (Turian et al., 2003), difference of sentence length, and proportion of dependency relation overlap.
Experimental Evaluation
10This is accomplished by eliminating lines 12 and 13 from the definition of pm and redefining pword to be the unigram word distribution estimated from the Gigaword corpus, as in G0, without the help of WordNet.
QG for Paraphrase Modeling
(15) Here aw is the Good-Turing unigram probability estimate of a word 21) from the Gigaword corpus (Graff, 2003).
QG for Paraphrase Modeling
As noted, the distributions pm, the word unigram weights in Eq.
unigram is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Garera, Nikesh and Yarowsky, David
Corpus Details
As our reference algorithm, we used the current state-of-the-art system developed by Boulis and Ostendorf (2005) using unigram and bigram features in a SVM framework.
Corpus Details
For each conversation side, a training example was created using unigram and bigram features with tf-idf weighting, as done in standard text classification approaches.
Corpus Details
Also, named entity “Mike” shows up as a discriminative unigram , this maybe due to the self-introduction at the beginning of the conversations and “Mike” being a common male name.
unigram is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Kruengkrai, Canasai and Uchimoto, Kiyotaka and Kazama, Jun'ichi and Wang, Yiou and Torisawa, Kentaro and Isahara, Hitoshi
Training method
Table 2: Unigram features.
Training method
We broadly classify features into two categories: unigram and bigram features.
Training method
Unigram features: Table 2 shows our unigram features.
unigram is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
DeNero, John and Chiang, David and Knight, Kevin
Computing Feature Expectations
where h(t) is the unigram prefix of bigram t.
Consensus Decoding Algorithms
where T1 is the set of unigrams in the language, and 6(6, 25) is an indicator function that equals 1 if 75 appears in e and 0 otherwise.
Consensus Decoding Algorithms
Figure 1: For the linear similarity measure U (c; e’ ), which computes unigram precision, the MBR translation can be found by iterating either over sentence pairs (Algorithm 1) or over features (Algorithm 2).
unigram is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Li, Zhifei and Eisner, Jason and Khudanpur, Sanjeev
Experimental Results
As shown in Table 2a, decoding with a single variational n-gram model (VM) as per (14) improves the Viterbi baseline (except the case with a unigram VM), though often not statistically significant.
Experimental Results
The interpolation between a VM and a word penalty feature (“wp”) improves over the unigram
Experimental Results
This is necessarily true, but it is interesting to see that most of the improvement is obtained just by moving from a unigram to a bigram model.
unigram is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Tsuruoka, Yoshimasa and Tsujii, Jun'ichi and Ananiadou, Sophia
Log-Linear Models
The features used in this experiment were unigrams and bigrams of neighboring words, and unigrams , bigrams and trigrams of neighboring POS tags.
Log-Linear Models
For the features, we used unigrams of neighboring chunk tags, substrings (shorter than 10 characters) of the current word, and the shape of the word (e. g. “IL-2” is converted into “AA-#”), on top of the features used in the text chunking experiments.
Log-Linear Models
For the features, we used unigrams and bigrams of neighboring words, prefixes and suffixes of the current word, and some characteristics of the word.
unigram is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Wan, Xiaojun
Empirical Evaluation 4.1 Evaluation Setup
In the above experiments, all features ( unigram + bigram) are used.
The Co-Training Approach
The English or Chinese features used in this study include both unigrams and bigrams5 and the feature weight is simply set to term frequency6.
The Co-Training Approach
5 For Chinese text, a unigram refers to a Chinese word and a bigram refers to two adjacent Chinese words.
unigram is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: