Index of papers in Proc. ACL 2009 that mention

Seen in text as:

Seen in 34 sentences in 6 papers.

DeNero, John and Chiang, David and Knight, Kevin

Computing Feature Expectations	Un-igrams are produced by lexical rules, while higher-order n-grams can be produced either directly by lexical rules, or by combining constituents.
Computing Feature Expectations	The n-gram language model score of e similarly decomposes over the h in e that produce n-grams .
Computing Feature Expectations	The linear similarity measure takes the following form, where Tn is the set of n-grams:
Experimental Results	Above, we compare the precision, relative to reference translations, of sets of n-grams chosen in two ways.
Experimental Results	The left bar is the precision of the n-grams in 6*.
Experimental Results	The right bar is the precision of n-grams with E[c(6, 75)] > p. To justify this comparison, we chose p so that both methods of choosing 71- grams gave the same n-gram recall: the fraction of n-grams in reference translations that also appeared in 6* or had E[6(6, 75)] > p.

n-grams is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

Kumar, Shankar and Macherey, Wolfgang and Dyer, Chris and Och, Franz

MERT for MBR Parameter Optimization	This linear function contains n + 1 parameters 60, 61, ..., 6 N, where N is the maximum order of the n-grams involved.
Minimum Bayes-Risk Decoding	First, the set of n-grams is extracted from the lattice.
Minimum Bayes-Risk Decoding	For a moderately large lattice, there can be several thousands of n-grams and the procedure becomes expensive.
Minimum Bayes-Risk Decoding	For each node 75 in the lattice, we maintain a quantity Score(w, t) for each n-gram 212 that lies on a path from the source node to t. Score(w, t) is the highest posterior probability among all edges on the paths that terminate on t and contain n-gram w. The forward pass requires computing the n-grams introduced by each edge; to do this, we propagate n-grams (up to maximum order — l) terminating on each node.

n-grams is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

Li, Mu and Duan, Nan and Zhang, Dongdong and Li, Chi-Ho and Zhou, Ming

Collaborative Decoding	Here we do not discriminate among different lexical n-grams and are only concerned with statistics aggregation of all n-grams of the same order.
Collaborative Decoding	Gn+(e,e') is the n-gram agreement measure function which counts the number of occurrences in e'of n-grams in 6.
Collaborative Decoding	So the corresponding feature value will be the expected number of occurrences in 17-[k (f) of all n-grams in e:
Discussion	Our method uses agreement information of n-grams , and consensus features are integrated into decoding models.
Experiments	In Table 5 we show in another dimension the impact of consensus-based features by restricting the maximum order of n-grams used to compute agreement statistics.
Experiments	One reason could be that the data sparsity for high-order n-grams leads to over fitting on development data.

n-grams is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

Li, Zhifei and Eisner, Jason and Khudanpur, Sanjeev

Variational Approximate Decoding	whose edges correspond to n-grams (weighted with negative log-probabilities) and whose vertices correspond to (n — l)-grams.
Variational Approximate Decoding	This may be regarded as favoring n-grams that are likely to appear in the reference translation (because they are likely in the derivation forest).
Variational Approximate Decoding	However, in order to score well on the BLEU metric for MT evaluation (Papineni et al., 2001), which gives partial credit, we would also like to favor lower-order n-grams that are likely to appear in the reference, even if this means picking some less-likely high-order n-grams .
Variational vs. Min-Risk Decoding	Now, let us divide N, which contains n-gram types of different n, into several subsets Wn, each of which contains only the n-grams with a given length n. We can now rewrite (19) as follows,

n-grams is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

Ceylan, Hakan and Kim, Yookyung

Introduction	Rank OIflGR using words —-- ——onte‘Carfo usin - rams --» ---d/x 7,‘ Monte Carlo using words w ~ 7 ~ ’ Relative Entropy using N-grams — —~—-’ IRelative EIntropy usin words If r r w
Related Work	The n-gram based approaches are based on the counts of character or byte n-grams , which are sequences of n characters or bytes, extracted from a corpus for each reference language.
Related Work	(Dunning, 1994) proposed a system that uses Markov Chains of byte n-grams with Bayesian Decision Rules to minimize the probability error.

n-grams is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

Song, Young-In and Lee, Jung-Tae and Rim, Hae-Chang

Introduction	(Miller et al., 1999; Song and Croft, 1999) explore the use n-grams in retrieval models.
Previous Work	Subsequently, various types of phrases, such as sequential n-grams (Mitra et al., 1997), head-modifier pairs extracted from syntactic structures (Lewis and Croft, 1990; Zhai, 1997; Dillon and Gray, 1983; Strzalkowski et al., 1994), proximity-based phrases (Turpin and Mof—fat, 1999), were examined with conventional retrieval models (e. g. vector space model).
Previous Work	(Song and Croft, 1999; Miller et al., 1999; Gao et al., 2004; Metzler and Croft, 2005) investigated the effectiveness of language modeling approach in modeling statistical phrases such as n-grams or proximity-based phrases.

n-grams is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: