Index of papers in Proc. ACL 2010 that mention
  • probability distribution
Spiegler, Sebastian and Flach, Peter A.
Probabilistic generative model
If a generative model is fully parameterised it can be reversed to find the underlying word decomposition by forming the conditional probability distribution Pr(Y |X
Probabilistic generative model
The first component of the equation above is the probability distribution over non-/boundaries Pr(bji).
Probabilistic generative model
We assume that a boundary in i is inserted independently from other boundaries (zero-order) and the graphemic representation of the word, however, is conditioned on the length of the word m j which means that the probability distribution is in fact Pr(bji|mj).
probability distribution is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Hall, David and Klein, Dan
Learning
Given the expected counts, we now need to normalize them to ensure that the transducer represents a conditional probability distribution (Eisner, 2002; Oncina and Sebban, 2006).
Message Approximation
An alternative approach might be to simply treat messages as unnormalized probability distributions , and to minimize the KL divergence be-
Message Approximation
tween some approximating message mm) and the true message However, messages are not always probability distributions and — because the number of possible strings is in principle infinite —they need not sum to a finite number.5 Instead, we propose to minimize the KL divergence between the “expected” marginal distribution and the approximated “expected” marginal distribution:
Message Approximation
The procedure for calculating these statistics is described in Li and Eisner (2009), which amounts to using an expectation semiring (Eisner, 2001) to compute expected transitions in 7' o 71* under the probability distribution 7' o ,u.
probability distribution is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Kazama, Jun'ichi and De Saeger, Stijn and Kuroda, Kow and Murata, Masaki and Torisawa, Kentaro
Background
Estimating a conditional probability distribution gbk; = p( as a context profile for each 212,- falls into this case.
Background
When the context profiles are probability distributions, we usually utilize the measures on probability distributions such as the Jensen-Shannon (J S) divergence to calculate similarities (Dagan et al., 1994; Dagan et al., 1997).
Background
The BC is also a similarity measure on probability distributions and is suitable for our purposes as we describe in the next section.
probability distribution is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Kummerfeld, Jonathan K. and Roesner, Jessika and Dawborn, Tim and Haggerty, James and Curran, James R. and Clark, Stephen
Background
The C&C supertagger is similar to the Ratnaparkhi (1996) tagger, using features based on words and POS tags in a five-word window surrounding the target word, and defining a local probability distribution over supertags for each word in the sentence, given the previous two supertags.
Background
Alternatively the Forward-Backward algorithm can be used to efficiently sum over all sequences, giving a probability distribution over supertags for each word which is conditional only on the input sentence.
Results
Note that these are all alternative methods for estimating the local log-linear probability distributions used by the Ratnaparkhi-style tagger.
probability distribution is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: