A Generative PCFG Model | (1996) who consider the kind of probabilities a generative parser should get from a PoS tagger, and concludes that these should be P(w|t) “and nothing fancier”.3 In our setting, therefore, the Lattice is not used to induce a probability distribution on a linear context, but rather, it is used as a common-denominator of state-indexation of all segmentations possibilities of a surface form. |
A Generative PCFG Model | We smooth Prf (p —> (s, 19)) for rare and 00V segments (3 E [,1 E L, s unseen) using a “per-tag” probability distribution over rare segments which we estimate using relative frequency estimates for once-occurring segments. |
Discussion and Conclusion | The overall performance of our joint framework demonstrates that a probability distribution obtained over mere syntactic contexts using a Treebank grammar and a data-driven lexicon outperforms upper bounds proposed by previous joint disambiguation systems and achieves segmentation and parsing results on a par with state-of-the-art standalone applications results. |
Model Preliminaries | Given that weights on all outgoing arcs sum up to one, weights induce a probability distribution on the lattice paths. |
Introduction | In the second phase, a conditional probability distribution is estimated that describes the probability that a word was uttered given such event representations. |
Linguistic Mapping | We model this relationship, much like traditional language models, using conditional probability distributions . |
Linguistic Mapping | The model assumes that every document is made up of a mixture of topics, and that each word in a document is generated from a probability distribution associated with one of those topics. |
Introduction | important reason for the success of these models is the fact that they are lexicalized: the probability distributions are also conditioned on the actual words occuring in the utterance, and not only on their parts of speech. |
Language Model 2.1 The General Approach | P was modeled by means of a dedicated probability distribution for each conditioning tag. |
Language Model 2.1 The General Approach | The resulting probability distributions were trained on the German TIGER treebank which consists of about 50000 sentences of newspaper text. |