A Generic Phrase Training Procedure | We first train word alignment models and will use them to evaluate the goodness of a phrase and a phrase pair . |
Abstract | Multiple data-driven feature functions are proposed to capture the quality and confidence of phrases and phrase pairs . |
Introduction | The first problem is referred to as phrase pair extraction, which identifies phrase pairs that are supposed to be translations of each other. |
Introduction | The most widely used approach derives phrase pairs from word alignment matrix (Och and Ney, 2003; Koehn et al., 2003). |
Introduction | Other methods do not depend on word alignments only, such as directly modeling phrase alignment in a joint generative way (Marcu and Wong, 2002), pursuing information extraction perspective (Venugopal et al., 2003), or augmenting with model-based phrase pair posterior (Deng and Byrne, 2005). |
Abstract | We combine the strengths of Bayesian modeling and synchronous grammar in unsupervised learning of basic translation phrase pairs . |
Abstract | The structured space of a synchronous grammar is a natural fit for phrase pair probability estimation, though the search space can be prohibitively large. |
Introduction | Computational complexity arises from the exponentially large number of decompositions of a sentence pair into phrase pairs; overfitting is a problem because as EM attempts to maximize the likelihood of its training data, it prefers to directly explain a sentence pair with a single phrase pair . |
Phrasal Inversion Transduction Grammar | Our ITG has two nonterminals: X and C, where X represents compositional phrase pairs that can have recursive structures and C is the preterminal over terminal phrase pairs . |
Phrasal Inversion Transduction Grammar | They split the left-hand side constituent which represents a phrase pair into two smaller phrase pairs on the right-hand side and order them according to one of the two possible permutations. |
Phrasal Inversion Transduction Grammar | where Ze/f P (e / f) = l is a multinomial distribution over phrase pairs . |
Variational Bayes for ITG | A sparse prior over a multinomial distribution such as the distribution of phrase pairs may bias the estimator toward skewed distributions that generalize better. |
Variational Bayes for ITG | The other is the distribution of the phrase pairs . |
Variational Bayes for ITG | By adjusting ac to a very small number, we hope to place more posterior mass on parsimonious solutions with fewer but more confident and general phrase pairs . |