A Generic Phrase Training Procedure | We first train word alignment models and will use them to evaluate the goodness of a phrase and a phrase pair. |
A Generic Phrase Training Procedure | Beginning with a flat lexicon, we train IBM Model-l word alignment model with 10 iterations for each translation direction. |
A Generic Phrase Training Procedure | We then train HMM word alignment models (Vogel et al., 1996) in two directions simultaneously by merging statistics collected in the |
Features | All these features are data-driven and defined based on models, such as statistical word alignment model or language model. |
Features | In a statistical generative word alignment model (Brown et al., 1993), it is assumed that (i) a random variable a specifies how each target word fj is generated by (therefore aligned to) a source 1 word eaj; and (ii) the likelihood function f (f, a|e) specifies a generative procedure from the source sentence to the target sentence. |
Features | This distribution is applicable to all word alignment models that follow assumptions (i) and (ii). |
Introduction | We employ features based on word alignment models and alignment matrix. |
Abstract | In this work we analyze a recently proposed agreement-constrained EM algorithm for unsupervised alignment models . |
Abstract | We propose and extensively evaluate a simple method for using alignment models to produce alignments better-suited for phrase-based MT systems, and show significant gains (as measured by BLEU score) in end-to-end translation systems for six languages pairs used in recent MT competitions. |
Adding agreement constraints | They suggest how this framework can be used to encourage two word alignment models to agree during training. |
Adding agreement constraints | Most MT systems train an alignment model in each direction and then heuristically combine their predictions. |
Introduction | In this work, we show that by changing the way the word alignment models are trained and |
Introduction | We present extensive experimental results evaluating a new training scheme for unsupervised word alignment models : an extension of the Expectation Maximization algorithm that allows effective injection of additional information about the desired alignments into the unsupervised training process. |
Phrase-based machine translation | We then train the competing alignment models and compute competing alignments using different decoding schemes. |
Statistical word alignment | 2.1 Baseline word alignment models |
Statistical word alignment | Figure 1 illustrates the mapping between the usual HMM notation and the HMM alignment model . |
Statistical word alignment | All word alignment models we consider are normally trained using the Expectation Maximization |
Conclusion | However, our best system does not apply VB to a single probability model, as we found an appreciable benefit from bootstrapping each model from simpler models, much as the IBM word alignment models are usually trained in succession. |
Introduction | As these word-level alignment models restrict the word alignment complexity by requiring each target word to align to zero or one source words, results are improved by aligning both source-to-target as well as target-to-source, |
Introduction | Ideally, such a procedure would remedy the deficiencies of word-level alignment models , including the strong restrictions on the form of the alignment, and the strong independence assumption between words. |
Phrasal Inversion Transduction Grammar | Our second approach was to constrain the search space using simpler alignment models , which has the further benefit of significantly speeding up training. |
Phrasal Inversion Transduction Grammar | First we train a lower level word alignment model , then we place hard constraints on the phrasal alignment space using confident word links from this simpler model. |
Variational Bayes for ITG | alignment models is the EM algorithm (Brown et al., 1993) which iteratively updates parameters to maximize the likelihood of the data. |