Experimental results | For the 44 and 230 million tokens corpora, all sentences are automatically parsed and used to initialize model parameters , while for 1.3 billion tokens corpus, we parse the sentences from a portion of the corpus that |
Experimental results | contain 230 million tokens, then use them to initialize model parameters . |
Experimental results | Nevertheless, experimental results show that this approach is effective to provide initial values of model parameters . |
Training algorithm | The objective of maximum likelihood estimation is to maximize the likelihood £(D, p) respect to model parameters . |
Training algorithm | and denote ’2' N as the collection of N -best list parse trees for sentences over entire corpus D under model parameter p. |
Training algorithm | mate model parameters . |
Introduction | From these corpora, we estimate translation model parameters : word-to-word translation tables, fertilities, distortion parameters, phrase tables, syntactic transformations, etc. |
Introduction | A language model P (e) is typically used in SMT decoding (Koehn, 2009), but here P (6) actually plays a central role in training translation model parameters . |
Machine Translation as a Decipherment Task | During decipherment training, our objective is to estimate the model parameters 0 in order to maximize the probability of the foreign corpus f. From Equation 4 we have: |
Machine Translation as a Decipherment Task | 5 For Bayesian MT decipherment, we set a high prior value on the language model (104) and use sparse priors for the IBM 3 model parameters t, n, d,p (0.01, 0.01, 0.01, 0.01). |
Word Substitution Decipherment | During decipherment, our goal is to estimate the channel model parameters 6. |
Word Substitution Decipherment | These methods are attractive for their ability to manage uncertainty about model parameters and allow one to incorporate prior knowledge during inference. |
Adding Linguistic Knowledge to the Monte-Carlo Framework | Since our model is a nonlinear approximation of the underlying action-value function of the game, we learn model parameters by applying nonlinear regression to the observed final utilities from the simulated roll-outs. |
Adding Linguistic Knowledge to the Monte-Carlo Framework | The resulting update to model parameters 6 is of the form: |
Adding Linguistic Knowledge to the Monte-Carlo Framework | We use the same experimental settings across all methods, and all model parameters are initialized to zero. |
Association Model | Basic Interpolation: This smoothing model, Pinw(e|q), linearly combines our foreground and background models using a model parameter 04: |
Association Model | Section 5.2 outlines our procedure for leam-ing the model parameters for both 15mm(e|q) and |
Experimental Results | 5.2.1 Model Parameters |
Decipherment | These methods are attractive for their ability to manage uncertainty about model parameters and allow one to incorporate prior knowledge during inference. |
Decipherment | Our goal is to estimate the channel model parameters 6 in order to maximize the probability of the observed ciphertext c: |
Decipherment | The base distribution P0 represents prior knowledge about the model parameter distributions. |