Experiments | As described in Section 3.2, the weight of each variable is a linear combination of the language model score , three classifier confidence scores, and three classifier disagreement scores. |
Experiments | We use the Web 1T 5—gram corpus (Brants and Franz, 2006) to compute the language model score for a sentence. |
Experiments | Finally, the language model score , classifier confidence scores, and classifier disagreement scores are normalized to take values in [0, 1], based on the H00 2011 development data. |
Inference with First Order Variables | The language model score h(s’, LM) of 8’ based on a large web corpus; |
Inference with First Order Variables | Next, to compute whpyg, we collect language model score and confidence scores from the article (ART), preposition (PREP), and noun number (NOUN) classifier, i.e., E = {ART, PREP, NOUN}. |
Inference with Second Order Variables | When measuring the gain due to 21131213312 2 1 (change cat to cats), the weight wNoungmluml is likely to be small since A cats will get a low language model score , a low article classifier confidence score, and a low noun number classifier confidence score. |
Related Work | Features used in classification include surrounding words, part-of—speech tags, language model scores (Gamon, 2010), and parse tree structures (Tetreault et al., 2010). |
Bk <— BESTS(X,Bk_1,W) | beam 13, (sequences in B are sorted in terms of model score , i.e., w-(p(13[0]) > w-(p(13[1]) ...). |
Introduction | step1 step2 stepk-1 stepk beam C 5 P 15 P 25 P 32 P 4 C 13 P 22 P 28 P 2_ P 10 C 20 P 26 alid updat model score C 25 |
Introduction | The numbers following C/P are model scores . |
Introduction | In particular, one needs to guarantee that each parameter update is valid, i.e., the correct action sequence has lower model score than the predicted onel. |
Introduction | In the search procedure, frequent computation of the model score is needed for the search heuristic function, which will be challenged by the decoding efficiency for the neural network based translation model. |
Introduction | The main reason why cube-pruning works is that the translation model is linear and the model score for the language model is approximately monotonic (Chiang, 2007). |
Introduction | The premise of cube-pruning is that the language model score is approximately monotonic (Chiang, 2007). |