A Skeleton-based Approach to MT 2.1 Skeleton Identification | As is standard in SMT, we further assume that 1) the translation process can be decomposed into a derivation of phrase-pairs (for phrase-based models) or translation rules (for syntax-based models); 2) and a linear function is used to assign a model score to each derivation. |
A Skeleton-based Approach to MT 2.1 Skeleton Identification | above problem can be redefined in a Viterbi fashion - we find the derivation d with the highest model score given 3 and 7': |
A Skeleton-based Approach to MT 2.1 Skeleton Identification | The skeleton translation model focuses on the translation of the sentence skeleton, i.e., the solid (red) rectangles; while the full translation model computes the model score for all those phrase-pairs, i.e., all solid and dashed rectangles. |
Abstract | However, by using an SVM ranker to combine the realizer’s model score together with features from multiple parsers, including ones designed to make the ranker more robust to parsing mistakes, we show that significant increases in BLEU scores can be achieved. |
Conclusion | In this paper, we have shown that while using parse accuracy in a simple reranking strategy for self-monitoring fails to improve BLEU scores over a state-of-the-art averaged perceptron realization ranking model, it is possible to significantly increase BLEU scores using an SVM ranker that combines the realizer’s model score together with features from multiple parsers, including ones designed to make the ranker more robust to parsing mistakes that human readers would be unlikely to make. |
Introduction | Therefore, to develop a more nuanced self-monitoring reranker that is more robust to such parsing mistakes, we trained an SVM using dependency precision and recall features for all three parses, their n-best parsing results, and per-label precision and recall for each type of dependency, together with the realizer’s normalized perceptron model score as a feature. |
Reranking with SVMs 4.1 Methods | Similarly, we conjectured that large differences in the realizer’s perceptron model score may more reliably reflect human fluency preferences than small ones, and thus we combined this score with features for parser accuracy in an SVM ranker. |
Reranking with SVMs 4.1 Methods | perceptron model score the score from the realizer’s model, normalized to [0,1] for the realizations in the n-best list |
Reranking with SVMs 4.1 Methods | We trained different models to investigate the contribution made by different parsers and different types of features, with the perceptron model score included as a feature in all models. |
Related Work | The training objective is that the original ngram is expected to obtain a higher language model score than the corrupted ngram by a margin of 1. |
Related Work | (1) where t is the original ngram, 757" is the corrupted ngram, wa(-) is a one-dimensional scalar representing the language model score of the input ngram. |
Related Work | The output f Cw is the language model score of the input, which is calculated as given in Equation 2, where L is the lookup table of word embedding, w1, 2122, (91, ()2 are the parameters of linear layers. |
Abstract | In this work, we construct a statistical model of grammaticality using various linguistic features (e.g., misspelling counts, parser outputs, n-gram language model scores ). |
Discussion and Conclusions | While Post found that such a system can effectively distinguish grammatical news text sentences from sentences generated by a language model, measuring the grammaticality of real sentences from language leam-ers seems to require a wider variety of features, including n-gram counts, language model scores , etc. |
Experiments | To create further baselines for comparison, we selected the following features that represent ways one might approximate grammaticality if a comprehensive model was unavailable: whether the link parser can fully parse the sentence (complete_l ink), the Gigaword language model score (gigaword_avglogprob), and the number of misspelled tokens (nummisspelled). |
Model Training | LSGT(VV7 V, 8[l’n]) = —10g( [m] ZtEnbest exp (yt ) (7) where yggile is the model score of a oracle translation candidate for the span [1, n] . |
Our Model | for SMT performance, such as language model score and distortion model score . |
Our Model | The commonly used features, such as translation score, language model score and distortion score, are used as the recurrent input vector :c . |
Experiments | Method 4, named REBOL, implements REsponse-Based Online Learning by instantiating y+ and y‘ to the form described in Section 4: In addition to the model score 3, it uses a cost function 0 based on sentence-level BLEU (Nakov et al., 2012) and tests translation hypotheses for task-based feedback using a binary execution function 6. |
Response-based Online Learning | (2012), inter alia) and incorporates the current model score , leading to various ramp loss objectives described in Gimpel and Smith (2012). |
Response-based Online Learning | The opposite of y+ is the translation y‘ that leads to negative feedback, has a high model score , and a high cost. |