Index of papers in Proc. ACL that mention
  • model score
Xiao, Tong and Zhu, Jingbo and Zhang, Chunliang
A Skeleton-based Approach to MT 2.1 Skeleton Identification
As is standard in SMT, we further assume that 1) the translation process can be decomposed into a derivation of phrase-pairs (for phrase-based models) or translation rules (for syntax-based models); 2) and a linear function is used to assign a model score to each derivation.
A Skeleton-based Approach to MT 2.1 Skeleton Identification
above problem can be redefined in a Viterbi fashion - we find the derivation d with the highest model score given 3 and 7':
A Skeleton-based Approach to MT 2.1 Skeleton Identification
The skeleton translation model focuses on the translation of the sentence skeleton, i.e., the solid (red) rectangles; while the full translation model computes the model score for all those phrase-pairs, i.e., all solid and dashed rectangles.
model score is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Wu, Yuanbin and Ng, Hwee Tou
Experiments
As described in Section 3.2, the weight of each variable is a linear combination of the language model score , three classifier confidence scores, and three classifier disagreement scores.
Experiments
We use the Web 1T 5—gram corpus (Brants and Franz, 2006) to compute the language model score for a sentence.
Experiments
Finally, the language model score , classifier confidence scores, and classifier disagreement scores are normalized to take values in [0, 1], based on the H00 2011 development data.
Inference with First Order Variables
The language model score h(s’, LM) of 8’ based on a large web corpus;
Inference with First Order Variables
Next, to compute whpyg, we collect language model score and confidence scores from the article (ART), preposition (PREP), and noun number (NOUN) classifier, i.e., E = {ART, PREP, NOUN}.
Inference with Second Order Variables
When measuring the gain due to 21131213312 2 1 (change cat to cats), the weight wNoungmluml is likely to be small since A cats will get a low language model score , a low article classifier confidence score, and a low noun number classifier confidence score.
Related Work
Features used in classification include surrounding words, part-of—speech tags, language model scores (Gamon, 2010), and parse tree structures (Tetreault et al., 2010).
model score is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Duan, Manjuan and White, Michael
Abstract
However, by using an SVM ranker to combine the realizer’s model score together with features from multiple parsers, including ones designed to make the ranker more robust to parsing mistakes, we show that significant increases in BLEU scores can be achieved.
Conclusion
In this paper, we have shown that while using parse accuracy in a simple reranking strategy for self-monitoring fails to improve BLEU scores over a state-of-the-art averaged perceptron realization ranking model, it is possible to significantly increase BLEU scores using an SVM ranker that combines the realizer’s model score together with features from multiple parsers, including ones designed to make the ranker more robust to parsing mistakes that human readers would be unlikely to make.
Introduction
Therefore, to develop a more nuanced self-monitoring reranker that is more robust to such parsing mistakes, we trained an SVM using dependency precision and recall features for all three parses, their n-best parsing results, and per-label precision and recall for each type of dependency, together with the realizer’s normalized perceptron model score as a feature.
Reranking with SVMs 4.1 Methods
Similarly, we conjectured that large differences in the realizer’s perceptron model score may more reliably reflect human fluency preferences than small ones, and thus we combined this score with features for parser accuracy in an SVM ranker.
Reranking with SVMs 4.1 Methods
perceptron model score the score from the realizer’s model, normalized to [0,1] for the realizations in the n-best list
Reranking with SVMs 4.1 Methods
We trained different models to investigate the contribution made by different parsers and different types of features, with the perceptron model score included as a feature in all models.
model score is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Auli, Michael and Lopez, Adam
Experiments
Both show an [proved relationship between model score and F-:asure.
Oracle Parsing
Model score
Oracle Parsing
Digging deeper, we compared parser model score against Viterbi F—score and oracle F-score at a va-
Oracle Parsing
Model score
model score is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Ma, Ji and Zhu, Jingbo and Xiao, Tong and Yang, Nan
Bk <— BESTS(X,Bk_1,W)
beam 13, (sequences in B are sorted in terms of model score , i.e., w-(p(13[0]) > w-(p(13[1]) ...).
Introduction
step1 step2 stepk-1 stepk beam C 5 P 15 P 25 P 32 P 4 C 13 P 22 P 28 P 2_ P 10 C 20 P 26 alid updat model score C 25
Introduction
The numbers following C/P are model scores .
Introduction
In particular, one needs to guarantee that each parameter update is valid, i.e., the correct action sequence has lower model score than the predicted onel.
model score is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Sim, Khe Chai
A Probabilistic Formulation for HVR
0 Acoustic model score: p(O|W) o Haptic model score : p(H|£)
A Probabilistic Formulation for HVR
o PLI model score : P(£|W)
A Probabilistic Formulation for HVR
0 Language model score : P(W)
model score is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
liu, lemao and Watanabe, Taro and Sumita, Eiichiro and Zhao, Tiejun
Introduction
In the search procedure, frequent computation of the model score is needed for the search heuristic function, which will be challenged by the decoding efficiency for the neural network based translation model.
Introduction
The main reason why cube-pruning works is that the translation model is linear and the model score for the language model is approximately monotonic (Chiang, 2007).
Introduction
The premise of cube-pruning is that the language model score is approximately monotonic (Chiang, 2007).
model score is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Tang, Duyu and Wei, Furu and Yang, Nan and Zhou, Ming and Liu, Ting and Qin, Bing
Related Work
The training objective is that the original ngram is expected to obtain a higher language model score than the corrupted ngram by a margin of 1.
Related Work
(1) where t is the original ngram, 757" is the corrupted ngram, wa(-) is a one-dimensional scalar representing the language model score of the input ngram.
Related Work
The output f Cw is the language model score of the input, which is calculated as given in Equation 2, where L is the lookup table of word embedding, w1, 2122, (91, ()2 are the parameters of linear layers.
model score is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Green, Spence and DeNero, John
A Class-based Model of Agreement
The agreement model scores sequences of morpho-syntactic word classes, which express grammatical features relevant to agreement.
A Class-based Model of Agreement
However, in MT, we seek a measure of sentence quality (1(6) that is comparable across different hypotheses on the beam (much like the n-gram language model score ).
A Class-based Model of Agreement
Discriminative model scores have been used as MT features (Galley and Manning, 2009), but we obtained better results by scoring the l-best class sequences with a generative model.
Inference during Translation Decoding
The agreement model score is one decoder feature function.
Introduction
Our model scores hypotheses during decoding.
model score is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Rush, Alexander M. and Collins, Michael
Background: Hypergraphs
The labels for leaves will be words, and will be important in defining strings and language model scores for those strings.
Conclusion
For each 12 E VL, define av = maxpm3(p)=v Mp), where MP) = (100109): 02(19):0309))—/\1(01(P))—/\2(v2(19))—Zsepl(p) — 25610209) Here h is a function that computes language model scores , and the other terms involve Lagrange mulipliers.
Introduction
Informally, the first decoding algorithm incorporates the weights and hard constraints on translations from the synchronous grammar, while the second decoding algorithm is used to integrate language model scores .
Introduction
We compare our method to cube pruning (Chiang, 2007), and find that our method gives improved model scores on a significant number of examples.
The Full Algorithm
In the simple algorithm, the first step was to predict the previous leaf for each leaf 21, under a score that combined a language model score with a Lagrange multiplier score (i.e., compute arg Inan 6(711, 21) where 6011,21) = 6011,11) + In this section we describe an algorithm that for each leaf 21 again predicts the previous leaf, but in addition predicts the full path back to that leaf.
model score is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
DeNero, John and Chiang, David and Knight, Kevin
Abstract
The minimum Bayes risk (MBR) decoding objective improves BLEU scores for machine translation output relative to the standard Viterbi objective of maximizing model score .
Computing Feature Expectations
Translation forests compactly encode an exponential number of output translations for an input sentence, along with their model scores .
Computing Feature Expectations
3Decoder states can include additional information as well, such as local configurations for dependency language model scoring .
Computing Feature Expectations
The n-gram language model score of e similarly decomposes over the h in e that produce n-grams.
Experimental Results
Figure 4: Three translations of an example Arabic sentence: its human- generated reference, the translation with the highest model score under Hiero (Viterbi), and the translation chosen by forest-based consensus decoding.
model score is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Razmara, Majid and Foster, George and Sankaran, Baskaran and Sarkar, Anoop
Ensemble Decoding
where 6} denotes the mixture operation between two or more model scores .
Ensemble Decoding
o Weighted Max (wmax): where the ensemble score is the weighted max of all model scores .
Ensemble Decoding
Since in log-linear models, the model scores are not normalized to form probability distributions, the scores that different models assign to each phrase-pair may not be in the same scale.
Experiments & Results 4.1 Experimental Setup
An interesting observation based on the results in Table 3 is that uniform weights are doing reasonably well given that the component weights are not optimized and therefore model scores may not be in the same scope (refer to discussion in §3.2).
model score is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Galley, Michel and Manning, Christopher D.
Abstract
This paper applies MST parsing to MT, and describes how it can be integrated into a phrase-based decoder to compute dependency language model scores .
Dependency parsing for machine translation
While it seems that loopy graphs are undesirable when the goal is to obtain a syntactic analysis, that is not necessarily the case when one just needs a language modeling score .
Machine translation experiments
We use the standard features implemented almost exactly as in Moses: four translation features (phrase-based translation probabilities and lexically-weighted probabilities), word penalty, phrase penalty, linear distortion, and language model score .
Machine translation experiments
model score computed with the dependency parsing algorithm described in Section 2.
model score is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zhang, Hao and Gildea, Daniel
Experiments
This figure indicates the advantage of the two-pass decoding strategy in producing translations with a high model score in less time.
Experiments
However, model scores do not directly translate into BLEU scores.
Experiments
Simply by exploring more (200 times the log beam) after-goal items, we can optimize the Viterbi synchronous parse significantly, shown in Figure 3(left) in terms of model score versus search time.
model score is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Shen, Libin and Xu, Jinxi and Weischedel, Ralph
Dependency Language Model
In order to calculate the dependency language model score , or depLM score for short, on the fly for
Implementation Details
Language model score .
Implementation Details
Dependency language model score 8.
model score is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Heilman, Michael and Cahill, Aoife and Madnani, Nitin and Lopez, Melissa and Mulholland, Matthew and Tetreault, Joel
Abstract
In this work, we construct a statistical model of grammaticality using various linguistic features (e.g., misspelling counts, parser outputs, n-gram language model scores ).
Discussion and Conclusions
While Post found that such a system can effectively distinguish grammatical news text sentences from sentences generated by a language model, measuring the grammaticality of real sentences from language leam-ers seems to require a wider variety of features, including n-gram counts, language model scores , etc.
Experiments
To create further baselines for comparison, we selected the following features that represent ways one might approximate grammaticality if a comprehensive model was unavailable: whether the link parser can fully parse the sentence (complete_l ink), the Gigaword language model score (gigaword_avglogprob), and the number of misspelled tokens (nummisspelled).
model score is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Liu, Shujie and Yang, Nan and Li, Mu and Zhou, Ming
Model Training
LSGT(VV7 V, 8[l’n]) = —10g( [m] ZtEnbest exp (yt ) (7) where yggile is the model score of a oracle translation candidate for the span [1, n] .
Our Model
for SMT performance, such as language model score and distortion model score .
Our Model
The commonly used features, such as translation score, language model score and distortion score, are used as the recurrent input vector :c .
model score is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Riezler, Stefan and Simianer, Patrick and Haas, Carolin
Experiments
Method 4, named REBOL, implements REsponse-Based Online Learning by instantiating y+ and y‘ to the form described in Section 4: In addition to the model score 3, it uses a cost function 0 based on sentence-level BLEU (Nakov et al., 2012) and tests translation hypotheses for task-based feedback using a binary execution function 6.
Response-based Online Learning
(2012), inter alia) and incorporates the current model score , leading to various ramp loss objectives described in Gimpel and Smith (2012).
Response-based Online Learning
The opposite of y+ is the translation y‘ that leads to negative feedback, has a high model score , and a high cost.
model score is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: