Index of papers in Proc. ACL 2012 that mention
  • model trained
Kolachina, Prasanth and Cancedda, Nicola and Dymetman, Marc and Venkatapathy, Sriram
Inferring a learning curve from mostly monolingual data
In scenario $2, the models trained from the seed parallel corpus and the features used for inference (Section 4) provide complementary information.
Inferring a learning curve from mostly monolingual data
Using the models trained for the experiments in Section 3, we estimate the squared extrapolation error at the anchors 83- when using models trained on size up to 30;, and set the confidence in the extrapolations8 for u to its inverse:
Inferring a learning curve from mostly monolingual data
For the cases where a slightly larger in-domain “seed” parallel corpus is available, we introduced an extrapolation method and a combined method yielding high-precision predictions: using models trained on up to 20K sentence pairs we can predict performance on a given test set with a root mean squared error in the order of l BLEU point at 75K sentence pairs, and in the order of 2-4 BLEU points at 500K.
Selecting a parametric family of curves
For a certain bilingual test dataset d, we consider a set of observations 0d 2 {(301, yl), ($2, yg)...(;vn, 3471)}, where y, is the performance on d (measured using BLEU (Papineni et al., 2002)) of a translation model trained on a parallel corpus of size 307;.
model trained is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Liu, Shujie and Li, Chi-Ho and Li, Mu and Zhou, Ming
Experiments and Results
Our baseline decoder is an in-house implementation of Bracketing Transduction Grammar (Dekai Wu, 1997) (BTG) in CKY-style decoding with a lexical reordering model trained with maximum entropy (Xiong et al., 2006).
Experiments and Results
The language model is 5-gram language model trained with the target sentences in the training data.
Experiments and Results
The language model is 5-gram language model trained with the Giga-Word corpus plus the English sentences in the training data.
Graph Construction
Note that, due to pruning in both decoding and translation model training , forced alignment may fail, i.e.
model trained is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Danescu-Niculescu-Mizil, Cristian and Cheng, Justin and Kleinberg, Jon and Lee, Lillian
Hello. My name is Inigo Montoya.
First, we show a concrete sense in which memorable quotes are indeed distinctive: with respect to lexical language models trained on the newswire portions of the Brown corpus [21], memorable quotes have significantly lower likelihood than their non-memorable counterparts.
Hello. My name is Inigo Montoya.
In particular, we analyze a corpus of advertising slogans, and we show that these slogans have significantly greater likelihood at both the word level and the part-of-speech level with respect to a language model trained on memorable movie quotes, compared to a corresponding language model trained on non-memorable movie quotes.
Never send a human to do a machine’s job.
In particular, the newswire section of the Brown corpus is predicted better at the lexical level by the language model trained on non-memorable quotes.
model trained is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Green, Spence and DeNero, John
Conclusion and Outlook
We achieved best results when the model training data, MT tuning set, and MT evaluation set contained roughly the same genre.
Discussion of Translation Results
For comparison, +POS indicates our class-based model trained on the 11 coarse POS tags only (e.g., “Noun”).
Discussion of Translation Results
The best result—a +1.04 BLEU average gain—was achieved when the class-based model training data, MT tuning set, and MT evaluation set contained the same genre.
model trained is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Pauls, Adam and Klein, Dan
Experiments
In Table l, we show the first four samples of length between 15 and 20 generated from our model and a 5- gram model trained on the Penn Treebank.
Experiments
Table 5: Classification accuracies on the noisy WSJ for models trained on WSJ Sections 2—21 and our 1B token corpus.
Experiments
In Table 4, we also show the performance of the generative models trained on our 1B corpus.
model trained is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Zweig, Geoffrey and Platt, John C. and Meek, Christopher and Burges, Christopher J.C. and Yessenalina, Ainur and Liu, Qiang
Experimental Results 5.1 Data Resources
Note that the latter are derived from models trained with the Los Angeles Times data, while the Holmes results are derived from models trained with 19th—century novels.
Sentence Completion via Language Modeling
Our baseline model is a Good—Turing smoothed model trained with the CMU language modeling toolkit (Clarkson and Rosenfeld, 1997).
Sentence Completion via Language Modeling
For the SAT task, we used a trigram language model trained on 1.1B words of newspaper data, described in Section 5.1.
model trained is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: