Index of papers in Proc. ACL 2013 that mention
  • model trained
Kauchak, David
Abstract
We explore the relationship between normal English and simplified English and compare language models trained on varying amounts of text from each.
Abstract
We find that a combined model using both simplified and normal English data achieves a 23% improvement in perplexity and a 24% improvement on the lexical simplification task over a model trained only on simple data.
Introduction
Finally, many recent text simplification systems have utilized language models trained only on simplified data (Zhu et al., 2010; Woodsend and Lapata, 2011; Coster and Kauchak, 2011a; Wubben et al., 2012); improvements in simple language modeling could translate into improvements for these systems.
Language Model Evaluation: Perplexity
Figure 1: Language model perplexities on the held-out test data for models trained on increasing amounts of data.
Language Model Evaluation: Perplexity
As expected, when trained on the same amount of data, the language models trained on simple data perform significantly better than language models trained on normal data.
Language Model Evaluation: Perplexity
The perplexity for the simple-ALL+norma1 model, which starts with all available simple data, continues to improve as normal data is added resulting in a 23% improvement over the model trained with only simple data (from a perplexity of 129 down to 100).
Related Work
Similarly for summarization, systems that have employed language models trained only on unsummarized text (Banko et al., 2000; Daume and Marcu, 2002).
model trained is mentioned in 22 sentences in this paper.
Topics mentioned in this paper:
Visweswariah, Karthik and Khapra, Mitesh M. and Ramanathan, Ananthakrishnan
Generating reference reordering from parallel sentences
However, as we will see in the experimental results, the quality of a reordering model trained from automatic alignments is very sensitive to the quality of alignments.
Generating reference reordering from parallel sentences
In addition to the original source and target sentence, we also feed the predictions of the reordering model trained in Step 1 to this alignment model (see section 4.2 for details of the model itself).
Generating reference reordering from parallel sentences
Step 3: Finally, we use the predictions of the alignment model trained in Step 2 to train reordering models C(773|ws,wt, a) (see section 4.3 for details on the reordering model itself).
Introduction
A reordering model trained on such incorrect reorderings would obviously perform poorly.
Introduction
Our experiments show that reordering models trained using these improved machine alignments perform significantly better than models trained only on manual word alignments.
Introduction
This results in a 1.8 BLEU point gain in machine translation performance on an Urdu-English machine translation task over a preordering model trained using only manual word alignments.
Results and Discussions
Using fewer features: We compare the performance of a model trained using lexical features for all words (Column 2 of Table l) with a model trained using lexical features only for the 1000 most frequent words (Column 3 of Table l).
Results and Discussions
Table 2: mBLEU scores for Urdu to English reordering using models trained on different data sources and tested on a development set of 8017 Urdu tokens.
Results and Discussions
Table 3: mBLEU with different methods to generate reordering model training data from a machine aligned parallel corpus in addition to manual word alignments.
model trained is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Feng, Minwei and Peter, Jan-Thorsten and Ney, Hermann
Experiments
The source side data statistics for the reordering model training is given in Table 2 (target side has only nine labels).
Experiments
Table 2: tagging-style model training data statistics
Experiments
will show later, the model trained with both CRFs and RNN help to improve the translation quality.
Tagging-style Reordering Model
Once the model training is finished, we make inference on develop and test corpora which means that we get the labels of the source sentences that need to be translated.
model trained is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Celikyilmaz, Asli and Hakkani-Tur, Dilek and Tur, Gokhan and Sarikaya, Ruhi
Markov Topic Regression - MTR
We use the word-tag posterior probabilities obtained from a CRF sequence model trained on labeled utterances as features.
Semi-Supervised Semantic Labeling
They decode unlabeled queries from target domain (t) using a CRF model trained on the POS-labeled newswire data (source domain (0)).
Semi-Supervised Semantic Labeling
They use a small value for 7' to enable the new model to be as close as possible to the initial model trained on source data.
model trained is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Lan, Man and Xu, Yu and Niu, Zhengyu
Abstract
However, a previous study (Sporleder and Lascarides, 2008) showed that models trained on these synthetic data do not generalize very well to natural (i.e.
Multitask Learning for Discourse Relation Prediction
However, (Sporleder and Lascarides, 2008) found that the model trained on synthetic implicit data has not performed as well as expected in natural implicit data.
Related Work
Unlike their previous work, our previous work (Zhou et al., 2010) presented a method to predict the missing connective based on a language model trained on an unannotated corpus.
model trained is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Roark, Brian and Allauzen, Cyril and Riley, Michael
Experimental results
For all results reported here, we use the SRILM toolkit for baseline model training and pruning, then convert from the resulting ARPA format model to an OpenFst format (Allauzen et al., 2007), as used in the OpenGrm n-gram library (Roark et al., 2012).
Experimental results
The model was trained on the 1996 and 1997 Hub4 acoustic model training sets (about 150 hours of data) using semi-tied covariance modeling and CMLLR-based speaker adaptive training and 4 iterations of boosted MMI.
Introduction
This is done via a marginal distribution constraint which requires the expected frequency of the lower-order n- grams to match their observed frequency in the training data, much as is commonly done for maximum entropy model training .
model trained is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Wang, Lu and Raghavan, Hema and Castelli, Vittorio and Florian, Radu and Cardie, Claire
Experimental Setup
Beam size is fixed at 2000.4 Sentence compressions are evaluated by a 5-gram language model trained on Gigaword (Graff, 2003) by SRILM (Stolcke, 2002).
Sentence Compression
As the space of possible compressions is exponential in the number of leaves in the parse tree, instead of looking for the globally optimal solution, we use beam search to find a set of highly likely compressions and employ a language model trained on a large corpus for evaluation.
Sentence Compression
Given the N -best compressions from the decoder, we evaluate the yield of the trimmed trees using a language model trained on the Gigaword (Graff, 2003) corpus and return the compression with the highest probability.
model trained is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Zhang, Longkai and Li, Li and He, Zhengyan and Wang, Houfeng and Sun, Ni
Experiment
When segmenting texts of the target domain using models trained on source domain, the performance will be hurt with more false segmented instances added into the training set.
INTRODUCTION
These new features of micro-blogs make the Chinese Word Segmentation (CWS) models trained on the source domain, such as news corpus, fail to perform equally well when transferred to texts from micro-blogs.
Our method
Because of this, the model trained on this unbalanced corpus tends to be biased.
model trained is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: