Abstract | In this paper we investigate the effects of applying such a technique to higher—order n—gram models trained on large corpora. |
Distributed Clustering | The quality of class-based models trained using the resulting clusterings did not differ noticeably from those trained using clusterings for which the full vocabulary was considered in each iteration. |
Experiments | Table 1: BLEU scores of the Arabic English system using models trained on the English en_ta7"get data set |
Experiments | We used these models in addition to a word-based 6-gram model created by combining models trained on all four English data sets. |
Experiments | Table 2 shows the BLEU scores of the machine translation system using only this word-based model, the scores after adding the class-based model trained on the enlarget data set and when using all three models. |
Introduction | Class-based n-gram models have also been shown to benefit from their reduced number of parameters when scaling to higher-order n- grams (Goodman and Gao, 2000), and even despite the increasing size and decreasing sparsity of language model training corpora (Brants et al., 2007), class-based n—gram models might lead to improvements when increasing the n—gram order. |
Introduction | We then show that using partially class-based language models trained using the resulting classifications together with word-based language models in a state-of-the-art statistical machine translation system yields improvements despite the very large size of the word-based models used. |
Conclusion | Additionally, our data-driven approach can be applied to any dimension that is meaningful to human judges, and it provides an elegant way to project multiple dimensions simultaneously, by including the relevant dimensions as features of the parameter models’ training data. |
Conclusion | In terms of our research questions in Section 3.1, we show that models trained on expert judges to project multiple traits in a single utterance generate utterances whose personality is recognized by naive judges. |
Evaluation Experiment | Q1: Is the personality projected by models trained on |
Introduction | Another thread investigates SNLG scoring models trained using higher-level linguistic features to replicate human judgments of utterance quality (Rambow et al., 2001; Nakatsu and White, 2006; Stent and Guo, 2005). |
Parameter Estimation Models | 2.3 Statistical Model Training |
Experiments | For example, the performance of the dep1c and dep2c models trained on 1k sentences is roughly the same as the performance of the dep1 and dep2 models, respectively, trained on 2k sentences. |
Experiments | For example, in scenario 1 the dep2c model trained on lk sentences is close in performance to the depl model trained on 4k sentences, and the dep2c model trained on 4k sentences is close to the depl model trained on the entire training set (roughly 40k sentences). |
Experiments | For example, the deplc model trained on 4k sentences is roughly as good as the dep1 model trained on 8k sentences. |
Experiments | In the experiments, the language model is a Chinese 5-gram language model trained with the Chinese part of the LDC parallel corpus and the Xin-hua part of the Chinese Gigaword corpus with about 27 million words. |
Experiments | The Bi-ME model is trained with FBIS corpus, whose size is smaller than that used in Mo-ME model training . |
Experiments | We can see that the Bi-ME model can achieve better results than the Mo-ME model in both recall and precision metrics although only a small sized bilingual corpus is used for Bi-ME model training . |