Abstract | Our proposed approach significantly improves the performance of competitive phrase-based systems, leading to consistent improvements between 1 and 4 BLEU points on standard evaluation sets. |
Conclusion | In this work, we presented an approach that can expand a translation model extracted from a sentence-aligned, bilingual corpus using a large amount of unstructured, monolingual data in both source and target languages, which leads to improvements of 1.4 and 1.2 BLEU points over strong baselines on evaluation sets, and in some scenarios gains in excess of 4 BLEU points . |
Evaluation | HalfMono”, we use only half of the monolingual comparable corpora, and still obtain an improvement of 0.56 BLEU points , indicating that adding more monolingual data is likely to improve the system further. |
Evaluation | In the first setup, we get a huge improvement of 4.2 BLEU points (“SLP+Noisy”) when using the monolingual data and the noisy parallel data for graph construction. |
Evaluation | Furthermore, despite completely unaligned, non-comparable monolingual text on the Urdu and English sides, and a very large language model, we can still achieve gains in excess of 1.2 BLEU points (“SLP”) in a difficult evaluation scenario, which shows that the technique adds a genuine translation improvement over and above na‘1've memorization of n-gram sequences. |
Introduction | This enhancement alone results in an improvement of almost 1.4 BLEU points . |
Introduction | We evaluated the proposed approach on both Arabic-English and Urdu-English under a range of scenarios (§3), varying the amount and type of monolingual corpora used, and obtained improvements between 1 and 4 BLEU points , even when using very large language models. |
Abstract | When the selected sentence pairs are evaluated on an end-to-end MT task, our methods can increase the translation performance by 3 BLEU points . |
Conclusion | Compared with the methods which only employ language model for data selection, we observe that our methods are able to select high-quality do-main-relevant sentence pairs and improve the translation performance by nearly 3 BLEU points . |
Experiments | The results show that General-domain system trained on a larger amount of bilingual resources outperforms the system trained on the in-domain corpus by over 12 BLEU points . |
Experiments | In the end-to-end SMT evaluation, TM selects top 600k sentence pairs of general-domain corpus, but increases the translation performance by 2.7 BLEU points . |
Experiments | Meanwhile, the TM+LM and Bidirectional TM+LM have gained 3.66 and 3.56 BLEU point improvements compared against the general-domain baseline system. |
Conclusion | When applied to English-to-Arabic translation, lattice desegmentation results in a 1.0 BLEU point improvement over one-best desegmentation, and a 1.7 BLEU point improvement over unsegmented translation. |
Results | For English-to-Arabic, 1-best desegmentation results in a 0.7 BLEU point improvement over training on unsegmented Arabic. |
Results | Moving to lattice desegmentation more than doubles that improvement, resulting in a BLEU score of 34.4 and an improvement of 1.0 BLEU point over 1-best desegmentation. |
Results | 1000-best desegmentation also works well, resulting in a 0.6 BLEU point improvement over 1-best. |
Experiments and Results | When we remove it from RZNN, WEPPE based method drops about 10 BLEU points on development data and more than 6 BLEU points on test data. |
Experiments and Results | TCBPPE based method drops about 3 BLEU points on both development and test data sets. |
Introduction | We conduct experiments on a Chinese-to-English translation task to test our proposed methods, and we get about 1.5 BLEU points improvement, compared with a state-of-the-art baseline system. |
Abstract | On two Chinese-English tasks, our semi-supervised DAE features obtain statistically significant improvements of l.34/2.45 (IWSLT) and 0.82/1.52 (NIST) BLEU points over the unsupervised DBN features and the baseline features, respectively. |
Conclusions | The results also demonstrate that DNN (DAE and HCDAE) features are complementary to the original features for SMT, and adding them together obtain statistically significant improvements of 3.16 (IWSLT) and 2.06 (NIST) BLEU points over the baseline features. |
Experiments and Results | Adding new DNN features as extra features significantly improves translation accuracy (row 2-17 vs. 1), with the highest increase of 2.45 (IWSLT) and 1.52 (NIST) (row 14 vs. 1) BLEU points over the baseline features. |
Experiments | 0 Our sense-based translation model achieves a substantial improvement of 1.2 BLEU points over the baseline. |
Experiments | 0 If we only integrate sense features into the sense-based translation model, we can still outperform the baseline by 0.62 BLEU points . |
Experiments | From the table, we can find that the sense-based translation model outperforms the reformulated WSD by 0.57 BLEU points . |