Index of papers in Proc. ACL 2014 that mention
  • BLEU points
Saluja, Avneesh and Hassan, Hany and Toutanova, Kristina and Quirk, Chris
Abstract
Our proposed approach significantly improves the performance of competitive phrase-based systems, leading to consistent improvements between 1 and 4 BLEU points on standard evaluation sets.
Conclusion
In this work, we presented an approach that can expand a translation model extracted from a sentence-aligned, bilingual corpus using a large amount of unstructured, monolingual data in both source and target languages, which leads to improvements of 1.4 and 1.2 BLEU points over strong baselines on evaluation sets, and in some scenarios gains in excess of 4 BLEU points .
Evaluation
HalfMono”, we use only half of the monolingual comparable corpora, and still obtain an improvement of 0.56 BLEU points , indicating that adding more monolingual data is likely to improve the system further.
Evaluation
In the first setup, we get a huge improvement of 4.2 BLEU points (“SLP+Noisy”) when using the monolingual data and the noisy parallel data for graph construction.
Evaluation
Furthermore, despite completely unaligned, non-comparable monolingual text on the Urdu and English sides, and a very large language model, we can still achieve gains in excess of 1.2 BLEU points (“SLP”) in a difficult evaluation scenario, which shows that the technique adds a genuine translation improvement over and above na‘1've memorization of n-gram sequences.
Introduction
This enhancement alone results in an improvement of almost 1.4 BLEU points .
Introduction
We evaluated the proposed approach on both Arabic-English and Urdu-English under a range of scenarios (§3), varying the amount and type of monolingual corpora used, and obtained improvements between 1 and 4 BLEU points , even when using very large language models.
BLEU points is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Liu, Le and Hong, Yu and Liu, Hao and Wang, Xing and Yao, Jianmin
Abstract
When the selected sentence pairs are evaluated on an end-to-end MT task, our methods can increase the translation performance by 3 BLEU points .
Conclusion
Compared with the methods which only employ language model for data selection, we observe that our methods are able to select high-quality do-main-relevant sentence pairs and improve the translation performance by nearly 3 BLEU points .
Experiments
The results show that General-domain system trained on a larger amount of bilingual resources outperforms the system trained on the in-domain corpus by over 12 BLEU points .
Experiments
In the end-to-end SMT evaluation, TM selects top 600k sentence pairs of general-domain corpus, but increases the translation performance by 2.7 BLEU points .
Experiments
Meanwhile, the TM+LM and Bidirectional TM+LM have gained 3.66 and 3.56 BLEU point improvements compared against the general-domain baseline system.
BLEU points is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Salameh, Mohammad and Cherry, Colin and Kondrak, Grzegorz
Conclusion
When applied to English-to-Arabic translation, lattice desegmentation results in a 1.0 BLEU point improvement over one-best desegmentation, and a 1.7 BLEU point improvement over unsegmented translation.
Results
For English-to-Arabic, 1-best desegmentation results in a 0.7 BLEU point improvement over training on unsegmented Arabic.
Results
Moving to lattice desegmentation more than doubles that improvement, resulting in a BLEU score of 34.4 and an improvement of 1.0 BLEU point over 1-best desegmentation.
Results
1000-best desegmentation also works well, resulting in a 0.6 BLEU point improvement over 1-best.
BLEU points is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Liu, Shujie and Yang, Nan and Li, Mu and Zhou, Ming
Experiments and Results
When we remove it from RZNN, WEPPE based method drops about 10 BLEU points on development data and more than 6 BLEU points on test data.
Experiments and Results
TCBPPE based method drops about 3 BLEU points on both development and test data sets.
Introduction
We conduct experiments on a Chinese-to-English translation task to test our proposed methods, and we get about 1.5 BLEU points improvement, compared with a state-of-the-art baseline system.
BLEU points is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Lu, Shixiang and Chen, Zhenbiao and Xu, Bo
Abstract
On two Chinese-English tasks, our semi-supervised DAE features obtain statistically significant improvements of l.34/2.45 (IWSLT) and 0.82/1.52 (NIST) BLEU points over the unsupervised DBN features and the baseline features, respectively.
Conclusions
The results also demonstrate that DNN (DAE and HCDAE) features are complementary to the original features for SMT, and adding them together obtain statistically significant improvements of 3.16 (IWSLT) and 2.06 (NIST) BLEU points over the baseline features.
Experiments and Results
Adding new DNN features as extra features significantly improves translation accuracy (row 2-17 vs. 1), with the highest increase of 2.45 (IWSLT) and 1.52 (NIST) (row 14 vs. 1) BLEU points over the baseline features.
BLEU points is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Xiong, Deyi and Zhang, Min
Experiments
0 Our sense-based translation model achieves a substantial improvement of 1.2 BLEU points over the baseline.
Experiments
0 If we only integrate sense features into the sense-based translation model, we can still outperform the baseline by 0.62 BLEU points .
Experiments
From the table, we can find that the sense-based translation model outperforms the reformulated WSD by 0.57 BLEU points .
BLEU points is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: