Index of papers in Proc. ACL 2012 that mention
  • language pairs
Chen, Boxing and Kuhn, Roland and Larkin, Samuel
Abstract
We compare PORT-tuned MT systems to BLEU-tuned baselines in five experimental conditions involving four language pairs .
BLEU and PORT
For our experiments, we tuned a on Chinese-English data, setting it to 0.25 and keeping this value for the other language pairs .
Conclusions
Most important, our results show that PORT-tuned MT systems yield better translations than BLEU-tuned systems on several language pairs , according both to automatic metrics and human evaluations.
Conclusions
In future work, we plan to tune the free parameter 0t for each language pair .
Experiments
Most WMT submissions involve language pairs with similar word order, so the ordering factor v in PORT won’t play a big role.
Experiments
In internal tests we have found no systematic difference in dev-set BLEUs, so we speculate that PORT’s emphasis on reordering yields models that generalize better for these two language pairs .
Experiments
Of the Table 5 language pairs , the one where PORT tuning helps most has the lowest BLEU in Table 4 (German-English); the one where it helps least in Table 5 has the highest BLEU in Table 4 (French-English).
Introduction
However, since PORT is designed for tuning, the most important results are those showing that PORT tuning yields systems with better translations than those produced by BLEU tuning — both as determined by automatic metrics (including BLEU), and according to human judgment, as applied to five data conditions involving four language pairs .
language pairs is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Neubig, Graham and Watanabe, Taro and Mori, Shinsuke and Kawahara, Tatsuya
Abstract
In an evaluation, we demonstrate that character-based translation can achieve results that compare to word-based systems while effectively translating unknown and uncommon words over several language pairs .
Experiments
In order to test the effectiveness of character-based translation, we performed experiments over a variety of language pairs and experimental settings.
Experiments
As previous research has shown that it is more difficult to translate into morphologically rich languages than into English (Koehn, 2005), we perform experiments translating in both directions for all language pairs .
Experiments
This confirms that character-based translation is performing well on languages that have long words or ambiguous boundaries, and less well on language pairs with relatively strong one-to-one correspondence between words.
Introduction
This method is attractive, as it is theoretically able to handle all sparsity phenomena in a single unified framework, but has only been shown feasible between similar language pairs such as Spanish-Catalan (Vilar et al., 2007), Swedish-Norwegian (Tiedemann, 2009), and Thai-Lao (Somlertlamvanich et al., 2008), which have a strong co-occurrence between single characters.
Introduction
(2007) state and we confirm, accurate translations cannot be achieved when applying traditional translation techniques to character-based translation for less similar language pairs .
Introduction
An evaluation on four language pairs with differing morphological properties shows that for distant language pairs , character-based SMT can achieve translation accuracy comparable to word-based systems.
Related Work on Data Sparsity in SMT
However, while the approach is attractive conceptually, previous research has only been shown effective for closely related language pairs (Vilar et al., 2007; Tiedemann, 2009; Sornlertlamvanich et al., 2008).
Related Work on Data Sparsity in SMT
In this work, we propose effective alignment techniques that allow character-based translation to achieve accurate translation results for both close and distant language pairs .
language pairs is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Vaswani, Ashish and Huang, Liang and Chiang, David
Conclusion
The method is implemented as a modification to the open-source toolkit GIZA++, and we have shown that it significantly improves translation quality across four different language pairs .
Experiments
For each language pair , we extracted grammar rules from the same data that were used for word alignment.
Introduction
These models are unsupervised, making them applicable to any language pair for which parallel text is available.
Introduction
Although manually-aligned data is very valuable, it is only available for a small number of language pairs .
language pairs is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Kolachina, Prasanth and Cancedda, Nicola and Dymetman, Marc and Venkatapathy, Sriram
Inferring a learning curve from mostly monolingual data
For each configuration (combination of language pair and domain) 0 and test set If in Table 2, a gold curve is fitted using the selected tri-parameter power-law family using a fine grid of corpus sizes.
Introduction
Our experiments involve 30 distinct language pair and domain combinations and 96 different learning curves.
Selecting a parametric family of curves
for all the six families on a test dataset for English-German language pair .
language pairs is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: