AL-SMT: Multilingual Setting | The nonnegative weights ad reflect the importance of the different translation tasks and 2d ad 2 l. AL-SMT formulation for single language pair is a special case of this formulation where only one of the ad’s in the objective function (1) is one and the rest are zero. |
AL-SMT: Multilingual Setting | 2.1 for AL in the multilingual setting includes the single language pair setting as a special case (Haffari et al., 2009). |
AL-SMT: Multilingual Setting | For a single language pair we use U and L. |
Abstract | We also provide new highly effective sentence selection methods that improve AL for phrase-based SMT in the multilingual and single language pair setting. |
Introduction | The multilingual setting provides new opportunities for AL over and above a single language pair . |
Introduction | In our case, the multiple tasks are individual machine translation tasks for several language pairs . |
Introduction | languages to the new language depending on the characteristics of each source-target language pair , hence these tasks are competing for annotating the same resource. |
Abstract | Our experiments show speedups from MERT and MBR as well as performance improvements from MBR decoding on several language pairs . |
Discussion | This may not be optimal in practice for unseen test sets and language pairs , and the resulting linear loss may be quite different from the corpus level BLEU. |
Discussion | On an experiment with 40 language pairs , we obtain improvements on 26 pairs, no difference on 8 pairs and drops on 5 pairs. |
Discussion | This was achieved without any need for manual tuning for each language pair . |
Experiments | We report results on nist03 set and present three systems for each language pair : phrase-based (pb), hierarchical (hier), and SAMT; Lattice MBR is done for the phrase-based system while HGMBR is used for the other two. |
Experiments | For the multi-language case, we train phrase-based systems and perform lattice MBR for all language pairs . |
Experiments | When we optimize MBR features with MERT, the number of language pairs with gains/no changes/-drops is 22/5/12. |
Abstract | PANDICTIONARY contains more than four times as many translations than in the largest Wiktionary at precision 0.90 and over 200,000,000 pairwise translations in over 200,000 language pairs at precision 0.8. |
Empirical Evaluation | Such people are hard to find and may not even exist for many language pairs (e. g., Basque and Maori). |
Empirical Evaluation | For this study we tagged 7 language pairs : Hindi-Hebrew, |
Introduction and Motivation | PANDICTIONARY, that could serve as a resource for translation systems operating over a very broad set of language pairs . |
Introduction and Motivation | PANDICTIONARY currently contains over 200 million pairwise translations in over 200,000 language pairs at precision 0.8. |
Introduction and Motivation | We describe the design and construction of PAN DICTIONARY—a novel lexical resource that spans over 200 million pairwise translations in over 200,000 language pairs at 0.8 precision, a fourfold increase when compared to the union of its input translation dictionaries. |
Related Work | lingual corpora, which may scale to several language pairs in future (Haghighi et al., 2008). |
Computing Feature Expectations | (2008) showed that most of the improvement from lattice-based consensus decoding comes from lattice-based expectations, not search: searching over lattices instead of k-best lists did not change results for two language pairs, and improved a third language pair by 0.3 BLEU. |
Experimental Results | Despite this optimization, our new Algorithm 3 was an average of 80 times faster across systems and language pairs . |
Introduction | We also show that using forests outperforms using k-best lists consistently across language pairs . |
Introduction | Unfortunately, large quantities of parallel data are not readily available for some languages pairs , therefore limiting the potential use of current SMT systems. |
Introduction | It is especially difficult to obtain such a domain-specific corpus for some language pairs such as Chinese to Spanish translation. |
Using RBMT Systems for Pivot Translation | For many source-target language pairs , the commercial pivot-source and/or pivot-target RBMT systems are available on markets. |
Discussion | (Zeman and Resnik, 2008) assumed that the morphology and syntax in the language pair should be very similar, and that is so for the language pair that they considered, Danish and Swedish, two very close north European languages. |
Introduction | What our method relies on is not the close relation of the chosen language pair but the similarity of two treebanks, this is the most different from the previous work. |
The Related Work | As fewer language properties are concerned, our approach holds the more possibility to be extended to other language pairs than theirs. |