Abstract | Modern automated lexicon generation methods usually require parallel corpora, which are not available for most language pairs . |
Introduction | However, for most language pairs parallel bilingual corpora either do not exist or are at best small and unrepresentative of the general language. |
Introduction | Pivot language approaches deal with the scarcity of bilingual data for most language pairs by relying on the availability of bilingual data for each of the languages in question with a third, pivot, language. |
Lexicon Generation Experiments | We chose a language pair for which basically no parallel corpora existz, and that do not share ancestry or writing system in a way that can provide cues for alignment. |
Lexicon Generation Experiments | These considerations lead us to believe that our choice of language pair is more challenging than, for example, a pair of European languages. |
NAS Score Properties | For other language pairs lemmatization may be needed. |
Previous Work | The limited availability of parallel corpora of sufficient size for most language pairs restricts the usefulness of these methods. |
Previous Work | (2009) used many input bilingual lexicons to create bilingual lexicons for new language pairs . |
Experiment | Three human annotators who are fluent in the two languages manually annotated N-to-N sentence alignments for each language pairs (KR-EN, KR-CH, KR-JP). |
Experiment | By keeping only the sentence chunks whose Korean chunk appears in all language pairs , we were left with 859 sentence chunk pairs. |
Experiment | The subjectivity analysis systems are evaluated with all language pairs with kappa and Pearson’s correlation coefficients. |
Abstract | This indicates that transliteration is useful for more than only translating OOV words for language pairs like Hindi-Urdu. |
Conclusion | In closely related language pairs such as Hindi-Urdu with a significant amount of vocabulary overlap, |
Evaluation | The difference of 2.35 BLEU points between M1 and Pbl indicates that transliteration is useful for more than only translating OOV words for language pairs like Hindi-Urdu. |
Abstract | We present a novel scheme to apply factored phrase-based SMT to a language pair with very disparate morphological structures. |
Experimental Setup and Results | 12The experience with MERT for this language pair has not been very positive. |
Experimental Setup and Results | In order to alleviate the lack of large scale parallel corpora for the English—Turkish language pair , we experimented with augmenting the training data with reliable phrase pairs obtained from a previous alignment. |