Index of papers in Proc. ACL 2008 that mention

machine translation

Seen in text as:

machine translation (76)
Machine Translation (6)

Seen in 82 sentences in 15 papers.

1. Better Alignments = Better Translations?

Ganchev, Kuzman and Graça, João V. and Taskar, Ben

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Automatic word alignment is a key step in training statistical machine translation systems.
Abstract	Despite much recent work on word alignment methods, alignment accuracy increases often produce little or no improvements in machine translation quality.
Conclusions	To our knowledge this is the first extensive evaluation where improvements in alignment accuracy lead to improvements in machine translation performance.
Introduction	The typical pipeline for a machine translation (MT) system starts with a parallel sentence-aligned corpus and proceeds to align the words in every sentence pair.
Introduction	Our contribution is a large scale evaluation of this methodology for word alignments, an investigation of how the produced alignments differ and how they can be used to consistently improve machine translation performance (as measured by BLEU score) across many languages on training corpora with up to hundred thousand sentences.
Introduction	Section 5 explores how the new alignments lead to consistent and significant improvement in a state of the art phrase base machine translation by using posterior decoding rather than Viterbi decoding.
Phrase-based machine translation	In particular we fix a state of the art machine translation system1 and measure its performance when we vary the supplied word alignments.
Phrase-based machine translation	The baseline system uses GIZA model 4 alignments and the open source Moses phrase-based machine translation toolkit2, and performed close to the best at the competition last year.
Phrase-based machine translation	In addition to the Hansards corpus and the Europarl English-Spanish corpus, we used four other corpora for the machine translation experiments.
Word alignment results	A natural question is how to tune the threshold in order to improve machine translation quality.
Word alignment results	In the next section we evaluate and compare the effects of the different alignments in a phrase based machine translation system.

machine translation is mentioned in 11 sentences in this paper.

Topics mentioned in this paper:

2. Distributed Word Clustering for Large Scale Class-Based Language Modeling in Machine Translation

Uszkoreit, Jakob and Brants, Thorsten

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We show that combining them with word—based n—gram models in the log—linear model of a state—of—the—art statistical machine translation system leads to improvements in translation quality as indicated by the BLEU score.
Experiments	We use the distributed training and application infrastructure described in (Brants et al., 2007) with modifications to allow the training of predictive class-based models and their application in the decoder of the machine translation system.
Experiments	Instead we report BLEU scores (Papineni et al., 2002) of the machine translation system using different combinations of word- and class-based models for translation tasks from English to Arabic and Arabic to English.
Experiments	A fourth data set, en_’web, was used together with the other three data sets to train the large word-based model used in the second machine translation experiment.
Introduction	.g for Large Scale Class-Based in Machine Translation
Introduction	However, in the area of statistical machine translation , especially in the context of large training corpora, fewer experiments with class-based n-gram models have been performed with mixed success (Raab, 2006).
Introduction	We then show that using partially class-based language models trained using the resulting classifications together with word-based language models in a state-of-the-art statistical machine translation system yields improvements despite the very large size of the word-based models used.

machine translation is mentioned in 14 sentences in this paper.

Topics mentioned in this paper:

3. Randomized Language Models via Perfect Hash Functions

Talbot, David and Brants, Thorsten

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We demonstrate the space-savings of the scheme via machine translation experiments within a distributed language modeling framework.
Experimental Setup	4.3 Machine Translation
Experiments	5.3 Machine Translation
Introduction	Language models (LMs) are a core component in statistical machine translation , speech recognition, optical character recognition and many other areas.
Introduction	Efficiency is paramount in applications such as machine translation which make huge numbers of LM requests per sentence.
Introduction	This paper focuses on machine translation .
Scaling Language Models	In statistical machine translation (SMT), LMs are used to score candidate translations in the target language.

machine translation is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

4. Name Translation in Statistical Machine Translation - Learning When to Transliterate

Hermjakob, Ulf and Knight, Kevin and Daumé III, Hal

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We present a method to transliterate names in the framework of end-to-end statistical machine translation .
Discussion	We have shown that a state-of-the-art statistical machine translation system can benefit from a dedicated transliteration module to improve the transla-
Discussion	Improved named entity translation accuracy as measured by the NEWA metric in general, and a reduction in dropped names in particular is clearly valuable to the human reader of machine translated documents as well as for systems using machine translation for further information processing.
End-to-End results	Finally, here are end-to-end machine translation results for three sentences, with and without the transliteration module, along with a human reference translation.
Introduction	State-of-the-art statistical machine translation (SMT) is bad at translating names that are not very common, particularly across languages with different character sets and sound systems.
Introduction	This evaluation involves a mixture of entity identification and translation concems—for example, the scoring system asks for coreference determination, which may or may not be of interest for improving machine translation output.

machine translation is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

5. Measure Word Generation for English-Chinese SMT Systems

Zhang, Dongdong and Li, Mu and Duan, Nan and Li, Chi-Ho and Zhou, Ming

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Conventional statistical machine translation (SMT) systems do not perform well on measure word generation due to data sparseness and the potential long distance dependency between measure words and their corresponding head words.
Abstract	Our model works as a postprocessing procedure over output of statistical machine translation systems, and can work with any SMT system.
Experiments	We also compared our method with a well-known rule-based machine translation system —SYSTRAN3.
Introduction	According to our survey on the measure word distribution in the Chinese Penn Treebank and the test datasets distributed by Linguistic Data Consortium (LDC) for Chinese-to-English machine translation evaluation, the average occurrence is 0.505 and 0.319 measure
Introduction	Therefore, in the English-to-Chinese machine translation task we need to take additional efforts to generate the missing measure words in Chinese.
Introduction	In most statistical machine translation (SMT) models (Och et al., 2004; Koehn et al., 2003; Chiang, 2005), some of measure words can be generated without modification or additional processing.

machine translation is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

6. A Discriminative Latent Variable Model for Statistical Machine Translation

Blunsom, Phil and Cohn, Trevor and Osborne, Miles

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Large-scale discriminative machine translation promises to further the state-of-the-art, but has failed to deliver convincing gains over current heuristic frequency count systems.
Challenges for Discriminative SMT	These results could — and should — be applied to other models, discriminative and generative, phrase- and syntax-based, to further progress the state-of-the-art in machine translation .
Discussion and Further Work	Finally, while in this paper we have focussed on the science of discriminative machine translation , we believe that with suitable engineering this model will advance the state-of-the-art.
Evaluation	The development and test data was taken from the 2006 NAACL and 2007 ACL workshops on machine translation , also filtered for sentence length.4 Tuning of the regularisation parameter and MERT training of the benchmark models was performed on dev2006, while the test set was the concatenation of devtest2006, test2006 and test2007, amounting to 315 development and 1164 test sentences.
Introduction	Statistical machine translation (SMT) has seen a resurgence in popularity in recent years, with progress being driven by a move to phrase-based and syntax-inspired approaches.

machine translation is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

7. Unsupervised Translation Induction for Chinese Abbreviations using Monolingual Corpora

Li, Zhifei and Yarowsky, David

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Due to the richness of Chinese abbreviations, many of them may not appear in available parallel corpora, in which case current machine translation systems simply treat them as unknown words and leave them untranslated.
Conclusions	We integrate our method into a state-of-the-art phrase-based baseline translation system, i.e., Moses (Koehn et al., 2007), and show that the integrated system consistently improves the performance of the baseline system on various NIST machine translation test sets.
Related Work	Though automatically extracting the relations between full-form Chinese phrases and their abbreviations is an interesting and important task for many natural language processing applications (e.g., machine translation , question answering, information retrieval, and so on), not much work is available in the literature.
Related Work	None of the above work has addressed the Chinese abbreviation issue in the context of a machine translation task, which is the primary goal in this paper.
Related Work	To the best of our knowledge, our work is the first to systematically model Chinese abbreviation expansion to improve machine translation .

machine translation is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

8. Enriching Morphologically Poor Languages for Statistical Machine Translation

Avramidis, Eleftherios and Koehn, Philipp

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Factored Model	The factored statistical machine translation model uses a log-linear approach, in order to combine the several components, including the language model, the reordering model, the translation models and the generation models.
Introduction	Traditional statistical machine translation methods are based on mapping on the lexical level, which takes place in a local window of a few words.
Introduction	Our method is based on factored phrase-based statistical machine translation models.
Introduction	Traditional statistical machine translation models deal with this problems in two ways:

machine translation is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

9. Optimal $k$-arization of Synchronous Tree-Adjoining Grammar

Nesson, Rebecca and Satta, Giorgio and Shieber, Stuart M.

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Synchronous Tree—Adjoining Grammar (STAG) is a promising formalism for syntax—aware machine translation and simultaneous computation of natural-language syntax and semantics.
Introduction	Recently, the desire to incorporate syntax-awareness into machine translation systems has generated interest in
Introduction	Without efficient algorithms for processing it, its potential for use in machine translation and TAG semantics systems is limited.
Synchronous Tree-Adjoining Grammar	In order for STAG to be used in machine translation and other natural-language processing tasks it must be possible to process it efficiently.

machine translation is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

machine translation (4)
parsing algorithms (3)

10. A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model

Shen, Libin and Xu, Jinxi and Weischedel, Ralph

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In this paper, we propose a novel string-to-dependency algorithm for statistical machine translation .
Conclusions and Future Work	In this paper, we propose a novel string-to-dependency algorithm for statistical machine translation .
Introduction	In recent years, hierarchical methods have been successfully applied to Statistical Machine Translation (Graehl and Knight, 2004; Chiang, 2005; Ding and Palmer, 2005; Quirk et al., 2005).
Introduction	1.1 Hierarchical Machine Translation

machine translation is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

11. Applying Morphology Generation Models to Machine Translation

Toutanova, Kristina and Suzuki, Hisami and Ruopp, Achim

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We improve the quality of statistical machine translation (SMT) by applying models that predict word forms from their stems using extensive morphological and syntactic information from both the source and target languages.
Introduction	One of the outstanding problems for further improving machine translation (MT) systems is the difficulty of dividing the MT problem into sub-problems and tackling each subproblem in isolation to improve the overall quality of MT.
Introduction	This paper describes a successful attempt to integrate a subcomponent for generating word inflections into a statistical machine translation (SMT)
Machine translation systems and data	We integrated the inflection prediction model with two types of machine translation systems: systems that make use of syntax and surface phrase-based systems.

machine translation is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

12. Cohesive Phrase-Based Decoding for Statistical Machine Translation

Cherry, Colin

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Statistical machine translation (SMT) is complicated by the fact that words can move during translation.
Introduction	1We use the term “syntactic cohesion” throughout this paper to mean what has previously been referred to as “phrasal cohesion”, because the nonlinguistic sense of “phrase” has become so common in machine translation literature.
Introduction	Phrase-based decoding (Koehn et al., 2003) is a dominant formalism in statistical machine translation .

machine translation is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

subtrees (11)
phrase-based (10)
BLEU (9)

13. Forest Reranking: Discriminative Parsing with Non-Local Features

Huang, Liang

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	We believe this general framework could also be applied to other problems involving forests or lattices, such as sequence labeling and machine translation .
Forest Reranking	For nonlocal features, we adapt cube pruning from forest rescoring (Chiang, 2007; Huang and Chiang, 2007), since the situation here is analogous to machine translation decoding with integrated language models: we can View the scores of unit nonlocal features as the language model cost, computed on-the-fly when combining sub-constituents.
Introduction	Discriminative reranking has become a popular technique for many NLP problems, in particular, parsing (Collins, 2000) and machine translation (Shen et al., 2005).

machine translation is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

14. Bayesian Learning of Non-Compositional Phrases with Synchronous Parsing

Zhang, Hao and Quirk, Chris and Moore, Robert C. and Gildea, Daniel

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	These experiments also indicate that a very sparse prior is needed for machine translation tasks.
Introduction	Most state-of—the-art statistical machine translation systems are based on large phrase tables extracted from parallel text using word-level alignments.
Introduction	While this approach has been very successful, poor word-level alignments are nonetheless a common source of error in machine translation systems.

machine translation is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

15. Efficient Multi-Pass Decoding for Synchronous Context Free Grammars

Zhang, Hao and Gildea, Daniel

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We take a multi-pass approach to machine translation decoding when using synchronous context-free grammars as the translation model and n-gram language models: the first pass uses a bigram language model, and the resulting parse forest is used in the second pass to guide search with a trigram language model.
Introduction	Statistical machine translation systems based on synchronous grammars have recently shown great promise, but one stumbling block to their widespread adoption is that the decoding, or search, problem during translation is more computationally demanding than in phrase-based systems.
Introduction	We examine the question of whether, given the reordering inherent in the machine translation problem, lower order n-grams will provide as valuable a search heuristic as they do for speech recognition.

machine translation is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

BLEU (19)
bigram (15)
language model (12)