Index of papers in Proc. ACL 2011 that mention
  • machine translation
Schwartz, Lane and Callison-Burch, Chris and Schuler, William and Wu, Stephen
Abstract
This paper describes a novel technique for incorporating syntactic knowledge into phrase-based machine translation through incremental syntactic parsing.
Introduction
Early work in statistical machine translation Viewed translation as a noisy channel process comprised of a translation model, which functioned to posit adequate translations of source language words, and a target language model, which guided the fluency of generated target language strings (Brown et al.,
Introduction
Drawing on earlier successes in speech recognition, research in statistical machine translation has effectively used n-gram word sequence models as language models.
Related Work
Recent work has shown that parsing-based machine translation using syntax-augmented (Zoll-mann and Venugopal, 2006) hierarchical translation grammars with rich nonterminal sets can demonstrate substantial gains over hierarchical grammars for certain language pairs (Baker et al., 2009).
Related Work
speech recognition and statistical machine translation focus on the use of n-grams, which provide a simple finite-state model approximation of the target language.
Related Work
priate algorithmic fit for incorporating syntax into phrase-based statistical machine translation , since both process sentences in an incremental left-to-right fashion.
machine translation is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Chen, David and Dolan, William
Abstract
A lack of standard datasets and evaluation metrics has prevented the field of paraphrasing from making the kind of rapid progress enjoyed by the machine translation community over the last 15 years.
Discussions and Future Work
In addition to paraphrasing, our data collection framework could also be used to produces useful data for machine translation and computer vision.
Introduction
Machine paraphrasing has many applications for natural language processing tasks, including machine translation (MT), MT evaluation, summary evaluation, question answering, and natural language generation.
Introduction
Despite the similarities between paraphrasing and translation, several maj or differences have prevented researchers from simply following standards that have been established for machine translation .
Introduction
Professional translators produce large volumes of bilingual data according to a more or less consistent specification, indirectly fueling work on machine translation algorithms.
machine translation is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
DeNero, John and Macherey, Klaus
Abstract
Statistical machine translation systems combine the predictions of two directional models, typically using heuristic combination procedures like grow-diag-final.
Conclusion
We also look forward to discovering the best way to take advantage of these new alignments in downstream applications like machine translation , supervised word alignment, bilingual parsing (Burkett et al., 2010), part-of-speech tag induction (Naseem et al., 2009), or cross-lingual model projection (Smith and Eisner, 2009; Das and Petrov, 2011).
Experimental Results
Extraction-based evaluations of alignment better coincide with the role of word aligners in machine translation systems (Ayan and Dorr, 2006).
Experimental Results
Finally, we evaluated our bidirectional model in a large-scale end-to-end phrase-based machine translation system from Chinese to English, based on the alignment template approach (Och and Ney, 2004).
Introduction
Machine translation systems typically combine the predictions of two directional models, one which aligns f to e and the other e to f (Och et al., 1999).
machine translation is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Lu, Bin and Tan, Chenhao and Cardie, Claire and K. Tsou, Benjamin
A Joint Model with Unlabeled Parallel Text
sentences in one language with their corresponding automatic translations), we use an automatic machine translation system (e.g.
Abstract
Experiments on multiple data sets show that the proposed approach (1) outperforms the monolingual baselines, significantly improving the accuracy for both languages by 3.44%-8.l2%; (2) outperforms two standard approaches for leveraging unlabeled data; and (3) produces (albeit smaller) performance gains when employing pseudo-parallel data from machine translation engines.
Conclusion
Moreover, the proposed approach continues to produce (albeit smaller) performance gains when employing pseudo-parallel data from machine translation engines.
Related Work
(2008; 2010) instead automatically translate the English resources using automatic machine translation engines for subjectivity classification.
Results and Analysis
As discussed in Section 3.4, we generate pseudo-parallel data by translating the monolingual sentences in each setting using Google’s machine translation system.
machine translation is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Neubig, Graham and Watanabe, Taro and Sumita, Eiichiro and Mori, Shinsuke and Kawahara, Tatsuya
Abstract
This allows for a completely probabilistic model that is able to create a phrase table that achieves competitive accuracy on phrase-based machine translation tasks directly from unaligned sentence pairs.
Conclusion
Machine translation systems using phrase tables learned directly by the proposed model were able to achieve accuracy competitive with the traditional pipeline of word alignment and heuristic phrase extraction, the first such result for an unsupervised model.
Experimental Evaluation
The data for French, German, and Spanish are from the 2010 Workshop on Statistical Machine Translation (Callison-Burch et al., 2010).
Introduction
The training of translation models for phrase-based statistical machine translation (SMT) systems (Koehn et al., 2003) takes unaligned bilingual training data as input, and outputs a scored table of phrase pairs.
Introduction
Using this model, we perform machine translation experiments over four language pairs.
machine translation is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Tan, Ming and Zhou, Wenli and Zheng, Lei and Wang, Shaojun
Abstract
The large scale distributed composite language model gives drastic perplexity reduction over n-grams and achieves significantly better translation quality measured by the BLEU score and “readability” when applied to the task of re—ranking the N -best list from a state-of—the-art parsing-based machine translation system.
Experimental results
We have applied our composite 5-gram/2-SLM+2-gram/4-SLM+5-gramfl9LSA language model that is trained by 1.3 billion word corpus for the task of re-ranking the N -best list in statistical machine translation .
Experimental results
Chiang (2007) studied the performance of machine translation on Hiero, the BLEU score is 33.31% when n-gram is used to re-rank the N -best list, however, the BLEU score becomes significantly higher 37.09% when the n-gram is embedded directly into Hiero’s one pass decoder, this is because there is not much diversity in the N -best list.
Introduction
The Markov chain (n-gram) source models, which predict each word on the basis of previous n-l words, have been the workhorses of state-of—the-art speech recognizers and machine translators that help to resolve acoustic or foreign language ambiguities by placing higher probability on more likely original underlying word strings.
Introduction
As the machine translation (MT) working groups stated on page 3 of their final report (Lavie et al., 2006), “These approaches have resulted in small improvements in MT quality, but have not fundamentally solved the problem.
machine translation is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Zollmann, Andreas and Vogel, Stephan
Conclusion and discussion
In this work we proposed methods of labeling phrase pairs to create automatically learned PSCFG rules for machine translation .
Introduction
The Probabilistic Synchronous Context Free Grammar (PSCFG) formalism suggests an intuitive approach to model the long-distance and lexically sensitive reordering phenomena that often occur across language pairs considered for statistical machine translation .
Introduction
SCFG Rules for Machine Translation
Introduction
Towards the ultimate goal of building end-to-end machine translation systems without any human annotations, we also experiment with automatically inferred word classes using distributional clustering (Kneser and Ney, 1993).
Related work
(2006) present a reordering model for machine translation , and make use of clustered phrase pairs to cope with data sparseness in the model.
machine translation is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Clifton, Ann and Sarkar, Anoop
Abstract
This paper extends the training and tuning regime for phrase-based statistical machine translation to obtain fluent translations into morphologically complex languages (we build an English to Finnish translation system).
Conclusion and Future Work
In order to help with replication of the results in this paper, we have run the various morphological analysis steps and created the necessary training, tuning and test data files needed in order to train, tune and test any phrase-based machine translation system with our data.
Conclusion and Future Work
We would particularly like to thank the developers of the open-source Moses machine translation toolkit and the Omorfi morphological analyzer for Finnish which we used for our experiments.
Translation and Morphology
Languages with rich morphological systems present significant hurdles for statistical machine translation (SMT), most notably data sparsity, source-target asymmetry, and problems with automatic evaluation.
machine translation is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Maletti, Andreas
Introduction
A (formal) translation model is at the core of every machine translation system.
Introduction
Contrary, in the field of syntax-based machine translation , the translation models have full access to the syntax of the sentences and can base their decision on it.
Introduction
In this contribution, we restrict MBOT to a form that is particularly relevant in machine translation .
The model
(2008) argue that STSG have sufficient expressive power for syntax-based machine translation , but Zhang et al.
machine translation is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Ravi, Sujith and Knight, Kevin
Abstract
In this work, we tackle the task of machine translation (MT) without parallel training data.
Introduction
Bilingual corpora are a staple of statistical machine translation (SMT) research.
Machine Translation as a Decipherment Task
From a decipherment perspective, machine translation is a much more complex task than word substitution decipherment and poses several technical challenges: (1) scalability due to large corpora sizes and huge translation tables, (2) nondeterminism in translation mappings (a word can have multiple translations), (3) reordering of words
Word Substitution Decipherment
Before we tackle machine translation without parallel data, we first solve a simpler problem—word substitution decipherment.
machine translation is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Crescenzi, Pierluigi and Gildea, Daniel and Marino, Andrea and Rossi, Gianluca and Satta, Giorgio
Concluding remarks
Grammar factorization for synchronous models is an important component of current machine translation systems (Zhang et al., 2006), and algorithms for factorization have been studied by Gildea et al.
Concluding remarks
These algorithms do not result in what we refer as head-driven strategies, although, as machine translation systems improve, lexicalized rules may become important in this setting as well.
Introduction
Similar questions have arisen in the context of machine translation , as the SCFGs used to model translation are also instances of LCFRSs, as already mentioned.
machine translation is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Lo, Chi-kiu and Wu, Dekai
Abstract
As machine translation systems improve in lexical choice and fluency, the shortcomings of widespread n-gram based, fluency-oriented MT evaluation metrics such as BLEU, which fail to properly evaluate adequacy, become more apparent.
Abstract
We argue that BLEU (Papineni et al., 2002) and other automatic n- gram based MT evaluation metrics do not adequately capture the similarity in meaning between the machine translation and the reference translation—which, ultimately, is essential for MT output to be useful.
Abstract
the most essential semantic information being captured by machine translation systems?
machine translation is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Nederhof, Mark-Jan and Satta, Giorgio
Discussion
Prefix probabilities and right prefix probabilities for PSCFGs can be exploited to compute probability distributions for the next word or part-of-speech in left-to-right incremental translation of speech, or alternatively as a predictive tool in applications of interactive machine translation , of the kind described by Foster et al.
Introduction
Within the area of statistical machine translation , there has been a growing interest in so-called syntax-based translation models, that is, models that define mappings between languages through hierarchical sentence structures.
Prefix probabilities
One should add that, in real world machine translation applications, it has been observed that recognition (and computation of inside probabilities) for SCFGs can typically be carried out in low-degree polynomial time, and the worst cases mentioned above are not observed with real data.
machine translation is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Zhao, Bing and Lee, Young-Suk and Luo, Xiaoqiang and Li, Liu
Elementary Trees to String Grammar
3 significantly enriches reordering powers for syntax-based machine translation .
Introduction
Most syntax-based machine translation models with synchronous context free grammar (SCFG) have been relying on the off—the—shelf monolingual parse structures to learn the translation equivalences for string-to-tree, tree—to—string or tree—to—tree grammars.
Introduction
However, state—of-the-art monolingual parsers are not necessarily well suited for machine translation in terms of both labels and chunks/brackets.
machine translation is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: