Index of papers in Proc. ACL 2009 that mention
  • machine translation
Li, Mu and Duan, Nan and Zhang, Dongdong and Li, Chi-Ho and Zhou, Ming
Abstract
This paper presents collaborative decoding (co-decoding), a new method to improve machine translation accuracy by leveraging translation consensus between multiple machine translation decoders.
Abstract
Different from system combination and MBR decoding, which post-process the n-best lists or word lattice of machine translation decoders, in our method multiple machine translation decoders collaborate by exchanging partial translation results.
Abstract
Experimental results on data sets for NIST Chinese-to—English machine translation task show that the co-decoding method can bring significant improvements to all baseline decoders, and the outputs from co-decoding can be used to further improve the result of system combination.
Collaborative Decoding
Because usually it is not feasible to enumerate the entire hypothesis space for machine translation , we approximate 17-[k (f) with n-best hypotheses by convention.
Introduction
Recent research has shown substantial improvements can be achieved by utilizing consensus statistics obtained from outputs of multiple machine translation systems.
Introduction
Typically, the resulting systems take outputs of individual machine translation systems as
Introduction
A common property of all the work mentioned above is that the combination models work on the basis of n-best translation lists (full hypotheses) of existing machine translation systems.
machine translation is mentioned in 16 sentences in this paper.
Topics mentioned in this paper:
Zhao, Hai and Song, Yan and Kit, Chunyu and Zhou, Guodong
Abstract
A simple statistical machine translation method, word-by-word decoding, where not a parallel corpus but a bilingual lexicon is necessary, is adopted for the treebank translation.
Conclusion and Future Work
A simple statistical machine translation technique, word-by-word decoding, where only a bilingual lexicon is necessary, is used to translate the source treebank.
Introduction
Machine translation has been shown one of the most expensive language processing tasks, as a great deal of time and space is required to perform this task.
Introduction
In addition, a standard statistical machine translation method based on a parallel corpus will not work effectively if it is not able to find a parallel corpus that right covers source and target treebanks.
The Related Work
In our method, a machine translation method is applied to tackle golden-standard treebank, while all the previous works focus on the unlabeled data.
The Related Work
The proposed parser using features from monolingual and mutual constraints helped its log-linear model to achieve better performance for both monolingual parsers and machine translation system.
The Related Work
The second is that a parallel corpus is required for their work and a strict statistical machine translation procedure was performed, while our approach holds a merit of simplicity as only a bilingual lexicon is required.
Treebank Translation and Dependency Transformation
A word-by-word statistical machine translation strategy is adopted to translate words attached with the respective dependency information from the source language to the target one.
machine translation is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Wan, Xiaojun
Abstract
Machine translation services are used for eliminating the language gap between the training set and test set, and English features and Chinese features are considered as two independent views of the classification problem.
Conclusion and Future Work
2) The feature distributions of the translated text and the natural text in the same language are still different due to the inaccuracy of the machine translation service.
Introduction
First, machine translation services are used to translate English training reviews into Chinese reviews and also translate Chinese test reviews and additional unlabeled reviews into English reviews.
Related Work 2.1 Sentiment Classification
(2004) use the technique of deep language analysis for machine translation to extract sentiment units in text documents.
The Co-Training Approach
The labeled English reviews are translated into labeled Chinese reviews, and the unlabeled Chinese reviews are translated into unlabeled English reviews, by using machine translation services.
The Co-Training Approach
Fortunately, machine translation techniques have been well developed in the NLP field, though the translation performance is far from satisfactory.
The Co-Training Approach
A few commercial machine translation services can be publicly accessed, e.g.
machine translation is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
DeNero, John and Chiang, David and Knight, Kevin
Abstract
The minimum Bayes risk (MBR) decoding objective improves BLEU scores for machine translation output relative to the standard Viterbi objective of maximizing model score.
Abstract
We evaluate our procedure on translation forests from two large-scale, state-of-the-art hierarchical machine translation systems.
Computing Feature Expectations
Exploiting forests has proven a fruitful avenue of research in both parsing (Huang, 2008) and machine translation (Mi et al., 2008).
Consensus Decoding Algorithms
Modern statistical machine translation systems take as input some f and score each derivation 6 according to a linear model of features: A, -6i(f, e).
Consensus Decoding Algorithms
Most similarity measures of interest for machine translation are not linear, and so Algorithm 2 does not apply.
Experimental Results
We evaluate these consensus decoding techniques on two different full-scale state-of-the-art hierarchical machine translation systems.
Introduction
In statistical machine translation , output translations are evaluated by their similarity to human reference translations, where similarity is most often measured by BLEU (Papineni et al., 2002).
machine translation is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Galley, Michel and Manning, Christopher D.
Dependency parsing experiments
For the MT setting, texts are all lower case, and tokenization was changed to improve machine translation (e. g., most hyphenated words were split).
Dependency parsing for machine translation
Dependency models have recently gained considerable interest in many NLP applications, including machine translation (Ding and Palmer, 2005; Quirk et al., 2005; Shen et al., 2008).
Introduction
Hierarchical approaches to machine translation have proven increasingly successful in recent years (Chiang, 2005; Marcu et al., 2006; Shen et al., 2008), and often outperform phrase-based systems (Och and Ney, 2004; Koehn et al., 2003) on.ungetlanguage fluency'and.adequacy; Ilouh ever, their benefits generally come with high computational costs, particularly when chart parsing, such as CKY, is integrated with language models of high orders (Wu, 1996).
Introduction
may sometimes appear too computa-tionally expensive for high-end statistical machine translation, there are many alternative parsing algorithms that have seldom been explored in the machine translation literature.
Introduction
In this paper, we show how to exploit syntactic dependency structure for better machine translation , under the constraint that the depen-
Related work
Perhaps due to the high computational cost of synchronous CFG decoding, there have been various attempts to exploit syntactic knowledge and hierarchical structure in other machine translation experiments that do not require chart parsing.
machine translation is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Zaslavskiy, Mikhail and Dymetman, Marc and Cancedda, Nicola
Abstract
An efficient decoding algorithm is a crucial element of any statistical machine translation system.
Introduction
Phrase-based systems (Koehn et al., 2003) are probably the most widespread class of Statistical Machine Translation systems, and arguably one of the most successful.
Phrase-based Decoding as TSP
h - mt - z' - 3 this machine translation is strange h - c - t - z' - a this curious translation is automatic ht - s - z' - a this translation strange is automatic
Phrase-based Decoding as TSP
For example, in the example of Figure 3, the cost of this - machine translation - is - strange, can only take into account the conditional probability of the word strange relative to the word is, but not relative to the words translation and is.
Phrase-based Decoding as TSP
machine translation .
machine translation is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Haffari, Gholamreza and Sarkar, Anoop
Abstract
Statistical machine translation (SMT) models require bilingual corpora for training, and these corpora are often multilingual with parallel text in multiple languages simultaneously.
Introduction
The main source of training data for statistical machine translation (SMT) models is a parallel corpus.
Introduction
In our case, the multiple tasks are individual machine translation tasks for several language pairs.
Introduction
11 Statistical Machine Translation*
Sentence Selection: Multiple Language Pairs
For the single language pair setting, (Haffari et al., 2009) presents and compares several sentence selection methods for statistical phrase-based machine translation .
machine translation is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Gao, Wei and Blitzer, John and Zhou, Ming and Wong, Kam-Fai
Experiments and Results
In order to compute cross-lingual document similarities based on machine translation
Features and Similarities
The cross-lingual similarities are valuated using different translation mechanisms, e.g., dictionary-based translation or machine translation , or even without any translation at all.
Features and Similarities
Similarity Based on Machine Translation (MT): For machine translation , the cross-lingual measure actually becomes a monolingual similarity between one document and another’s translation.
Introduction
As we will see, machine translation can provide important predictive information in our setting, but we do not wish to display machine-translated output to the user.
Introduction
We approach our problem by learning a ranking function for bilingual queries — queries that are easily translated (e.g., with machine translation ) and appear in the query logs of two languages (e.g., English and Chinese).
machine translation is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Huang, Fei
Introduction
Data-driven approaches have been quite active in recent machine translation (MT) research.
Related Work
In the machine translation area, most research on confidence measure focus on the confidence of MT output: how accurate a translated sentence is.
Related Work
(Ueff-ing et al., 2003) presented several word-level confidence measures for machine translation based on word posterior probabilities.
Translation
We evaluate the improved alignment on several Chinese-English and Arabic-English machine translation tasks.
machine translation is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Kumar, Shankar and Macherey, Wolfgang and Dyer, Chris and Och, Franz
Abstract
Minimum Error Rate Training (MERT) and Minimum Bayes-Risk (MBR) decoding are used in most current state-of—the-art Statistical Machine Translation (SMT) systems.
Experiments
We also train two SCFG—based MT systems: a hierarchical phrase-based SMT (Chiang, 2007) system and a syntax augmented machine translation (SAMT) system using the approach described in Zollmann and Venugopal (2006).
Introduction
Statistical Machine Translation (SMT) systems have improved considerably by directly using the error criterion in both training and decoding.
Minimum Error Rate Training
In the context of statistical machine translation , the optimization procedure was first described in Och (2003) for N -best lists and later extended to phrase-lattices in Macherey et al.
machine translation is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Liu, Yang and Mi, Haitao and Feng, Yang and Liu, Qun
Background
Statistical machine translation is a decision problem where we need decide on the best of target sentence matching a source sentence.
Introduction
System combination aims to find consensus translations among different machine translation systems.
Related Work
In machine translation , confusion-network based combination techniques (e.g., (Rosti et al., 2007; He et al., 2008)) have achieved the state-of-the-art performance in MT evaluations.
Related Work
Hypergraphs have been successfully used in parsing (Klein and Manning., 2001; Huang and Chiang, 2005; Huang, 2008) and machine translation (Huang and Chiang, 2007; Mi et al., 2008; Mi and Huang, 2008).
machine translation is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Wu, Hua and Wang, Haifeng
Abstract
This paper revisits the pivot language approach for machine translation .
Introduction
Current statistical machine translation (SMT) systems rely on large parallel and monolingual training corpora to produce translations of relatively higher quality.
Introduction
In order to fill up this data gap, we make use of rule-based machine translation (RBMT) systems to translate the pivot sentences in the source-pivot or pivot-target
Translation Selection
We regard sentence-level translation selection as a machine translation (MT) evaluation problem and formalize this problem with a regression learning model.
machine translation is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Yang, Fan and Zhao, Jun and Liu, Kang
Abstract
The experimental results show that the proposed method outperforms the baseline statistical machine translation system by 30.42%.
Experiments
Then the phrase-based machine translation system MOSES2 is adopted to translate the 503 Chinese NEs in testing set into English.
Experiments
First, word order determination is difficult in statistical machine translation (SMT), while search engines are insensitive to this problem.
Introduction
The task of Named Entity (NE) translation is to translate a named entity from the source language to the target language, which plays an important role in machine translation and cross-language information retrieval (CLIR).
machine translation is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zhao, Shiqi and Lan, Xiang and Liu, Ting and Li, Sheng
Conclusions and Future Work
(1) It is the first statistical model specially designed for paraphrase generation, which is based on the analysis of the differences between paraphrase generation and other researches, especially machine translation .
Experimental Setup
In our experiments, the development set contains 200 sentences and the test set contains 500 sentences, both of which are randomly selected from the human translations of 2008 NIST Open Machine Translation Evaluation: Chinese to English Task.
Introduction
PG shows its importance in many areas, such as question expansion in question answering (QA) (Duboue and Chu-Carroll, 2006), text polishing in natural language generation (NLG) (Iordanskaja et al., 1991), text simplification in computer-aided reading (Carroll et al., 1999), and sentence similarity computation in the automatic evaluation of machine translation (MT) (Kauchak and Barzilay, 2006) and summarization (Zhou et al., 2006).
Statistical Paraphrase Generation
o Sentence similarity computation: Given a reference sentence s’, this application aims to paraphrase s into t, so that t is more similar (closer in wording) with s’ than s. This application is important for the automatic evaluation of machine translation and summarization, since we can paraphrase the human translations/summaries to make them more similar to the system outputs, which can refine the accuracy of the evaluation (Kauchak and Barzilay, 2006).
machine translation is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
He, Wei and Wang, Haifeng and Guo, Yuqing and Liu, Ting
Log-linear Models
BLEU score, a method originally proposed to automatically evaluate machine translation quality (Papineni et al., 2002), has been widely used as a metric to evaluate general-purpose sentence generation (Langkilde, 2002; White et al., 2007; Guo et al.
Log-linear Models
3 The BLEU scoring script is supplied by NIST Open Machine Translation Evaluation at ftp://iaguarncsl.nist.gov/mt/resources/mteval-vl lb.pl
Log-linear Models
(MERT), which is popular in statistical machine translation (Och, 2003).
machine translation is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Jiang, Long and Yang, Shiquan and Zhou, Ming and Liu, Xiaohua and Zhu, Qingsheng
Abstract
Mining bilingual data (including bilingual sentences and termsl) from the Web can benefit many NLP applications, such as machine translation and cross language information retrieval.
Conclusions
We also want to evaluate the usefulness of our mined data for machine translation or other applications.
Introduction
Bilingual data (including bilingual sentences and bilingual terms) are critical resources for building many applications, such as machine translation (Brown, 1993) and cross language information retrieval (Nie et al., 1999).
machine translation is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Pado, Sebastian and Galley, Michel and Jurafsky, Dan and Manning, Christopher D.
Abstract
Existing evaluation metrics for machine translation lack crucial robustness: their correlations with human quality judgments vary considerably across languages and genres.
Expt. 2: Predicting Pairwise Preferences
This experiment uses the 2006—2008 corpora of the Workshop on Statistical Machine Translation (WMT).7 It consists of data from EU-ROPARL (Koehn, 2005) and various news commentaries, with five source languages (French, German, Spanish, Czech, and Hungarian).
Introduction
Constant evaluation is vital to the progress of machine translation (MT).
machine translation is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Parton, Kristen and McKeown, Kathleen R. and Coyne, Bob and Diab, Mona T. and Grishman, Ralph and Hakkani-Tür, Dilek and Harper, Mary and Ji, Heng and Ma, Wei Yun and Meyers, Adam and Stolbach, Sara and Sun, Ang and Tur, Gokhan and Xu, Wei and Yaman, Sibel
Abstract
Cross-lingual tasks are especially difficult due to the compounding effect of errors in language processing and errors in machine translation (MT).
Introduction
0 How much does machine translation (MT) degrade the performance of cross-lingual 5W systems, as compared to monolingual performance?
The Chinese-English 5W Task
In this task, both machine translation (MT) and SW extraction must succeed in order to produce correct answers.
machine translation is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Amigó, Enrique and Giménez, Jesús and Gonzalo, Julio and Verdejo, Felisa
Introduction
Automatic evaluation methods based on similarity to human references have substantially accelerated the development cycle of many NLP tasks, such as Machine Translation , Automatic Summarization, Sentence Compression and Language Generation.
Introduction
context of Machine Translation , a considerable effort has also been made to include deeper linguistic information in automatic evaluation metrics, both syntactic and semantic (see Section 2 for details).
Previous Work on Machine Translation Meta-Evaluation
Insofar as automatic evaluation metrics for machine translation have been proposed, different meta-evaluation frameworks have been gradually introduced.
machine translation is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: