Index of papers in Proc. ACL 2014 that mention
  • machine translation
Xiong, Deyi and Zhang, Min
Abstract
In this paper, we propose a sense-based translation model to integrate word senses into statistical machine translation .
Abstract
Our method is significantly different from preVious word sense disambiguation reformulated for machine translation in that the latter neglects word senses in nature.
Conclusion
We have presented a sense-based translation model that integrates word senses into machine translation .
Conclusion
0 Word senses automatically induced by the HDP-based WSI on large-scale training data are very useful for machine translation .
Experiments
This suggests that automatically induced word senses alone are indeed useful for machine translation .
Introduction
In the context of machine translation , such different meanings normally produce different target translations.
Introduction
Therefore a natural assumption is that word sense disambiguation (WSD) may contribute to statistical machine translation (SMT) by providing appropriate word senses for target translation selection with context features (Carpuat and Wu, 2005).
Introduction
'or Statistical Machine Translation
Related Work
Xiong and Zhang (2013) employ a sentence-level topic model to capture coherence for document-level machine translation .
Related Work
The difference between our work and these preVious studies on topic model for SMT lies in that we adopt topic-based WSI to obtain word senses rather than generic topics and integrate induced word senses into machine translation .
WSI-Based Broad-Coverage Sense Tagger
We want to extend this hypothesis to machine translation by building sense-based translation model upon the HDP-based word sense induction: words with the same meanings tend to be translated in the same way.
machine translation is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Narayan, Shashi and Gardent, Claire
Abstract
We present a hybrid approach to sentence simplification which combines deep semantics and monolingual machine translation to derive simple sentences from complex ones.
Introduction
It is useful as a preprocessing step for a variety of NLP systems such as parsers and machine translation systems (Chandrasekar et al., 1996), sum-marisation (Knight and Marcu, 2000), sentence fusion (Filippova and Strube, 2008) and semantic
Introduction
Machine Translation systems have been adapted to translate complex sentences into $nqfleones(ZhuetaL,2010;VVubbenetaL,2012; Coster and Kauchak, 2011).
Introduction
First, it combines a model encoding probabilities for splitting and deletion with a monolingual machine translation module which handles reordering and substitution.
Related Work
(2010) constructed a parallel corpus (PWKP) of 108,016/114,924 comple)dsimple sentences by aligning sentences from EWKP and SWKP and used the resulting bitext to train a simplification model inspired by syntax-based machine translation (Yamada and Knight, 2001).
Related Work
To account for deletions, reordering and substitution, Coster and Kauchak (2011) trained a phrase based machine translation system on the PWKP corpus while modifying the word alignment output by GIZA++ in Moses to allow for null phrasal alignments.
Related Work
(2012) use Moses and the PWKP data to train a phrase based machine translation system augmented with a post-hoc reranking procedure designed to rank the output based on their dissimilarity from the source.
Simplification Framework
We also depart from Coster and Kauchak (2011) who rely on null phrasal alignments for deletion during phrase based machine translation .
Simplification Framework
Second, the simplified sentence(s) s’ is further simplified to s using a phrase based machine translation system (PBMT+LM).
Simplification Framework
where the probabilities p(s’ |DC), p(s’ |s) and 19(3) are given by the DRS simplification model, the phrase based machine translation model and the language model respectively.
machine translation is mentioned in 14 sentences in this paper.
Topics mentioned in this paper:
Hu, Yuening and Zhai, Ke and Eidelman, Vladimir and Boyd-Graber, Jordan
Abstract
Topic models, an unsupervised technique for inferring translation domains improve machine translation quality.
Experiments
We evaluate our new topic model, ptLDA, and existing topic models—LDA, pLDA, and tLDA—on their ability to induce domains for machine translation and the resulting performance of the translations on standard machine translation metrics.
Introduction
In particular, we use topic models to aid statistical machine translation (Koehn, 2009, SMT).
Introduction
Modern machine translation systems use millions of examples of translations to learn translation rules.
Introduction
As we review in Section 2, topic models are a promising solution for automatically discovering domains in machine translation corpora.
Polylingual Tree-based Topic Models
We compare these models’ machine translation performance in Section 5.
Topic Models for Machine Translation
2.1 Statistical Machine Translation
Topic Models for Machine Translation
Statistical machine translation casts machine translation as a probabilistic process (Koehn, 2009).
Topic Models for Machine Translation
(2012) ignore a wealth of information that could improve topic models and help machine translation .
machine translation is mentioned in 20 sentences in this paper.
Topics mentioned in this paper:
Koehn, Philipp and Tsoukala, Chara and Saint-Amand, Herve
Introduction
As machine translation enters the workflow of professional translators, the exact nature of this human-computer interaction is currently an open challenge.
Introduction
Instead of tasking translators to post-edit the output of machine translation systems, a more interactive approach may be more fruitful.
Introduction
The standard approach to this problem uses the search graph of the machine translation system.
Properties of Core Algorithm
We predict translations that were crafted by manual post-editing of machine translation output.
Properties of Core Algorithm
We also use the search graphs of the system that produced the original machine translation output.
Properties of Core Algorithm
In the project’s first field trialz, professional translators corrected machine translations of news stories from a competitive English—Spanish machine translation system (Koehn and Haddow, 2012).
Refinements
Analysis of the data suggests that gains mainly come from large length mismatches between user translation and machine translation , even in the case of first pass searches.
Refinements
For instance, if the user prefix differs only in casing from the machine translation (say, University instead of university), then we may still want to treat that as a word match in our algorithm.
Related Work
The interactive machine translation paradigm was first explored in the TransType and TransType2 projects (Langlais et al., 2000a; Foster et al., 2002; Bender et al., 2005; Barrachina et al., 2009).
Word Completion
When the machine translation system decides for college over university, but the user types the letter u, it should change its prediction.
machine translation is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Qian, Longhua and Hui, Haotian and Hu, Ya'nan and Zhou, Guodong and Zhu, Qiaoming
Abstract
Instead of using a parallel corpus which should have entity/relation alignment information and is thus difficult to obtain, this paper employs an off-the-shelf machine translator to translate both labeled and unlabeled instances from one language into the other language, forming pseudo parallel corpora.
Abstract
Based on a small number of labeled instances and a large number of unlabeled instances in both languages, our method differs from theirs in that we adopt a bilingual active learning paradigm via machine translation and improve the performance for both languages simultaneously.
Abstract
machine translation , which make use of multilingual corpora to decrease human annotation efforts by selecting highly informative sentences for a newly added language in multilingual parallel corpora.
machine translation is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Guzmán, Francisco and Joty, Shafiq and Màrquez, Llu'is and Nakov, Preslav
Abstract
We present experiments in using discourse structure for improving machine translation evaluation.
Abstract
Then, we show that these measures can help improve a number of existing machine translation evaluation metrics both at the segment- and at the system-level.
Experimental Results
In this section, we explore how discourse information can be used to improve machine translation evaluation metrics.
Experimental Results
Overall, from the experimental results in this section, we can conclude that discourse structure is an important information source to be taken into account in the automatic evaluation of machine translation output.
Introduction
From its foundations, Statistical Machine Translation (SMT) had two defining characteristics: first, translation was modeled as a generative process at the sentence-level.
Introduction
This is demonstrated by the establishment of a recent workshop dedicated to Discourse in Machine Translation (Webber et al., 2013), collocated with the 2013 annual meeting of the Association of Computational Linguistics.
Introduction
The area of discourse analysis for SMT is still nascent and, to the best of our knowledge, no previous research has attempted to use rhetorical structure for SMT or machine translation evaluation.
Related Work
Addressing discourse-level phenomena in machine translation is relatively new as a research direction.
Related Work
The field of automatic evaluation metrics for MT is very active, and new metrics are continuously being proposed, especially in the context of the evaluation campaigns that run as part of the Workshops on Statistical Machine Translation (WMT 2008-2012), and NIST Metrics for Machine Translation Challenge (MetricsMATR), among others.
machine translation is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Yan, Rui and Gao, Mingkun and Pavlick, Ellie and Callison-Burch, Chris
Abstract
Crowdsourcing is a viable mechanism for creating training data for machine translation .
Conclusion
In addition to its benefits of cost and scalability, crowdsourcing provides access to languages that currently fall outside the scope of statistical machine translation research.
Evaluation
art machine translation system (the syntax-based variant of Joshua) achieves a score of 26.91, which is reported in (Zaidan and Callison-Burch, 2011).
Introduction
Statistical machine translation (SMT) systems are trained using bilingual sentence-aligned parallel corpora.
Related work
These have focused on an iterative collaboration between monolingual speakers of the two languages, facilitated with a machine translation system.
Related work
In our setup the poor translations are produced by bilingual individuals who are weak in the target language, and in their experiments the translations are the output of a machine translation system.1 Another significant difference is that the HCI studies assume cooperative participants.
Related work
1A variety of HC1 and NLP studies have confirmed the efficacy of monolingual or bilingual individuals post-editing of machine translation output (Callison-Burch, 2005; Koehn, 2010; Green et al., 2013).
machine translation is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Liu, Le and Hong, Yu and Liu, Hao and Wang, Xing and Yao, Jianmin
Abstract
Data selection has been demonstrated to be an effective approach to addressing the lack of high-quality bitext for statistical machine translation in the domain of interest.
Conclusion
our methods into domain adaptation task of statistical machine translation in model level.
Experiments
We use the NiuTrans 2 toolkit which adopts GIZA++ (Och and Ney, 2003) and MERT (Och, 2003) to train and tune the machine translation system.
Experiments
This tool scores the outputs in several criterions, while the case-insensitive BLEU-4 (Papineni et al., 2002) is used as the evaluation for the machine translation system.
Experiments
When top 600k sentence pairs are picked out from general-domain corpus to train machine translation systems, the systems perform higher than the General-domain baseline trained on 16 million parallel data.
Introduction
Statistical machine translation depends heavily on large scale parallel corpora.
Introduction
However, domain-specific machine translation has few parallel corpora for translation model training in the domain of interest.
Training Data Selection Methods
Translation model is a key component in statistical machine translation .
machine translation is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Tu, Mei and Zhou, Yu and Zong, Chengqing
Abstract
However, in most current statistical machine translation (SMT) systems, the outputs of compound-complex sentences still lack proper transitional expressions.
Conclusion
of machine translation .
Experiments
5 http://www.speech.sri.com/projects/srilm/ 6 The China Workshop on Machine Translation
Introduction
During the last decade, great progress has been made on statistical machine translation (SMT) models.
Related Work
In (Xiong et al., 2013a), three different features were designed to capture the lexical cohesion for document-level machine translation .
Related Work
(Xiong et al., 2013b) incorporated lexical-chain-based models (Morris and Hirst, 1991) into machine translation .
Related Work
(Meyer and Popescu-Belis, 2012) used sense-labeled discourse connectives for machine translation from English to French.
machine translation is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Auli, Michael and Gao, Jianfeng
Abstract
Neural network language models are often trained by optimizing likelihood, but we would prefer to optimize for a task specific metric, such as BLEU in machine translation .
Abstract
Our best results improve a phrase-based statistical machine translation system trained on WMT 2012 French-English data by up to 2.0 BLEU, and the expected BLEU objective improves over a cross-entropy trained model by up to 0.6 BLEU in a single reference setup.
Introduction
Neural network-based language and translation models have achieved impressive accuracy improvements on statistical machine translation tasks (Allauzen et al., 2011; Le et al., 2012b; Schwenk et al., 2012; Vaswani et al., 2013; Gao et al., 2014).
Introduction
In this paper we focus on recurrent neural network architectures which have recently advanced the state of the art in language modeling (Mikolov et al., 2010; Mikolov et al., 2011; Sundermeyer et al., 2013) with several subsequent applications in machine translation (Auli et al., 2013; Kalchbrenner and Blunsom, 2013; Hu et al., 2014).
Introduction
In practice, neural network models for machine translation are usually trained by maximizing the likelihood of the training data, either via a cross-entropy objective (Mikolov et al., 2010; Schwenk
machine translation is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Hermann, Karl Moritz and Blunsom, Phil
Corpora
While the corpus is aimed at machine translation tasks, we use the keywords associated with each talk to build a subsidiary corpus for multilingual document classification as follows.3
Experiments
A similar idea exists in machine translation where English is frequently used to pivot between other languages (Cohn and Lapata, 2007).
Experiments
MT System We develop a machine translation baseline as follows.
Experiments
We train a machine translation tool on the parallel training data, using the development data of each language pair to optimize the translation system.
Related Work
Is was demonstrated that this approach can be applied to improve tasks related to machine translation .
Related Work
(2013), also learned bilingual embeddings for machine translation .
machine translation is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Salloum, Wael and Elfardy, Heba and Alamir-Salloum, Linda and Habash, Nizar and Diab, Mona
Abstract
In this paper we study the use of sentence-level dialect identification in optimizing machine translation system selection when translating mixed dialect input.
Abstract
We test our approach on Arabic, a prototypical diglossic language; and we optimize the combination of four different machine translation systems.
Introduction
For statistical machine translation (MT), which relies on the existence of parallel data, translating from nonstandard dialects is a challenge.
Machine Translation Experiments
We use the open-source Moses toolkit (Koehn et al., 2007) to build four Arabic-English phrase-based statistical machine translation systems (SMT).
Related Work
Arabic Dialect Machine Translation .
Related Work
System Selection and Combination in Machine Translation .
machine translation is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Chen, Yanqing and Skiena, Steven
Knowledge Graph Construction
0 Machine Translation - We script the Google translation API to get even more semantic links.
Knowledge Graph Construction
In total, machine translation provides 53.2% of the total links and establishes connections between 3.5 million vertices.
Related Work
The ready availability of machine translation to and from English has prompted efforts to employ translation for sentiment analysis (Bautin et al., 2008).
Related Work
(2008) demonstrate that machine translation can perform quite well when extending the subjectivity analysis to multilingual environment, which makes it inspiring to replicate their work on lexicon-based sentiment analysis.
Related Work
(2013) combine machine translation and word representation to generate bilingual language resources.
machine translation is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
van Gompel, Maarten and van den Bosch, Antal
Data preparation
This is done using the scripts provided by the Statistical Machine Translation system Moses (Koehn et al., 2007).
Evaluation
In addition to these, the system’s output can be compared against the L2 reference translation(s) using established Machine Translation evaluation metrics.
Introduction
Whereas machine translation generally concerns the translation of whole sentences or texts from one language to the other, this study focusses on the translation of native language (henceforth L1) words and phrases, i.e.
Introduction
the role of the translation model in Statistical Machine Translation (SMT).
System
It has also been used in machine translation studies in which local source context is used to classify source phrases into target phrases, rather than looking them up in a phrase table (Stroppa et al., 2007; Haque et al., 2011).
machine translation is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Wang, Xiaolin and Utiyama, Masao and Finch, Andrew and Sumita, Eiichiro
Abstract
Unsupervised word segmentation (UWS) can provide domain-adaptive segmentation for statistical machine translation (SMT) without annotated data, and bilingual UWS can even optimize segmentation for alignment.
Complexity Analysis
The first bilingual corpus: OpenMT06 was used in the NIST open machine translation 2006 Evaluation 2.
Complexity Analysis
PatentMT9 is from the shared task of NTCIR-9 patent machine translation .
Complexity Analysis
For the bilingual tasks, the publicly available system of Moses (Koehn et al., 2007) with default settings is employed to perform machine translation , and BLEU (Papineni et al., 2002) was used to evaluate the quality.
Introduction
For example, in machine translation , there are various parallel corpora such as
machine translation is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Heilman, Michael and Cahill, Aoife and Madnani, Nitin and Lopez, Melissa and Mulholland, Matthew and Tetreault, Joel
Abstract
Automated methods for identifying whether sentences are grammatical have various potential applications (e.g., machine translation , automated essay scoring, computer-assisted language learning).
Introduction
Such a system could be used, for example, to check or to rank outputs from systems for text summarization, natural language generation, or machine translation .
Introduction
While some applications (e.g., grammar checking) rely on such fine-grained predictions, others might be better addressed by sentence-level grammaticality judgments (e. g., machine translation evaluation).
Introduction
ity of machine translation outputs (Gamon et al., 2005; Parton et al., 2011), such as the MT Quality Estimation Shared Tasks (Bojar et al., 2013, §6), but relatively little on evaluating the grammaticality of naturally occurring text.
machine translation is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Devlin, Jacob and Zbib, Rabih and Huang, Zhongqiang and Lamar, Thomas and Schwartz, Richard and Makhoul, John
Introduction
Initially, these models were primarily used to create n-gram neural network language models (NNLMs) for speech recognition and machine translation (Bengio et al., 2003; Schwenk, 2010).
Introduction
Unlike previous approaches to joint modeling (Le et al., 2012), our feature can be easily integrated into any statistical machine translation (SMT) decoder, which leads to substantially larger improvements than k-best rescoring only.
Model Variations
We have described a novel formulation for a neural network-based machine translation joint model, along with several simple variations of this model.
Model Variations
One of the biggest goals of this work is to quell any remaining doubts about the utility of neural networks in machine translation .
machine translation is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Riezler, Stefan and Simianer, Patrick and Haas, Carolin
Abstract
We propose a novel learning approach for statistical machine translation (SMT) that allows to extract supervision signals for structured learning from an extrinsic response to a translation input.
Introduction
In this paper, we propose a novel approach for learning and evaluation in statistical machine translation (SMT) that borrows ideas from response-based learning for grounded semantic parsing.
Introduction
We suggest that in a similar way the preservation of meaning in machine translation should be defined in the context of an interaction in an extrinsic task.
Related Work
Interactive scenarios have been used for evaluation purposes of translation systems for nearly 50 years, especially using human reading comprehension testing (Pfafflin, 1965; Fuji, 1999; Jones et al., 2005), and more recently, using face-to-face conversation mediated via machine translation (Sakamoto et al., 2013).
machine translation is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Tamura, Akihiro and Watanabe, Taro and Sumita, Eiichiro
Introduction
Automatic word alignment is an important task for statistical machine translation .
Related Work
Recently, FFNNs have been applied successfully to several tasks, such as speech recognition (Dahl et al., 2012), statistical machine translation (Le et al., 2012; Vaswani et al., 2013), and other popular natural language processing tasks (Collobert and Weston, 2008; Collobert et al., 2011).
Training
5.4 Machine Translation Results
Training
Our experiments have shown that the proposed model outperforms the FFNN-based model (Yang et al., 2013) for word alignment and machine translation , and that the agreement constraint improves alignment performance.
machine translation is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zeng, Xiaodong and Chao, Lidia S. and Wong, Derek F. and Trancoso, Isabel and Tian, Liang
Abstract
This study investigates on building a better Chinese word segmentation model for statistical machine translation .
Abstract
The experiments on a Chinese-to-English machine translation task reveal that the proposed model can bring positive segmentation effects to translation quality.
Introduction
The empirical works show that word segmentation can be beneficial to Chinese-to-English statistical machine translation (SMT) (Xu et al., 2005; Chang et al., 2008; Zhao et al., 2013).
machine translation is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Yıldız, Olcay Taner and Solak, Ercan and Görgün, Onur and Ehsani, Razieh
Abstract
In this paper, we report our preliminary efforts in building an English-Turkish parallel treebank corpus for statistical machine translation .
Introduction
For example, EuroParl corpus (Koehn, 2002), one of the biggest parallel corpora in statistical machine translation , contains 22 languages (but not Turkish).
Introduction
In this study, we report our preliminary efforts in constructing an English-Turkish parallel treebank corpus for statistical machine translation .
machine translation is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Huang, Fei and Xu, Jian-Ming and Ittycheriah, Abraham and Roukos, Salim
Abstract
We present an adaptive translation quality estimation (QE) method to predict the human-targeted translation error rate (HTER) for a document-specific machine translation model.
Introduction
Machine translation (MT) systems suffer from an inconsistent and unstable translation quality.
Related Work
There has been a long history of study in confidence estimation of machine translation .
machine translation is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Cui, Lei and Zhang, Dongdong and Liu, Shujie and Chen, Qiming and Li, Mu and Zhou, Ming and Yang, Muyun
Abstract
Statistical Machine Translation (SMT) usually utilizes contextual information to disambiguate translation candidates.
Experiments
We evaluate the performance of our neural network based topic similarity model on a Chinese-to-English machine translation task.
Introduction
Making translation decisions is a difficult task in many Statistical Machine Translation (SMT) systems.
machine translation is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Cai, Jingsheng and Utiyama, Masao and Sumita, Eiichiro and Zhang, Yujie
Abstract
In statistical machine translation (SMT), syntax-based pre-ordering of the source language is an effective method for dealing with language pairs where there are great differences in their respective word orders.
Introduction
This is especially important on the point of the system combination of PBSMT systems, because the diversity of outputs from machine translation systems is important for system combination (Cer et al., 2013).
Introduction
By using both our rules and Wang et al.’s rules, one can obtain diverse machine translation results because the pre-ordering results of these two rule sets are generally different.
machine translation is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: