Index of papers in Proc. ACL that mention

translation model

Seen in text as:

translation model (481)
translation models (215)
Translation Model (20)
translation modeling (6)
Translation model (4)
Translation models (4)
Translation Models (3)
translation model’s (3)

Seen in 693 sentences in 73 papers.

1. Hindi-to-Urdu Machine Translation through Transliteration

Durrani, Nadir and Sajjad, Hassan and Fraser, Alexander and Schmid, Helmut

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Section 3 introduces two probabilistic models for integrating translations and transliterations into a translation model which are based on conditional and joint probability distributions.
Our Approach	Both of our models combine a character-based transliteration model with a word-based translation model .
Our Approach	Language Model for Unknown Words: Our model generates transliterations that can be known or unknown to the language model and the translation model .
Our Approach	We refer to the words known to the language model and to the translation model as LM-known and TM-known words respectively and to words that are unknown as LM-unknown and TM-unknown respectively.
Previous Work	Moreover, they are working with a large bitext so they can rely on their translation model and only need to transliterate NEs and OOVs.
Previous Work	Our translation model is based on data which is both sparse and noisy.

translation model is mentioned in 16 sentences in this paper.

Topics mentioned in this paper:

2. Deciphering Foreign Language by Combining Language Models and Context Vectors

Nuhn, Malte and Mauser, Arne and Ney, Hermann

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	In this work, we attempt to learn statistical translation models from only monolingual data in the source and target language.
Introduction	This work is a big step towards large-scale and large-vocabulary unsupervised training of statistical translation models .
Introduction	In this work, we will develop, describe, and evaluate methods for large vocabulary unsupervised learning of machine translation models suitable for real-world tasks.
Related Work	Their best performing approach uses an EM-Algorithm to train a generative word based translation model .
Translation Model	In this section, we describe the statistical training criterion and the translation model that is trained using monolingual data.
Translation Model	As training criterion for the translation model’s parameters 6, Ravi and Knight (2011) suggest
Translation Model	This becomes increasingly difficult with more complex translation models .

translation model is mentioned in 15 sentences in this paper.

Topics mentioned in this paper:

LM (27)
BLEU (16)
translation model (15)

3. Mixing Multiple Translation Models in Statistical Machine Translation

Razmara, Majid and Foster, George and Sankaran, Baskaran and Sarkar, Anoop

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Statistical machine translation is often faced with the problem of combining training data from many diverse sources into a single translation model which then has to translate sentences in a new domain.
Baselines	Log-linear translation model (TM) mixtures are of the form:
Ensemble Decoding	Given a number of translation models which are already trained and tuned, the ensemble decoder uses hypotheses constructed from all of the models in order to translate a sentence.
Introduction	Common techniques for model adaptation adapt two main components of contemporary state-of-the-art SMT systems: the language model and the translation model .
Introduction	translation model adaptation, because various measures such as perplexity of adapted language models can be easily computed on data in the target domain.
Introduction	It is also easier to obtain monolingual data in the target domain, compared to bilingual data which is required for translation model adaptation.

translation model is mentioned in 16 sentences in this paper.

Topics mentioned in this paper:

4. Maximum Expected BLEU Training of Phrase and Lexicon Translation Models

He, Xiaodong and Deng, Li

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	This paper proposes a new discriminative training method in constructing phrase and lexicon translation models .
Abstract	parameters in the phrase and lexicon translation models are estimated by relative frequency or maximizing joint likelihood, which may not correspond closely to the translation measure, e.g., bilingual evaluation understudy (BLEU) (Papineni et al., 2002).
Abstract	However, the number of parameters in common phrase and lexicon translation models is much larger.

translation model is mentioned in 25 sentences in this paper.

Topics mentioned in this paper:

5. Modeling the Translation of Predicate-Argument Structure for SMT

Xiong, Deyi and Zhang, Min and Li, Haizhou

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In this paper, we propose two discriminative, feature-based models to exploit predicate-argument structures for statistical machine translation: 1) a predicate translation model and 2) an argument reordering model.
Abstract	The predicate translation model explores lexical and semantic contexts surrounding a verbal predicate to select desirable translations for the predicate.
Introduction	This suggests that conventional leXical and phrasal translation models adopted in those SMT systems are not sufficient to correctly translate predicates in source sentences.
Introduction	Thus we propose a discriminative, feature-based predicate translation model that captures not only leXical information (i.e., surrounding words) but also high-level semantic contexts to correctly translate predicates.
Introduction	In Section 3 and 4, we will elaborate the proposed predicate translation model and argument reordering model respectively, including details about modeling, features and training procedure.
Predicate Translation Model	In this section, we present the features and the training process of the predicate translation model .
Predicate Translation Model	Following the context-dependent word models in (Berger et al., 1996), we propose a discriminative predicate translation model .
Predicate Translation Model	Given a source sentence which contains N verbal predicates , our predicate translation model Mt can be denoted as
Related Work	Our predicate translation model is also related to previous discriminative leXicon translation models (Berger et al., 1996; Venkatapathy and Bangalore, 2007; Mauser et al., 2009).
Related Work	This will tremendously reduce the amount of training data required, which usually is a problem in discriminative leXicon translation models (Mauser et al., 2009).
Related Work	Furthermore, the proposed translation model also differs from previous leXicon translation models in that we use both leXical and semantic features.

translation model is mentioned in 29 sentences in this paper.

Topics mentioned in this paper:

6. Deciphering Foreign Language

Ravi, Sujith and Knight, Kevin

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We frame the MT problem as a decipherment task, treating the foreign text as a cipher for English and present novel methods for training translation models from nonparallel text.
Introduction	From these corpora, we estimate translation model parameters: word-to-word translation tables, fertilities, distortion parameters, phrase tables, syntactic transformations, etc.
Introduction	In this paper, we address the problem of learning a fall translation model from nonparallel data, and we use the
Introduction	How can we learn a translation model from nonparallel data?
Machine Translation as a Decipherment Task	Probabilistic decipherment: Unlike parallel training, here we have to estimate the translation model P9( f \|e) parameters using only monolingual data.
Machine Translation as a Decipherment Task	We then estimate parameters of the translation model P9( f \|e) during training.
Machine Translation as a Decipherment Task	EM Decipherment: We propose a new translation model for MT decipherment which can be efficiently trained using the EM algorithm.

translation model is mentioned in 15 sentences in this paper.

Topics mentioned in this paper:

7. Additive Neural Networks for Statistical Machine Translation

liu, lemao and Watanabe, Taro and Sumita, Eiichiro and Zhao, Tiejun

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	additive neural networks, for SMT to go beyond the log-linear translation model .
Abstract	Our model outperforms the log-linear translation models with/without embedding features on Chinese-to-English and J apanese-to-English translation tasks.
Introduction	On the one hand, features are required to be linear with respect to the objective of the translation model (Nguyen et al., 2007), but it is not guaranteed that the potential features be linear with the model.
Introduction	In the search procedure, frequent computation of the model score is needed for the search heuristic function, which will be challenged by the decoding efficiency for the neural network based translation model .
Introduction	The biggest contribution of this paper is that it goes beyond the log-linear model and proposes a nonlinear translation model instead of re-ranking model (Duh and Kirchhoff, 2008; Sokolov et al., 2012).

translation model is mentioned in 15 sentences in this paper.

Topics mentioned in this paper:

8. Scalable Decipherment for Machine Translation via Hash Sampling

Ravi, Sujith

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Following a probabilistic decipherment approach, we first introduce a new framework for decipherment training that is flexible enough to incorporate any number/type of features (besides simple bag-of-words) as side-information used for estimating translation models .
Decipherment Model for Machine Translation	Contrary to standard machine translation training scenarios, here we have to estimate the translation model P9( f \|e) parameters using only monolingual data.
Decipherment Model for Machine Translation	We then estimate the parameters of the translation model P9 (f le) during training.
Decipherment Model for Machine Translation	Translation Model : Machine translation is a much more complex task than solving other decipherment tasks such as word substitution ciphers (Ravi and Knight, 2011b; Dou and Knight, 2012).
Introduction	The parallel corpora are used to estimate translation model parameters involving word-to-word translation tables, fertilities, distortion, phrase translations, syntactic transformations, etc.
Introduction	Learning translation models from monolingual corpora could help address the challenges faced by modem-day MT systems, especially for low resource language pairs.
Introduction	Recently, this topic has been receiving increasing attention from researchers and new methods have been proposed to train statistical machine translation models using only monolingual data in the source and target language.

translation model is mentioned in 30 sentences in this paper.

Topics mentioned in this paper:

9. Combining Morpheme-based Machine Translation with Post-processing Morpheme Prediction

Clifton, Ann and Sarkar, Anoop

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Models 2.1 Baseline Models	Our second baseline is a factored translation model (Koehn and Hoang, 2007) (called Factored), which used as factors the word, “stem”1 and suffix.
Models 2.1 Baseline Models	performance of unsupervised segmentation for translation, our third baseline is a segmented translation model based on a supervised segmentation model (called Sup), using the hand-built Omorfi morphological analyzer (Pirinen and Lis-tenmaa, 2007), which provided slightly higher BLEU scores than the word-based baseline.
Models 2.1 Baseline Models	For segmented translation models , it cannot be taken for granted that greater linguistic accuracy in segmentation yields improved translation (Chang et al., 2008).
Translation and Morphology	Our main contributions are: 1) the introduction of the notion of segmented translation where we explicitly allow phrase pairs that can end with a dangling morpheme, which can connect with other morphemes as part of the translation process, and 2) the use of a fully segmented translation model in combination with a postprocessing morpheme prediction system, using unsupervised morphology induction.
Translation and Morphology	Morphology can express both content and function categories, and our experiments show that it is important to use morphology both within the translation model (for morphology with content) and outside it (for morphology contributing to fluency).

translation model is mentioned in 21 sentences in this paper.

Topics mentioned in this paper:

translation model (21)
BLEU (17)
CRF (12)

10. A Multi-Domain Translation Model Framework for Statistical Machine Translation

Sennrich, Rico and Schwenk, Holger and Aransa, Walid

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We present an architecture that delays the computation of translation model features until decoding, allowing for the application of mixture-modeling techniques at decoding time.
Abstract	Experimental results on two language pairs demonstrate the effectiveness of both our translation model architecture and automatic clustering, with gains of up to 1 BLEU over unadapted systems and single-domain adaptation.
Introduction	We introduce a translation model architecture that delays the computation of features to the decoding phase.
Related Work	(Ortiz-Martinez et al., 2010) delay the computation of translation model features for the purpose of interactive machine translation with online training.
Related Work	(Sennrich, 2012b) perform instance weighting of translation models , based on the sufficient statistics.
Related Work	(Razmara et al., 2012) describe an ensemble decoding framework which combines several translation models in the decoding step.
Translation Model Architecture	This section covers the architecture of the multi-domain translation model framework.
Translation Model Architecture	Our translation model is embedded in a log-linear model as is common for SMT, and treated as a single translation model in this log-linear combination.
Translation Model Architecture	The architecture has two goals: move the calculation of translation model features to the decoding phase, and allow for multiple knowledge sources (e. g. bitexts or user-provided data) to contribute to their calculation.

translation model is mentioned in 25 sentences in this paper.

Topics mentioned in this paper:

11. Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization

Zhou, Guangyou and Liu, Fang and Liu, Yang and He, Shizhu and Zhao, Jun

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	State-of-the-art approaches address these issues by implicitly expanding the queried questions with additional words or phrases using monolingual translation models .
Experiments	Row 3 and row 6 are monolingual translation models to address the word mismatch problem and obtain the state-of-the-art performance in previous work.
Experiments	Row 3 is the word-based translation model (Jeon et al., 2005), and row 4 is the word-based translation language model, which linearly combines the word-based translation model and language model into a unified framework (Xue et al., 2008).
Experiments	Row 5 is the phrase-based translation model , which translates a sequence of words as whole (Zhou et al., 2011).
Introduction	Researchers have proposed the use of word-based translation models (Berger et al., 2000; Jeon et al., 2005; Xue et al., 2008; Lee et al., 2008; Bernhard and Gurevych, 2009) to solve the word mismatch problem.
Introduction	As a principle approach to capture semantic word relations, word-based translation models are built by using the IBM model 1 (Brown et al., 1993) and have been shown to outperform traditional models (e.g., VSM, BM25, LM) for question retrieval.
Introduction	(2011) proposed the phrase-based translation models for question and answer retrieval.

translation model is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

12. Learning Topic Representation for SMT with Neural Networks

Cui, Lei and Zhang, Dongdong and Liu, Shujie and Chen, Qiming and Li, Mu and Zhou, Ming and Yang, Muyun

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion and Future Work	Experimental results show that our approach is promising for SMT systems to learn a better translation model .
Experiments	Translation models are trained over the parallel data that is automatically word-aligned
Experiments	This implementation makes the system perform much better and the translation model size is much smaller.
Introduction	Current translation modeling approaches usually use context dependent information to disambiguate translation candidates.
Introduction	Therefore, it is important to leverage topic information to learn smarter translation models and achieve better translation performance.
Introduction	Attempts on topic-based translation modeling include topic-specific lexicon translation models (Zhao and Xing, 2006; Zhao and Xing, 2007), topic similarity models for synchronous rules (Xiao et al., 2012), and document-level translation with topic coherence (Xiong and Zhang, 2013).
Related Work	Following this work, (Xiao et al., 2012) extended topic-specific lexicon translation models to hierarchical phrase-based translation models , where the topic information of synchronous rules was directly inferred with the help of document-level information.
Related Work	They incorporated the bilingual topic information into language model adaptation and lexicon translation model adaptation, achieving significant improvements in the large-scale evaluation.
Related Work	They estimated phrase-topic distributions in translation model adaptation and generated better translation quality.
Topic Similarity Model with Neural Network	Therefore, it helps to train a smarter translation model with the embedded topic information.
Topic Similarity Model with Neural Network	Standard features: Translation model , including translation probabilities and lexical weights for both directions (4 features), 5-gram language model (1 feature), word count (1 feature), phrase count (1 feature), NULL penalty (1 feature), number of hierarchical rules used (1 feature).

translation model is mentioned in 11 sentences in this paper.

Topics mentioned in this paper:

13. Forest-based Tree Sequence to String Translation Model

Zhang, Hui and Zhang, Min and Li, Haizhou and Aw, Aiti and Tan, Chew Lim

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	This paper proposes a forest-based tree sequence to string translation model for syntax-based statistical machine translation, which automatically learns tree sequence to string translation rules from word-aligned source-side-parsed bilingual texts.
Abstract	The proposed model leverages on the strengths of both tree sequence-based and forest-based translation models .
Decoding	4) Decode the translation forest using our translation model and a dynamic search algorithm.
Forest-based tree sequence to string model	3.3 Forest-based tree-sequence to string translation model
Forest-based tree sequence to string model	Given a source forest F and target translation T S as well as word alignment A, our translation model is formulated as:
Introduction	to String Translation Model
Introduction	Section 2 describes related work while section 3 defines our translation model .
Related work	Motivated by the fact that non-syntactic phrases make nontrivial contribution to phrase-based SMT, the tree sequence-based translation model is proposed (Liu et al., 2007; Zhang et al., 2008a) that uses tree sequence as the basic translation unit, rather than using single subtree as in the STSG.
Related work	(2007) propose the tree sequence concept and design a tree sequence to string translation model .
Related work	(2008a) propose a tree sequence-based tree to tree translation model and Zhang et al.

translation model is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

14. Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding

Bernhard, Delphine and Gurevych, Iryna

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	They can be obtained by training statistical translation models on parallel monolingual corpora, such as question-answer pairs, where answers act as the “source” language and questions as the “target” language.
Abstract	We compare monolingual translation models built from lexical semantic resources with two other kinds of datasets: manually-tagged question reformulations and question-answer pairs.
Introduction	Berger and Lafferty (1999) have formulated a further solution to the lexical gap problem consisting in integrating monolingual statistical translation models in the retrieval process.
Introduction	Monolingual translation models encode statistical word associations which are trained on parallel monolingual corpora.
Introduction	While collection-specific translation models effectively encode statistical word associations for the target document collection, it also introduces a bias in the evaluation and makes it difficult to assess the quality of the translation model per se, independently from a specific task and document collection.

translation model is mentioned in 29 sentences in this paper.

Topics mentioned in this paper:

15. A Sense-Based Translation Model for Statistical Machine Translation

Xiong, Deyi and Zhang, Min

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In this paper, we propose a sense-based translation model to integrate word senses into statistical machine translation.
Abstract	The proposed sense-based translation model enables the decoder to select appropriate translations for source words according to the inferred senses for these words using maximum entropy classifiers.
Abstract	We test the effectiveness of the proposed sense-based translation model on a large-scale Chinese-to-English translation task.
Introduction	These glosses, used as the sense predictions of their WSD system, are integrated into a word-based SMT system either to substitute for translation candidates of their translation model or to postedit the output of their SMT system.
Introduction	In order to incorporate word senses into SMT, we propose a sense-based translation model that is built on maximum entropy classifiers.
Introduction	We collect training instances from the sense-tagged training data to train the proposed sense-based translation model .

translation model is mentioned in 46 sentences in this paper.

Topics mentioned in this paper:

16. A Hybrid Approach to Skeleton-based Translation

Xiao, Tong and Zhu, Jingbo and Zhang, Chunliang

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Skeleton-based Approach to MT 2.1 Skeleton Identification	To compute g(d), we use a linear combination of a skeleton translation model 98196; (d) and a full translation model gfuu (d):
A Skeleton-based Approach to MT 2.1 Skeleton Identification	9(d) = gskel(d) + gfuu(d) (3) where the skeleton translation model handles the translation of the sentence skeleton, while the full translation model is the baseline model and handles the original problem of translating the whole sentence.
A Skeleton-based Approach to MT 2.1 Skeleton Identification	The skeleton translation model focuses on the translation of the sentence skeleton, i.e., the solid (red) rectangles; while the full translation model computes the model score for all those phrase-pairs, i.e., all solid and dashed rectangles.
Abstract	The basic idea is that we translate the key elements of the input sentence using a skeleton translation model, and then cover the remain segments using a full translation model .
Introduction	Note that the source-language structural information has been intensively investigated in recent studies of syntactic translation models .
Introduction	0 We develop a skeleton-based model which divides translation into two sub-models: a skeleton translation model (i.e., translating the key elements) and a full translation model (i.e., translating the remaining source words and generating the complete translation).

translation model is mentioned in 14 sentences in this paper.

Topics mentioned in this paper:

17. A Tree Sequence Alignment-based Tree-to-Tree Translation Model

Zhang, Min and Jiang, Hongfei and Aw, Aiti and Li, Haizhou and Tan, Chew Lim and Li, Sheng

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	This paper presents a translation model that is based on tree sequence alignment, where a tree sequence refers to a single sequence of sub-trees that covers a phrase.
Introduction	3d Tree-to-Tree Translation Model
Introduction	In this paper, we propose a tree-to-tree translation model that is based on tree sequence alignment.
Related Work	Ding and Palmer (2005) propose a syntax-based translation model based on a probabilistic synchronous dependency insertion grammar.
Related Work	(2005) propose a dependency treelet-based translation model .
Related Work	(2007b) present a STSG-based tree-to-tree translation model .
Tree Sequence Alignment Model	3.2 Tree Sequence Translation Model
Tree Sequence Alignment Model	sequence-to-tree sequence translation model is formulated as:

translation model is mentioned in 18 sentences in this paper.

Topics mentioned in this paper:

18. Unsupervised Solution Post Identification from Discussion Forums

P, Deepak and Visweswariah, Karthik

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We use translation models and language models to exploit lexical correlations and solution post character respectively.
Introduction	We model the lexical correlation and solution post character using regularized translation models and unigram language models respectively.
Our Approach	The usage of translation models in QA retrieval (Xue et al., 2008; Singh, 2012) and segmentation (Deepak et al., 2012) were also motivated by the correlation assumption.
Our Approach	We use an IBM Model 1 translation model (Brown et al., 1990) in our technique; simplistically, such a model m may be thought of as a 2-d associative array where the value m[w1] [mg] is directly related to the probability of wl occuring in the problem when ’LU2 occurs in the solution.
Our Approach	Consider a unigram language model 83 that models the lexical characteristics of solution posts, and a translation model 73 that models the lexical correlation between problems and solutions.
Related Work	Usage of translation models for modeling the correlation between textual problems and solutions have been explored earlier starting from the answer retrieval work in (Xue et al., 2008) where new queries were conceptually expanded using the translation model to improve retrieval.
Related Work	Translation models were also seen to be useful in segmenting incident reports into the problem and solution parts (Deepak et al., 2012); we will use an adaptation of the generative model presented therein, for our solution extraction formulation.
Related Work	Entity-level translation models

translation model is mentioned in 24 sentences in this paper.

Topics mentioned in this paper:

19. Effective Selection of Translation Model Training Data

Liu, Le and Hong, Yu and Liu, Hao and Wang, Xing and Yao, Jianmin

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	By contrast, we argue that the relevance between a sentence pair and target domain can be better evaluated by the combination of language model and translation model .
Abstract	In this paper, we study and experiment with novel methods that apply translation models into domain-relevant data selection.
Introduction	The corpora are necessary priori knowledge for training effective translation model .
Introduction	However, domain-specific machine translation has few parallel corpora for translation model training in the domain of interest.
Introduction	To overcome the problem, we first propose the method combining translation model with language model in data selection.
Related Work	Thus, we propose novel methods which are based on translation model and language model for data selection.
Training Data Selection Methods	We present three data selection methods for ranking and selecting domain-relevant sentence pairs from general-domain corpus, with an eye towards improving domain-specific translation model performance.
Training Data Selection Methods	These methods are based on language model and translation model , which are trained on small in-domain parallel data.
Training Data Selection Methods	3.1 Data Selection with Translation Model

translation model is mentioned in 25 sentences in this paper.

Topics mentioned in this paper:

20. Revisiting Pivot Language Approach for Machine Translation

Wu, Hua and Wang, Haifeng

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	For the synthetic method, we used the ES translation model to translate the English part of the CE corpus to Spanish to construct a synthetic corpus.
Experiments	And we also used the BTEC CEl corpus to build a EC translation model to translate the English part of ES corpus into Chinese.
Experiments	Then we combined these two synthetic corpora to build a Chinese-Spanish translation model .
Introduction	It multiples corresponding translation probabilities and lexical weights in source-pivot and pivot-target translation models to induce a new source-target phrase table.
Introduction	For example, we can obtain a source-pivot corpus by translating the pivot sentence in the source-pivot corpus into the target language with pivot-target translation models .
Pivot Methods for Phrase-based SMT	Following the method described in Wu and Wang (2007), we train the source-pivot and pivot-target translation models using the source-pivot and pivot-target corpora, respectively.
Pivot Methods for Phrase-based SMT	Based on these two models, we induce a source-target translation model , in which two important elements need to be induced: phrase translation probability and lexical weight.
Pivot Methods for Phrase-based SMT	Then we build a source-target translation model using this corpus.
Using RBMT Systems for Pivot Translation	The source-target pairs extracted from this synthetic multilingual corpus can be used to build a source-target translation model .

translation model is mentioned in 13 sentences in this paper.

Topics mentioned in this paper:

21. Joint Decoding with Multiple Translation Models

Liu, Yang and Mi, Haitao and Feng, Yang and Liu, Qun

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Current SMT systems usually decode with single translation models and cannot benefit from the strengths of other models in decoding phase.
Abstract	We instead propose joint decoding, a method that combines multiple translation models in one decoder.
Conclusion	We have presented a framework for including multiple translation models in one decoder.
Introduction	In this paper, we propose a framework for combining multiple translation models directly in de-
Joint Decoding	Second, translation models differ in decoding algorithms.
Joint Decoding	Despite the diversity of translation models , they all have to produce partial translations for substrings of input sentences.
Joint Decoding	Therefore, we represent the search space of a translation model as a structure called translation hypergraph.

translation model is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

22. Learning New Semi-Supervised Deep Auto-encoder Features for Statistical Machine Translation

Lu, Shixiang and Chen, Zhenbiao and Xu, Bo

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In this paper, instead of designing new features based on intuition, linguistic knowledge and domain, we learn some new and effective features using the deep auto-encoder (DAE) paradigm for phrase-based translation model .
Experiments and Results	The baseline translation models are generated by Moses with default parameter settings.
Input Features for DNN Feature Learning	The phrase-based translation model (Koehn et al., 2003; Och and Ney, 2004) has demonstrated superior performance and been widely used in current SMT systems, and we employ our implementation on this translation model .
Input Features for DNN Feature Learning	2This corpus is used to train the translation model in our experiments, and we will describe it in detail in section 5.1.
Introduction	al., 2010), and speech spectrograms (Deng et al., 2010), we propose new feature learning using semi-supervised DAE for phrase-based translation model .
Related Work	(2012) improved translation quality of n-gram translation model by using a bilingual neural LM, where translation probabilities are estimated using a continuous representation of translation units in lieu of standard discrete representations.
Related Work	Kalchbrenner and Blunsom (2013) introduced recurrent continuous translation models that comprise a class for purely continuous sentence-level translation models .
Related Work	(2013) presented a joint language and translation model based on a recurrent neural network which predicts target words based on an unbounded history of both source and target words.
Semi-Supervised Deep Auto-encoder Features Learning for SMT	Each translation rule in the phrase-based translation model has a set number of features that are combined in the log-linear model (Och and Ney, 2002), and our semi-supervised DAE features can also be combined in this model.

translation model is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

23. A Unified Model for Soft Linguistic Reordering Constraints in Statistical Machine Translation

Li, Junhui and Marton, Yuval and Resnik, Philip and Daumé III, Hal

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion and Future Work	In this paper, we have presented a unified reordering framework to incorporate soft linguistic constraints (of syntactic or semantic nature) into the HPB translation model .
Experiments	Our basic baseline system employs 19 basic features: a language model feature, 7 translation model features, word penalty, unknown word penalty, the glue rule, date, number and 6 pass-through features.
HPB Translation Model: an Overview	Each such rule is associated with a set of translation model features {gbi}, such as phrase translation probability p (04 \| 7) and its inverse p (7 \| 04), the lexical translation probability plew (04 \| 7) and its inverse plew (7 \| 04), and a rule penalty that affects preference for longer or shorter derivations.
Introduction	The popular distortion or lexicalized reordering models in phrase-based SMT make good local predictions by focusing on reordering on word level, while the synchronous context free grammars in hierarchical phrase-based (HPB) translation models are capable of handling nonlocal reordering on the translation phrase level.
Introduction	The general ideas, however, are applicable to other translation models , e.g., phrase-based model, as well.
Introduction	Section 2 provides an overview of HPB translation model .
Related Work	Both are close to our work; however, our model generates reordering features that are integrated into the log-linear translation model during decoding.
Related Work	In the soft constraint or reordering model approach, Liu and Gildea (2010) modeled the reordering/deletion of source-side semantic roles in a tree-to-string translation model .
Unified Linguistic Reordering Models	For models with syntactic reordering, we add two new features (i.e., one for the leftmost reordering model and the other for the rightmost reordering model) into the log-linear translation model in Eq.
Unified Linguistic Reordering Models	For the semantic reordering models, we also add two new features into the log-linear translation model .

translation model is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

24. Enhancing Grammatical Cohesion: Generating Transitional Expressions for SMT

Tu, Mei and Zhou, Yu and Zong, Chengqing

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A semantic span can include one or more eus.	1) CSS-based translation model : following formula (1), we obtain the cohesion information by modifying the translation rules with their probabilities P(es Ift) based on word align-
A semantic span can include one or more eus.	3.1 CSS-based Translation Model
A semantic span can include one or more eus.	For the existing translation models , the entire training process is conducted at the lexical or syntactic level without grammatically cohesive information.
Abstract	Our models include a CSS-based translation model , which generates new CSS-based translation rules, and a generative transfer model, which encourages producing transitional expressions during decoding.
Conclusion	In the future, we will extend our methods to other translation models , such as the syntax-based model, to study how to further improve the performance of SMT systems.
Experiments	The bilingual training data for translation model and CSS-based transfer model is FBIS corpus with approximately 7.1 million Chinese words and 9.2 million English words.
Experiments	For this work, we use an in-house decoder to build the SMT baseline; it combines the hierarchical phrase-based translation model (Chiang, 2005; Chiang, 2007) with the BTG (Wu, 1996) reordering model (Xiong et al., 2006; Zens and Ney, 2006; He et al., 2010).
Experiments	To further evaluate the effectiveness of the proposed models, we also conducted an experiment on a larger set of bilingual training data from the LDC corpus7 for translation model and transfer model.
Introduction	One is a new translation model that is utilized to generate new translation rules combined with the information of source functional relationships.

translation model is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

25. Adaptive HTER Estimation for Document-Specific MT Post-Editing

Huang, Fei and Xu, Jian-Ming and Ittycheriah, Abraham and Roukos, Salim

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We present an adaptive translation quality estimation (QE) method to predict the human-targeted translation error rate (HTER) for a document-specific machine translation model .
Adaptive MT Quality Estimation	Therefore it is necessary to build a QB regression model that’s robust to different document-specific translation models .
Document-specific MT System	Building a general MT system using all the parallel data not only produces a huge translation model (unless with very aggressive pruning), the performance on the given input document is suboptimal due to the unwanted dominance of out-of-domain data.
Document-specific MT System	Here we adopt the same strategy, building a document-specific translation model for each input document.
Experiments	In a typical MT QE scenario, the QE model is pre-trained and applied to various MT outputs, even though the QE training data and MT outputs are generated from different translation models .
Experiments	We train the static QE model with this training set, including the source sentences, references and MT outputs (from multiple translation models ).
Experiments	To train the adaptive QE model for each test document, we build a translation model whose subsampling data includes source sentences from both the test document and the QE training data.
Introduction	First, existing approaches to MT quality estimation rely on lexical and syntactical features defined over parallel sentence pairs, which includes source sentences, MT outputs and references, and translation models (Blatz et al., 2004; Ueffing and Ney, 2007; Specia et al., 2009a; Xiong et al., 2010; Soricut and Echihabi, 2010a; Bach et al., 2011).
Static MT Quality Estimation	derived from a Maximum Entropy translation model (Ittycheriah and Roukos, 2005).

translation model is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

26. Incremental Syntactic Language Models for Phrase-based Translation

Schwartz, Lane and Callison-Burch, Chris and Schuler, William and Wu, Stephen

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Early work in statistical machine translation Viewed translation as a noisy channel process comprised of a translation model , which functioned to posit adequate translations of source language words, and a target language model, which guided the fluency of generated target language strings (Brown et al.,
Related Work	The translation models in these techniques define phrases as contiguous word sequences (with gaps allowed in the case of hierarchical phrases) which may or may not correspond to any linguistic constituent.
Related Work	Early work in statistical phrase-based translation considered whether restricting translation models to use only syntactically well-formed constituents might improve translation quality (Koehn et al., 2003) but found such restrictions failed to improve translation quality.
Related Work	Significant research has examined the extent to which syntax can be usefully incorporated into statistical tree-based translation models: string-to-tree (Yamada and Knight, 2001; Gildea, 2003; Imamura et al., 2004; Galley et al., 2004; Graehl and Knight, 2004; Melamed, 2004; Galley et al., 2006; Huang et al., 2006; Shen et al., 2008), tree-to-string (Liu et al., 2006; Liu et al., 2007; Mi et al., 2008; Mi and Huang, 2008; Huang and Mi, 2010), tree-to-tree (Abeille et al., 1990; Shieber and Schabes, 1990; Poutsma, 1998; Eisner, 2003; Shieber, 2004; Cowan et al., 2006; Nesson et al., 2006; Zhang et al., 2007; DeNeefe et al., 2007; DeNeefe and Knight, 2009; Liu et al., 2009; Chiang, 2010), and treelet (Ding and Palmer, 2005; Quirk et al., 2005) techniques use syntactic information to inform the translation model .

translation model is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

27. A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation

Sun, Jun and Zhang, Min and Tan, Chew Lim

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	The tree sequence based translation model allows the violation of syntactic boundaries in a rule to capture non-syntactic phrases, where a tree sequence is a contiguous sequence of sub-trees.
Abstract	This paper goes further to present a translation model based on noncontiguous tree sequence alignment, where a noncontiguous tree sequence is a sequence of sub-trees and gaps.
Experiments	In the experiments, we train the translation model on FBIS corpus (7.2M (Chinese) + 9.2M (English) words) and train a 4-gram language model on the Xinhua portion of the English Gigaword corpus (181M words) using the SRILM Toolkits (Stolcke,
Introduction	Bod (2007) also finds that discontinues phrasal rules make significant improvement in linguistically motivated STSG-based translation model .
Introduction	2 We illustrate the rule extraction with an example from the tree-to-tree translation model based on tree sequence alignment (Zhang et al, 2008a) without losing of generality to most syntactic tree based models.
Introduction	To address this issue, we propose a syntactic translation model based on noncontiguous tree sequence alignment.
NonContiguous Tree sequence Align-ment-based Model	In this section, we give a formal definition of SncTSSG and accordingly we propose the alignment based translation model .
NonContiguous Tree sequence Align-ment-based Model	2.2 SncTSSG based Translation Model

translation model is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

28. Predicting and Eliciting Addressee's Emotion in Online Dialogue

Hasegawa, Takayuki and Kaji, Nobuhiro and Yoshinaga, Naoki and Toyoda, Masashi

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Eliciting Addressee’s Emotion	Following (Ritter et al., 2011), we apply the statistical machine translation model for generating a response to a given utterance.
Eliciting Addressee’s Emotion	We use GIZA++8 and SRILM9 for learning translation model and 5-gram language model, re-
Eliciting Addressee’s Emotion	We use the emotion-tagged dialogue corpus to learn eight translation models and language models, each of which is specialized in generating the response that elicits one of the eight emotions (Plutchik, 1980).
Experiments	Table 6: The number of utterance pairs used for training classifiers in emotion prediction and learning the translation models and language models in response generation.
Experiments	We use the utterance pairs summarized in Table 6 to learn the translation models and language models for eliciting each emotional category.
Experiments	However, for learning the general translation models , we currently use 4 millions of utterance pairs sampled from the 640 millions of pairs due to the computational limitation.

translation model is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

29. Learning to Rank Answers on Large Online QA Collections

Surdeanu, Mihai and Ciaramita, Massimiliano and Zaragoza, Hugo

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Approach	supervised IR models, the answer ranking is implemented using discriminative learning, and finally, some of the ranking features are produced by question-to-answer translation models , which use class-conditional learning.
Approach	the similarity between questions and answers (FGl), features that encode question-to-answer transformations using a translation model (FG2), features that measure keyword density and frequency (FG3), and features that measure the correlation between question-answer pairs and other collections (FG4).
Approach	One way to address this problem is to learn question-to-answer transformations using a translation model (Berger et al., 2000; Echihabi and Marcu, 2003; Soricut and Brill, 2006; Riezler et al., 2007).
Experiments	Our ranking model was tuned strictly on the development set (i.e., feature selection and parameters of the translation models ).
Experiments	to improve lexical matching and translation models .
Experiments	This indicates that, even though translation models are the most useful, it is worth exploring approaches that combine several strategies for answer ranking.
Related Work	In the QA literature, answer ranking for non-factoid questions has typically been performed by learning question-to-answer transformations, either using translation models (Berger et al., 2000; Soricut and Brill, 2006) or by exploiting the redundancy of the Web (Agichtein et al., 2001).
Related Work	On the other hand, our approach allows the learning of full transformations from question structures to answer structures using translation models applied to different text representations.

translation model is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

30. A Chinese-English Organization Name Translation System Using Heuristic Web Mining and Asymmetric Alignment

Yang, Fan and Zhao, Jun and Liu, Kang

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	In order to evaluate the influence of segmentation results upon the statistical ON translation system, we compare the results of two translation models .
Experiments	For constructing a statistical ON translation model , we use GIZA++I to align the Chinese NEs and the English NEs in the training set.
Experiments	Q2: the Chinese ON and the results of the statistical translation model .
Related Work	The first type of methods translates ONs by building a statistical translation model .
Related Work	The statistical translation model can give an output for any input.
The Chunking-based Segmentation for Chinese ONs	The performance of the statistical ON translation model is dependent on the precision of the Chinese ON segmentation to some extent.

translation model is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

31. Bilingual Sense Similarity for Statistical Machine Translation

Chen, Boxing and Foster, George and Kuhn, Roland

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Similarity scores are used as additional features of the translation model to improve translation performance.
Experiments	The sense similarity scores are used as feature functions in the translation model .
Experiments	In particular, all the allowed bilingual corpora except the UN corpus and Hong Kong Hansard corpus have been used for estimating the translation model .
Experiments	The second one is the small data condition where only the FBIS3 corpus is used to train the translation model .
Hierarchical phrase-based MT system	The hierarchical phrase-based translation method (Chiang, 2005; Chiang, 2007) is a formal syntax-based translation modeling method; its translation model is a weighted synchronous context free grammar (SCFG).
Introduction	the translation probabilities in a translation model , for units from parallel corpora are mainly based on the co-occurrence counts of the two units.

translation model is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

32. A Class-Based Agreement Model for Generating Accurately Inflected Translations

Green, Spence and DeNero, John

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Class-based Model of Agreement	Translation Model 6 Target sequence of I words f Source sequence of J words a Sequence of K phrase alignments for (e, f) H Permutation of the alignments for target word order 6 h Sequence of M feature functions A Sequence of learned weights for the M features H A priority queue of hypotheses
Discussion of Translation Results	This large gap between the unigram recall of the actual translation output (top) and the lexical coverage of the phrase-based model (bottom) indicates that translation performance can be improved dramatically by altering the translation model through features such as ours, without expanding the search space of the decoder.
Experiments	We trained the translation model on 502 million words of parallel text collected from a variety of sources, including the Web.
Inference during Translation Decoding	3.3 Translation Model Features
Introduction	However, using lexical coverage experiments, we show that there is ample room for translation quality improvements through better selection of forms that already exist in the translation model .
Related Work	Factored Translation Models Factored translation models (Koehn and Hoang, 2007) facilitate a more data-oriented approach to agreement modeling.
Related Work	Subotin (2011) recently extended factored translation models to hierarchical phrase-based translation and developed a discriminative model for predicting target-side morphology in English-Czech.

translation model is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

phrase-based (10)
CRF (9)
LM (9)

33. Smaller Alignment Models for Better Translations: Unsupervised Word Alignment with the l0-norm

Vaswani, Ashish and Huang, Liang and Chiang, David

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Two decades after their invention, the IBM word-based translation models , widely available in the GIZA++ toolkit, remain the dominant approach to word alignment and an integral part of many statistical translation systems.
Abstract	In this paper, we propose a simple extension to the IBM models: an 60 prior to encourage sparsity in the word-to-word translation model .
Conclusion	We have extended the IBM models and HMM model by the addition of an (0 prior to the word-to-word translation model , which compacts the word-to-word translation table, reducing overfitting, and, in particular, the “garbage collection” effect.
Experiments	Table 4 shows B scores for translation models learned from these alignments.
Introduction	Although state-of—the-art translation models use rules that operate on units bigger than words (like phrases or tree fragments), they nearly always use word alignments to drive extraction of those translation rules.
Introduction	It extends the IBM/HMM models by incorporating an (0 prior, inspired by the principle of minimum description length (Barron et al., 1998), to encourage sparsity in the word-to-word translation model (Section 2.2).

translation model is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

Xiao, Xinyan and Xiong, Deyi and Zhang, Min and Liu, Qun and Lin, Shouxun

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion and Future Work	Finally, we hope to apply our method to other translation models , especially syntax-based models.
Decoding	In the topic-specific lexicon translation model , given a source document, it first calculates the topic-specific translation probability by normalizing the entire lexicon translation table, and then adapts the lexical weights of rules correspondingly.
Experiments	The adapted lexicon translation model is added as a new feature under the discriminative framework.
Introduction	To exploit topic information for statistical machine translation (SMT), researchers have proposed various topic-specific lexicon translation models (Zhao and Xing, 2006; Zhao and Xing, 2007; Tam et al., 2007) to improve translation quality.
Introduction	Topic-specific lexicon translation models focus on word-level translations.
Related Work	combine a specific domain translation model with a general domain translation model depending on various text distances.

translation model is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

35. Enriching Morphologically Poor Languages for Statistical Machine Translation

Avramidis, Eleftherios and Koehn, Philipp

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	Opposed to other factored translation model approaches that require target language factors, that are not easily obtainable for many languages, our approach only requires English syntax trees, which are acquired with widely available automatic parsers.
Factored Model	The factored statistical machine translation model uses a log-linear approach, in order to combine the several components, including the language model, the reordering model, the translation models and the generation models.
Introduction	Our method is based on factored phrase-based statistical machine translation models .
Introduction	Traditional statistical machine translation models deal with this problems in two ways:
Introduction	Then, contrary to the methods that added only output features or altered the generation procedure, we used this information in order to augment only the source side of a factored translation model , assuming that we do not have resources allowing factors or specialized generation in the target language (a common problem, when translating from English into under-resourced languages).
Methods for enriching input	Considering such annotation, a factored translation model is trained to map the word-case pair to the correct inflection of the target noun.

translation model is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

36. Head-Driven Hierarchical Phrase-based Translation

Li, Junhui and Tu, Zhaopeng and Zhou, Guodong and van Genabith, Josef

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	We present a head-driven hierarchical phrase-based (HD-HPB) translation model , which adopts head information (derived through unlabeled dependency analysis) in the definition of non-terminals to better differentiate among translation rules.
Head-Driven HPB Translation Model	Like Chiang (2005) and Chiang (2007), our HD-HPB translation model adopts a synchronous context free grammar, a rewriting system which generates source and target side string pairs simultaneously using a context-free grammar.
Head-Driven HPB Translation Model	For rule extraction, we first identify initial phrase pairs on word-aligned sentence pairs by using the same criterion as most phrase-based translation models (Och and Ney, 2004) and Chiang’s HPB model (Chiang, 2005; Chiang, 2007).
Head-Driven HPB Translation Model	Merging two neighboring non-terminals into a single nonterminal, NRRs enable the translation model to explore a wider search space.
Introduction	Chiang’s hierarchical phrase-based (HPB) translation model utilizes synchronous context free grammar (SCFG) for translation derivation (Chiang, 2005; Chiang, 2007) and has been widely adopted in statistical machine translation (SMT).
Introduction	However, the two approaches are not mutually exclusive, as we could also include a set of syntax-driven features into our translation model .

translation model is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

37. Training Phrase Translation Models with Leaving-One-Out

Wuebker, Joern and Mauser, Arne and Ney, Hermann

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Alignment	, N is used for both the initialization of the translation model p(f\|é) and the phrase model training.
Conclusion	We have shown that training phrase models can improve translation performance on a state-of-the-art phrase-based translation model .
Experimental Evaluation	The scaling factors of the translation models have been optimized for BLEU on the DEV data.
Experimental Evaluation	We will focus on the proposed leaving-one-out technique and show that it helps in finding good phrasal alignments on the training data that lead to improved translation models .
Introduction	Viterbi Word Alignment Phrase Alignment word translation models phrase translation models trained by EM Algorithm trained by EM Algorithm heuristic phrase phrase translation counts probabilities Phrase Translation Table ‘ ‘ Phrase Translation Table
Related Work	This is different from word-based translation models , where a typical assumption is that each target word corresponds to only one source word.

translation model is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

38. Improve SMT Quality with Automatically Extracted Paraphrase Rules

He, Wei and Wu, Hua and Wang, Haifeng and Liu, Ting

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	The translation quality of the SMT system is highly related to the coverage of translation models .
Introduction	Naturally, a solution to the coverage problem is to bridge the gaps between the input sentences and the translation models , either from the input side, which targets on rewriting the input sentences to the MT-favored expressions, or from
Introduction	the side of translation models, which tries to enrich the translation models to cover more expressions.

translation model is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

39. A Discriminative Latent Variable Model for Statistical Machine Translation

Blunsom, Phil and Cohn, Trevor and Osborne, Miles

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We present a translation model which models derivations as a latent variable, in both training and decoding, and is fully discriminative and globally optimised.
Discriminative Synchronous Transduction	Our log-linear translation model defines a conditional probability distribution over the target translations of a given source sentence.
Evaluation	6We also experimented with using max-translation decoding for standard MER trained translation models , finding that it had a small negative impact on BLEU score.
Evaluation	Firstly we show the relative scores of our model against Hiero without using reverse translation or lexical features.7 This allows us to directly study the differences between the two translation models without the added complication of the other features.
Evaluation	As expected, the language model makes a significant difference to BLEU, however we believe that this effect is orthogonal to the choice of base translation model , thus we would expect a similar gain when integrating a language model into the discriminative system.

translation model is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

40. Applying Morphology Generation Models to Machine Translation

Toutanova, Kristina and Suzuki, Hisami and Ruopp, Achim

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Integration of inflection models with MT systems	In this method, stemming can impact word alignment in addition to the translation models .
MT performance results	From the results of Method 2 we can see that reducing sparsity at translation modeling is advantageous.
MT performance results	From these results, we also see that about half of the gain from using stemming in the base MT system came from improving word alignment, and half came from using translation models operating at the less sparse stem level.
Machine translation systems and data	The treelet translation model is estimated using a parallel corpus.
Related work	Though our motivation is similar to that of Koehn and Hoang (2007), we chose to build an independent component for inflection prediction in isolation rather than folding morphological information into the main translation model .

translation model is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

41. Translation Assistance by Translation of L1 Fragments in an L2 Context

van Gompel, Maarten and van den Bosch, Antal

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	the role of the translation model in Statistical Machine Translation (SMT).
System	The language model is a trigram-based back-off language model with Kneser-Ney smoothing, computed using SRILM (Stolcke, 2002) and trained on the same training data as the translation model .
System	We do so by normalising the class probability from the classifier (scoreT(H)), which is our translation model , and the language model (scorelm(H)), in such a way that the highest classifier score for the alternatives under consideration is always 1.0, and the highest language model score of the sentence is always 1.0.
System	If desired, the search can be parametrised with variables A3 and A4, representing the weights we want to attach to the classifier-based translation model and the language model, respectively.

translation model is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

42. Lattice Desegmentation for Statistical Machine Translation

Salameh, Mohammad and Cherry, Colin and Kondrak, Grzegorz

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	Eventually, we would like to replace the functionality of factored translation models (Koehn and Hoang, 2007) with lattice transformation and augmentation.
Experimental Setup	Four translation model features encode phrase translation probabilities and lexical scores in both directions.
Introduction	Morphological complexity leads to much higher type to token ratios than English, which can create sparsity problems during translation model estimation.
Related Work	Most techniques approach the problem by transforming the target language in some manner before training the translation model .
Related Work	In this setting, the sparsity reduction from segmentation helps word alignment and target language modeling, but it does not result in a more expressive translation model .

translation model is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

LM (16)
language model (13)
BLEU (13)

43. Hybrid Simplification using Deep Semantics and Machine Translation

Narayan, Shashi and Gardent, Claire

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Second, it combines a simplification model for splitting and deletion with a monolingual translation model for phrase substitution and reordering.
Experiments	We trained our simplification and translation models on the PWKP corpus.
Simplification Framework	Our simplification framework consists of a probabilistic model for splitting and dropping which we call DRS simplification model (DRS-SM); a phrase based translation model for substitution and reordering (PBMT); and a language model learned on Simple English Wikipedia (LM) for fluency and grammaticality.
Simplification Framework	where the probabilities p(s’ \|DC), p(s’ \|s) and 19(3) are given by the DRS simplification model, the phrase based machine translation model and the language model respectively.
Simplification Framework	Our phrase based translation model is trained using the Moses toolkit5 with its default command line options on the PWKP corpus (except the sentences from the test set) considering the complex sentence as the source and the simpler one as the target.

translation model is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

44. Constituency to Dependency Translation with Forests

Mi, Haitao and Liu, Qun

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We thus propose to combine the advantages of both, and present a novel constituency-to-dependency translation model , which uses constituency forests on the source side to direct the translation, and dependency trees on the target side (as a language model) to ensure grammaticality.
Conclusion and Future Work	In this paper, we presented a novel forest-based constituency-to-dependency translation model , which combines the advantages of both tree-to-string and string-to-tree systems, runs fast and guarantees grammaticality of the output.
Introduction	Linguistically syntax-based statistical machine translation models have made promising progress in recent years.
Model	Figure 1 shows a word-aligned source constituency forest FC and target dependency tree De, our constituency to dependency translation model can be formalized as:
Related Work	(2009), we apply forest into a new constituency tree to dependency tree translation model rather than constituency tree-to-tree model.

translation model is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

45. How to train your multi bottom-up tree transducer

Maletti, Andreas

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	A (formal) translation model is at the core of every machine translation system.
Introduction	(1990) discuss automatically trainable translation models in their seminal paper.
Introduction	Contrary, in the field of syntax-based machine translation, the translation models have full access to the syntax of the sentences and can base their decision on it.
Preservation of regularity	Preservation of regularity is an important property for a number of translation model manipulations.

translation model is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

46. Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation

Nguyen, ThuyLinh and Vogel, Stephan

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Hiero translation models have two limitations compared to phrase-based models: 1) Limited hypothesis space; 2) No lexicalized reordering model.
Introduction	Phrase-based and tree-based translation model are the two main streams in state-of-the-art machine translation.
Introduction	The tree-based translation model , by using a synchronous context-free grammar formalism, can capture longer reordering between source and target language.
Introduction	Many features are shared between phrase-based and tree-based systems including language model, word count, and translation model features.

translation model is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

47. Learning Hierarchical Translation Structure with Linguistic Annotations

Mylonakis, Markos and Sima'an, Khalil

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	The induced joint translation model can be used to recover arg maxe p(e\|f), as it is equal to arg maxe p(e, f We employ the induced probabilistic HR-SCFG G as the backbone of a log-linear, feature based translation model , with the derivation probability p(D) under the grammar estimate being
Introduction	Nevertheless, the successful employment of SCFGs for phrase-based SMT brought translation models assuming latent syntactic structure to the spotlight.
Introduction	Section 2 discusses the weak independence assumptions of SCFGs and introduces a joint translation model which addresses these issues and separates hierarchical translation structure from phrase-pair emission.
Joint Translation Model	The rest of the (sometimes thousands of) rule-specific features usually added to SCFG translation models do not directly help either, leaving reordering decisions disconnected from the rest of the derivation.
Related Work	Most of the aforementioned work does concentrate on learning hierarchical, linguistically motivated translation models .

translation model is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

48. Topic Models for Dynamic Translation Model Adaptation

Eidelman, Vladimir and Boyd-Graber, Jordan and Resnik, Philip

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We use these topic distributions to compute topic-dependent lexical weighting probabilities and directly incorporate them into our translation model as features.
Discussion and Conclusion	We can construct a topic model once on the training data, and use it infer topics on any test set to adapt the translation model .
Introduction	This problem has led to a substantial amount of recent work in trying to bias, or adapt, the translation model (TM) toward particular domains of interest (Axelrod et al., 2011; Foster et al., 2010; Snover et al., 2008).1 The intuition behind TM adaptation is to increase the likelihood of selecting relevant phrases for translation.
Introduction	We induce unsupervised domains from large corpora, and we incorporate soft, probabilistic domain membership into a translation model .
Introduction	We accomplish this by introducing topic dependent lexical probabilities directly as features in the translation model , and interpolating them log-linearly with our other features, thus allowing us to discriminatively optimize their weights on an arbitrary objective function.

translation model is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

49. Shallow Local Multi-Bottom-up Tree Transducers in Statistical Machine Translation

Braune, Fabienne and Seemann, Nina and Quernheim, Daniel and Maletti, Andreas

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We present a new translation model integrating the shallow local multi bottom-up tree transducer.
Experiments	Both translation models were trained with approximately 1.5 million bilingual sentences after length-ratio filtering.
Introduction	In this contribution, we report on our novel statistical machine translation system that uses an [MBOT-based translation model .
Introduction	The theoretical foundations of [MBOT and their integration into our translation model are presented in Sections 2 and 3.
Translation Model	Given a source language sentence 6, our translation model aims to find the best corresponding target language translation §;7 i.e.,

translation model is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

50. Distortion Model Considering Rich Context for Statistical Machine Translation

Goto, Isao and Utiyama, Masao and Sumita, Eiichiro and Tamura, Akihiro and Kurohashi, Sadao

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiment	The translation model was trained using sentences of 40 words or less from the training data.
Experiment	The common SMT feature set consists of: four translation model features, phrase penalty, word penalty, and a language model feature.
Experiment	Our distortion model was trained as follows: We used 0.2 million sentence pairs and their word alignments from the data used to build the translation model as the training data for our distortion models.

translation model is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

51. Deceptive Answer Prediction with User Preference Graph

Li, Fangtao and Gao, Yang and Zhou, Shuchang and Si, Xiance and Dai, Decheng

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	The larger question threads data is employed for feature learning, such as translation model , and topic model training.
Proposed Features	Translation Model
Proposed Features	A translation model is a mathematical model in which the language translation is modeled in a statistical way.
Proposed Features	We train a translation model (Brown et al., 1990; Och and Ney, 2003) using the Community QA data, with the question as the target language, and the corresponding best answer as the source language.

translation model is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

52. Advancements in Reordering Models for Statistical Machine Translation

Feng, Minwei and Peter, Jan-Thorsten and Ney, Hermann

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Comparative Study	(Marifio et al., 2006) implement a translation model using n-grams.
Experiments	Our baseline is a phrase-based decoder, which includes the following models: an n-gram target-side language model (LM), a phrase translation model and a word-based lexicon model.
Experiments	Table 1: translation model and LM training data statistics
Experiments	Table 1 contains the data statistics used for translation model and LM.
Introduction	(Marifio et al., 2006) present a translation model that constitutes a language model of a sort of bilanguage composed of bilingual units.

translation model is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

CRFs (19)
LM (14)
BLEU (8)

53. An Infinite Hierarchical Bayesian Model of Phrasal Translation

Cohn, Trevor and Haffari, Gholamreza

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Modern phrase-based machine translation systems make extensive use of word-based translation models for inducing alignments from parallel corpora.
Experiments	In the end-to-end MT pipeline we use a standard set of features: relative-frequency and lexical translation model probabilities in both directions; distance-based distortion model; language model and word count.
Introduction	Word-based translation models (Brown et al., 1993) remain central to phrase-based model training, where they are used to infer word-level alignments from sentence aligned parallel data, from
Introduction	This paper develops a phrase-based translation model which aims to address the above shortcomings of the phrase-based translation pipeline.

translation model is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

54. Fast Consensus Decoding over Translation Forests

DeNero, John and Chiang, David and Knight, Kevin

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Computing Feature Expectations	The weight of h is the incremental score contributed to all translations containing the rule application, including translation model features on 7“ and language model features that depend on both 7“ and the English contexts of the child nodes.
Experimental Results	Hiero is a hierarchical system that expresses its translation model as a synchronous context-free grammar (Chiang, 2007).
Experimental Results	SBMT is a string-to-tree translation system with rich target-side syntactic information encoded in the translation model .
Experimental Results	Figure 3: N - grams with high expected count are more likely to appear in the reference translation that 71- grams in the translation model’s Viterbi translation, 6*.

translation model is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

55. Vector Space Model for Adaptation in Statistical Machine Translation

Chen, Boxing and Kuhn, Roland and Foster, George

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	The translation model (TM) was smoothed in both directions with KN smoothing (Chen et al., 2011).
Experiments	In (Foster and Kuhn, 2007), two kinds of linear mixture were described: linear mixture of language models (LMs), and linear mixture of translation models (TMs).
Introduction	The translation models of a statistical machine translation (SMT) system are trained on parallel data.
Introduction	The 2012 JHU workshop on Domain Adaptation for MT 1 proposed phrase sense disambiguation (PSD) for translation model adaptation.

translation model is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

56. Graph-based Semi-Supervised Learning of Translation Models from Monolingual Data

Saluja, Avneesh and Hassan, Hany and Toutanova, Kristina and Quirk, Chris

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	In this work, we presented an approach that can expand a translation model extracted from a sentence-aligned, bilingual corpus using a large amount of unstructured, monolingual data in both source and target languages, which leads to improvements of 1.4 and 1.2 BLEU points over strong baselines on evaluation sets, and in some scenarios gains in excess of 4 BLEU points.
Generation & Propagation	We assume that sufficient parallel resources exist to learn a basic translation model using standard techniques, and also assume the availability of larger monolingual corpora in both the source and target languages.
Introduction	Unlike previous work (Irvine and Callison-Burch, 2013a; Razmara et al., 2013), we use higher order n-grams instead of restricting to unigrams, since our approach goes beyond OOV mitigation and can enrich the entire translation model by using evidence from monolingual text.
Related Work	(2013) and Irvine and Callison-Burch (2013a) conduct a more extensive evaluation of their graph-based BLI techniques, where the emphasis and end-to-end BLEU evaluations concentrated on OOVs, i.e., unigrams, and not on enriching the entire translation model .

translation model is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

language model (18)
BLEU (15)
bigram (13)

57. Decoder Integration and Expected BLEU Training for Recurrent Neural Network Language Models

Auli, Michael and Gao, Jianfeng

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Expected BLEU Training	We summarize the weights of the recurrent neural network language model as 6 = {U, W, V} and add the model as an additional feature to the log-linear translation model using the simplified notation 89(10):) 2 8(wt\|w1...wt_1,ht_1):
Expected BLEU Training	The translation model is parameterized by A and 6 which are learned as follows (Gao et al., 2014):
Experiments	Translation models are estimated on 102M words of parallel data for French-English, and 99M words for German-English; about 6.5M words for each language pair are newswire, the remainder are parliamentary proceedings.
Introduction	Neural network-based language and translation models have achieved impressive accuracy improvements on statistical machine translation tasks (Allauzen et al., 2011; Le et al., 2012b; Schwenk et al., 2012; Vaswani et al., 2013; Gao et al., 2014).

translation model is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

58. Hierarchical Phrase Table Combination for Machine Translation

Zhu, Conghui and Watanabe, Taro and Sumita, Eiichiro and Zhao, Tiejun

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	In section 3, we briefly describe the translation model with phrasal ITGs and Pitman-Yor process.
Related Work	The translation model and language model are primary components in SMT.
Related Work	In the case of the previous work on translation modeling , mixed methods have been investigated for domain adaptation in SMT by adding domain information as additional labels to the original phrase table (Foster and Kuhn, 2007).
Related Work	Monolingual topic information is taken as a new feature for a domain adaptive translation model and tuned on the development set (Su et al., 2012).

translation model is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

59. Learning a Phrase-based Translation Model from Monolingual Data with Application to Domain Adaptation

Zhang, Jiajun and Zong, Chengqing

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Recently, some research works study the unsupervised SMT for inducing a simple word-based translation model from the monolingual corpora.
Introduction	Novel translation models , such as phrase-based models (Koehn et a., 2007), hierarchical phrase-based models (Chiang, 2007) and linguistically syntax-based models (Liu et a., 2006; Huang et al., 2006; Galley, 2006; Zhang et a1, 2008; Chiang, 2010; Zhang et al., 2011; Zhai et al., 2011, 2012) have been proposed and achieved higher and higher translation performance.
Introduction	However, all of these state-of-the-art translation models rely on the parallel corpora to induce translation rules and estimate the corresponding parameters.
Introduction	Finally, they used the learned translation model directly to translate unseen data (Ravi and Knight, 2011; Nuhn et al., 2012) or incorporated the learned bilingual lexicon as a new in-domain translation resource into the phrase-based model which is trained with out-of-domain data to improve the domain adaptation performance in machine translation (Dou and Knight, 2012).

translation model is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

60. Using subcategorization knowledge to improve case prediction for translation to German

Weller, Marion and Fraser, Alexander and Schulte im Walde, Sabine

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	We first replace inflected forms by their stems or lemmas: building a translation system on a stemmed representation of the target side leads to a simpler translation task, and the morphological information contained in the source and target language parts of the translation model is more balanced.
Translation pipeline	We use this representation to create the stemmed representation for training the translation model .
Translation pipeline	In addition to the translation model , the target-side language model, as well as the reference data for parameter tuning use this representation.
Translation pipeline	3.2 Building a stemmed translation model

translation model is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

61. A Word-Class Approach to Labeling PSCFG Rules for Machine Translation

Zollmann, Andreas and Vogel, Stephan

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Accordingly, we use Chiang’s hierarchical phrase based translation model (Chiang, 2007) as a base line, and the syntax-augmented MT model (Zollmann and Venugopal, 2006) as a ‘target line’, a model that would not be applicable for language pairs without linguistic resources.
Hard rule labeling from word classes	Our approach instead uses distinct grammar rules and labels to discriminate phrase size, with the advantage of enabling all translation models to estimate distinct weights for distinct size classes and avoiding the need of additional models in the log-linear framework; however, the increase in the number of labels and thus grammar rules decreases the reliability of estimated models for rare events due to increased data sparseness.
Introduction	Labels on these nonterminal symbols are often used to enforce syntactic constraints in the generation of bilingual sentences and imply conditional independence assumptions in the translation model .
Related work	(2009) present a nonparametric PSCFG translation model that directly induces a grammar from parallel sentences Without the use of or constraints from a word-alignment model, and

translation model is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

62. Incremental Topic-Based Translation Model Adaptation for Conversational Spoken Language Translation

Hewavitharana, Sanjika and Mehay, Dennis and Ananthakrishnan, Sankaranarayanan and Natarajan, Prem

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We describe a translation model adaptation approach for conversational spoken language translation (CSLT), which encourages the use of contextually appropriate translation options from relevant training conversations.
Discussion and Future Directions	We have presented a novel, incremental topic-based translation model adaptation approach that obeys the causality constraint imposed by spoken conversations.
Incremental Topic-Based Adaptation	Our approach is based on the premise that biasing the translation model to favor phrase pairs originating in training conversations that are contextually similar to the current conversation will lead to better translation quality.
Incremental Topic-Based Adaptation	We add this feature to the log-linear translation model with its own weight, which is tuned with MERT.

translation model is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

63. Efficient Multi-Pass Decoding for Synchronous Context Free Grammars

Zhang, Hao and Gildea, Daniel

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We take a multi-pass approach to machine translation decoding when using synchronous context-free grammars as the translation model and n-gram language models: the first pass uses a bigram language model, and the resulting parse forest is used in the second pass to guide search with a trigram language model.
Experiments	The word-to-Word translation probabilities are from the translation model of IBM Model 4 trained on a 160-million-word English-Chinese parallel corpus using GIZA++.
Introduction	This complexity arises from the interaction of the tree-based translation model with an n-gram language model.

translation model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

BLEU (19)
bigram (15)
language model (12)

64. Improving Statistical Machine Translation with Monolingual Collocation

Liu, Zhanyi and Wang, Haifeng and Wu, Hua and Li, Sheng

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Improving Statistical Bilingual Word Alignment	IBM Model 1 only employs the word translation model to calculate the probabilities of alignments.
Improving Statistical Bilingual Word Alignment	In IBM Model 2, both the word translation model and position distribution model are used.
Improving Statistical Bilingual Word Alignment	IBM Model 3, 4 and 5 consider the fertility model in addition to the word translation model and position distribution model.

translation model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

65. Confidence Measure for Word Alignment

Huang, Fei

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	The quality of the parallel data and the word alignment have significant impacts on the learned translation models and ultimately the quality of translation output.
Sentence Alignment Confidence Measure	The source-to-target lexical translation model p(t\|s) and target-to-source model p(s\|t) can be obtained through IBM Model-l or HMM training.
Sentence Alignment Confidence Measure	For the efficient computation of the denominator, we use the lexical translation model .

translation model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

66. Variational Decoding for Statistical Machine Translation

Li, Zhifei and Eisner, Jason and Khudanpur, Sanjeev

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Results	Our translation model was trained on about 1M parallel sentence pairs (about 28M words in each language), which are sub-sampled from corpora distributed by LDC for the NIST MT evaluation using a sampling method based on the n-gram matches between training and test sets in the foreign side.
Experimental Results	We use GIZA++ (Och and Ney, 2000), a suffix-array (Lopez, 2007), SRILM (Stol-cke, 2002), and risk-based deterministic annealing (Smith and Eisner, 2006)17 to obtain word alignments, translation models , language models, and the optimal weights for combining these models, respectively.
Variational vs. Min-Risk Decoding	However, suppose the hypergraph were very large (thanks to a large or smoothed translation model and weak pruning).

translation model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

n-gram (32)
Viterbi (23)
BLEU (15)

67. A Recursive Recurrent Neural Network for Statistical Machine Translation

Liu, Shujie and Yang, Nan and Li, Mu and Zhou, Ming

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	RZNN is a combination of recursive neural network and recurrent neural network, and in turn integrates their respective capabilities: (1) new information can be used to generate the next hidden state, like recurrent neural networks, so that language model and translation model can be integrated naturally; (2) a tree structure can be built, as recursive neural networks, so as to generate the translation candidates in a bottom up manner.
Introduction	DNN is also introduced to Statistical Machine Translation (SMT) to learn several components or features of conventional framework, including word alignment, language modelling, translation modelling and distortion modelling.
Introduction	(2013) propose a joint language and translation model , based on a recurrent neural network.

translation model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

68. A Syntax-Driven Bracketing Model for Phrase-Based Translation

Xiong, Deyi and Zhang, Min and Aw, Aiti and Li, Haizhou

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Analysis	We want to further study the happenings after we integrate the constraint feature (our SDB model and Marton and Resnik’s XP+) into the log-linear translation model .
Experiments	All translation models were trained on the FBIS corpus.
The Syntax-Driven Bracketing Model 3.1 The Model	new feature into the log-linear translation model : PSDB (b\|T, This feature is computed by the SDB model described in equation (3) or equation (4), which estimates a probability that a source span is to be translated as a unit within particular syntactic contexts.

translation model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

69. Fast and Robust Neural Network Joint Models for Statistical Machine Translation

Devlin, Jacob and Zbib, Rabih and Huang, Zhongqiang and Lamar, Thomas and Schwartz, Richard and Makhoul, John

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	They have since been extended to translation modeling , parsing, and many other NLP tasks.
Model Variations	4.1 Neural Network Lexical Translation Model (NNLTM)
Model Variations	In order to assign a probability to every source word during decoding, we also train a neural network lexical translation model (NNLMT).

translation model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

70. Fine-Grained Tree-to-String Translation Rule Extraction

Wu, Xianchao and Matsuzaki, Takuya and Tsujii, Jun'ichi

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	These tree-to-tree rules are applicable for forest-to-tree translation models (Liu et al., 2009a).
Experiments	4.1 Translation models
Experiments	In our translation models , we have made use of three kinds of translation rule sets which are trained separately.

translation model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

71. Exact Decoding of Syntactic Translation Models through Lagrangian Relaxation

Rush, Alexander M. and Collins, Michael

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	We have described a Lagrangian relaxation algorithm for exact decoding of syntactic translation models , and shown that it is significantly more efficient than other exact algorithms for decoding tree-
Conclusion	Our experiments have focused on tree-to-string models, but the method should also apply to Hiero-style syntactic translation models (Chiang, 2007).
Experiments	We use an identical model, and identical development and test data, to that used by Huang and Mi.9 The translation model is trained on 1.5M sentence pairs of Chinese-English data; a trigram language model is used.

translation model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

72. Decipherment Complexity in 1:1 Substitution Ciphers

Nuhn, Malte and Ney, Hermann

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	The decipherment approach for MT has recently gained popularity for training and adapting translation models using only monolingual data.
Introduction	The general idea is to find those translation model parameters that maximize the probability of the translations of a given source text in a given language model of the target language.
Related Work	(Ravi and Knight, 2011), (Nuhn et al., 2012), and (Dou and Knight, 2012) treat natural language translation as a deciphering problem including phenomena like reordering, insertion, and deletion and are able to train translation models using only monolingual data.

translation model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

73. Name-aware Machine Translation

Li, Haibo and Zheng, Jing and Ji, Heng and Li, Qi and Wang, Wen

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Name-aware MT	Some of these names carry special meanings that may influence translations of the neighboring words, and thus replacing them with non-terminals can lead to information loss and weaken the translation model .
Name-aware MT	Both sentence pairs are kept in the combined data to build the translation model .
Related Work	More importantly, in these approaches the MT model was still mostly treated as a “black-box” because neither the translation model nor the LM was updated or adapted specifically for names.

translation model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

BLEU (19)
word alignment (17)
LM (12)