Index of papers in Proc. ACL 2014 that mention
  • translation model
Cui, Lei and Zhang, Dongdong and Liu, Shujie and Chen, Qiming and Li, Mu and Zhou, Ming and Yang, Muyun
Conclusion and Future Work
Experimental results show that our approach is promising for SMT systems to learn a better translation model .
Experiments
Translation models are trained over the parallel data that is automatically word-aligned
Experiments
This implementation makes the system perform much better and the translation model size is much smaller.
Introduction
Current translation modeling approaches usually use context dependent information to disambiguate translation candidates.
Introduction
Therefore, it is important to leverage topic information to learn smarter translation models and achieve better translation performance.
Introduction
Attempts on topic-based translation modeling include topic-specific lexicon translation models (Zhao and Xing, 2006; Zhao and Xing, 2007), topic similarity models for synchronous rules (Xiao et al., 2012), and document-level translation with topic coherence (Xiong and Zhang, 2013).
Related Work
Following this work, (Xiao et al., 2012) extended topic-specific lexicon translation models to hierarchical phrase-based translation models , where the topic information of synchronous rules was directly inferred with the help of document-level information.
Related Work
They incorporated the bilingual topic information into language model adaptation and lexicon translation model adaptation, achieving significant improvements in the large-scale evaluation.
Related Work
They estimated phrase-topic distributions in translation model adaptation and generated better translation quality.
Topic Similarity Model with Neural Network
Therefore, it helps to train a smarter translation model with the embedded topic information.
Topic Similarity Model with Neural Network
Standard features: Translation model , including translation probabilities and lexical weights for both directions (4 features), 5-gram language model (1 feature), word count (1 feature), phrase count (1 feature), NULL penalty (1 feature), number of hierarchical rules used (1 feature).
translation model is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Liu, Le and Hong, Yu and Liu, Hao and Wang, Xing and Yao, Jianmin
Abstract
By contrast, we argue that the relevance between a sentence pair and target domain can be better evaluated by the combination of language model and translation model .
Abstract
In this paper, we study and experiment with novel methods that apply translation models into domain-relevant data selection.
Introduction
The corpora are necessary priori knowledge for training effective translation model .
Introduction
However, domain-specific machine translation has few parallel corpora for translation model training in the domain of interest.
Introduction
To overcome the problem, we first propose the method combining translation model with language model in data selection.
Related Work
Thus, we propose novel methods which are based on translation model and language model for data selection.
Training Data Selection Methods
We present three data selection methods for ranking and selecting domain-relevant sentence pairs from general-domain corpus, with an eye towards improving domain-specific translation model performance.
Training Data Selection Methods
These methods are based on language model and translation model , which are trained on small in-domain parallel data.
Training Data Selection Methods
3.1 Data Selection with Translation Model
translation model is mentioned in 25 sentences in this paper.
Topics mentioned in this paper:
P, Deepak and Visweswariah, Karthik
Abstract
We use translation models and language models to exploit lexical correlations and solution post character respectively.
Introduction
We model the lexical correlation and solution post character using regularized translation models and unigram language models respectively.
Our Approach
The usage of translation models in QA retrieval (Xue et al., 2008; Singh, 2012) and segmentation (Deepak et al., 2012) were also motivated by the correlation assumption.
Our Approach
We use an IBM Model 1 translation model (Brown et al., 1990) in our technique; simplistically, such a model m may be thought of as a 2-d associative array where the value m[w1] [mg] is directly related to the probability of wl occuring in the problem when ’LU2 occurs in the solution.
Our Approach
Consider a unigram language model 83 that models the lexical characteristics of solution posts, and a translation model 73 that models the lexical correlation between problems and solutions.
Related Work
Usage of translation models for modeling the correlation between textual problems and solutions have been explored earlier starting from the answer retrieval work in (Xue et al., 2008) where new queries were conceptually expanded using the translation model to improve retrieval.
Related Work
Translation models were also seen to be useful in segmenting incident reports into the problem and solution parts (Deepak et al., 2012); we will use an adaptation of the generative model presented therein, for our solution extraction formulation.
Related Work
Entity-level translation models
translation model is mentioned in 24 sentences in this paper.
Topics mentioned in this paper:
Xiao, Tong and Zhu, Jingbo and Zhang, Chunliang
A Skeleton-based Approach to MT 2.1 Skeleton Identification
To compute g(d), we use a linear combination of a skeleton translation model 98196; (d) and a full translation model gfuu (d):
A Skeleton-based Approach to MT 2.1 Skeleton Identification
9(d) = gskel(d) + gfuu(d) (3) where the skeleton translation model handles the translation of the sentence skeleton, while the full translation model is the baseline model and handles the original problem of translating the whole sentence.
A Skeleton-based Approach to MT 2.1 Skeleton Identification
The skeleton translation model focuses on the translation of the sentence skeleton, i.e., the solid (red) rectangles; while the full translation model computes the model score for all those phrase-pairs, i.e., all solid and dashed rectangles.
Abstract
The basic idea is that we translate the key elements of the input sentence using a skeleton translation model, and then cover the remain segments using a full translation model .
Introduction
Note that the source-language structural information has been intensively investigated in recent studies of syntactic translation models .
Introduction
0 We develop a skeleton-based model which divides translation into two sub-models: a skeleton translation model (i.e., translating the key elements) and a full translation model (i.e., translating the remaining source words and generating the complete translation).
translation model is mentioned in 14 sentences in this paper.
Topics mentioned in this paper:
Xiong, Deyi and Zhang, Min
Abstract
In this paper, we propose a sense-based translation model to integrate word senses into statistical machine translation.
Abstract
The proposed sense-based translation model enables the decoder to select appropriate translations for source words according to the inferred senses for these words using maximum entropy classifiers.
Abstract
We test the effectiveness of the proposed sense-based translation model on a large-scale Chinese-to-English translation task.
Introduction
These glosses, used as the sense predictions of their WSD system, are integrated into a word-based SMT system either to substitute for translation candidates of their translation model or to postedit the output of their SMT system.
Introduction
In order to incorporate word senses into SMT, we propose a sense-based translation model that is built on maximum entropy classifiers.
Introduction
We collect training instances from the sense-tagged training data to train the proposed sense-based translation model .
translation model is mentioned in 46 sentences in this paper.
Topics mentioned in this paper:
Li, Junhui and Marton, Yuval and Resnik, Philip and Daumé III, Hal
Conclusion and Future Work
In this paper, we have presented a unified reordering framework to incorporate soft linguistic constraints (of syntactic or semantic nature) into the HPB translation model .
Experiments
Our basic baseline system employs 19 basic features: a language model feature, 7 translation model features, word penalty, unknown word penalty, the glue rule, date, number and 6 pass-through features.
HPB Translation Model: an Overview
Each such rule is associated with a set of translation model features {gbi}, such as phrase translation probability p (04 | 7) and its inverse p (7 | 04), the lexical translation probability plew (04 | 7) and its inverse plew (7 | 04), and a rule penalty that affects preference for longer or shorter derivations.
Introduction
The popular distortion or lexicalized reordering models in phrase-based SMT make good local predictions by focusing on reordering on word level, while the synchronous context free grammars in hierarchical phrase-based (HPB) translation models are capable of handling nonlocal reordering on the translation phrase level.
Introduction
The general ideas, however, are applicable to other translation models , e.g., phrase-based model, as well.
Introduction
Section 2 provides an overview of HPB translation model .
Related Work
Both are close to our work; however, our model generates reordering features that are integrated into the log-linear translation model during decoding.
Related Work
In the soft constraint or reordering model approach, Liu and Gildea (2010) modeled the reordering/deletion of source-side semantic roles in a tree-to-string translation model .
Unified Linguistic Reordering Models
For models with syntactic reordering, we add two new features (i.e., one for the leftmost reordering model and the other for the rightmost reordering model) into the log-linear translation model in Eq.
Unified Linguistic Reordering Models
For the semantic reordering models, we also add two new features into the log-linear translation model .
translation model is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Lu, Shixiang and Chen, Zhenbiao and Xu, Bo
Abstract
In this paper, instead of designing new features based on intuition, linguistic knowledge and domain, we learn some new and effective features using the deep auto-encoder (DAE) paradigm for phrase-based translation model .
Experiments and Results
The baseline translation models are generated by Moses with default parameter settings.
Input Features for DNN Feature Learning
The phrase-based translation model (Koehn et al., 2003; Och and Ney, 2004) has demonstrated superior performance and been widely used in current SMT systems, and we employ our implementation on this translation model .
Input Features for DNN Feature Learning
2This corpus is used to train the translation model in our experiments, and we will describe it in detail in section 5.1.
Introduction
al., 2010), and speech spectrograms (Deng et al., 2010), we propose new feature learning using semi-supervised DAE for phrase-based translation model .
Related Work
(2012) improved translation quality of n-gram translation model by using a bilingual neural LM, where translation probabilities are estimated using a continuous representation of translation units in lieu of standard discrete representations.
Related Work
Kalchbrenner and Blunsom (2013) introduced recurrent continuous translation models that comprise a class for purely continuous sentence-level translation models .
Related Work
(2013) presented a joint language and translation model based on a recurrent neural network which predicts target words based on an unbounded history of both source and target words.
Semi-Supervised Deep Auto-encoder Features Learning for SMT
Each translation rule in the phrase-based translation model has a set number of features that are combined in the log-linear model (Och and Ney, 2002), and our semi-supervised DAE features can also be combined in this model.
translation model is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Tu, Mei and Zhou, Yu and Zong, Chengqing
A semantic span can include one or more eus.
1) CSS-based translation model : following formula (1), we obtain the cohesion information by modifying the translation rules with their probabilities P(es Ift) based on word align-
A semantic span can include one or more eus.
3.1 CSS-based Translation Model
A semantic span can include one or more eus.
For the existing translation models , the entire training process is conducted at the lexical or syntactic level without grammatically cohesive information.
Abstract
Our models include a CSS-based translation model , which generates new CSS-based translation rules, and a generative transfer model, which encourages producing transitional expressions during decoding.
Conclusion
In the future, we will extend our methods to other translation models , such as the syntax-based model, to study how to further improve the performance of SMT systems.
Experiments
The bilingual training data for translation model and CSS-based transfer model is FBIS corpus with approximately 7.1 million Chinese words and 9.2 million English words.
Experiments
For this work, we use an in-house decoder to build the SMT baseline; it combines the hierarchical phrase-based translation model (Chiang, 2005; Chiang, 2007) with the BTG (Wu, 1996) reordering model (Xiong et al., 2006; Zens and Ney, 2006; He et al., 2010).
Experiments
To further evaluate the effectiveness of the proposed models, we also conducted an experiment on a larger set of bilingual training data from the LDC corpus7 for translation model and transfer model.
Introduction
One is a new translation model that is utilized to generate new translation rules combined with the information of source functional relationships.
translation model is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Huang, Fei and Xu, Jian-Ming and Ittycheriah, Abraham and Roukos, Salim
Abstract
We present an adaptive translation quality estimation (QE) method to predict the human-targeted translation error rate (HTER) for a document-specific machine translation model .
Adaptive MT Quality Estimation
Therefore it is necessary to build a QB regression model that’s robust to different document-specific translation models .
Document-specific MT System
Building a general MT system using all the parallel data not only produces a huge translation model (unless with very aggressive pruning), the performance on the given input document is suboptimal due to the unwanted dominance of out-of-domain data.
Document-specific MT System
Here we adopt the same strategy, building a document-specific translation model for each input document.
Experiments
In a typical MT QE scenario, the QE model is pre-trained and applied to various MT outputs, even though the QE training data and MT outputs are generated from different translation models .
Experiments
We train the static QE model with this training set, including the source sentences, references and MT outputs (from multiple translation models ).
Experiments
To train the adaptive QE model for each test document, we build a translation model whose subsampling data includes source sentences from both the test document and the QE training data.
Introduction
First, existing approaches to MT quality estimation rely on lexical and syntactical features defined over parallel sentence pairs, which includes source sentences, MT outputs and references, and translation models (Blatz et al., 2004; Ueffing and Ney, 2007; Specia et al., 2009a; Xiong et al., 2010; Soricut and Echihabi, 2010a; Bach et al., 2011).
Static MT Quality Estimation
derived from a Maximum Entropy translation model (Ittycheriah and Roukos, 2005).
translation model is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Narayan, Shashi and Gardent, Claire
Abstract
Second, it combines a simplification model for splitting and deletion with a monolingual translation model for phrase substitution and reordering.
Experiments
We trained our simplification and translation models on the PWKP corpus.
Simplification Framework
Our simplification framework consists of a probabilistic model for splitting and dropping which we call DRS simplification model (DRS-SM); a phrase based translation model for substitution and reordering (PBMT); and a language model learned on Simple English Wikipedia (LM) for fluency and grammaticality.
Simplification Framework
where the probabilities p(s’ |DC), p(s’ |s) and 19(3) are given by the DRS simplification model, the phrase based machine translation model and the language model respectively.
Simplification Framework
Our phrase based translation model is trained using the Moses toolkit5 with its default command line options on the PWKP corpus (except the sentences from the test set) considering the complex sentence as the source and the simpler one as the target.
translation model is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Salameh, Mohammad and Cherry, Colin and Kondrak, Grzegorz
Conclusion
Eventually, we would like to replace the functionality of factored translation models (Koehn and Hoang, 2007) with lattice transformation and augmentation.
Experimental Setup
Four translation model features encode phrase translation probabilities and lexical scores in both directions.
Introduction
Morphological complexity leads to much higher type to token ratios than English, which can create sparsity problems during translation model estimation.
Related Work
Most techniques approach the problem by transforming the target language in some manner before training the translation model .
Related Work
In this setting, the sparsity reduction from segmentation helps word alignment and target language modeling, but it does not result in a more expressive translation model .
translation model is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
van Gompel, Maarten and van den Bosch, Antal
Introduction
the role of the translation model in Statistical Machine Translation (SMT).
System
The language model is a trigram-based back-off language model with Kneser-Ney smoothing, computed using SRILM (Stolcke, 2002) and trained on the same training data as the translation model .
System
We do so by normalising the class probability from the classifier (scoreT(H)), which is our translation model , and the language model (scorelm(H)), in such a way that the highest classifier score for the alternatives under consideration is always 1.0, and the highest language model score of the sentence is always 1.0.
System
If desired, the search can be parametrised with variables A3 and A4, representing the weights we want to attach to the classifier-based translation model and the language model, respectively.
translation model is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Auli, Michael and Gao, Jianfeng
Expected BLEU Training
We summarize the weights of the recurrent neural network language model as 6 = {U, W, V} and add the model as an additional feature to the log-linear translation model using the simplified notation 89(10):) 2 8(wt|w1...wt_1,ht_1):
Expected BLEU Training
The translation model is parameterized by A and 6 which are learned as follows (Gao et al., 2014):
Experiments
Translation models are estimated on 102M words of parallel data for French-English, and 99M words for German-English; about 6.5M words for each language pair are newswire, the remainder are parliamentary proceedings.
Introduction
Neural network-based language and translation models have achieved impressive accuracy improvements on statistical machine translation tasks (Allauzen et al., 2011; Le et al., 2012b; Schwenk et al., 2012; Vaswani et al., 2013; Gao et al., 2014).
translation model is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Saluja, Avneesh and Hassan, Hany and Toutanova, Kristina and Quirk, Chris
Conclusion
In this work, we presented an approach that can expand a translation model extracted from a sentence-aligned, bilingual corpus using a large amount of unstructured, monolingual data in both source and target languages, which leads to improvements of 1.4 and 1.2 BLEU points over strong baselines on evaluation sets, and in some scenarios gains in excess of 4 BLEU points.
Generation & Propagation
We assume that sufficient parallel resources exist to learn a basic translation model using standard techniques, and also assume the availability of larger monolingual corpora in both the source and target languages.
Introduction
Unlike previous work (Irvine and Callison-Burch, 2013a; Razmara et al., 2013), we use higher order n-grams instead of restricting to unigrams, since our approach goes beyond OOV mitigation and can enrich the entire translation model by using evidence from monolingual text.
Related Work
(2013) and Irvine and Callison-Burch (2013a) conduct a more extensive evaluation of their graph-based BLI techniques, where the emphasis and end-to-end BLEU evaluations concentrated on OOVs, i.e., unigrams, and not on enriching the entire translation model .
translation model is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Devlin, Jacob and Zbib, Rabih and Huang, Zhongqiang and Lamar, Thomas and Schwartz, Richard and Makhoul, John
Introduction
They have since been extended to translation modeling , parsing, and many other NLP tasks.
Model Variations
4.1 Neural Network Lexical Translation Model (NNLTM)
Model Variations
In order to assign a probability to every source word during decoding, we also train a neural network lexical translation model (NNLMT).
translation model is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Liu, Shujie and Yang, Nan and Li, Mu and Zhou, Ming
Abstract
RZNN is a combination of recursive neural network and recurrent neural network, and in turn integrates their respective capabilities: (1) new information can be used to generate the next hidden state, like recurrent neural networks, so that language model and translation model can be integrated naturally; (2) a tree structure can be built, as recursive neural networks, so as to generate the translation candidates in a bottom up manner.
Introduction
DNN is also introduced to Statistical Machine Translation (SMT) to learn several components or features of conventional framework, including word alignment, language modelling, translation modelling and distortion modelling.
Introduction
(2013) propose a joint language and translation model , based on a recurrent neural network.
translation model is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: