Integration with SMT | We use the following method to integrate our transliterator into the overall SMT system: |
Integration with SMT | In a tuning step, the Minimim Error Rate Training component of our SMT system iteratively adjusts the set of rule weights, including the weight associated with the transliteration feature, such that the English translations are optimized with respect to a set of known reference translations according to the BLEU translation metric. |
Integration with SMT | At runtime, the transliterations then compete with the translations generated by the general SMT system . |
Introduction | The SMT system drops most names in this example. |
Introduction | The simplest way to integrate name handling into SMT is: (1) run a named-entity identification system on the source sentence, (2) transliterate identified entities with a special-purpose transliteration component, and (3) run the SMT system on the source sentence, as usual, but when looking up phrasal translations for the words identified in step 1, instead use the transliterations from step 2. |
Introduction | The base SMT system may translate a commonly-occurring name just fine, due to the bitext it was trained on, while the transliteration component can easily supply a worse answer. |
Abstract | In this paper, we propose a statistical model to generate appropriate measure words of nouns for an English-to-Chinese SMT system . |
Abstract | Our model works as a postprocessing procedure over output of statistical machine translation systems, and can work with any SMT system . |
Introduction | English-Chinese SMT Systems |
Introduction | However, as we will show below, existing SMT systems do not deal well with the measure word generation in general due to data sparseness and long distance dependencies between measure words and their corresponding head words. |
Introduction | Due to the limited size of bilingual corpora, many measure words, as well as the collocations between a measure and its head word, cannot be well covered by the phrase translation table in an SMT system . |
Our Method | For those having English translations, such as 9K” (meter), “DEE” (ton), we just use the translation produced by the SMT system itself. |
Our Method | The model is applied to SMT system outputs as a postprocessing procedure. |
Our Method | Based on contextual information contained in both input source sentence and SMT system’s output translation, a measure word candidate set M is constructed. |
Abstract | Furthermore, integrated Model-III achieves overall 3.48 BLEU points improvement and 2.62 TER points reduction in comparison with the pure SMT system . |
Conclusion and Future Work | The experiments show that the proposed Model-III outperforms both the TM and the SMT systems significantly (p < 0.05) in either BLEU or TER when fuzzy match score is above 0.4. |
Conclusion and Future Work | Compared with the pure SMT system , Model-III achieves overall 3.48 BLEU points improvement and 2.62 TER points reduction on a Chinese—English TM database. |
Experiments | For the phrase-based SMT system , we adopted the Moses toolkit (Koehn et al., 2007). |
Experiments | We first extract 95% of the bilingual sentences as a new training corpus to train a SMT system . |
Experiments | Scores marked by “*” are significantly better ([9 < 0.05) than both the TM and the SMT systems . |
Introduction | Especially, there is no guarantee that a SMT system can produce translations in a consistent manner (Ma et al., 2011). |
Introduction | Afterwards, they merge the relevant translations of matched segments into the source sentence, and then force the SMT system to only translate those unmatched segments at decoding. |
Introduction | Compared with the pure SMT system , the proposed integrated Model-III achieves 3.48 BLEU points improvement and 2.62 TER points reduction overall. |
Abstract | Existing work that uses two independently trained SMT systems cannot directly optimize the paraphrase results. |
Abstract | In this paper, we propose a joint learning method of two SMT systems to optimize the process of paraphrase generation. |
Abstract | In addition, a revised BLEU score (called iBLEU) which measures the adequacy and diversity of the generated paraphrase sentence is proposed for tuning parameters in SMT systems . |
Introduction | Thus researchers leverage bilingual parallel data for this task and apply two SMT systems (dual SMT system ) to translate the original sentences into another pivot language and then translate them back into the original language. |
Introduction | Context features are added into the SMT system to improve translation correctness against polysemous. |
Introduction | Previous work employs two separately trained SMT systems the parameters of which are tuned for SMT scheme and therefore cannot directly optimize the paraphrase purposes, for example, optimize the diversity against the input. |
Paraphrasing with a Dual SMT System | Generating sentential paraphrase with the SMT system is done by first translating a source sentence into another pivot language, and then back into the source. |
Paraphrasing with a Dual SMT System | Here, we call these two procedures a dual SMT system . |
Paraphrasing with a Dual SMT System | 2.1 Joint Inference of Dual SMT System |
Abstract | Then we employ a hybrid method combining RBMT and SMT systems to fill up the data gap for pivot translation, where the source-pivot and pivot-target corpora are independent. |
Experiments | 5.3 Results by Using SMT Systems |
Experiments | Table 3: CRR/ASR translation results by using SMT systems |
Experiments | 5.4 Results by Using both RBMT and SMT Systems |
Introduction | Unfortunately, large quantities of parallel data are not readily available for some languages pairs, therefore limiting the potential use of current SMT systems . |
Introduction | Experimental results show that (l) the performances of the three pivot methods are comparable when only SMT systems are used. |
Pivot Methods for Phrase-based SMT | Where L is the number of features used in SMT systems . |
Pivot Methods for Phrase-based SMT | This can be achieved by translating the pivot sentences in source-pivot corpus to target sentences with the pivot-target SMT system . |
Pivot Methods for Phrase-based SMT | The other is to obtain source translations for the target sentences in the pivot-target corpus using the pivot-source SMT system . |
Using RBMT Systems for Pivot Translation | Since it is easy to obtain monolingual corpora than bilingual corpora, we use RBMT systems to translate the available monolingual corpora to obtain synthetic bilingual corpus, which are added to the training data to improve the performance of SMT systems . |
Abstract | We make use of the collocation probabilities, which are estimated from monolingual corpora, in two aspects, namely improving word alignment for various kinds of SMT systems and improving phrase table for phrase-based SMT. |
Abstract | As compared to baseline systems, we achieve absolute improvements of 2.40 BLEU score on a phrase-based SMT system and 1.76 BLEU score on a parsing-based SMT system . |
Experiments on Phrase-Based SMT | We use FBIS corpus to train the Chinese-to-English SMT systems . |
Experiments on Word Alignment | To train a Chinese-to-English SMT system , we need to perform both Chinese-to-English and |
Improving Phrase Table | Phrase-based SMT system automatically extracts bilingual phrase pairs from the word aligned bilingual corpus. |
Improving Phrase Table | These collocation probabilities are incorporated into the phrase-based SMT system as features. |
Improving Statistical Bilingual Word Alignment | This method ignores the correlation of the words in the same alignment unit, so an alignment may include many unrelated words2, which influences the performances of SMT systems . |
Introduction | 1993) is the base of most SMT systems . |
Introduction | Then the collocation information is employed to improve Bilingual Word Alignment (BWA) for various kinds of SMT systems and to improve phrase table for phrase-based SMT. |
Introduction | Then the phrase collocation probabilities are used as additional features in phrase-based SMT systems . |
Abstract | To adapt boosting to SMT system combination, several key components of the original boosting algorithms are redesigned in this work. |
Background | where Pr(e| f) is the probability that e is the translation of the given source string f. To model the posterior probability Pr(e| f) , most of the state-of-the-art SMT systems utilize the log-linear model proposed by Och and Ney (2002), as follows, |
Background | In this paper, u denotes a log-linear model that has Mfixed features {h1(f,e), ..., hM(f,e)}, ,1 = {3.1, ..., AM} denotes the M parameters of u, and u(/1) denotes a SMT system based on u with parameters ,1. |
Background | Suppose that there are T available SMT systems {u1(/1*1), ..., uT(/1*T)}, the task of system combination is to build a new translation system v(u1(/l*1), mm?» from mm), mfg}. |
Introduction | With the emergence of various structurally different SMT systems, more and more studies are focused on combining multiple SMT systems for achieving higher translation accuracy rather than using a single translation system. |
Introduction | One of the key factors in SMT system combination is the diversity in the ensemble of translation outputs (Macherey and Och, 2007). |
Introduction | However, this requirement cannot be met in many cases, since we do not always have the access to multiple SMT engines due to the high cost of developing and tuning SMT systems . |
Experiments | The bilingual SMT system used in our experiments is the state-of-the-art SCFG decoder CDEC (Dyer et al., 2010)5. |
Experiments | We trained the SMT system on the English-German parallel web data provided in the COMMON CRAWL6 (Smith et al., 2013) dataset. |
Experiments | Method 1 is the baseline system, consisting of the CDEC SMT system trained on the COMMON CRAWL data as described above. |
Grounding SMT in Semantic Parsing | Given a manual German translation of the English query as source sentence, the SMT system produces an English target translation. |
Introduction | This avoids the problem of un-reachability of independently generated reference translations by the SMT system . |
Introduction | in which SMT systems can be trained and evaluated. |
Related Work | (2012) propose a setup where an SMT system feeds into cross-language information retrieval, and receives feedback from the performance of translated queries with respect to cross-language retrieval performance. |
Related Work | However, despite offering direct and reliable prediction of translation quality, the cost and lack of reusability has confined task-based evaluations involving humans to testing scenarios, but prevented a use for interactive training of SMT systems as in our work. |
Response-based Online Learning | Firstly, update rules that require to compute a feature representation for the reference translation are suboptimal in SMT, because often human-generated reference translations cannot be generated by the SMT system . |
Response-based Online Learning | Such “un-reachable” gold-standard translations need to be replaced by “surrogate” gold-standard translations that are close to the human-generated translations and still lie within the reach of the SMT system . |
Introduction | While the research in statistical machine trans-ation (SMT) has made significant progress, most SMT systems (Koehn et al., 2003; Chiang, 2007; 3alley et al., 2006) rely on parallel corpora to extract ,ranslation entries. |
Introduction | The richness and complexness )f Chinese abbreviations imposes challenges to the SMT systems . |
Introduction | In particular, many Chinese abbrevi-1ti0ns may not appear in available parallel corpora, n which case current SMT systems treat them as mknown words and leave them untranslated. |
Unsupervised Translation Induction for Chinese Abbreviations | Moreover, our approach utilizes both Chinese and English monolingual data to help MT, while most SMT systems utilizes only the English monolingual data to build a language model. |
Conclusion and Future Work | Experimental results show that our approach is promising for SMT systems to learn a better translation model. |
Experiments | This proves that bilingually induced topic representation with neural network helps the SMT system disambiguate translation candidates. |
Introduction | For example, translation sense disambiguation approaches (Carpuat and Wu, 2005; Carpuat and Wu, 2007) are proposed for phrase-based SMT systems . |
Introduction | Meanwhile, for hierarchical phrase-based or syntax-based SMT systems , there is also much work involving rich contexts to guide rule selection (He et al., 2008; Liu et al., 2008; Marton and Resnik, 2008; Xiong et al., 2009). |
Introduction | Although these methods are effective and proven successful in many SMT systems , they only leverage within- |
Topic Similarity Model with Neural Network | For the SMT system , the best translation candidate 6 is given by: |
Abstract | We evaluated our approach on large-scale J apanese-English and English-Japanese machine translation tasks, and show that it can significantly outperform the baseline phrase-based SMT system . |
Conclusion and Future Work | We also expect to explore better way to integrate ranking reorder model into SMT system instead of a simple penalty scheme. |
Experiments | Lexicon features generally continue to improve the RankingSVM accuracy and reduce CLN on training data, but they do not bring further improvement for SMT systems beyond the top 100 most frequent words. |
Integration into SMT system | There are two ways to integrate the ranking reordering model into a phrase-based SMT system : the pre-reorder method, and the decoding time constraint method. |
Introduction | This is usually done in a preprocessing step, and then followed by a standard phrase-based SMT system that takes the reordered source sentence as input to finish the translation. |
Introduction | The ranking model can not only be used in a pre-reordering based SMT system , but also be integrated into a phrase-based decoder serving as additional distortion features. |
Introduction | We evaluated our approach on large-scale J apanese-English and English-Japanese machine translation tasks, and experimental results show that our approach can bring significant improvements to the baseline phrase-based SMT system in both pre-ordering and integrated decoding settings. |
Features | Given a source sentence f, let {enHV be the N -best list generated by an SMT system , and let ef, is the i-th word in en. |
Introduction | Automatically distinguishing incorrect parts from correct parts is therefore very desirable not only for post-editing and interactive machine translation (Ueffing and Ney, 2007) but also for SMT itself: either by rescoring hypotheses in the N-best list using the probability of correctness calculated for each hypothesis (Zens and Ney, 2006) or by generating new hypotheses using N -best lists from one SMT system or multiple sys- |
Introduction | 1) Calculate features that express the correctness of words either based on SMT model (e. g. translatiorfllanguage model) or based on SMT system output (e.g. |
Introduction | However, it is not adequate to just use these features as discussed in (Shi and Zhou, 2005) because the information that they carry is either from the inner components of SMT systems or from system outputs. |
Abstract | Our inflection generation models are trained independently of the SMT system . |
Abstract | We investigate different ways of combining the inflection prediction component with the SMT system by training the base MT system on fully inflected forms or on word stems. |
Abstract | We applied our inflection generation models in translating English into two morphologically complex languages, Russian and Arabic, and show that our model improves the quality of SMT over both phrasal and syntax-based SMT systems according to BLEU and human judge-ments. |
Conclusion and future work | We have shown that an independent model of morphology generation can be successfully integrated with an SMT system , making improvements in both phrasal and syntax-based MT. |
Introduction | We also demonstrate that our independently trained models are portable, showing that they can improve both syntactic and phrasal SMT systems . |
Related work | In recent work, Koehn and Hoang (2007) proposed a general framework for including morphological features in a phrase-based SMT system by factoring the representation of words into a vector of morphological features and allowing a phrase-based MT system to work on any of the factored representations, which is implemented in the Moses system. |
Conclusion | We describe four versions of the model and implement an algorithm to integrate our proposed model into a syntax-based SMT system . |
Decoding | Integrating the TNO Model into syntax-based SMT systems is nontrivial, especially with the MOS modeling. |
Introduction | To show the effectiveness of our model, we integrate our TNO model into a state-of-the-art syntax-based SMT system , which uses synchronous context-free grammar (SCFG) rules to jointly model reordering and lexical translation. |
Introduction | We show the efficacy of our proposal in a large-scale Chinese-to-English translation task where the introduction of our TNO model provides a significant gain over a state-of-the-art string-to-dependency SMT system (Shen et al., 2008) that we enhance with additional state-of-the-art features. |
Introduction | Even though the experimental results carried out in this paper employ SCFG-based SMT systems, we would like to point out that our models is applicable to other systems including phrase-based SMT systems . |
Model Decomposition and Variants | Each of these factors will act as an additional feature in the log-linear framework of our SMT system . |
Abstract | Evaluations of J apanese-to-English translation on the NTCIR-9 data show that our induced Japanese POS tags for dependency trees improve the performance of a forest-to-string SMT system . |
Experiment | We evaluated our bilingual infinite tree model for POS induction using an in-house developed syntax-based forest-to-string SMT system . |
Experiment | Under the Moses phrase-based SMT system (Koehn et al., 2007) with the default settings, we achieved a 26.80% BLEU score. |
Introduction | If we could discriminate POS tags for two cases, we might improve the performance of a Japanese-to-English SMT system . |
Introduction | Experiments are carried out on the NTCIR-9 Japanese-to-English task using a binarized forest-to-string SMT system with dependency trees as its source side. |
Decoding with Sense-Based Translation Model | Figure 2: Architecture of SMT system with the sense-based translation model. |
Decoding with Sense-Based Translation Model | Figure 2 shows the architecture of the SMT system enhanced with the sense-based translation model. |
Experiments | Our baseline system is a state-of-the-art SMT system which adapts Bracketing Transduction Grammars (Wu, 1997) to phrasal translation and equips itself with a maximum entropy based reordering model (Xiong et al., 2006). |
Introduction | These glosses, used as the sense predictions of their WSD system, are integrated into a word-based SMT system either to substitute for translation candidates of their translation model or to postedit the output of their SMT system . |
Introduction | We integrate the proposed sense-based translation model into a state-of-the-art SMT system and conduct experiments on Chines-to-English translation using large-scale training data. |
Discussion | MERT and MBR decoding are popular techniques for incorporating the final evaluation metric into the development of SMT systems . |
Introduction | These two techniques were originally developed for N -best lists of translation hypotheses and recently extended to translation lattices (Macherey et al., 2008; Tromble et al., 2008) generated by a phrase-based SMT system (Och and Ney, 2004). |
Introduction | SMT systems based on synchronous context free grammars (SCFG) (Chiang, 2007; Zollmann and Venugopal, 2006; Galley et al., 2006) have recently been shown to give competitive performance relative to phrase-based SMT. |
Translation Hypergraphs | A translation lattice compactly encodes a large number of hypotheses produced by a phrase-based SMT system . |
Translation Hypergraphs | The corresponding representation for an SMT system based on SCFGs (e.g. |
Introduction | Further experiments in machine translation also suggest that the obtained subtree alignment can improve the performance of both phrase and syntax based SMT systems . |
Substructure Spaces for BTKs | Most linguistically motivated syntax based SMT systems require an automatic parser to perform the rule induction. |
Substructure Spaces for BTKs | We explore the effectiveness of subtree alignment for both phrase based and linguistically motivated syntax based SMT systems . |
Substructure Spaces for BTKs | The findings suggest that with the modeling of non-syntactic phrases maintained, more emphasis on syntactic phrases can benefit both the phrase and syntax based SMT systems . |
Conclusions and Future Work | The two models have been integrated into a phrase-based SMT system and evaluated on Chinese-to-English translation tasks using large-scale training data. |
Introduction | Unfortunately they are usually neither correctly translated nor translated at all in many SMT systems according to the error study by Wu and Fung (2009a). |
Introduction | This suggests that conventional leXical and phrasal translation models adopted in those SMT systems are not sufficient to correctly translate predicates in source sentences. |
Related Work | (2011) incorporate source language semantic role labels into a tree-to-string SMT system . |
Ensemble Decoding | As in the Hiero SMT system (Chiang, 2005), the cells which span up to a certain length (i.e. |
Introduction | Common techniques for model adaptation adapt two main components of contemporary state-of-the-art SMT systems : the language model and the translation model. |
Related Work 5.1 Domain Adaptation | In a similar approach, Koehn and Schroeder (2007) use a feature of the factored translation model framework in Moses SMT system (Koehn and Schroeder, 2007) to use multiple alternative decoding paths. |
Related Work 5.1 Domain Adaptation | The Moses SMT system implements (Koehn and Schroeder, |
Conclusion | In the future, we will extend our methods to other translation models, such as the syntax-based model, to study how to further improve the performance of SMT systems . |
Introduction | In Section 3, we present our models and show how to integrate the models into an SMT system . |
Related Work | They added the labels assigned to connectives as an additional input to an SMT system , but their experimental results show that the improvements under the evaluation metric of BLEU were not significant. |
Related Work | To the best of our knowledge, our work is the first attempt to exploit the source functional relationship to generate the target transitional expressions for grammatical cohesion, and we have successfully incorporated the proposed models into an SMT system with significant improvement of BLEU metrics. |
Abstract | (2009) improved a syntactic SMT system by adding as many as ten thousand syntactic features, and used Margin Infused Relaxed Algorithm (MIRA) to train the feature weights. |
Abstract | Our work is based on a phrase-based SMT system . |
Abstract | In a phrase-based SMT system , the total number of parameters of phrase and lexicon translation models, which we aim to learn discriminatively, is very large (see Table 1). |
Experiments | order to simulate pass-through behavior of out-of-vocabulary terms in SMT systems , additional features accounting for source and target term identity were added to DK and BM models. |
Introduction | This approach is advantageous if large amounts of in-domain sentence-parallel data are available to train SMT systems , but relevance rankings to train retrieval models are not. |
Related Work | In a direct translation approach (DT), a state-of-the-art SMT system is used to produce a single best translation that is used as search query in the target language. |
Related Work | For example, Google’s CLIR approach combines their state-of-the-art SMT system with their proprietary search engine (Chin et al., 2008). |
Discussion | Removing redundant words: Mostly, translating redundant words may confuse the SMT system and would be unnecessary. |
Introduction | The translation quality of the SMT system is highly related to the coverage of translation models. |
Introduction | This problem is more serious for online SMT systems in real-world applications. |
Introduction | the input sentences of the SMT system using automatically extracted paraphrase rules which can capture structures on sentence level in addition to paraphrases on the word or phrase level. |
Experiments | To optimize SMT system , we tune the parameters on NIST MT06, and report results on three test sets: MT02, MT03 and MT05.2 |
Introduction | In contrast, machine translation uses inherently multilingual data: an SMT system must translate a phrase or sentence from a source language to a different target language, so existing applications of topic models (Eidelman et al., 2012) are wilfully ignoring available information on the target side that could aid domain discovery. |
Topic Models for Machine Translation | Cross-Domain SMT A SMT system is usually trained on documents with the same genre (e.g., sports, business) from a similar style (e.g., newswire, blog-posts). |
Topic Models for Machine Translation | Domain Adaptation for SMT Training a SMT system using diverse data requires domain adaptation. |
Conclusions | Some SMT systems never get deployed because of legitimate and incompatible concerns of the prospective users and of the training data owners. |
Conclusions | This same method can be easily extended to other resources used by SMT systems , and indeed even beyond SMT itself, whenever similar constraints on data access exist. |
Experiments | We validated our simple implementation using a phrase table of 38,488,777 lines created with the Moses toolkit3(Koehn et al., 2007) phrase-based SMT system , corresponding to 15,764,069 entries |
Introduction | At the same time, the prospective user of the SMT system that could be derived from such TM might be subject to confidentiality constraints on the text stream needing translation, so that sending out text to translate to an SMT system deployed by the owner of the PT is not an option. |
Introduction | Although modern SMT systems have switched to a discriminative log-linear framework, which allows for additional sources as features, it is generally hard to incorporate dependencies beyond a small window of adjacent words, thus making it difficult to use linguistically-rich models. |
Introduction | We believe that the semantic and pragmatic information captured in the form of DTs (i) can help develop discourse-aware SMT systems that produce coherent translations, and (ii) can yield better MT evaluation metrics. |
Introduction | While in this work we focus on the latter, we think that the former is also within reach, and that SMT systems would benefit from preserving the coherence relations in the source language when generating target-language translations. |
Training | In the translation tasks, we used the Moses phrase-based SMT systems (Koehn et al., 2007). |
Training | In addition, for a detailed comparison, we evaluated the SMT system where the IBM Model 4 was trained from all the training data (I BM 4a“). |
Training | Consequently, the SMT system using RN Nu+c trained from a small part of training data can achieve comparable performance to that using I BM 4 trained from all training data, which is shown in Table 3. |
Machine Translation Experiments | We used MT08 and EgyDevV3 to tune SMT systems while we divided the remaining sets among classifier training data (5,562 sentences), dev (1,802 sentences) and blind test (1,804 sentences) sets to ensure each of these new sets has a variety of dialects and genres (weblog and newswire). |
Machine Translation Experiments | This MSA-pivoting system uses Salloum and Habash (2013)’s DA-MSA MT system followed by an Arabic-English SMT system which is trained on both corpora augmented with the DA-English where the DA side is preprocessed with the same DA-MSA MT system then tokenized with MADA-ARZ. |
Machine Translation Experiments | Test sets are similarly preprocessed before decoding with the SMT system . |
Methods | We approach this problem by augmenting an SMT system built over target segments with features that reflect the desegmented target words. |
Methods | In this section, we describe our various strategies for desegmenting the SMT system’s output space, along with the features that we add to take advantage of this desegmented view. |
Related Work | Other approaches train an SMT system to predict lemmas instead of surface forms, and then inflect the SMT output as a postprocessing step (Minkov et al., 2007; Clifton and Sarkar, 2011; Fraser et al., 2012; El Kholy and Habash, 2012b). |
Abstract | Current SMT systems usually decode with single translation models and cannot benefit from the strengths of other models in decoding phase. |
Background | Most SMT systems approximate the summation over all possible derivations by using l-best derivation for efficiency. |
Background | By now, most current SMT systems , adopting either max-derivation decoding or max-translation decoding, have only used single models in decoding phase. |
Introduction | The contribution of this paper is to improve the prediction of case in our SMT system by implementing and combining two alternative routes to integrate subcategorization information from the syntax-semantic interface: (i) We regard the translation as a function of the source language input, and project the syntactic functions of the English nouns to their German translations in the |
Translation pipeline | Table 2 illustrates the different steps of the inflection process: the markup (number and gender on nouns) in the stemmed output of the SMT system is part of the input to the respective feature prediction. |
Using subcategorization information | In contrast, the SMT system often produces more isomorphic translations, which is helpful for annotating source-side features on the target language. |
Experimental Evaluation | The baseline system is a standard phrase-based SMT system with eight features: phrase translation and word lexicon probabilities in both translation directions, phrase penalty, word penalty, language model score and a simple distance-based reordering model. |
Introduction | A phrase-based SMT system takes a source sentence and produces a translation by segmenting the sentence into phrases and translating those phrases separately (Koehn et al., 2003). |
Introduction | The phrase translation table, which contains the bilingual phrase pairs and the corresponding translation probabilities, is one of the main components of an SMT system . |
Inferring a learning curve from mostly monolingual data | For enabling this work we trained a multitude of instances of the same phrase-based SMT system on 30 distinct combinations of language-pair and domain, each with fourteen distinct training sets of increasing size and tested these instances on multiple in—domain datasets, generating 96 learning curves. |
Introduction | This prediction, or more generally the prediction of the learning curve of an SMT system as a function of available in-domain parallel data, is the objective of this paper. |
Introduction | An extensive study across six parametric function families, empirically establishing that a certain three-parameter power-law family is well suited for modeling learning curves for the Moses SMT system when the evaluation score is BLEU. |
Introduction | However, the state-of-the-art SMT systems translate sentences by using sequences of synchronous rules or phrases, instead of translating word by word. |
Topic Similarity Model | Hellinger function is used to calculate distribution distance and is popular in topic model (Blei and Laf-ferty, 2007).1 By topic similarity, we aim to encourage or penalize the application of a rule for a given document according to their topic distributions, which then helps the SMT system make better translation decisions. |
Topic Similarity Model | By incorporating the topic sensitivity model with the topic similarity model, we enable our SMT system to balance the selection of these two types of rules. |