Cross-lingual Features | We experimented with three different cross-lingual features that used Arabic and English Wikipedia cross-language links and a true-cased phrase table that was generated using Moses (Koehn et al., 2007). |
Cross-lingual Features | The phrase table was trained on a set of 3.69 million parallel sentences containing 123.4 million English tokens. |
Cross-lingual Features | To capture cross-lingual capitalization, we used the aforementioned true-cased phrase table at word and |
Introduction | Cross-lingual links are obtained using Wikipedia cross-language links and a large Machine Translation (MT) phrase table that is true cased, where word casing is preserved during training. |
Related Work | Transliteration Mining (TM) has been used to enrich MT phrase tables or to improve cross language search (Udupa et al., 2009). |
Adaptive Online MT | Finally, large data structures such as the language model (LM) and phrase table exist in shared memory, obviating the need for remote queries. |
Analysis | In Table 6, A is the set of phrase table features that received a nonzero weight when tuned on dataset DA (same for B). |
Analysis | Phrase table features in A m B are overwhelmingly short, simple, and correct phrases, suggesting L1 regularization is effective for feature selection. |
Analysis | To understand the domain adaptation issue we compared the nonzero weights in the discriminative phrase table (PT) for Ar—En models tuned on bitext5k and MT05/6/ 8. |
Experiments | tive phrase table (PT): indicators for each rule in the phrase table . |
Experiments | Moses5 also contains the discriminative phrase table implementation of (Hasler et al., 2012b), which is identical to our implementation using Phrasal. |
Experiments | Moses and Phrasal accept the same phrase table and LM formats, so we kept those data structures in common. |
Related Work | A discriminative phrase table helped them improve slightly over a dense, online MIRA baseline, but their best results required initialization with MERT-tuned weights and retuning a single, shared weight for the discriminative phrase table with MERT. |
Abstract | In face of the problem, we propose an efficient phrase table combination method. |
Abstract | The learned phrase tables are hierarchically combined as if they are drawn from a hierarchical Pitman-Yor process. |
Abstract | Furthermore, each phrase table is trained separately in each domain, and while computational overhead is significantly reduced by training them in parallel. |
Introduction | This paper proposes a new phrase table combination method. |
Introduction | (2011) to perform phrase table extraction. |
Introduction | Second, extracted phrase tables are combined as if they are drawn from a hierarchical Pitman-Yor process, in which the phrase tables represented as tables in the Chinese restaurant process (CRP) are hierarchically chained by treating each of the previously learned phrase tables as prior to the current one. |
Phrase Pair Extraction with Unsupervised Phrasal ITGs | It can achieve comparable translation accuracy with a much smaller phrase table than the traditional GIZA++ and heuristic phrase extraction methods. |
Phrase Pair Extraction with Unsupervised Phrasal ITGs | Compared to GIZA++ with heuristic phrase extraction, the Bayesian phrasal ITG can achieve competitive accuracy under a smaller phrase table size. |
Related Work | In the case of the previous work on translation modeling, mixed methods have been investigated for domain adaptation in SMT by adding domain information as additional labels to the original phrase table (Foster and Kuhn, 2007). |
Related Work | However, their methods usually require numbers of hyperparameters, such as mini-batch size, step size, or human judgment to determine the quality of phrases, and still rely on a heuristic phrase extraction method in each phrase table update. |
Conclusion | Using phrase table merging that combined AR and EG’ training data in a way that preferred adapted dialectal data yielded an extra 0.86 BLEU points. |
Proposed Methods 3.1 Egyptian to EG’ Conversion | - Only added the phrase with its translations and their probabilities from the AR phrase table . |
Proposed Methods 3.1 Egyptian to EG’ Conversion | - Only added the phrase with its translations and their probabilities from the EG’ phrase table . |
Proposed Methods 3.1 Egyptian to EG’ Conversion | - Added translations of the phrase from both phrase tables and left the choice to the decoder. |
Abstract | We propose a Name-aware Machine Translation (MT) approach which can tightly integrate name processing into MT model, by jointly annotating parallel corpora, extracting name-aware translation grammar and rules, adding name phrase table and name translation driven decoding. |
Experiments | For better comparison with NAMT, besides the original baseline, we develop the other baseline system by adding name translation table into the phrase table (NPhrase). |
Name-aware MT | Finally, the extracted 9,963 unique name translation pairs were also used to create an additional name phrase table for NAMT. |
Name-aware MT | Finally, based on LMs, our decoder exploits the dynamically created phrase table from name translation, competing with originally extracted rules, to find the best translation for the input sentence. |
Related Work | Preprocessing: identify names in the source texts and propose name translations to the MT system; the name translation results can be simply but aggressively transferred from the source to the target side using word alignment, or added into phrase table in order to |
Experiments | Following (Levenberg et al., 2012; Neubig et al., 2011), we evaluate our model by using its output word alignments to construct a phrase table . |
Experiments | For our models, we report the average BLEU score of the 5 independent runs as well as that of the aggregate phrase table generated by these 5 independent runs. |
Experiments | Firstly, combining the phrase tables from independent runs results in increased BLEU scores, possibly due to the representation of uncertainty in the outputs, and the representation of different modes captured by the individual models. |
Related Work | Our paper fits into the recent line of work for jointly inducing the phrase table and word alignment (DeNero and Klein, 2010; Neubig et al., 2011). |
Experiments | For the out-of-domain data, we build the phrase table and reordering table using the 2.08 million Chinese-to-English sentence pairs, and we use the SRILM toolkit (Stolcke, 2002) to train the 5-gram English language model with the target part of the parallel sentences and the Xinhua portion of the English Gigaword. |
Experiments | For the in-domain electronic data, we first consider the lexicon as a phrase table in which we assign a constant 1.0 for each of the four probabilities, and then we combine this initial phrase table and the induced phrase pairs to form the new phrase table . |
Experiments | (2008) regards the in-domain lexicon with corpus translation probability as another phrase table and further use the in-domain language model besides the out-of-domain language model. |
Data and Gold Standard | We first used the phrase table from |
Data and Gold Standard | We then looked at the different translations that each had in the phrase table and a French speaker selected a subset that have multiple senses.3 |
New Sense Indicators | The ground truth labels (target translation for a given source word) for this classifier are generated from the phrase table of the old domain data. |
Introduction | Furthermore, the introduction of non-terminals makes the grammar size significantly bigger than phrase tables and leads to higher memory requirement (Chiang, 2007). |
Introduction | On the other hand, although target words can be generated left-to-right by altering the way of tree transversal in syntax-based models, it is still difficult to reach full rule coverage as compared with phrase table . |
Introduction | phrase table limit is set to 20 for all the three systems. |