Advancements in Reordering Models for Statistical Machine Translation
Feng, Minwei and Peter, Jan-Thorsten and Ney, Hermann

Article Structure

Abstract

In this paper, we propose a novel reordering model based on sequence labeling techniques.

Introduction

The systematic word order difference between two languages poses a challenge for current statistical machine translation (SMT) systems.

Translation System Overview

In statistical machine translation, we are given a source language sentence fi] = fl... fj .

Tagging-style Reordering Model

In this section, we describe the proposed novel model.

Comparative Study

The second part of this paper is comparative study on reordering models.

Experiments

In this section, we describe the baseline setup, the CRFs training results, the RNN training results

Conclusion

In this paper, a novel tagging style reordering model has been proposed.

Topics

CRFs

Appears in 19 sentences as: CRFs (19)
In Advancements in Reordering Models for Statistical Machine Translation
  1. For this supervised learning task, we choose the approach conditional random fields ( CRFs ) (Lafferty et al., 2001; Sutton and Mccallum, 2006; Lavergne et al., 2010) and recurrent neural network (RNN) (Elman, 1990; Jordan, 1990; Lang et al., 1990).
    Page 3, “Tagging-style Reordering Model”
  2. For the first method, we adopt the linear-chain CRFs .
    Page 3, “Tagging-style Reordering Model”
  3. However, even for the simple linear-chain CRFs , the complexity of learning and inference grows quadratically with respect to the number of output labels and the amount of structural features which are with regard to adjacent pairs of labels.
    Page 3, “Tagging-style Reordering Model”
  4. In this section, we describe the baseline setup, the CRFs training results, the RNN training results
    Page 6, “Experiments”
  5. 0 Wapiti toolkit (Lavergne et al., 2010) used for CRFs ; RNN is built by the RNNLIB toolkit.
    Page 6, “Experiments”
  6. 5.2 CRFs Training Results
    Page 7, “Experiments”
  7. We also applied RNN to the task as an alternative approach to CRFs .
    Page 7, “Experiments”
  8. Table 3: feature templates for CRFs training
    Page 7, “Experiments”
  9. 5.4 Comparison of CRFs and RNN errors
    Page 7, “Experiments”
  10. CRFs performs better than RNN (token error rate 25.75% vs 27.31%).
    Page 7, “Experiments”
  11. will show later, the model trained with both CRFs and RNN help to improve the translation quality.
    Page 8, “Experiments”

See all papers in Proc. ACL 2013 that mention CRFs.

See all papers in Proc. ACL that mention CRFs.

Back to top.

LM

Appears in 14 sentences as: LM (15)
In Advancements in Reordering Models for Statistical Machine Translation
  1. Figure 4: bilingual LM illustration.
    Page 5, “Comparative Study”
  2. 4.3 Bilingual LM
    Page 5, “Comparative Study”
  3. We build a 9-gram LM using SRILM toolkit (Stolcke, 2002) with modified Kneser—Ney smoothing.
    Page 5, “Comparative Study”
  4. To use the bilingual LM , the search state must be augmented to keep the bilingual unit
    Page 5, “Comparative Study”
  5. In search, the bilingual LM is applied similar to the standard target side LM .
    Page 5, “Comparative Study”
  6. 4.4 Source decoding sequence LM
    Page 5, “Comparative Study”
  7. (Feng et al., 2010) present an simpler version of the above bilingual LM where they use only the source side to model the decoding order.
    Page 5, “Comparative Study”
  8. We also build a 9-gram LM based on the source word decoding sequences.
    Page 5, “Comparative Study”
  9. The usage of the model is same as bilingual LM .
    Page 5, “Comparative Study”
  10. Our baseline is a phrase-based decoder, which includes the following models: an n-gram target-side language model ( LM ), a phrase translation model and a word-based lexicon model.
    Page 6, “Experiments”
  11. 0 lowercased training data from the GALE task (Table 1, UN corpus not included) alignment trained with GIZA++ o tuning corpus: NIST06 test corpora: NIST02 03 04 05 and 08 o 5-gram LM (1694412027 running words) trained by SRILM toolkit (Stolcke, 2002) with modified Kneser—Ney smoothing training data: target side of bilingual data.
    Page 6, “Experiments”

See all papers in Proc. ACL 2013 that mention LM.

See all papers in Proc. ACL that mention LM.

Back to top.

BLEU

Appears in 8 sentences as: BLEU (10)
In Advancements in Reordering Models for Statistical Machine Translation
  1. Results on five Chinese-English NIST tasks show that our model improves the baseline system by 1.32 BLEU and 1.53 TER on average.
    Page 1, “Abstract”
  2. 0 BLEU (Papineni et al., 2001) and TER (Snover et al., 2005) reported all scores calculated in lowercase way.
    Page 6, “Experiments”
  3. An Index column is added for score reference convenience (B for BLEU ; T for TER).
    Page 8, “Experiments”
  4. For the proposed model, significance testing results on both BLEU and TER are reported (B2 and B3 compared to B1, T2 and T3 compared to T1).
    Page 8, “Experiments”
  5. From Table 7 we see that the proposed reordering model using CRFs improves the baseline by 0.98 BLEU and 1.21 TER on average, while the proposed reordering model using RNN improves the baseline by 1.32 BLEU and 1.53 TER on average.
    Page 8, “Experiments”
  6. To investigate why RNN has lower performance for the tagging task but achieves better BLEU , we build a 3-gram LM on the source side of the training corpus in Table 2 and perplexity values are listed in Table 8.
    Page 8, “Experiments”
  7. The difference is tiny: on average only 0.08 BLEU (B3 and B10) and 0.15 TER (T3 and T10).
    Page 8, “Experiments”
  8. Experimental results show that our model is stable and improves the baseline system by 0.98 BLEU and 1.21 TER (trained by CRFs) and 1.32 BLEU and 1.53 TER (trained by RNN).
    Page 9, “Conclusion”

See all papers in Proc. ACL 2013 that mention BLEU.

See all papers in Proc. ACL that mention BLEU.

Back to top.

word alignment

Appears in 7 sentences as: word aligned (1) word alignment (7) words aligned (1)
In Advancements in Reordering Models for Statistical Machine Translation
  1. The first step is word alignment training.
    Page 2, “Tagging-style Reordering Model”
  2. We also have the word alignment within the new phrase pair, which is stored during the phrase extraction process.
    Page 4, “Tagging-style Reordering Model”
  3. monotone for current phrase, if a word alignment to the bottom left (point A) exists and there is no word alignment point at the bottom right position (point B) .
    Page 4, “Comparative Study”
  4. swap for current phrase, if a word alignment to the bottom right (point B) exists and there is no word alignment point at the bottom left position (point A) .
    Page 4, “Comparative Study”
  5. 0 when one source word aligned to multiple target words, duplicate the source word for each target word, e.g.
    Page 5, “Comparative Study”
  6. 0 when multiple source words aligned to one target word, put together the source words for that target word, e.g.
    Page 5, “Comparative Study”
  7. The main reason is that the “annotated” corpus is converted from word alignment which contains lots of error.
    Page 7, “Experiments”

See all papers in Proc. ACL 2013 that mention word alignment.

See all papers in Proc. ACL that mention word alignment.

Back to top.

TER

Appears in 7 sentences as: TER (9)
In Advancements in Reordering Models for Statistical Machine Translation
  1. Results on five Chinese-English NIST tasks show that our model improves the baseline system by 1.32 BLEU and 1.53 TER on average.
    Page 1, “Abstract”
  2. 0 BLEU (Papineni et al., 2001) and TER (Snover et al., 2005) reported all scores calculated in lowercase way.
    Page 6, “Experiments”
  3. An Index column is added for score reference convenience (B for BLEU; T for TER ).
    Page 8, “Experiments”
  4. For the proposed model, significance testing results on both BLEU and TER are reported (B2 and B3 compared to B1, T2 and T3 compared to T1).
    Page 8, “Experiments”
  5. From Table 7 we see that the proposed reordering model using CRFs improves the baseline by 0.98 BLEU and 1.21 TER on average, while the proposed reordering model using RNN improves the baseline by 1.32 BLEU and 1.53 TER on average.
    Page 8, “Experiments”
  6. The difference is tiny: on average only 0.08 BLEU (B3 and B10) and 0.15 TER (T3 and T10).
    Page 8, “Experiments”
  7. Experimental results show that our model is stable and improves the baseline system by 0.98 BLEU and 1.21 TER (trained by CRFs) and 1.32 BLEU and 1.53 TER (trained by RNN).
    Page 9, “Conclusion”

See all papers in Proc. ACL 2013 that mention TER.

See all papers in Proc. ACL that mention TER.

Back to top.

sequence labeling

Appears in 7 sentences as: sequence labeling (7)
In Advancements in Reordering Models for Statistical Machine Translation
  1. In this paper, we propose a novel reordering model based on sequence labeling techniques.
    Page 1, “Abstract”
  2. Our model converts the reordering problem into a sequence labeling problem, i.e.
    Page 1, “Abstract”
  3. Our model converts the decoding order problem into a sequence labeling problem, i.e.
    Page 2, “Introduction”
  4. Now Figure 1(d) converts the reordering problem into a sequence labeling or tagging problem.
    Page 3, “Tagging-style Reordering Model”
  5. By our method, the reordering problem is converted into a sequence labeling problem so that the whole source sentence is taken into consideration for reordering decision.
    Page 9, “Conclusion”
  6. We choose CRFs and RNN to accomplish the sequence labeling task.
    Page 9, “Conclusion”
  7. The main contributions of the paper are: propose the tagging-style reordering model and improve the translation quality; compare two sequence labeling techniques CRFs and RNN; compare our method with seven other reordering models.
    Page 9, “Conclusion”

See all papers in Proc. ACL 2013 that mention sequence labeling.

See all papers in Proc. ACL that mention sequence labeling.

Back to top.

sentence pairs

Appears in 7 sentences as: sentence pair (3) sentence pairs (4)
In Advancements in Reordering Models for Statistical Machine Translation
  1. The transformation in Figure 1 is conducted for all the sentence pairs in the bilingual training corpus.
    Page 3, “Tagging-style Reordering Model”
  2. During the search, a sentence pair ( 1‘], (if) will be formally splitted into a segmentation Sff which consists of K phrase pairs.
    Page 4, “Tagging-style Reordering Model”
  3. The interpretation is that given the sentence pair ( f17 , 6;) and its alignment, the correct translation order is 631—762—163,€3—f1,64—f4765—f4,€6—f6—f7767—f5-Notice the bilingual units have been ordered according to the target side, as the decoder writes the translation in a left-to-right way.
    Page 5, “Comparative Study”
  4. After the operation in Figure 4 was done for all bilingual sentence pairs , we get a decoding sequence corpus.
    Page 5, “Comparative Study”
  5. Firstly, we delete the sentence pairs if the source sentence length is one.
    Page 6, “Experiments”
  6. Secondly, we delete the sentence pairs if the source sentence contains more than three contiguous unaligned words.
    Page 6, “Experiments”
  7. When this happens, the sentence pair is usually low quality hence not suitable for learning.
    Page 6, “Experiments”

See all papers in Proc. ACL 2013 that mention sentence pairs.

See all papers in Proc. ACL that mention sentence pairs.

Back to top.

phrase pair

Appears in 7 sentences as: phrase pair (5) phrase pairs (2)
In Advancements in Reordering Models for Statistical Machine Translation
  1. During the search, a sentence pair ( 1‘], (if) will be formally splitted into a segmentation Sff which consists of K phrase pairs .
    Page 4, “Tagging-style Reordering Model”
  2. Suppose the search~ state is ~now extended with a new phrase pair (fk,ék): fk, :2 fbk .
    Page 4, “Tagging-style Reordering Model”
  3. We also have the word alignment within the new phrase pair , which is stored during the phrase extraction process.
    Page 4, “Tagging-style Reordering Model”
  4. We count how often each extracted phrase pair is found with each of the three reordering types.
    Page 4, “Comparative Study”
  5. The bilingual sequence of phrase pairs will be extracted using the same strategy in Figure 4.
    Page 5, “Comparative Study”
  6. Suppose the search state is now extended with a new phrase pair (1216).
    Page 5, “Comparative Study”
  7. F is the bilingual sequence for the new phrase pair (13,6) and Pi is the ith unit within F. F, is the bilingual sequence history for current state.
    Page 5, “Comparative Study”

See all papers in Proc. ACL 2013 that mention phrase pair.

See all papers in Proc. ACL that mention phrase pair.

Back to top.

error rate

Appears in 7 sentences as: Error Rate (1) error rate (9)
In Advancements in Reordering Models for Statistical Machine Translation
  1. The model scaling factors M” are trained with Minimum Error Rate Training (MERT).
    Page 2, “Translation System Overview”
  2. Several experiments have been done to find the suitable hy-perparameter p1 and p2, we choose the model with lowest error rate on validation corpus for translation experiments.
    Page 7, “Experiments”
  3. The error rate of the chosen model on test corpus (the test corpus in Table 2) is 25.75% for token error rate and 69.39% for sequence error rate .
    Page 7, “Experiments”
  4. The RNN has a token error rate of 27.31% and a sentence error rate of 77.00% over the test corpus in Table 2.
    Page 7, “Experiments”
  5. CRFs performs better than RNN (token error rate 25.75% vs 27.31%).
    Page 7, “Experiments”
  6. Both error rate values are much higher than what we usually see in part-of-speech tagging task.
    Page 7, “Experiments”
  7. The CRFs achieves lower error rate on the tagging task but RNN trained model is better for the translation task.
    Page 9, “Conclusion”

See all papers in Proc. ACL 2013 that mention error rate.

See all papers in Proc. ACL that mention error rate.

Back to top.

machine translation

Appears in 5 sentences as: Machine Translation (1) machine translation (4)
In Advancements in Reordering Models for Statistical Machine Translation
  1. The systematic word order difference between two languages poses a challenge for current statistical machine translation (SMT) systems.
    Page 1, “Introduction”
  2. :15 for Statistical Machine Translation
    Page 1, “Introduction”
  3. The remainder of this paper is organized as follows: Section 2 introduces the basement of this research: the principle of statistical machine translation .
    Page 2, “Introduction”
  4. In statistical machine translation , we are given a source language sentence fi] = fl... fj .
    Page 2, “Translation System Overview”
  5. In this paper, the phrase-based machine translation system
    Page 2, “Translation System Overview”

See all papers in Proc. ACL 2013 that mention machine translation.

See all papers in Proc. ACL that mention machine translation.

Back to top.

baseline system

Appears in 5 sentences as: baseline system (5)
In Advancements in Reordering Models for Statistical Machine Translation
  1. Results on five Chinese-English NIST tasks show that our model improves the baseline system by 1.32 BLEU and 1.53 TER on average.
    Page 1, “Abstract”
  2. The reordering model for the baseline system is the distance-based jump model which uses linear distance.
    Page 6, “Experiments”
  3. The results show that our proposed idea improves the baseline system and RNN trained model performs better than CRFs trained model, in terms of both automatic measure and significance test.
    Page 8, “Experiments”
  4. Experimental results show that our model is stable and improves the baseline system by 0.98 BLEU and 1.21 TER (trained by CRFs) and 1.32 BLEU and 1.53 TER (trained by RNN).
    Page 9, “Conclusion”
  5. We also show that the proposed model is able to improve a very strong baseline system .
    Page 9, “Conclusion”

See all papers in Proc. ACL 2013 that mention baseline system.

See all papers in Proc. ACL that mention baseline system.

Back to top.

translation model

Appears in 5 sentences as: translation model (5)
In Advancements in Reordering Models for Statistical Machine Translation
  1. (Marifio et al., 2006) present a translation model that constitutes a language model of a sort of bilanguage composed of bilingual units.
    Page 1, “Introduction”
  2. (Marifio et al., 2006) implement a translation model using n-grams.
    Page 5, “Comparative Study”
  3. Our baseline is a phrase-based decoder, which includes the following models: an n-gram target-side language model (LM), a phrase translation model and a word-based lexicon model.
    Page 6, “Experiments”
  4. Table 1: translation model and LM training data statistics
    Page 6, “Experiments”
  5. Table 1 contains the data statistics used for translation model and LM.
    Page 6, “Experiments”

See all papers in Proc. ACL 2013 that mention translation model.

See all papers in Proc. ACL that mention translation model.

Back to top.

proposed model

Appears in 5 sentences as: proposed model (5)
In Advancements in Reordering Models for Statistical Machine Translation
  1. Section 3 describes the proposed model .
    Page 2, “Introduction”
  2. For the proposed model , significance testing results on both BLEU and TER are reported (B2 and B3 compared to B1, T2 and T3 compared to T1).
    Page 8, “Experiments”
  3. Our proposed model ranks the second position.
    Page 8, “Experiments”
  4. By adding an unaligned word tag, the unaligned word phenomenon is automatically implanted in the proposed model .
    Page 9, “Conclusion”
  5. We also show that the proposed model is able to improve a very strong baseline system.
    Page 9, “Conclusion”

See all papers in Proc. ACL 2013 that mention proposed model.

See all papers in Proc. ACL that mention proposed model.

Back to top.

model training

Appears in 5 sentences as: model trained (1) model training (4)
In Advancements in Reordering Models for Statistical Machine Translation
  1. Once the model training is finished, we make inference on develop and test corpora which means that we get the labels of the source sentences that need to be translated.
    Page 4, “Tagging-style Reordering Model”
  2. The source side data statistics for the reordering model training is given in Table 2 (target side has only nine labels).
    Page 7, “Experiments”
  3. Table 2: tagging-style model training data statistics
    Page 7, “Experiments”
  4. will show later, the model trained with both CRFs and RNN help to improve the translation quality.
    Page 8, “Experiments”
  5. In other words, there exists mismatch of the data for reordering model training and actual MT data.
    Page 8, “Experiments”

See all papers in Proc. ACL 2013 that mention model training.

See all papers in Proc. ACL that mention model training.

Back to top.

phrase-based

Appears in 4 sentences as: phrase-based (4)
In Advancements in Reordering Models for Statistical Machine Translation
  1. Within the phrase-based SMT framework there are mainly three stages where improved reordering could be integrated: In the preprocessing: the source sentence is reordered by heuristics, so that the word order of source and target sentences is similar.
    Page 1, “Introduction”
  2. In this way, syntax information can be incorporated into phrase-based SMT systems.
    Page 1, “Introduction”
  3. In this paper, the phrase-based machine translation system
    Page 2, “Translation System Overview”
  4. Our baseline is a phrase-based decoder, which includes the following models: an n-gram target-side language model (LM), a phrase translation model and a word-based lexicon model.
    Page 6, “Experiments”

See all papers in Proc. ACL 2013 that mention phrase-based.

See all papers in Proc. ACL that mention phrase-based.

Back to top.

lexicalized

Appears in 4 sentences as: lexicalized (5)
In Advancements in Reordering Models for Statistical Machine Translation
  1. The classifier can be trained with maximum likelihood like Moses lexicalized reordering (Koehn et al., 2007) and hierarchical lexicalized reordering model (Galley and Manning, 2008) or be trained under maximum entropy framework (Zens and Ney, 2006).
    Page 1, “Introduction”
  2. 4.1 Moses lexicalized reordering model
    Page 4, “Comparative Study”
  3. Figure 2: lexicalized reordering model illustration.
    Page 4, “Comparative Study”
  4. Our implementation is same with the default behaVior of Moses lexicalized reordering model.
    Page 4, “Comparative Study”

See all papers in Proc. ACL 2013 that mention lexicalized.

See all papers in Proc. ACL that mention lexicalized.

Back to top.

soft constraints

Appears in 4 sentences as: soft constraints (4)
In Advancements in Reordering Models for Statistical Machine Translation
  1. (Feng et al., 2012) present a method that utilizes predicate-argument structures from semantic role labeling results as soft constraints .
    Page 1, “Introduction”
  2. Similar to the previous model, the SRL information is used as soft constraints .
    Page 6, “Comparative Study”
  3. In Section 3.6 of (Zhang, 2013), instead of doing hard reordering decision, the author uses the rules as soft constraints in the decoder.
    Page 6, “Comparative Study”
  4. The model is utilized as soft constraints in the decoder.
    Page 9, “Conclusion”

See all papers in Proc. ACL 2013 that mention soft constraints.

See all papers in Proc. ACL that mention soft constraints.

Back to top.

statistical machine translation

Appears in 4 sentences as: Statistical Machine Translation (1) statistical machine translation (3)
In Advancements in Reordering Models for Statistical Machine Translation
  1. The systematic word order difference between two languages poses a challenge for current statistical machine translation (SMT) systems.
    Page 1, “Introduction”
  2. :15 for Statistical Machine Translation
    Page 1, “Introduction”
  3. The remainder of this paper is organized as follows: Section 2 introduces the basement of this research: the principle of statistical machine translation .
    Page 2, “Introduction”
  4. In statistical machine translation , we are given a source language sentence fi] = fl... fj .
    Page 2, “Translation System Overview”

See all papers in Proc. ACL 2013 that mention statistical machine translation.

See all papers in Proc. ACL that mention statistical machine translation.

Back to top.

log-linear

Appears in 4 sentences as: log-linear (4)
In Advancements in Reordering Models for Statistical Machine Translation
  1. We model Pr(e{|fi]) directly using a log-linear combination of several models (Och and Ney,
    Page 2, “Translation System Overview”
  2. The number of source words that have inconsistent labels is the penalty and is then added into the log-linear framework as a new feature.
    Page 4, “Tagging-style Reordering Model”
  3. (4) The orientation probability is modeled in a log-linear framework using a set of N feature func-tiOIlS €{,i,j, de-IJH), n = 1, .
    Page 4, “Comparative Study”
  4. Finally, in the log-linear framework (Equation 2) a new jump model is added which uses the reordered source sentence to calculate the cost.
    Page 6, “Comparative Study”

See all papers in Proc. ACL 2013 that mention log-linear.

See all papers in Proc. ACL that mention log-linear.

Back to top.

reranking

Appears in 3 sentences as: reranking (4)
In Advancements in Reordering Models for Statistical Machine Translation
  1. In the reranking framework: in principle, all
    Page 1, “Introduction”
  2. the models in previous category can be used in the reranking framework, because in the reranking we have all the information (source and target words/phrases, alignment) about the translation process.
    Page 2, “Introduction”
  3. One disadvantage of carrying out reordering in reranking is the representativeness of the N-best list is often a question mark.
    Page 2, “Introduction”

See all papers in Proc. ACL 2013 that mention reranking.

See all papers in Proc. ACL that mention reranking.

Back to top.

semantic role

Appears in 3 sentences as: semantic role (2) semantic roles (1)
In Advancements in Reordering Models for Statistical Machine Translation
  1. (Feng et al., 2012) present a method that utilizes predicate-argument structures from semantic role labeling results as soft constraints.
    Page 1, “Introduction”
  2. (Feng et al., 2012) propose two structure features from semantic role labeling (SRL) results.
    Page 6, “Comparative Study”
  3. During decoding process, the first feature will report how many event layers that one search state violates and the second feature will report the amount of semantic roles that one search state violates.
    Page 6, “Comparative Study”

See all papers in Proc. ACL 2013 that mention semantic role.

See all papers in Proc. ACL that mention semantic role.

Back to top.

Feature Templates

Appears in 3 sentences as: feature template (1) Feature Templates (1) feature templates (1)
In Advancements in Reordering Models for Statistical Machine Translation
  1. Table 3 is the feature template we set initially which generates 722 999 637 features.
    Page 7, “Experiments”
  2. Feature Templates
    Page 7, “Experiments”
  3. Table 3: feature templates for CRFs training
    Page 7, “Experiments”

See all papers in Proc. ACL 2013 that mention Feature Templates.

See all papers in Proc. ACL that mention Feature Templates.

Back to top.

dependency tree

Appears in 3 sentences as: dependency tree (3) dependency trees (1)
In Advancements in Reordering Models for Statistical Machine Translation
  1. (Cherry, 2008) uses information from dependency trees to make the decoding process keep syntactic cohesion.
    Page 1, “Introduction”
  2. This structure is represented by a source sentence dependency tree .
    Page 5, “Comparative Study”
  3. The algorithm is as follows: given the source sentence and its dependency tree, during the translation process, once a hypothesis is extended, check if the source dependency tree contains a subtree T such that:
    Page 5, “Comparative Study”

See all papers in Proc. ACL 2013 that mention dependency tree.

See all papers in Proc. ACL that mention dependency tree.

Back to top.

maximum entropy

Appears in 3 sentences as: Maximum entropy (1) maximum entropy (2)
In Advancements in Reordering Models for Statistical Machine Translation
  1. The classifier can be trained with maximum likelihood like Moses lexicalized reordering (Koehn et al., 2007) and hierarchical lexicalized reordering model (Galley and Manning, 2008) or be trained under maximum entropy framework (Zens and Ney, 2006).
    Page 1, “Introduction”
  2. 4.2 Maximum entropy reordering model
    Page 4, “Comparative Study”
  3. (Zens and Ney, 2006) proposed a maximum entropy classifier to predict the orientation of the next phrase given the current phrase.
    Page 4, “Comparative Study”

See all papers in Proc. ACL 2013 that mention maximum entropy.

See all papers in Proc. ACL that mention maximum entropy.

Back to top.

translation task

Appears in 3 sentences as: translation task (2) translation tasks (1)
In Advancements in Reordering Models for Statistical Machine Translation
  1. (Wang et al., 2007) present a pre-reordering method for Chinese-English translation task .
    Page 6, “Comparative Study”
  2. The CRFs achieves lower error rate on the tagging task but RNN trained model is better for the translation task .
    Page 9, “Conclusion”
  3. However, the tree-based jump model relies on manually designed reordering rules which does not exist for many language pairs while our model can be easily adapted to other translation tasks .
    Page 9, “Conclusion”

See all papers in Proc. ACL 2013 that mention translation task.

See all papers in Proc. ACL that mention translation task.

Back to top.