Index of papers in Proc. ACL 2011 that mention
  • phrase-based
Schwartz, Lane and Callison-Burch, Chris and Schuler, William and Wu, Stephen
Abstract
This paper describes a novel technique for incorporating syntactic knowledge into phrase-based machine translation through incremental syntactic parsing.
Abstract
This requirement makes it difficult to incorporate them into phrase-based translation, which generates partial hypothesized translations from left-to-right.
Abstract
Incremental syntactic language models score sentences in a similar left-to-right fashion, and are therefore a good mechanism for incorporating syntax into phrase-based translation.
Introduction
Modern phrase-based translation using large scale n-gram language models generally performs well in terms of lexical choice, but still often produces ungrammatical output.
Introduction
Bottom-up and top-down parsers typically require a completed string as input; this requirement makes it difficult to incorporate these parsers into phrase-based translation, which generates hypothesized translations incrementally, from left-to-right.1 As a workaround, parsers can rerank the translated output of translation systems (Och et al., 2004).
Introduction
We observe that incremental parsers, used as structured language models, provide an appropriate algorithmic match to incremental phrase-based decoding.
Related Work
Neither phrase-based (Koehn et al., 2003) nor hierarchical phrase-based translation (Chiang, 2005) take explicit advantage of the syntactic structure of either source or target language.
Related Work
Early work in statistical phrase-based translation considered whether restricting translation models to use only syntactically well-formed constituents might improve translation quality (Koehn et al., 2003) but found such restrictions failed to improve translation quality.
phrase-based is mentioned in 32 sentences in this paper.
Topics mentioned in this paper:
Zhang, Hao and Fang, Licheng and Xu, Peng and Wu, Xiaoyun
Abstract
Combining the two techniques, we show that using a fast shift-reduce parser we can achieve significant quality gains in NIST 2008 English-to-Chinese track (1.3 BLEU points over a phrase-based system, 0.8 BLEU points over a hierarchical phrase-based system).
Experiments
We compare three systems: a phrase-based system (Och and Ney, 2004), a hierarchical phrase-based system (Chiang, 2005), and our forest-to-string system with different binarization schemes.
Experiments
In the phrase-based decoder, jump width is set to 8.
Experiments
Besides standard features (Och and Ney, 2004), the phrase-based decoder also uses a Maximum Entropy phrasal reordering model (Zens and Ney, 2006).
phrase-based is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Subotin, Michael
Hierarchical phrase-based translation
We take as our starting point David Chiang’s Hiero system, which generalizes phrase-based translation to substrings with gaps (Chiang, 2007).
Hierarchical phrase-based translation
As shown by Chiang (2007), a weighted grammar of this form can be collected and scored by simple extensions of standard methods for phrase-based translation and efficiently combined with a language model in a CKY decoder to achieve large improvements over a state-of-the-art phrase-based system.
Hierarchical phrase-based translation
Although a variety of scores interpolated into the decision rule for phrase-based systems have been investigated over the years, only a handful have been discovered to be consistently useful.
Introduction
Translation into languages with rich morphology presents special challenges for phrase-based methods.
Introduction
Thus, Birch et al (2008) find that translation quality achieved by a popular phrase-based system correlates significantly with a measure of target-side, but not source-side morphological complexity.
Introduction
Recently, several studies (Bojar, 2007; Avramidis and Koehn, 2009; Ramanathan et al., 2009; Yen-iterzi and Oflazer, 2010) proposed modeling target-side morphology in a phrase-based factored models framework (Koehn and Hoang, 2007).
Modeling unobserved target inflections
As a consequence of translating into a morphologically rich language, some inflected forms of target words are unobserved in training data and cannot be generated by the decoder under standard phrase-based approaches.
phrase-based is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Clifton, Ann and Sarkar, Anoop
Abstract
This paper extends the training and tuning regime for phrase-based statistical machine translation to obtain fluent translations into morphologically complex languages (we build an English to Finnish translation system).
Conclusion and Future Work
We also demonstrate that for Finnish (and possibly other agglutinative languages), phrase-based MT benefits from allowing the translation model access to morphological segmentation yielding productive morphological phrases.
Conclusion and Future Work
In order to help with replication of the results in this paper, we have run the various morphological analysis steps and created the necessary training, tuning and test data files needed in order to train, tune and test any phrase-based machine translation system with our data.
Experimental Results
In all the experiments conducted in this paper, we used the Moses5 phrase-based translation system (Koehn et al., 2007), 2008 version.
Models 2.1 Baseline Models
We then trained the Moses phrase-based system (Koehn et al., 2007) on the segmented and marked text.
Translation and Morphology
In this work, we propose to address the problem of morphological complexity in an English-to-Finnish MT task within a phrase-based translation framework.
phrase-based is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Neubig, Graham and Watanabe, Taro and Sumita, Eiichiro and Mori, Shinsuke and Kawahara, Tatsuya
A Probabilistic Model for Phrase Table Extraction
If 6 takes the form of a scored phrase table, we can use traditional methods for phrase-based SMT to find P(e|f, 6) and concentrate on creating a model for P(6| (5 , .7: We decompose this posterior probability using Bayes law into the corpus likelihood and parameter prior probabilities
Abstract
This allows for a completely probabilistic model that is able to create a phrase table that achieves competitive accuracy on phrase-based machine translation tasks directly from unaligned sentence pairs.
Hierarchical ITG Model
and we confirm in the experiments in Section 7, using only minimal phrases leads to inferior translation results for phrase-based SMT.
Introduction
The training of translation models for phrase-based statistical machine translation (SMT) systems (Koehn et al., 2003) takes unaligned bilingual training data as input, and outputs a scored table of phrase pairs.
phrase-based is mentioned in 4 sentences in this paper.
Topics mentioned in this paper: