Abstractive Caption Generation | Phrase-based Model The model outlined in equation (8) will generate captions with function words. |
Abstractive Caption Generation | Search To generate a caption it is necessary to find the sequence of words that maximizes P(w1,w2, ...,wn) for the word-based model (equation (8)) and P(p1, p2, ..., pm) for the phrase-based model (equation (15)). |
Experimental Setup | Documents and captions were parsed with the Stanford parser (Klein and Manning, 2003) in order to obtain dependencies for the phrase-based abstractive model. |
Experimental Setup | We tuned the caption length parameter on the development set using a range of [5, 14] tokens for the word-based model and [2, 5] phrases for the phrase-based model. |
Experimental Setup | For the phrase-based model, we also experimented with reducing the search scope, either by considering only the n most similar sentences to the keywords (range [2,10]), or simply the single most similar sentence and its neighbors (range [2, 5]). |
Results | different from phrase-based abstractive system. |
Results | Table 4: Captions written by humans (G) and generated by extractive (KL), word-based abstractive (AW), and phrase-based extractive (Ap systems). |
Results | It is significantly worse than the phrase-based abstractive system (0c < 0.01), the extractive system (0c < 0.01), and the gold standard (0c < 0.01). |
Abstract | We make use of the collocation probabilities, which are estimated from monolingual corpora, in two aspects, namely improving word alignment for various kinds of SMT systems and improving phrase table for phrase-based SMT. |
Abstract | As compared to baseline systems, we achieve absolute improvements of 2.40 BLEU score on a phrase-based SMT system and 1.76 BLEU score on a parsing-based SMT system. |
Experiments on Phrase-Based SMT | Moses (Koehn et al., 2007) is used as the baseline phrase-based SMT system. |
Experiments on Phrase-Based SMT | 6.2 Effect of improved word alignment on phrase-based SMT |
Improving Phrase Table | Phrase-based SMT system automatically extracts bilingual phrase pairs from the word aligned bilingual corpus. |
Improving Phrase Table | These collocation probabilities are incorporated into the phrase-based SMT system as features. |
Introduction | In phrase-based SMT (Koehn et al., 2003), the phrase boundary is usually determined based on the bidirectional word alignments. |
Introduction | Then the collocation information is employed to improve Bilingual Word Alignment (BWA) for various kinds of SMT systems and to improve phrase table for phrase-based SMT. |
Introduction | Then the phrase collocation probabilities are used as additional features in phrase-based SMT systems. |
A Phrase-Based Error Model | The goal of the phrase-based error model is to transform a correctly spelled query C into a misspelled query Q. |
A Phrase-Based Error Model | If we assume a uniform probability over segmentations, then the phrase-based probability can be defined as: |
Abstract | Then, a phrase-based error model that accounts for the transformation probability between multi-term phrases is trained and integrated into a query speller system. |
Abstract | Results show that the system using the phrase-based error model outperforms significantly its baseline systems. |
Introduction | Among these models, the most effective one is a phrase-based error model that captures the probability of transforming one multi-term phrase into another multi-term phrase. |
Introduction | Comparing to traditional error models that account for transformation probabilities between single characters (Kernighan et al., 1990) or sub-word strings (Brill and Moore, 2000), the phrase-based model is more powerful in that it captures some contextual information by retaining inter-term dependencies. |
Introduction | In particular, the speller system incorporating a phrase-based error model significantly outperforms its baseline systems. |
Related Work | To this end, inspired by the phrase-based statistical machine translation (SMT) systems (Koehn et al., 2003; Och and Ney, 2004), we propose a phrase-based error model where we assume that query spelling correction is performed at the phrase level. |
Related Work | In what follows, before presenting the phrase-based error model, we will first describe the clickthrough data and the query speller system we used in this study. |
The Baseline Speller System | Figure 2: Example demonstrating the generative procedure behind the phrase-based error model. |
Abstract | We evaluate our method on Chinese-to-English Machine Translation (MT) tasks in three baseline systems, including a phrase-based system, a hierarchical phrase-based system and a syntax-based system. |
Background | The first SMT system is a phrase-based system with two reordering models including the maximum entropy-based lexicalized reordering model proposed by Xiong et al. |
Background | The second SMT system is an in-house reim-plementation of the Hiero system which is based on the hierarchical phrase-based model proposed by Chiang (2005). |
Background | After 5, 7 and 8 iterations, relatively stable improvements are achieved by the phrase-based system, the Hiero system and the syntaX-based system, respectively. |
Introduction | Many SMT frameworks have been developed, including phrase-based SMT (Koehn et al., 2003), hierarchical phrase-based SMT (Chiang, 2005), syntax-based SMT (Eisner, 2003; Ding and Palmer, 2005; Liu et al., 2006; Galley et al., 2006; Cowan et al., 2006), etc. |
Introduction | Our experiments are conducted on Chinese-to-English translation in three state-of-the-art SMT systems, including a phrase-based system, a hierarchical phrase-based system and a syntax-based |
Abstract | We present a novel scheme to apply factored phrase-based SMT to a language pair with very disparate morphological structures. |
Conclusions | We have presented a novel way to incorporate source syntactic structure in English-to-Turkish phrase-based machine translation by parsing the source sentences and then encoding many local and nonlocal source syntactic structures as additional complex tag factors. |
Experimental Setup and Results | We evaluated the impact of the transformations in factored phrase-based SMT with an English-Turkish data set which consists of 52712 parallel sentences. |
Experimental Setup and Results | As a baseline system, we built a standard phrase-based system, using the surface forms of the words without any transformations, and with a 3—gram LM in the decoder. |
Experimental Setup and Results | Factored phrase-based SMT allows the use of multiple language models for the target side, for different factors during decoding. |
Introduction | Once these were identified as separate tokens, they were then used as “words” in a standard phrase-based framework (Koehn et al., 2003). |
Introduction | This facilitates the use of factored phrase-based translation that was not previously applicable due to the morphological complexity on the target side and mismatch between source and target morphologies. |
Introduction | We assume that the reader is familiar with the basics of phrase-based statistical machine translation (Koehn et al., 2003) and factored statistical machine translation (Koehn and Hoang, 2007). |
Related Work | Koehn (2005) applied standard phrase-based SMT to Finnish using the Europarl corpus and reported that translation to Finnish had the worst BLEU scores. |
Related Work | Yang and Kirchhoff (2006) have used phrase-based backoff models to translate unknown words by morphologically decomposing the unknown source words. |
Related Work | They used both CCG supertags and LTAG su-pertags in Arabic-to-English phrase-based translation and have reported about 6% relative improvement in BLEU scores. |
Abstract | Several attempts have been made to learn phrase translation probabilities for phrase-based statistical machine translation that go beyond pure counting of phrases in word-aligned training data. |
Alignment | We apply our normal phrase-based decoder on the source side of the training data and constrain the translations to the corresponding target sentences from the training data. |
Conclusion | We have shown that training phrase models can improve translation performance on a state-of-the-art phrase-based translation model. |
Experimental Evaluation | The baseline system is a standard phrase-based SMT system with eight features: phrase translation and word lexicon probabilities in both translation directions, phrase penalty, word penalty, language model score and a simple distance-based reordering model. |
Introduction | A phrase-based SMT system takes a source sentence and produces a translation by segmenting the sentence into phrases and translating those phrases separately (Koehn et al., 2003). |
Introduction | We use a modified version of a phrase-based decoder to perform the forced alignment. |
Related Work | For the hierarchical phrase-based approach, (Blunsom et al., 2008) present a discriminative rule model and show the difference between using only the viterbi alignment in training and using the full sum over all possible derivations. |
Related Work | We also include these word lexica, as they are standard components of the phrase-based system. |
Related Work | They report improvements over a phrase-based model that uses an inverse phrase model and a language model. |
Abstract | We obtain final BLEU scores of 19.35 (conditional probability model) and 19.00 (joint probability model) as compared to 14.30 for a baseline phrase-based system and 16.25 for a system which transliterates OOV words in the baseline system. |
Evaluation | Table 4: Comparing Model-1 and Model-2 with Phrase-based Systems |
Evaluation | We also used two methods to incorporate transliterations in the phrase-based system: |
Evaluation | Post-process P191: All the 00V words in the phrase-based output are replaced with their top-candidate transliteration as given by our transliteration system. |
Introduction | Section 4 discusses the training data, parameter optimization and the initial set of experiments that compare our two models with a baseline Hindi-Urdu phrase-based system and with two transliteration-aided phrase-based systems in terms of BLEU scores |
Abstract | The model operates over a phrase-based representation of the source document which we obtain by merging information from PCFG parse trees and dependency graphs. |
Experimental Setup | Training We obtained phrase-based salience scores using a supervised machine learning algorithm. |
Experimental Setup | The SVM was trained with the same features used to obtain phrase-based salience scores, but with sentence-level labels (labels (1) and (2) positive, (3) negative). |
Experimental Setup | Figure 3: ROUGE-l and ROUGE-L results for phrase-based ILP model and two baselines, with error bars showing 95% confidence levels. |
Results | F-score is higher for the phrase-based system but not significantly. |
Results | The highlights created by the sentence ILP were considered significantly more verbose (0c < 0.05) than those created by the phrase-based system and the CNN abstractors. |
Results | Table 5 shows the output of the phrase-based system for the documents in Table 1. |
Abstract | Significant improvements are obtained over a state-of-the—art hierarchical phrase-based machine translation system. |
Bag-of-Words Vector Space Model | In the hierarchical phrase-based translation method, the translation rules are extracted by abstracting some words from an initial phrase pair (Chiang, 2005). |
Experiments | For the baseline, we train the translation model by following (Chiang, 2005; Chiang, 2007) and our decoder is Joshuas, an open-source hierarchical phrase-based machine translation system written in Java. |
Hierarchical phrase-based MT system | The hierarchical phrase-based translation method (Chiang, 2005; Chiang, 2007) is a formal syntax-based translation modeling method; its translation model is a weighted synchronous context free grammar (SCFG). |
Hierarchical phrase-based MT system | Empirically, this method has yielded better performance on language pairs such as Chinese-English than the phrase-based method because it permits phrases with gaps; it generalizes the normal phrase-based models in a way that allows long-distance reordering (Chiang, 2005; Chiang, 2007). |
Introduction | We chose a hierarchical phrase-based SMT system as our baseline; thus, the units involved in computation of sense similarities are hierarchical rules. |
Abstract | The pipeline of most Phrase-Based Statistical Machine Translation (PB-SMT) systems starts from automatically word aligned parallel corpus. |
Conclusion | We have presented pseudo-word as a novel machine translational unit for phrase-based machine translation. |
Conclusion | Experimental results of Chinese-to-English translation task show that, in phrase-based machine translation model, pseudo-word performs significantly better than word in both spoken language translation domain and news domain. |
Experiments and Results | The pipeline uses GIZA++ model 4 (Brown et al., 1993; Och and Ney, 2003) for pseudo-word alignment, uses Moses (Koehn et al., 2007) as phrase-based decoder, uses the SRI Language Modeling Toolkit to train language model with modified Kneser-Ney smoothing (Kneser and Ney 1995; Chen and Goodman 1998). |
Experiments and Results | We use GIZA++ model 4 for word alignment, use Moses for phrase-based decoding. |
Introduction | The pipeline of most Phrase-Based Statistical Machine Translation (PB-SMT) systems starts from automatically word aligned parallel corpus generated from word-based models (Brown et al., 1993), proceeds with step of induction of phrase table (Koehn et al., 2003) or synchronous grammar (Chiang, 2007) and with model weights tuning step. |
Introduction | By incorporating the syntactic annotations of parse trees from both or either side(s) of the bitext, they are believed better than phrase-based counterparts in reorderings. |
Introduction | In contrast to conventional tree-to-tree approaches (Ding and Palmer, 2005; Quirk et al., 2005; Xiong et al., 2007; Zhang et al., 2007; Liu et al., 2009), which only make use of a single type of trees, our model is able to combine two types of trees, outperforming both phrase-based and tree-to-string systems. |
Introduction | Current tree-to-tree models (Xiong et al., 2007; Zhang et al., 2007; Liu et al., 2009) still have not outperformed the phrase-based system Moses (Koehn et al., 2007) significantly even with the help of forests.1 |
Related Work | This model shows a significant improvement over the state-of-the-art hierarchical phrase-based system (Chiang, 2005). |