Abstract | In our method, a target-side tree fragment that corresponds to a source-side tree fragment is identified via word alignment and mapping rules that are automatically learned. |
Bilingual subtree constraints | Then we perform word alignment using a word-level aligner (Liang et al., 2006; DeNero and Klein, 2007). |
Bilingual subtree constraints | Figure 8 shows an example of a processed sentence pair that has tree structures on both sides and word alignment links. |
Bilingual subtree constraints | Then through word alignment links, we obtain the corresponding words of the words of 3758. |
Experiments | Word alignments were generated from the Berkeley Aligner (Liang et al., 2006; DeNero and Klein, 2007) trained on a bilingual corpus having approximately 0.8M sentence pairs. |
Introduction | Basically, a (candidate) dependency subtree in a source-language sentence is mapped to a subtree in the corresponding target-language sentence by using word alignment and mapping rules that are automatically learned. |
Motivation | Suppose that we have an input sentence pair as shown in Figure l, where the source sentence is in English, the target is in Chinese, the dashed undirected links are word alignment links, and the directed links between words indicate that they have a (candidate) dependency relation. |
Motivation | We obtain their corresponding words “I’El(meat)”, “H3 (use)”, and “X¥(fork)” in Chinese Via the word alignment links. |
Abstract | The pipeline of most Phrase-Based Statistical Machine Translation (PB-SMT) systems starts from automatically word aligned parallel corpus. |
Abstract | But word appears to be too fine-grained in some cases such as non-compositional phrasal equivalences, where no clear word alignments exist. |
Introduction | The pipeline of most Phrase-Based Statistical Machine Translation (PB-SMT) systems starts from automatically word aligned parallel corpus generated from word-based models (Brown et al., 1993), proceeds with step of induction of phrase table (Koehn et al., 2003) or synchronous grammar (Chiang, 2007) and with model weights tuning step. |
Introduction | But there is a deficiency in such manner that word is too fine-grained in some cases such as non-compositional phrasal equivalences, where clear word alignments do not exist. |
Introduction | No clear word alignments |
Searching for Pseudo-words | Then we apply word alignment techniques to build pseudo-word alignments. |
Abstract | We make use of the collocation probabilities, which are estimated from monolingual corpora, in two aspects, namely improving word alignment for various kinds of SMT systems and improving phrase table for phrase-based SMT. |
Abstract | The experimental results show that our method improves the performance of both word alignment and translation quality significantly. |
Collocation Model | This method adapts the bilingual word alignment algorithm to monolingual scenario to extract collocations only from monolingual corpora. |
Collocation Model | 2.1 Monolingual word alignment |
Collocation Model | Then the monolingual word alignment algorithm is employed to align the potentially collocated words in the monolingual sentences. |
Introduction | Statistical bilingual word alignment (Brown et al. |
Introduction | Although many methods were proposed to improve the quality of word alignments (Wu, 1997; Och and Ney, 2000; Marcu and Wong, 2002; Cherry and Lin, 2003; Liu et al., 2005; Huang, 2009), the correlation of the words in multi-word alignments is not fully considered. |
Introduction | In phrase-based SMT (Koehn et al., 2003), the phrase boundary is usually determined based on the bidirectional word alignments . |
Abstract | We present a simple yet powerful hierarchical search algorithm for automatic word alignment . |
Abstract | We report results on Arabic-English word alignment and translation tasks. |
Introduction | Automatic word alignment is generally accepted as a first step in training any statistical machine translation system. |
Introduction | has motivated much recent work in discriminative modeling for word alignment (Moore, 2005; Itty-cheriah and Roukos, 2005; Liu et al., 2005; Taskar et al., 2005; Blunsom and Cohn, 2006; Lacoste-Julien et al., 2006; Moore et al., 2006). |
Introduction | We borrow ideas from both k—best parsing (Klein and Manning, 2001; Huang and Chiang, 2005; Huang, 2008) and forest-based, and hierarchical phrase-based translation (Huang and Chiang, 2007; Chiang, 2007), and apply them to word alignment . |
Word Alignment as a Hypergraph | Word alignments are built bottom-up on the parse tree. |
Word Alignment as a Hypergraph | Initial partial alignments are enumerated and scored at preterminal nodes, each spanning a single column of the word alignment matrix. |
Word Alignment as a Hypergraph | Initial alignments We can construct a word alignment hierarchically, bottom-up, by making use of the structure inherent in syntactic parse trees. |
A Phrase-Based Error Model | Let J be the length of Q, L be the length of C, and A = a1, ..., a] be a hidden variable representing the word alignment . |
A Phrase-Based Error Model | When scoring a given candidate pair, we further restrict our attention to those S, T, M triples that are consistent with the word alignment , which we denote as B(C, Q, A*). |
A Phrase-Based Error Model | Once the word alignment is fixed, the final permutation is uniquely determined, so we can safely discard that factor. |
Abstract | It is suggested that the subtree alignment benefits both phrase and syntax based systems by relaxing the constraint of the word alignment . |
Introduction | However, most of the syntax based systems construct the syntactic translation rules based on word alignment , which not only suffers from the pipeline errors, but also fails to effectively utilize the syntactic structural features. |
Substructure Spaces for BTKs | 4.1 Lexical and Word Alignment Features |
Substructure Spaces for BTKs | Internal Word Alignment Features: The word alignment links account much for the co-occurrence of the aligned terms. |
Substructure Spaces for BTKs | We define the internal word alignment features as follows: |
Basics of ITG | From the viewpoint of word alignment , the terminal unary rules provide the links of word pairs, whereas the binary rules represent the reordering factor. |
Basics of ITG | First of all, it imposes a l-to-l constraint in word alignment . |
Basics of ITG | Secondly, the simple ITG leads to redundancy if word alignment is the sole purpose of applying ITG. |
Basics of ITG Parsing | Based on the rules in normal form, ITG word alignment is done in a similar way to chart parsing (Wu, 1997). |
Conclusion and Future Work | This paper reviews word alignment through ITG parsing, and clarifies the problem of ITG pruning. |
Evaluation | Table 3 lists the word alignment time cost and SMT performance of different pruning methods. |
Introduction | For this reason ITG has gained more and more attention recently in the word alignment community (Zhang and Gildea, 2005; Cherry and Lin, 2006; Haghighi et al., 2009). |
The DPDI Framework | Discriminative approaches to word alignment use manually annotated alignment for sentence pairs. |
The DPDI Framework | However, in reality there are often the cases where a foreign word aligns to more than one English word. |
Alignment | A phrase extraction is performed for each training sentence pair separately using the same word alignment as for the initialization. |
Experimental Evaluation | For the heuristic phrase model, we first use GIZA++ (Och and Ney, 2003) to compute the word alignment on TRAIN. |
Experimental Evaluation | Next we obtain a phrase table by extraction of phrases from the word alignment . |
Introduction | Viterbi Word Alignment Phrase Alignment word translation models phrase translation models trained by EM Algorithm trained by EM Algorithm heuristic phrase phrase translation counts probabilities Phrase Translation Table ‘ ‘ Phrase Translation Table |
Phrase Model Training | The simplest of our generative phrase models estimates phrase translation probabilities by their relative frequencies in the Viterbi alignment of the data, similar to the heuristic model but with counts from the phrase-aligned data produced in training rather than computed on the basis of a word alignment . |
Related Work | Their results show that it can not reach a performance competitive to extracting a phrase table from word alignment by heuristics (Och et al., 1999). |
Related Work | In addition, we do not restrict the training to phrases consistent with the word alignment , as was done in (DeNero et al., 2006). |
Related Work | This allows us to recover from flawed word alignments . |
Introduction | For dependency projection, the relationship between words in the parsed sentences can be simply projected across the word alignment to words in the unparsed sentences, according to the DCA assumption (Hwa et al., 2005). |
Introduction | Such a projection procedure suffers much from the word alignment errors and syntactic isomerism between languages, which usually lead to relationship projection conflict and incomplete projected dependency structures. |
Introduction | Because of the free translation, the syntactic isomerism between languages and word alignment errors, it would be strained to completely project the dependency structure from one language to another. |
Projected Classification Instance | In order to alleviate the effect of word alignment errors, we base the projection on the alignment matrix, a compact representation of multiple GIZA++ (Och and Ney, 2000) results, rather than a single word alignment in previous dependency projection works. |
Projected Classification Instance | Figure 2: The word alignment matrix between a Chinese sentence and its English translation. |
Related Works | Because of the free translation, the word alignment errors, and the heterogeneity between two languages, it is reluctant and less effective to project the dependency tree completely to the target language sentence. |
Abstract | These translation tasks are characterized by the relative ability to commit to parallel parse trees and availability of word alignments , yet the unavailability of large-scale data, calling for a Bayesian tree-to-tree formalism. |
Conclusion | The future for this work would involve natural extensions such as mixing over the space of word alignments ; this would allow application to MT—like tasks where flexible word reordering is allowed, such as abstractive sentence compression and paraphrasing. |
Introduction | One approach is to use word alignments (where these can be reliably estimated, as in our testbed application) to align subtrees and extract rules (Och and Ney, 2004; Galley et al., 2004) but this leaves open the question of finding the right level of generality of the rules — how deep the rules should be and how much lexicalization they should involve — necessitating resorting to heuristics such as minimality of rules, and leading to |
Introduction | possibility of searching over the infinite space of grammars (and, in machine translation, possible word alignments ), thus sidestepping the narrowness problem outlined above as well. |
Introduction | This task is characterized by the availability of word alignments , providing a clean testbed for investigating the effects of grammar extraction. |
The STSG Model | In particular, we visit every tree pair and each of its source nodes i, and update its alignment by selecting between and within two choices: (a) unaligned, (b) aligned with some target node j or e. The number of possibilities j in (b) is significantly limited, firstly by the word alignment (for instance, a source node dominating a deleted subspan cannot be aligned with a target node), and secondly by the current alignment of other nearby aligned source nodes. |