Abstract | Unsupervised word alignment is most often modeled as a Markov process that generates a sentence f conditioned on its translation 6. |
Experimental Results | Extraction-based evaluations of alignment better coincide with the role of word aligners in machine translation systems (Ayan and Dorr, 2006). |
Introduction | Word alignment is the task of identifying corresponding words in sentence pairs. |
Introduction | The standard approach to word alignment employs directional Markov models that align the words of a sentence f to those of its translation 6, such as IBM Model 4 (Brown et al., 1993) or the HMM-based alignment rnodel(ngeletal,l996) |
Model Definition | Our bidirectional model Q = (12,13) is a globally normalized, undirected graphical model of the word alignment for a fixed sentence pair (6, f Each vertex in the vertex set V corresponds to a model variable Vi, and each undirected edge in the edge set D corresponds to a pair of variables (W, Each vertex has an associated potential function w, that assigns a real-valued potential to each possible value v,- of 16.1 Likewise, each edge has an associated potential function gig-(vi, 213-) that scores pairs of values. |
Model Definition | The highest probability word alignment vector under the model for a given sentence pair (6, f) can be computed exactly using the standard Viterbi algorithm for HMMs in O(|e|2 - time. |
Model Definition | An alignment vector a can be converted trivially into a set of word alignment links A: |
Related Work | In addition, supervised word alignment models often use the output of directional unsupervised aligners as features or pruning signals. |
Related Work | A parallel idea that closely relates to our bidirectional model is posterior regularization, which has also been applied to the word alignment problem (Graca et al., 2008). |
Related Work | Another similar line of work applies belief propagation to factor graphs that enforce a one-to-one word alignment (Cromieres and Kurohashi, 2009). |
Experimental Evaluation | We compare the accuracy of our proposed method of joint phrase alignment and extraction using the FLAT, HIER and HLEN models, with a baseline of using word alignments from GIZA++ and heuristic phrase extraction. |
Flat ITG Model | It should be noted that while Model 1 probabilities are used, they are only soft constraints, compared with the hard constraint of choosing a single word alignment used in most previous phrase extraction approaches. |
Hierarchical ITG Model | Because of this, previous research has combined FLAT with heuristic phrase extraction, which exhaustively combines all adjacent phrases permitted by the word alignments (Och et al., 1999). |
Hierarchical ITG Model | Figure l: A word alignment (a), and its derivations according to FLAT (b), and HIER (C). |
Introduction | However, as DeNero and Klein (2010) note, this two step approach results in word alignments that are not optimal for the final task of generating |
Introduction | As a solution to this, they proposed a supervised discriminative model that performs joint word alignment and phrase extraction, and found that joint estimation of word alignments and extraction sets improves both word alignment accuracy and translation results. |
Phrase Extraction | Figure 3: The phrase, block, and word alignments used in heuristic phrase extraction. |
Phrase Extraction | The traditional method for heuristic phrase extraction from word alignments exhaustively enumerates all phrases up to a certain length consistent with the alignment (Och et al., 1999). |
Phrase Extraction | We will call this heuristic extraction from word alignments HEUR-W. |
Abstract | Finally, we integrate the transliteration module into the GIZA++ word aligner and evaluate it on two word alignment tasks achieving improvements in both precision and recall measured against gold standard word alignments . |
Experiments | We evaluate our transliteration mining algorithm on three tasks: transliteration mining from Wikipedia InterLanguage Links, transliteration mining from parallel corpora, and word alignment using a word aligner with a transliteration component. |
Experiments | In the word alignment experiment, we integrate a transliteration module which is trained on the transliterations pairs extracted by our method into a word aligner and show a significant improvement. |
Experiments | We use the English/Hindi corpus from the shared task on word alignment , organized as part of the ACL 2005 Workshop on Building and Using Parallel Texts (WA05) (Martin et al., 2005). |
Introduction | Finally we integrate a transliteration module into the GIZA++ word aligner and show that it improves word alignment quality. |
Introduction | We evaluate our word alignment system on two language pairs using gold standard word alignments and achieve improvements of 10% and 13.5% in precision and 3.5% and 13.5% in recall. |
Introduction | Section 4 describes the evaluation of our mining method through both gold standard evaluation and through using it to improve word alignment quality. |
Backgrounds | 1These numbers are language/corpus-dependent and are not necessarily to be taken as a general reflection of the overall quality of the word alignments for arbitrary language pairs. |
Composed Rule Extraction | Input: HPSG forest F5, target sentence T, word alignment A = j)}, target function word set {fw} appeared in T, and target chunk set {C} |
Introduction | However, forest-based translation systems, and, in general, most linguistically syntax-based SMT systems (Galley et al., 2004; Galley et al., 2006; Liu et al., 2006; Zhang et al., 2007; Mi et al., 2008; Liu et al., 2009; Chiang, 2010), are built upon word aligned parallel sentences and thus share a critical dependence on word alignments . |
Introduction | For example, even a single spurious word alignment can invalidate a large number of otherwise extractable rules, and unaligned words can result in an exponentially large set of extractable rules for the interpretation of these unaligned words (Galley et al., 2006). |
Introduction | What makes word alignment so fragile? |
Related Research | By dealing with the ambiguous word alignment instead of unaligned target words, syntax-based realignment models were proposed by (May |
Related Research | Specially, we observed that most incorrect or ambiguous word alignments are caused by function words rather than content words. |
Approach Overview | To establish a soft correspondence between the two languages, we use a second similarity function, which leverages standard unsupervised word alignment statistics (§3.3).3 |
Graph Construction | 3The word alignment methods do not use POS information. |
Graph Construction | To define a similarity function between the English and the foreign vertices, we rely on high-confidence word alignments . |
Graph Construction | Since our graph is built from a parallel corpus, we can use standard word alignment techniques to align the English sentences “De |
Corpora and baselines | All conditions use word alignments produced by sequential iterations of IBM model 1, HMM, and IBM model 4 in GIZA++, followed by “diag-and” symmetrization (Koehn et al., 2003). |
Features | We add inflection features for all words aligned to at least one English verb, adjective, noun, pronoun, or determiner, excepting definite and indefinite articles. |
Features | These features would be more properly defined based on the identity of the target word aligned to these quantifiers, but little ambiguity seems to arise from this substitution in practice. |
Features | These dependencies are inferred from source-side annotation Via word alignments , as depicted in figure 1, without any use of target-side dependency parses. |