BLEU and PORT | We use word alignment to compute the two permutations (LRscore also uses word alignment ). |
BLEU and PORT | The word alignment between the source input and reference is computed using GIZA++ (Och and Ney, 2003) beforehand with the default settings, then is refined with the heuristic grow-diag-final-and; the word alignment between the source input and the translation is generated by the decoder with the help of word alignment inside each phrase pair. |
BLEU and PORT | These encode one-to-one relations but not one-to-many, many-to-one, many-to-many or null relations, all of which can occur in word alignments . |
Experiments | In order to compute the v part of PORT, we require source-target word alignments for the references and MT outputs. |
Experiments | Also, v depends on source-target word alignments for reference and test sets. |
Experiments | 3.2.5 Robustness to word alignment errors |
Abstract | Two decades after their invention, the IBM word-based translation models, widely available in the GIZA++ toolkit, remain the dominant approach to word alignment and an integral part of many statistical translation systems. |
Abstract | We explain how to implement this extension efliciently for large-scale data (also released as a modification to GIZA++) and demonstrate, in experiments on Czech, Arabic, Chinese, and Urdu to English translation, significant improvements over IBM Model 4 in both word alignment (up to +6.7 F1) and translation quality (up to +1.4 B ). |
Experiments | We measured the accuracy of word alignments generated by GIZA++ with and without the {O-norm, |
Introduction | Automatic word alignment is a Vital component of nearly all current statistical translation pipelines. |
Introduction | Although state-of—the-art translation models use rules that operate on units bigger than words (like phrases or tree fragments), they nearly always use word alignments to drive extraction of those translation rules. |
Introduction | The dominant approach to word alignment has been the IBM models (Brown et al., 1993) together with the HMM model (Vogel et al., 1996). |
Method | We start with a brief review of the IBM and HMM word alignment models, then describe how to extend them with a smoothed (0 prior and how to efficiently train them. |
Method | In word alignment , one well-known manifestation of overfitting is that rare words can act as “garbage collectors” |
Method | Previously (Vaswani et al., 2010), we used ALGENCAN, a nonlinear optimization toolkit, but this solution does not scale well to the number of parameters involved in word alignment models. |
Experiments | The alignment was obtained using GIZA++ (Och and Ney, 2003) and then we symmetrized the word alignment using the grow-diag-fmal heuristic. |
Extraction of Paraphrase Rules | 3.3 Word Alignments Filtering |
Extraction of Paraphrase Rules | We can construct word alignment between S0 and S1 through T 0. |
Extraction of Paraphrase Rules | On the initial corpus of (S0, T 0), we conduct word alignment with Giza++ (Och and Ney, 2000) in both directions and then apply the grow-diag-fmal heuristic (Koehn et al., 2005) for symmetrization. |
Abstract | The ranking model is automatically derived from word aligned parallel data with a syntactic parser for source language based on both lexical and syntactical features. |
Experiments | We use Giza++ (Och and Ney, 2003) to generate the word alignment for the parallel corpus. |
Experiments | By manual analysis, we find that the gap is due to both errors of the ranking reorder model and errors from word alignment and parser. |
Experiments | The reason is that our annotators tend to align function words which might be left unaligned by automatic word aligner . |
Introduction | The ranking model is automatically derived from the word aligned parallel data, viewing the source tree nodes to be reordered as list items to be ranked. |
Ranking Model Training | As pointed out by (Li et al., 2007), in practice, nodes often have overlapping target spans due to erroneous word alignment or different syntactic structures between source and target sentences. |
Word Reordering as Syntax Tree Node Ranking | Constituent tree is shown above the source sentence; arrows below the source sentences show head-dependent arcs for dependency tree; word alignment links are lines without arrow between the source and target sentences. |
Data and task | Word alignment features |
Data and task | We eXploit a feature set based on HMM word alignments in both directions (Och and Ney, 2000). |
Data and task | The first oracle ORACLEl has access to the gold-standard English entities and gold-standard word alignments among English and foreign words. |
Conclusion and Future Work | In the future, we will work on leveraging parallel sentences and word alignments for other tasks in sentiment analysis, such as building multilingual sentiment lexicons. |
Cross-Lingual Mixture Model for Sentiment Classification | We estimate word projection probability using word alignment probability generated by the Berkeley aligner (Liang et al., 2006). |
Cross-Lingual Mixture Model for Sentiment Classification | The word alignment probabilities serves two purposes. |
Cross-Lingual Mixture Model for Sentiment Classification | Figure 2 gives an example of word alignment probability. |
Cross-lingual Annotation Projection for Relation Extraction | However, these automatic annotations can be unreliable because of source text misclassification and word alignment errors; thus, it can cause a critical falling-off in the annotation projection quality. |
Graph Construction | Das and Petrov (Das and Petrov, 2011) proposed a graph-based bilingual projection of part-of-speech tagging by considering the tagged words in the source language as labeled examples and connecting them to the unlabeled words in the target language, while referring to the word alignments . |
Graph Construction | If the context vertices U S for the source language sentences are defined, then the units of context in the target language can also be created based on the word alignments . |
Implementation | We used the GIZA++ software 3 (Och and Ney, 2003) to obtain the word alignments for each bi-sentence in the parallel corpus. |
Experiments | For Moses HPB, we use “grow-diag-final-and” to obtain symmetric word alignments , 10 for the maximum phrase length, and the recommended default values for all other parameters. |
Experiments | We obtain the word alignments by running |
Head-Driven HPB Translation Model | Given the word alignment in Figure 1, Table 1 demonstrates the difference between hierarchical rules in Chiang (2007) and HD-HRs defined here. |
Introduction | Figure 1: An example word alignment for a Chinese-English sentence pair with the dependency parse tree for the Chinese sentence. |
Empirical Evaluation | We use GIZA++ (Och and Ney, 2003) to produce word alignments in Europarl: we ran it in both directions and kept the intersection of the induced word alignments . |
Empirical Evaluation | We mark arguments in two languages as aligned if there is any word alignment between the corresponding sets and if they are arguments of aligned predicates. |
Multilingual Extension | In doing so, as in much of previous work on unsupervised induction of linguistic structures, we rely on automatically produced word alignments . |
Multilingual Extension | In Section 6, we describe how we use word alignment to decide if two arguments are aligned; for now, we assume that (noisy) argument alignments are given. |
Experiments | We ran GIZA++ on these corpora in both directions and then applied the “grow-diag-final” refinement rule to obtain word alignments . |
Integrating the Two Models into SMT | We maintain word alignments for each phrase pair in the phrase table. |
Integrating the Two Models into SMT | Whenever a hypothesis covers a new verbal predicate v, we find the target translation 6 for 7} through word alignments and then calculate its translation probability pt(e|C according to Eq. |
Related Work | Therefore they either postpone the integration of target side PASs until the whole decoding procedure is completed (Wu and Fung, 2009b), or directly project semantic roles from the source side to the target side through word alignments during decoding (Liu and Gildea, 2010). |
Word Sense Disambiguation | Then, word alignment was performed on the parallel corpora with the GIZA+ + software (Och and Ney, 2003). |
Word Sense Disambiguation | For each English morphological root 6, the English sentences containing its occurrences were eXtracted from the word aligned output of GIZA++, as well as the corresponding translations of these occurrences. |
Word Sense Disambiguation | To minimize noisy word alignment result, translations with no Chinese character were deleted, and we further removed a translation when it only appears once, or its frequency is less than 10 and also less than 1% of the frequency of 6. |