Index of papers in Proc. ACL 2008 that mention
  • word alignment
Deng, Yonggang and Xu, Jia and Gao, Yuqing
A Generic Phrase Training Procedure
We first train word alignment models and will use them to evaluate the goodness of a phrase and a phrase pair.
A Generic Phrase Training Procedure
Beginning with a flat lexicon, we train IBM Model-l word alignment model with 10 iterations for each translation direction.
A Generic Phrase Training Procedure
We then train HMM word alignment models (Vogel et al., 1996) in two directions simultaneously by merging statistics collected in the
Abstract
Experimental results demonstrate consistent and significant improvement over the widely used method that is based on word alignment matrix only.
Introduction
The most widely used approach derives phrase pairs from word alignment matrix (Och and Ney, 2003; Koehn et al., 2003).
Introduction
Other methods do not depend on word alignments only, such as directly modeling phrase alignment in a joint generative way (Marcu and Wong, 2002), pursuing information extraction perspective (Venugopal et al., 2003), or augmenting with model-based phrase pair posterior (Deng and Byrne, 2005).
Introduction
On the other hand, there are valid translation pairs in the training corpus that are not learned due to word alignment errors as shown in Deng and Byrne (2005).
word alignment is mentioned in 44 sentences in this paper.
Topics mentioned in this paper:
Ganchev, Kuzman and Graça, João V. and Taskar, Ben
Abstract
Automatic word alignment is a key step in training statistical machine translation systems.
Abstract
Despite much recent work on word alignment methods, alignment accuracy increases often produce little or no improvements in machine translation quality.
Introduction
The word alignment problem has received much recent attention, but improvements in standard measures of word alignment performance often do not result in better translations.
Introduction
In this work, we show that by changing the way the word alignment models are trained and
Introduction
We present extensive experimental results evaluating a new training scheme for unsupervised word alignment models: an extension of the Expectation Maximization algorithm that allows effective injection of additional information about the desired alignments into the unsupervised training process.
Statistical word alignment
Statistical word alignment (Brown et al., 1994) is the task identifying which words are translations of each other in a bilingual sentence corpus.
Statistical word alignment
Figure 2 shows two examples of word alignment of a sentence pair.
Statistical word alignment
Due to the ambiguity of the word alignment task, it is common to distinguish two kinds of alignments (Och and Ney, 2003).
word alignment is mentioned in 22 sentences in this paper.
Topics mentioned in this paper:
Toutanova, Kristina and Suzuki, Hisami and Ruopp, Achim
Inflection prediction models
ture of English and word alignment information.
Integration of inflection models with MT systems
Stemming the target sentences is expected to be helpful for word alignment, especially when the stemming operation is defined so that the word alignment becomes more one-to-one (Goldwater and McClosky, 2005).
Integration of inflection models with MT systems
However, for some language pairs, stemming one language can make word alignment worse, if it leads to more violations in the assumptions of current word alignment models, rather than making the source look more like the target.
Integration of inflection models with MT systems
Note that it may be better to use the word alignment maintained as part of the translation hypotheses during search, but our solution is more suitable to situations where these can not be easily obtained.
Introduction
Evidence for this difficulty is the fact that there has been very little work investigating the use of such independent sub-components, though we started to see some successful cases in the literature, for example in word alignment (Fraser and Marcu, 2007), target language capitalization (Wang et al., 2006) and case marker generation (Toutanova and Suzuki, 2007).
MT performance results
Finally, we can see that using stemming at the word alignment stage further improved both the oracle and the achieved results.
MT performance results
pressed in English at the word level, these results are consistent with previous results using stemming to improve word alignment .
Machine translation systems and data
It uses the same lexicalized-HMM model for word alignment as the treelet system, and uses the standard extraction heuristics to extract phrase pairs using forward and backward alignments.
word alignment is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Zhang, Hao and Quirk, Chris and Moore, Robert C. and Gildea, Daniel
Abstract
Incorporating a sparse prior using Variational Bayes, biases the models toward generalizable, parsimonious parameter sets, leading to significant improvements in word alignment .
Abstract
This preference for sparse solutions together with effective pruning methods forms a phrase alignment regimen that produces better end-to-end translations than standard word alignment approaches.
Bootstrapping Phrasal ITG from Word-based ITG
The scope of iterative phrasal ITG training, therefore, is limited to determining the boundaries of the phrases anchored on the given one-to-one word alignments .
Bootstrapping Phrasal ITG from Word-based ITG
Second, we do not need to worry about non-ITG word alignments , such as the (2, 4, l, 3) permutation patterns.
Bootstrapping Phrasal ITG from Word-based ITG
Figure 3 (a) shows all possible non-compositional phrases given the Viterbi word alignment of the example sentence pair.
Experiments
7.1 Word Alignment Evaluation
Experiments
The output of the word alignment systems (GIZA++ or ITG) were fed to a standard phrase extraction procedure that extracted all phrases of length up to 7 and estimated the conditional probabilities of source given target and target given source using relative frequencies.
Introduction
As these word-level alignment models restrict the word alignment complexity by requiring each target word to align to zero or one source words, results are improved by aligning both source-to-target as well as target-to-source,
Introduction
Finally, the set of phrases consistent with the word alignments are extracted from every sentence pair; these form the basis of the decoding process.
Introduction
Furthermore it would obviate the need for heuristic combination of word alignments .
Phrasal Inversion Transduction Grammar
First we train a lower level word alignment model, then we place hard constraints on the phrasal alignment space using confident word links from this simpler model.
word alignment is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Zhao, Shiqi and Wang, Haifeng and Liu, Ting and Li, Sheng
Experiments
It is not surprising, since Bannard and Callison-Burch (2005) have pointed out that word alignment error is the major factor that influences the performance of the methods learning paraphrases from bilingual corpora.
Experiments
The LW based features validate the quality of word alignment and assign low scores to those aligned EC pattern pairs with incorrect alignment.
Introduction
parsing and English-foreign language word alignment , (2) aligned patterns induction, which produces English patterns along with the aligned pivot patterns in the foreign language, (3) paraphrase patterns extraction, in which paraphrase patterns are extracted based on a log-linear model.
Proposed Method
We conduct word alignment with Giza++ (Och and Ney, 2000) in both directions and then apply the grow-diag heuristic (Koehn et al., 2005) for symmetrization.
Proposed Method
where a denotes the word alignment between 6 and 6. n is the number of words in 6.
word alignment is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Cherry, Colin
Cohesive Decoding
showed that a soft cohesion constraint is superior to a hard constraint for word alignment .
Cohesive Phrasal Output
Previous approaches to measuring the cohesion of a sentence pair have worked with a word alignment (Fox, 2002; Lin and Cherry, 2003).
Experiments
Word alignments are provided by GIZA++ (Och and Ney, 2003) with grow-diag-final combination, with infrastructure for alignment combination and phrase extraction provided by the shared task.
Introduction
Fox (2002) showed that cohesion is held in the vast majority of cases for English-French, while Cherry and Lin (2006) have shown it to be a strong feature for word alignment .
word alignment is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zhang, Min and Jiang, Hongfei and Aw, Aiti and Li, Haizhou and Tan, Chew Lim and Li, Sheng
Conclusions and Future Work
In addition, word alignment is a hard constraint in our rule extraction.
Conclusions and Future Work
We will study direct structure alignments to reduce the impact of word alignment errors.
Experiments
We used GIZA++ (Och and Ney, 2004) and the heuristics “grow-diag-final” to generate m-to-n word alignments .
Experiments
(2006) reports that discontinuities are very useful for translational equivalence analysis using binary-branching structures under word alignment and parse tree constraints while they are almost of no use if under word alignment constraints only.
word alignment is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Chan, Yee Seng and Ng, Hwee Tou
Automatic Evaluation Metrics
Given a pair of strings to compare (a system translation and a reference translation), METEOR (Banerjee and Lavie, 2005) first creates a word alignment between the two strings.
Automatic Evaluation Metrics
These word alignments are created incrementally through a series of stages, where each stage only adds alignments between unigrams which have not been matched in previous stages.
Introduction
Although a maximum weight bipartite graph was also used in the recent work of (Taskar et al., 2005), their focus was on learning supervised models for single word alignment between sentences from a source and target language.
word alignment is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Zhang, Dongdong and Li, Mu and Duan, Nan and Li, Chi-Ho and Zhou, Ming
Model Training and Application 3.1 Training
For the bilingual corpus, we also perform word alignment to get correspondences between source and target words.
Model Training and Application 3.1 Training
According to word alignment results, we classify
Model Training and Application 3.1 Training
We ran GIZA++ (Och and Ney, 2000) on the training corpus in both directions with IBM model 4, and then applied the refinement rule described in (Koehn et al., 2003) to obtain a many-to-many word alignment for each sentence pair.
word alignment is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: