Index of papers in Proc. ACL 2012 that mention
  • word alignment
Chen, Boxing and Kuhn, Roland and Larkin, Samuel
BLEU and PORT
We use word alignment to compute the two permutations (LRscore also uses word alignment ).
BLEU and PORT
The word alignment between the source input and reference is computed using GIZA++ (Och and Ney, 2003) beforehand with the default settings, then is refined with the heuristic grow-diag-final-and; the word alignment between the source input and the translation is generated by the decoder with the help of word alignment inside each phrase pair.
BLEU and PORT
These encode one-to-one relations but not one-to-many, many-to-one, many-to-many or null relations, all of which can occur in word alignments .
Experiments
In order to compute the v part of PORT, we require source-target word alignments for the references and MT outputs.
Experiments
Also, v depends on source-target word alignments for reference and test sets.
Experiments
3.2.5 Robustness to word alignment errors
word alignment is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
Vaswani, Ashish and Huang, Liang and Chiang, David
Abstract
Two decades after their invention, the IBM word-based translation models, widely available in the GIZA++ toolkit, remain the dominant approach to word alignment and an integral part of many statistical translation systems.
Abstract
We explain how to implement this extension efliciently for large-scale data (also released as a modification to GIZA++) and demonstrate, in experiments on Czech, Arabic, Chinese, and Urdu to English translation, significant improvements over IBM Model 4 in both word alignment (up to +6.7 F1) and translation quality (up to +1.4 B ).
Experiments
We measured the accuracy of word alignments generated by GIZA++ with and without the {O-norm,
Introduction
Automatic word alignment is a Vital component of nearly all current statistical translation pipelines.
Introduction
Although state-of—the-art translation models use rules that operate on units bigger than words (like phrases or tree fragments), they nearly always use word alignments to drive extraction of those translation rules.
Introduction
The dominant approach to word alignment has been the IBM models (Brown et al., 1993) together with the HMM model (Vogel et al., 1996).
Method
We start with a brief review of the IBM and HMM word alignment models, then describe how to extend them with a smoothed (0 prior and how to efficiently train them.
Method
In word alignment , one well-known manifestation of overfitting is that rare words can act as “garbage collectors”
Method
Previously (Vaswani et al., 2010), we used ALGENCAN, a nonlinear optimization toolkit, but this solution does not scale well to the number of parameters involved in word alignment models.
word alignment is mentioned in 17 sentences in this paper.
Topics mentioned in this paper:
He, Wei and Wu, Hua and Wang, Haifeng and Liu, Ting
Experiments
The alignment was obtained using GIZA++ (Och and Ney, 2003) and then we symmetrized the word alignment using the grow-diag-fmal heuristic.
Extraction of Paraphrase Rules
3.3 Word Alignments Filtering
Extraction of Paraphrase Rules
We can construct word alignment between S0 and S1 through T 0.
Extraction of Paraphrase Rules
On the initial corpus of (S0, T 0), we conduct word alignment with Giza++ (Och and Ney, 2000) in both directions and then apply the grow-diag-fmal heuristic (Koehn et al., 2005) for symmetrization.
word alignment is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Yang, Nan and Li, Mu and Zhang, Dongdong and Yu, Nenghai
Abstract
The ranking model is automatically derived from word aligned parallel data with a syntactic parser for source language based on both lexical and syntactical features.
Experiments
We use Giza++ (Och and Ney, 2003) to generate the word alignment for the parallel corpus.
Experiments
By manual analysis, we find that the gap is due to both errors of the ranking reorder model and errors from word alignment and parser.
Experiments
The reason is that our annotators tend to align function words which might be left unaligned by automatic word aligner .
Introduction
The ranking model is automatically derived from the word aligned parallel data, viewing the source tree nodes to be reordered as list items to be ranked.
Ranking Model Training
As pointed out by (Li et al., 2007), in practice, nodes often have overlapping target spans due to erroneous word alignment or different syntactic structures between source and target sentences.
Word Reordering as Syntax Tree Node Ranking
Constituent tree is shown above the source sentence; arrows below the source sentences show head-dependent arcs for dependency tree; word alignment links are lines without arrow between the source and target sentences.
word alignment is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Kim, Sungchul and Toutanova, Kristina and Yu, Hwanjo
Data and task
Word alignment features
Data and task
We eXploit a feature set based on HMM word alignments in both directions (Och and Ney, 2000).
Data and task
The first oracle ORACLEl has access to the gold-standard English entities and gold-standard word alignments among English and foreign words.
word alignment is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Meng, Xinfan and Wei, Furu and Liu, Xiaohua and Zhou, Ming and Xu, Ge and Wang, Houfeng
Conclusion and Future Work
In the future, we will work on leveraging parallel sentences and word alignments for other tasks in sentiment analysis, such as building multilingual sentiment lexicons.
Cross-Lingual Mixture Model for Sentiment Classification
We estimate word projection probability using word alignment probability generated by the Berkeley aligner (Liang et al., 2006).
Cross-Lingual Mixture Model for Sentiment Classification
The word alignment probabilities serves two purposes.
Cross-Lingual Mixture Model for Sentiment Classification
Figure 2 gives an example of word alignment probability.
word alignment is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Kim, Seokhwan and Lee, Gary Geunbae
Cross-lingual Annotation Projection for Relation Extraction
However, these automatic annotations can be unreliable because of source text misclassification and word alignment errors; thus, it can cause a critical falling-off in the annotation projection quality.
Graph Construction
Das and Petrov (Das and Petrov, 2011) proposed a graph-based bilingual projection of part-of-speech tagging by considering the tagged words in the source language as labeled examples and connecting them to the unlabeled words in the target language, while referring to the word alignments .
Graph Construction
If the context vertices U S for the source language sentences are defined, then the units of context in the target language can also be created based on the word alignments .
Implementation
We used the GIZA++ software 3 (Och and Ney, 2003) to obtain the word alignments for each bi-sentence in the parallel corpus.
word alignment is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Li, Junhui and Tu, Zhaopeng and Zhou, Guodong and van Genabith, Josef
Experiments
For Moses HPB, we use “grow-diag-final-and” to obtain symmetric word alignments , 10 for the maximum phrase length, and the recommended default values for all other parameters.
Experiments
We obtain the word alignments by running
Head-Driven HPB Translation Model
Given the word alignment in Figure 1, Table 1 demonstrates the difference between hierarchical rules in Chiang (2007) and HD-HRs defined here.
Introduction
Figure 1: An example word alignment for a Chinese-English sentence pair with the dependency parse tree for the Chinese sentence.
word alignment is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Titov, Ivan and Klementiev, Alexandre
Empirical Evaluation
We use GIZA++ (Och and Ney, 2003) to produce word alignments in Europarl: we ran it in both directions and kept the intersection of the induced word alignments .
Empirical Evaluation
We mark arguments in two languages as aligned if there is any word alignment between the corresponding sets and if they are arguments of aligned predicates.
Multilingual Extension
In doing so, as in much of previous work on unsupervised induction of linguistic structures, we rely on automatically produced word alignments .
Multilingual Extension
In Section 6, we describe how we use word alignment to decide if two arguments are aligned; for now, we assume that (noisy) argument alignments are given.
word alignment is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Xiong, Deyi and Zhang, Min and Li, Haizhou
Experiments
We ran GIZA++ on these corpora in both directions and then applied the “grow-diag-final” refinement rule to obtain word alignments .
Integrating the Two Models into SMT
We maintain word alignments for each phrase pair in the phrase table.
Integrating the Two Models into SMT
Whenever a hypothesis covers a new verbal predicate v, we find the target translation 6 for 7} through word alignments and then calculate its translation probability pt(e|C according to Eq.
Related Work
Therefore they either postpone the integration of target side PASs until the whole decoding procedure is completed (Wu and Fung, 2009b), or directly project semantic roles from the source side to the target side through word alignments during decoding (Liu and Gildea, 2010).
word alignment is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zhong, Zhi and Ng, Hwee Tou
Word Sense Disambiguation
Then, word alignment was performed on the parallel corpora with the GIZA+ + software (Och and Ney, 2003).
Word Sense Disambiguation
For each English morphological root 6, the English sentences containing its occurrences were eXtracted from the word aligned output of GIZA++, as well as the corresponding translations of these occurrences.
Word Sense Disambiguation
To minimize noisy word alignment result, translations with no Chinese character were deleted, and we further removed a translation when it only appears once, or its frequency is less than 10 and also less than 1% of the frequency of 6.
word alignment is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: