Index of papers in Proc. ACL that mention

word alignment

Seen in text as:

word alignment (561)
word alignments (230)
word aligned (26)
Word Alignment (21)
word aligner (15)
words aligned (14)
Word alignment (14)
Word alignments (6)
word aligns (4)
word aligners (4)
Word Alignments (3)

Seen in 835 sentences in 85 papers.

1. Pseudo-Word for Phrase-Based Machine Translation

Duan, Xiangyu and Zhang, Min and Li, Haizhou

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	The pipeline of most Phrase-Based Statistical Machine Translation (PB-SMT) systems starts from automatically word aligned parallel corpus.
Abstract	But word appears to be too fine-grained in some cases such as non-compositional phrasal equivalences, where no clear word alignments exist.
Introduction	The pipeline of most Phrase-Based Statistical Machine Translation (PB-SMT) systems starts from automatically word aligned parallel corpus generated from word-based models (Brown et al., 1993), proceeds with step of induction of phrase table (Koehn et al., 2003) or synchronous grammar (Chiang, 2007) and with model weights tuning step.
Introduction	But there is a deficiency in such manner that word is too fine-grained in some cases such as non-compositional phrasal equivalences, where clear word alignments do not exist.
Introduction	No clear word alignments
Searching for Pseudo-words	Then we apply word alignment techniques to build pseudo-word alignments.

word alignment is mentioned in 18 sentences in this paper.

Topics mentioned in this paper:

2. Smaller Alignment Models for Better Translations: Unsupervised Word Alignment with the l0-norm

Vaswani, Ashish and Huang, Liang and Chiang, David

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Two decades after their invention, the IBM word-based translation models, widely available in the GIZA++ toolkit, remain the dominant approach to word alignment and an integral part of many statistical translation systems.
Abstract	We explain how to implement this extension efliciently for large-scale data (also released as a modification to GIZA++) and demonstrate, in experiments on Czech, Arabic, Chinese, and Urdu to English translation, significant improvements over IBM Model 4 in both word alignment (up to +6.7 F1) and translation quality (up to +1.4 B ).
Experiments	We measured the accuracy of word alignments generated by GIZA++ with and without the {O-norm,
Introduction	Automatic word alignment is a Vital component of nearly all current statistical translation pipelines.
Introduction	Although state-of—the-art translation models use rules that operate on units bigger than words (like phrases or tree fragments), they nearly always use word alignments to drive extraction of those translation rules.
Introduction	The dominant approach to word alignment has been the IBM models (Brown et al., 1993) together with the HMM model (Vogel et al., 1996).
Method	We start with a brief review of the IBM and HMM word alignment models, then describe how to extend them with a smoothed (0 prior and how to efficiently train them.
Method	In word alignment , one well-known manifestation of overfitting is that rare words can act as “garbage collectors”
Method	Previously (Vaswani et al., 2010), we used ALGENCAN, a nonlinear optimization toolkit, but this solution does not scale well to the number of parameters involved in word alignment models.

word alignment is mentioned in 17 sentences in this paper.

Topics mentioned in this paper:

3. PORT: a Precision-Order-Recall MT Evaluation Metric for Tuning

Chen, Boxing and Kuhn, Roland and Larkin, Samuel

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

BLEU and PORT	We use word alignment to compute the two permutations (LRscore also uses word alignment ).
BLEU and PORT	The word alignment between the source input and reference is computed using GIZA++ (Och and Ney, 2003) beforehand with the default settings, then is refined with the heuristic grow-diag-final-and; the word alignment between the source input and the translation is generated by the decoder with the help of word alignment inside each phrase pair.
BLEU and PORT	These encode one-to-one relations but not one-to-many, many-to-one, many-to-many or null relations, all of which can occur in word alignments .
Experiments	In order to compute the v part of PORT, we require source-target word alignments for the references and MT outputs.
Experiments	Also, v depends on source-target word alignments for reference and test sets.
Experiments	3.2.5 Robustness to word alignment errors

word alignment is mentioned in 15 sentences in this paper.

Topics mentioned in this paper:

4. An Algorithm for Unsupervised Transliteration Mining with an Application to Word Alignment

Sajjad, Hassan and Fraser, Alexander and Schmid, Helmut

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Finally, we integrate the transliteration module into the GIZA++ word aligner and evaluate it on two word alignment tasks achieving improvements in both precision and recall measured against gold standard word alignments .
Experiments	We evaluate our transliteration mining algorithm on three tasks: transliteration mining from Wikipedia InterLanguage Links, transliteration mining from parallel corpora, and word alignment using a word aligner with a transliteration component.
Experiments	In the word alignment experiment, we integrate a transliteration module which is trained on the transliterations pairs extracted by our method into a word aligner and show a significant improvement.
Experiments	We use the English/Hindi corpus from the shared task on word alignment , organized as part of the ACL 2005 Workshop on Building and Using Parallel Texts (WA05) (Martin et al., 2005).
Introduction	Finally we integrate a transliteration module into the GIZA++ word aligner and show that it improves word alignment quality.
Introduction	We evaluate our word alignment system on two language pairs using gold standard word alignments and achieve improvements of 10% and 13.5% in precision and 3.5% and 13.5% in recall.
Introduction	Section 4 describes the evaluation of our mining method through both gold standard evaluation and through using it to improve word alignment quality.

word alignment is mentioned in 23 sentences in this paper.

Topics mentioned in this paper:

5. An Unsupervised Model for Joint Phrase Alignment and Extraction

Neubig, Graham and Watanabe, Taro and Sumita, Eiichiro and Mori, Shinsuke and Kawahara, Tatsuya

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Evaluation	We compare the accuracy of our proposed method of joint phrase alignment and extraction using the FLAT, HIER and HLEN models, with a baseline of using word alignments from GIZA++ and heuristic phrase extraction.
Flat ITG Model	It should be noted that while Model 1 probabilities are used, they are only soft constraints, compared with the hard constraint of choosing a single word alignment used in most previous phrase extraction approaches.
Hierarchical ITG Model	Because of this, previous research has combined FLAT with heuristic phrase extraction, which exhaustively combines all adjacent phrases permitted by the word alignments (Och et al., 1999).
Hierarchical ITG Model	Figure l: A word alignment (a), and its derivations according to FLAT (b), and HIER (C).
Introduction	However, as DeNero and Klein (2010) note, this two step approach results in word alignments that are not optimal for the final task of generating
Introduction	As a solution to this, they proposed a supervised discriminative model that performs joint word alignment and phrase extraction, and found that joint estimation of word alignments and extraction sets improves both word alignment accuracy and translation results.
Phrase Extraction	Figure 3: The phrase, block, and word alignments used in heuristic phrase extraction.
Phrase Extraction	The traditional method for heuristic phrase extraction from word alignments exhaustively enumerates all phrases up to a certain length consistent with the alignment (Och et al., 1999).
Phrase Extraction	We will call this heuristic extraction from word alignments HEUR-W.

word alignment is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

6. Model-Based Aligner Combination Using Dual Decomposition

DeNero, John and Macherey, Klaus

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Unsupervised word alignment is most often modeled as a Markov process that generates a sentence f conditioned on its translation 6.
Experimental Results	Extraction-based evaluations of alignment better coincide with the role of word aligners in machine translation systems (Ayan and Dorr, 2006).
Introduction	Word alignment is the task of identifying corresponding words in sentence pairs.
Introduction	The standard approach to word alignment employs directional Markov models that align the words of a sentence f to those of its translation 6, such as IBM Model 4 (Brown et al., 1993) or the HMM-based alignment rnodel(ngeletal,l996)
Model Definition	Our bidirectional model Q = (12,13) is a globally normalized, undirected graphical model of the word alignment for a fixed sentence pair (6, f Each vertex in the vertex set V corresponds to a model variable Vi, and each undirected edge in the edge set D corresponds to a pair of variables (W, Each vertex has an associated potential function w, that assigns a real-valued potential to each possible value v,- of 16.1 Likewise, each edge has an associated potential function gig-(vi, 213-) that scores pairs of values.
Model Definition	The highest probability word alignment vector under the model for a given sentence pair (6, f) can be computed exactly using the standard Viterbi algorithm for HMMs in O(\|e\|2 - time.
Model Definition	An alignment vector a can be converted trivially into a set of word alignment links A:
Related Work	In addition, supervised word alignment models often use the output of directional unsupervised aligners as features or pruning signals.
Related Work	A parallel idea that closely relates to our bidirectional model is posterior regularization, which has also been applied to the word alignment problem (Graca et al., 2008).
Related Work	Another similar line of work applies belief propagation to factor graphs that enforce a one-to-one word alignment (Cromieres and Kurohashi, 2009).

word alignment is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

7. Name-aware Machine Translation

Li, Haibo and Zheng, Jing and Ji, Heng and Li, Qi and Wang, Wen

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Experiments on Chinese-English translation demonstrated the effectiveness of our approach on enhancing the quality of overall translation, name translation and word alignment over a high-quality MT baselinel.
Experiments	Therefore, it is important to use name-replaced corpora for rule extraction to fully take advantage of improved word alignment .
Experiments	5.4 Word Alignment
Experiments	It is also important to investigate the impact of our NAMT approach on improving word alignment .
Introduction	names in parallel corpora, updating word segmentation, word alignment and grammar extraction (Section 3.1).
Name-aware MT	We pair two entities from two languages, if they have the same entity type and are mapped together by word alignment .
Name-aware MT	First, we replace tagged name pairs with their entity types, and then use Giza++ and symmetrization heuristics to regenerate word alignment .
Name-aware MT	Since the name tags appear very frequently, the existence of such tags yields improvement in word alignment quality.

word alignment is mentioned in 17 sentences in this paper.

Topics mentioned in this paper:

BLEU (19)
word alignment (17)
LM (12)

8. Exploring Syntactic Structural Features for Sub-Tree Alignment Using Bilingual Tree Kernels

Sun, Jun and Zhang, Min and Tan, Chew Lim

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	It is suggested that the subtree alignment benefits both phrase and syntax based systems by relaxing the constraint of the word alignment .
Introduction	However, most of the syntax based systems construct the syntactic translation rules based on word alignment , which not only suffers from the pipeline errors, but also fails to effectively utilize the syntactic structural features.
Substructure Spaces for BTKs	4.1 Lexical and Word Alignment Features
Substructure Spaces for BTKs	Internal Word Alignment Features: The word alignment links account much for the co-occurrence of the aligned terms.
Substructure Spaces for BTKs	We define the internal word alignment features as follows:

word alignment is mentioned in 15 sentences in this paper.

Topics mentioned in this paper:

9. Learning Phrase-Based Spelling Error Models from Clickthrough Data

Sun, Xu and Gao, Jianfeng and Micol, Daniel and Quirk, Chris

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Phrase-Based Error Model	Let J be the length of Q, L be the length of C, and A = a1, ..., a] be a hidden variable representing the word alignment .
A Phrase-Based Error Model	When scoring a given candidate pair, we further restrict our attention to those S, T, M triples that are consistent with the word alignment , which we denote as B(C, Q, A*).
A Phrase-Based Error Model	Once the word alignment is fixed, the final permutation is uniquely determined, so we can safely discard that factor.

word alignment is mentioned in 11 sentences in this paper.

Topics mentioned in this paper:

10. Hierarchical Search for Word Alignment

Riesa, Jason and Marcu, Daniel

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We present a simple yet powerful hierarchical search algorithm for automatic word alignment .
Abstract	We report results on Arabic-English word alignment and translation tasks.
Introduction	Automatic word alignment is generally accepted as a first step in training any statistical machine translation system.
Introduction	has motivated much recent work in discriminative modeling for word alignment (Moore, 2005; Itty-cheriah and Roukos, 2005; Liu et al., 2005; Taskar et al., 2005; Blunsom and Cohn, 2006; Lacoste-Julien et al., 2006; Moore et al., 2006).
Introduction	We borrow ideas from both k—best parsing (Klein and Manning, 2001; Huang and Chiang, 2005; Huang, 2008) and forest-based, and hierarchical phrase-based translation (Huang and Chiang, 2007; Chiang, 2007), and apply them to word alignment .
Word Alignment as a Hypergraph	Word alignments are built bottom-up on the parse tree.
Word Alignment as a Hypergraph	Initial partial alignments are enumerated and scored at preterminal nodes, each spanning a single column of the word alignment matrix.
Word Alignment as a Hypergraph	Initial alignments We can construct a word alignment hierarchically, bottom-up, by making use of the structure inherent in syntactic parse trees.

word alignment is mentioned in 16 sentences in this paper.

Topics mentioned in this paper:

11. Improving Statistical Machine Translation with Monolingual Collocation

Liu, Zhanyi and Wang, Haifeng and Wu, Hua and Li, Sheng

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We make use of the collocation probabilities, which are estimated from monolingual corpora, in two aspects, namely improving word alignment for various kinds of SMT systems and improving phrase table for phrase-based SMT.
Abstract	The experimental results show that our method improves the performance of both word alignment and translation quality significantly.
Collocation Model	This method adapts the bilingual word alignment algorithm to monolingual scenario to extract collocations only from monolingual corpora.
Collocation Model	2.1 Monolingual word alignment
Collocation Model	Then the monolingual word alignment algorithm is employed to align the potentially collocated words in the monolingual sentences.
Introduction	Statistical bilingual word alignment (Brown et al.
Introduction	Although many methods were proposed to improve the quality of word alignments (Wu, 1997; Och and Ney, 2000; Marcu and Wong, 2002; Cherry and Lin, 2003; Liu et al., 2005; Huang, 2009), the correlation of the words in multi-word alignments is not fully considered.
Introduction	In phrase-based SMT (Koehn et al., 2003), the phrase boundary is usually determined based on the bidirectional word alignments .

word alignment is mentioned in 46 sentences in this paper.

Topics mentioned in this paper:

12. Bitext Dependency Parsing with Bilingual Subtree Constraints

Chen, Wenliang and Kazama, Jun'ichi and Torisawa, Kentaro

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In our method, a target-side tree fragment that corresponds to a source-side tree fragment is identified via word alignment and mapping rules that are automatically learned.
Bilingual subtree constraints	Then we perform word alignment using a word-level aligner (Liang et al., 2006; DeNero and Klein, 2007).
Bilingual subtree constraints	Figure 8 shows an example of a processed sentence pair that has tree structures on both sides and word alignment links.
Bilingual subtree constraints	Then through word alignment links, we obtain the corresponding words of the words of 3758.
Experiments	Word alignments were generated from the Berkeley Aligner (Liang et al., 2006; DeNero and Klein, 2007) trained on a bilingual corpus having approximately 0.8M sentence pairs.
Introduction	Basically, a (candidate) dependency subtree in a source-language sentence is mapped to a subtree in the corresponding target-language sentence by using word alignment and mapping rules that are automatically learned.
Motivation	Suppose that we have an input sentence pair as shown in Figure l, where the source sentence is in English, the target is in Chinese, the dashed undirected links are word alignment links, and the directed links between words indicate that they have a (candidate) dependency relation.
Motivation	We obtain their corresponding words “I’El(meat)”, “H3 (use)”, and “X¥(fork)” in Chinese Via the word alignment links.

word alignment is mentioned in 11 sentences in this paper.

Topics mentioned in this paper:

13. Syntactic Patterns versus Word Alignment: Extracting Opinion Targets from Online Reviews

Liu, Kang and Xu, Liheng and Zhao, Jun

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In contrast, alignment based methods used word alignment model to fulfill this task, which could avoid parsing errors without using parsing.
Introduction	A word can find its corresponding modifiers by using a word alignment
Introduction	Furthermore, this paper naturally addresses another question: is it useful for opinion targets extraction when we combine syntactic patterns and word alignment model into a unified model?
Introduction	Then, these partial alignment links can be regarded as the constrains for a standard unsupervised word alignment model.
Opinion Target Extraction Methodology	In the first component, we respectively use syntactic patterns and unsupervised word alignment model (WAM) to capture opinion relations.
Opinion Target Extraction Methodology	In addition, we employ a partially supervised word alignment model (PSWAM) to incorporate syntactic information into WAM.
Opinion Target Extraction Methodology	3.1.2 Unsupervised Word Alignment Model

word alignment is mentioned in 22 sentences in this paper.

Topics mentioned in this paper:

14. Training Nondeficient Variants of IBM-3 and IBM-4 for Word Alignment

Schoenemann, Thomas

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	We have shown that the word alignment models IBM-3 and IBM-4 can be turned into nondeficient
Introduction	While most people think of the translation and word alignment models IBM-3 and IBM-4 as inherently deficient models (i.e.
Introduction	The source code of this project is available in our word alignment software RegAlignerl, version 1.2 and later.
Introduction	Related Work Today’s most widely used models for word alignment are still the models IBM 1-5 of Brown et al.
The models IBM-3, IBM-4 and IBM-5	bility p(fi] \|e{) of getting the foreign sentence as a translation of the English one is modeled by introducing the word alignment a as a hidden variable:
The models IBM-3, IBM-4 and IBM-5	,I, decide on the number (I),-of foreign words aligned to 6,.
The models IBM-3, IBM-4 and IBM-5	Foreachi=l,2,...,I, andk = l,...,<l>z-decide on (a) the identity fiyc of the next foreign word aligned to 6,.
Training the New Variants	For the task of word alignment , we infer the parameters of the models using the maximum likeli-
Training the New Variants	This task is also needed for the actual task of word alignment (annotating a given sentence pair with an alignment).

word alignment is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

15. A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation

Sun, Jun and Zhang, Min and Tan, Chew Lim

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusions and Future Work	Although the characteristic of more sensitiveness to word alignment error enables SncTSSG to capture the additional noncontiguous language phenomenon, it also induces many redundant noncontiguous rules.
Experiments	We base on the m-to-n word alignments dumped by GIZA++ to extract the tree sequence pairs.
Experiments	The STSSG or any contiguous translational equivalence based model is unable to attain the corresponding target output for this idiom word via the noncontiguous word alignment and consider it as an out-of—vocabulary (OOV).
Experiments	On the contrary, the SncTSSG based model can capture the noncontiguous tree sequence pair consistent with the word alignment and further provide a reasonable target translation.
Introduction	(2006) statistically report that discontinuities are very useful for translational equivalence analysis using binary branching structures under word alignment and parse tree constraints.
Tree Sequence Pair Extraction	Data structure: p[j1, jg] to store tree sequence pairs covering source SpanU1,j2] 1: foreach source span [j1,j2], do 2: find a target span [i1,i2] with minimal length covering all the target words aligned to [j1,j2] 3: if all the target words in [i1,i2] are aligned with source words only in [j1,j2], then 4: Pair each source tree sequence covering [j1,j2] with those in target covering [i1,i2] as a contiguous tree sequence pair
Tree Sequence Pair Extraction	7: create sub-span set s([i1,i2]) to cover all the target words aligned to [j1,j2]
Tree Sequence Pair Extraction	13: find a source span [1'], jg] with minimal length covering all the source words aligned to [i1,i2]

word alignment is mentioned in 11 sentences in this paper.

Topics mentioned in this paper:

16. Joint Word Alignment and Bilingual Named Entity Recognition Using Dual Decomposition

Wang, Mengqiu and Che, Wanxiang and Manning, Christopher D.

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	However, most previous approaches to bilingual tagging assume word alignments are given as fixed input, which can cause cascading errors.
Abstract	We observe that NER label information can be used to correct alignment mistakes, and present a graphical model that performs bilingual NER tagging jointly with word alignment , by combining two monolingual tagging models with two unidirectional alignment models.
Abstract	Experiments on the OntoNotes dataset demonstrate that our method yields significant improvements in both NER and word alignment over state-of-the-art monolingual baselines.
Bilingual NER by Agreement	We also assume that a set of word alignments (A = {(z’,j) : e,- <—> fj}) is given by a word aligner and remain fixed in our model.
Bilingual NER by Agreement	The assumption in the hard agreement model can also be violated if there are word alignment errors.
Introduction	In this work, we first develop a bilingual NER model (denoted as BI-NER) by embedding two monolingual CRF-based NER models into a larger undirected graphical model, and introduce additional edge factors based on word alignment (WA).
Introduction	Our method does not require any manual annotation of word alignments or named entities over the bilingual training data.
Introduction	The aforementioned BI-NER model assumes fixed alignment input given by an underlying word aligner .
Joint Alignment and NER Decoding	To capture this intuition, we extend the BI-NER model to jointly perform word alignment and NER decoding, and call the resulting model BI-NER-WA.

word alignment is mentioned in 20 sentences in this paper.

Topics mentioned in this paper:

NER (41)
word alignment (20)
CRF (15)

17. Recurrent Neural Networks for Word Alignment Model

Tamura, Akihiro and Watanabe, Taro and Sumita, Eiichiro

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	This study proposes a word alignment model based on a recurrent neural network (RNN), in which an unlimited alignment history is represented by recurrently connected hidden layers.
Abstract	The RNN-based model outperforms the feed-forward neural network-based model (Yang et al., 2013) as well as the IBM Model 4 under Japanese-English and French-English word alignment tasks, and achieves comparable translation performance to those baselines for Japanese-English and Chinese-English translation tasks.
Introduction	Automatic word alignment is an important task for statistical machine translation.
Introduction	We assume that this property would fit with a word alignment task, and we propose an RNN-based word alignment model.
Introduction	(2013) trained their model from word alignments produced by traditional unsupervised probabilistic models.
Related Work	Various word alignment models have been proposed.
Related Work	As an instance of discriminative models, we describe an FFNN-based word alignment model (Yang et al., 2013), which is our baseline.
Training	GEN is a subset of all possible word alignments (I), which is generated by beam search.
Training	We evaluated the alignment performance of the proposed models with two tasks: Japanese-English word alignment with the Basic Travel Expression Corpus (BTEC) (Takezawa et a1., 2002) and French-English word alignment with the Hansard dataset (H ansards) from the 2003 NAACL shared task (Mihalcea and Pedersen, 2003).

word alignment is mentioned in 28 sentences in this paper.

Topics mentioned in this paper:

18. Phrase Table Training for Precision and Recall: What Makes a Good Phrase and a Good Phrase Pair?

Deng, Yonggang and Xu, Jia and Gao, Yuqing

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Generic Phrase Training Procedure	We first train word alignment models and will use them to evaluate the goodness of a phrase and a phrase pair.
A Generic Phrase Training Procedure	Beginning with a flat lexicon, we train IBM Model-l word alignment model with 10 iterations for each translation direction.
A Generic Phrase Training Procedure	We then train HMM word alignment models (Vogel et al., 1996) in two directions simultaneously by merging statistics collected in the
Abstract	Experimental results demonstrate consistent and significant improvement over the widely used method that is based on word alignment matrix only.
Introduction	The most widely used approach derives phrase pairs from word alignment matrix (Och and Ney, 2003; Koehn et al., 2003).
Introduction	Other methods do not depend on word alignments only, such as directly modeling phrase alignment in a joint generative way (Marcu and Wong, 2002), pursuing information extraction perspective (Venugopal et al., 2003), or augmenting with model-based phrase pair posterior (Deng and Byrne, 2005).
Introduction	On the other hand, there are valid translation pairs in the training corpus that are not learned due to word alignment errors as shown in Deng and Byrne (2005).

word alignment is mentioned in 44 sentences in this paper.

Topics mentioned in this paper:

19. Better Alignments = Better Translations?

Ganchev, Kuzman and Graça, João V. and Taskar, Ben

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Automatic word alignment is a key step in training statistical machine translation systems.
Abstract	Despite much recent work on word alignment methods, alignment accuracy increases often produce little or no improvements in machine translation quality.
Introduction	The word alignment problem has received much recent attention, but improvements in standard measures of word alignment performance often do not result in better translations.
Introduction	In this work, we show that by changing the way the word alignment models are trained and
Introduction	We present extensive experimental results evaluating a new training scheme for unsupervised word alignment models: an extension of the Expectation Maximization algorithm that allows effective injection of additional information about the desired alignments into the unsupervised training process.
Statistical word alignment	Statistical word alignment (Brown et al., 1994) is the task identifying which words are translations of each other in a bilingual sentence corpus.
Statistical word alignment	Figure 2 shows two examples of word alignment of a sentence pair.
Statistical word alignment	Due to the ambiguity of the word alignment task, it is common to distinguish two kinds of alignments (Och and Ney, 2003).

word alignment is mentioned in 22 sentences in this paper.

Topics mentioned in this paper:

20. Applying Morphology Generation Models to Machine Translation

Toutanova, Kristina and Suzuki, Hisami and Ruopp, Achim

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Inflection prediction models	ture of English and word alignment information.
Integration of inflection models with MT systems	Stemming the target sentences is expected to be helpful for word alignment, especially when the stemming operation is defined so that the word alignment becomes more one-to-one (Goldwater and McClosky, 2005).
Integration of inflection models with MT systems	However, for some language pairs, stemming one language can make word alignment worse, if it leads to more violations in the assumptions of current word alignment models, rather than making the source look more like the target.
Integration of inflection models with MT systems	Note that it may be better to use the word alignment maintained as part of the translation hypotheses during search, but our solution is more suitable to situations where these can not be easily obtained.
Introduction	Evidence for this difficulty is the fact that there has been very little work investigating the use of such independent sub-components, though we started to see some successful cases in the literature, for example in word alignment (Fraser and Marcu, 2007), target language capitalization (Wang et al., 2006) and case marker generation (Toutanova and Suzuki, 2007).
MT performance results	Finally, we can see that using stemming at the word alignment stage further improved both the oracle and the achieved results.
MT performance results	pressed in English at the word level, these results are consistent with previous results using stemming to improve word alignment .
Machine translation systems and data	It uses the same lexicalized-HMM model for word alignment as the treelet system, and uses the standard extraction heuristics to extract phrase pairs using forward and backward alignments.

word alignment is mentioned in 13 sentences in this paper.

Topics mentioned in this paper:

21. Unsupervised Dependency Parsing with Transferring Distribution via Parallel Guidance and Entropy Regularization

Ma, Xuezhe and Xia, Fei

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Data and Tools	3.2 Word Alignments
Data and Tools	In our approach, word alignments for the parallel text are required.
Data and Tools	We perform word alignments with the open source GIZA++ toolkit5.
Experiments	By using IGT Data, not only can we obtain more accurate word alignments , but also extract useful cross-lingual information for the resource-poor language.
Our Approach	In our scenario, we have a set of aligned parallel data P = mg, a,} where ai is the word alignment for the pair of source-target sentences (mf, and a set of unlabeled sentences of the target language U = We also have a trained English parsing model pAE Then the K in equation (7) can be divided into two cases, according to whether 3:,- belongs to parallel data set P or unlabeled data set U.
Our Approach	We define the transferring distribution by defining the transferring weight utilizing the English parsing model pAE (y Via parallel data with word alignments:
Our Approach	By reducing unaligned edges to their deleXicalized forms, we can still use those deleXicalized features, such as part-of-speech tags, for those unaligned edges, and can address problem that automatically generated word alignments include errors.

word alignment is mentioned in 11 sentences in this paper.

Topics mentioned in this paper:

22. Bayesian Learning of Non-Compositional Phrases with Synchronous Parsing

Zhang, Hao and Quirk, Chris and Moore, Robert C. and Gildea, Daniel

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Incorporating a sparse prior using Variational Bayes, biases the models toward generalizable, parsimonious parameter sets, leading to significant improvements in word alignment .
Abstract	This preference for sparse solutions together with effective pruning methods forms a phrase alignment regimen that produces better end-to-end translations than standard word alignment approaches.
Bootstrapping Phrasal ITG from Word-based ITG	The scope of iterative phrasal ITG training, therefore, is limited to determining the boundaries of the phrases anchored on the given one-to-one word alignments .
Bootstrapping Phrasal ITG from Word-based ITG	Second, we do not need to worry about non-ITG word alignments , such as the (2, 4, l, 3) permutation patterns.
Bootstrapping Phrasal ITG from Word-based ITG	Figure 3 (a) shows all possible non-compositional phrases given the Viterbi word alignment of the example sentence pair.
Experiments	7.1 Word Alignment Evaluation
Experiments	The output of the word alignment systems (GIZA++ or ITG) were fed to a standard phrase extraction procedure that extracted all phrases of length up to 7 and estimated the conditional probabilities of source given target and target given source using relative frequencies.
Introduction	As these word-level alignment models restrict the word alignment complexity by requiring each target word to align to zero or one source words, results are improved by aligning both source-to-target as well as target-to-source,
Introduction	Finally, the set of phrases consistent with the word alignments are extracted from every sentence pair; these form the basis of the decoding process.
Introduction	Furthermore it would obviate the need for heuristic combination of word alignments .
Phrasal Inversion Transduction Grammar	First we train a lower level word alignment model, then we place hard constraints on the phrasal alignment space using confident word links from this simpler model.

word alignment is mentioned in 13 sentences in this paper.

Topics mentioned in this paper:

23. A Constrained Viterbi Relaxation for Bidirectional Word Alignment

Chang, Yin-Wen and Rush, Alexander M. and DeNero, John and Collins, Michael

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Bidirectional models of word alignment are an appealing alternative to post-hoc combinations of directional word aligners .
Background	The focus of this work is on the word alignment decoding problem.
Background	Before turning to the model of interest, we first introduce directional word alignment .
Background	2.1 Word Alignment
Introduction	Word alignment is a critical first step for building statistical machine translation systems.
Introduction	In order to ensure accurate word alignments, most systems employ a post-hoc symmetrization step to combine directional word aligners , such as IBM Model 4 (Brown et al., 1993) or hidden Markov model (HMM) based aligners (Vogel et al., 1996).
Introduction	We begin in Section 2 by formally describing the directional word alignment problem.
Related Work	Cromieres and Kurohashi (2009) use belief propagation on a factor graph to train and decode a one-to-one word alignment problem.
Related Work	(2008) use posterior regularization to constrain the posterior probability of the word alignment problem to be symmetric and bij ective.

word alignment is mentioned in 15 sentences in this paper.

Topics mentioned in this paper:

24. Word Alignment Modeling with Context Dependent Deep Neural Network

Yang, Nan and Liu, Shujie and Li, Mu and Zhou, Ming and Yu, Nenghai

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In this paper, we explore a novel bilingual word alignment approach based on DNN (Deep Neural Network), which has been proven to be very effective in various machine learning tasks (Collobert et al., 2011).
Abstract	We describe in detail how we adapt and extend the CD-DNN-HMM (Dahl et al., 2012) method introduced in speech recognition to the HMM-based word alignment model, in which bilingual word embedding is discrimina-tively learnt to capture lexical translation information, and surrounding words are leveraged to model context information in bilingual sentences.
Abstract	Experiments on a large scale English-Chinese word alignment task show that the proposed method outperforms the HMM and IBM model 4 baselines by 2 points in F-score.
DNN for word alignment	Our DNN word alignment model extends classic HMM word alignment model (Vogel et al., 1996).
DNN for word alignment	Given a sentence pair (e, f), HMM word alignment takes the following form:
Introduction	Inspired by successful previous works, we propose a new DNN-based word alignment method, which exploits contextual and semantic similarities between words.
Introduction	Figure 1: Two examples of word alignment
Introduction	In the rest of this paper, related work about DNN and word alignment are first reviewed in Section 2, followed by a brief introduction of DNN in Section 3.
Related Work	For the related works of word alignment , the most popular methods are based on generative models such as IBM Models (Brown et al., 1993) and HMM (Vogel et al., 1996).
Related Work	Discriminative approaches are also proposed to use hand crafted features to improve word alignment .

word alignment is mentioned in 28 sentences in this paper.

Topics mentioned in this paper:

25. Enlisting the Ghost: Modeling Empty Categories for Machine Translation

Xiang, Bing and Luo, Xiaoqiang and Zhou, Bowen

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We show that the recovered empty categories not only improve the word alignment quality, but also lead to significant improvements in a large-scale state-of-the-art syntactic MT system.
Experimental Results	Then we run GIZA++ (Och and Ney, 2000) to generate the word alignment for each direction and apply grow-diagonal-final (Koehn et al., 2003), same as in the baseline.
Integrating Empty Categories in Machine Translation	With the preprocessed MT training corpus, an unsupervised word aligner, such as GIZA++, can be used to generate automatic word alignment , as the first step of a system training pipeline.
Integrating Empty Categories in Machine Translation	The effect of inserting ECs is twofold: first, it can impact the automatic word alignment since now it allows the target-side words, especially the function words, to align to the inserted ECs and fix some errors in the original word alignment ; second, new phrases and rules can be extracted from the preprocessed training data.
Integrating Empty Categories in Machine Translation	A few examples of the extracted Hiero rules and tree-to-string rules are also listed, which we would not have been able to extract from the original incorrect word alignment when the pro was missing.
Introduction	In addition, the pro-drop problem can also degrade the word alignment quality in the training data.
Introduction	A sentence pair observed in the real data is shown in Figure 1 along with the word alignment obtained from an automatic word aligner , where the English subject pronoun
Introduction	Figure 1: Example of incorrect word alignment due to missing pronouns on the Chinese side.

word alignment is mentioned in 17 sentences in this paper.

Topics mentioned in this paper:

26. Learning a Compositional Semantic Parser using an Existing Syntactic Parser

Ge, Ruifang and Mooney, Raymond

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion and Future work	The approach also exploits methods from statistical MT ( word alignment ) and therefore integrates techniques from statistical syntactic parsing, MT, and compositional semantics to produce an effective semantic parser.
Experimental Evaluation	We also evaluated the impact of the word alignment component by replacing Giza++ by gold-standard word alignments manually annotated for the CLANG corpus.
Experimental Evaluation	The results consistently showed that compared to using gold-standard word alignment , Giza++ produced lower semantic parsing accuracy when given very little training data, but similar or better results when given sufficient training data (> 160 examples).
Experimental Evaluation	This suggests that, given sufficient data, Giza++ can produce effective word alignments, and that imperfect word alignments do not seriously impair our semantic parsers since the disambiguation model evaluates multiple possible interpretations of ambiguous words.
Introduction	The learning system first employs a word alignment method from statistical machine translation (GIZA++ (Och and Ney, 2003)) to acquire a semantic lexicon that maps words to logical predicates.
Learning Semantic Knowledge	We use an approach based on Wong and Mooney (2006), which constructs word alignments between NL sentences and their MRs.
Learning Semantic Knowledge	Normally, word alignment is used in statistical machine translation to match words in one NL to words in another; here it is used to align words with predicates based on a ”parallel corpus” of NL sentences and MRs. We assume that each word alignment defines a possible mapping from words to predicates for building a SAPT and semantic derivation which compose the correct MR. A semantic lexicon and composition rules are then extracted directly from the
Learning Semantic Knowledge	Generation of word alignments for each training example proceeds as follows.
Learning a Disambiguation Model	Here, unique word alignments are not required, and alternative interpretations compete for the best semantic parse.

word alignment is mentioned in 11 sentences in this paper.

Topics mentioned in this paper:

27. Confidence Measure for Word Alignment

Huang, Fei

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In this paper we present a confidence measure for word alignment based on the posterior probability of alignment links.
Abstract	Based on these measures, we improve the alignment quality by selecting high confidence sentence alignments and alignment links from multiple word alignments of the same sentence pair.
Abstract	Additionally, we remove low confidence alignment links from the word alignment of a bilingual training corpus, which increases the alignment F-score, improves Chinese-English and Arabic-English translation quality and significantly reduces the phrase translation table size.
Introduction	Many MT systems, such as statistical phrase-based and syntax-based systems, learn phrase translation pairs or translation rules from large amount of bilingual data with word alignment .
Introduction	The quality of the parallel data and the word alignment have significant impacts on the learned translation models and ultimately the quality of translation output.
Introduction	Given the huge amount of bilingual training data, word alignments are automatically generated using various algorithms ((Brown et al., 1994), (Vogel et al., 1996)

word alignment is mentioned in 22 sentences in this paper.

Topics mentioned in this paper:

28. Cut the noise: Mutually reinforcing reordering and alignments for improved machine translation

Visweswariah, Karthik and Khapra, Mitesh M. and Ramanathan, Ananthakrishnan

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Previous work has shown that a reordering model can be learned from high quality manual word alignments to improve machine translation performance.
Abstract	In this paper, we focus on further improving the performance of the reordering model (and thereby machine translation) by using a larger corpus of sentence aligned data for which manual word alignments are not available but automatic machine generated alignments are available.
Abstract	To mitigate the effect of noisy machine alignments, we propose a novel approach that improves reorderings produced given noisy alignments and also improves word alignments using information from the reordering model.
Introduction	These methods use a small corpus of manual word alignments (where the words in the source sentence are manually aligned to the words in the target sentence) to learn a model to preorder the source sentence to match target order.
Introduction	In this paper, we build upon the approach in (Visweswariah et al., 2011) which uses manual word alignments for learning a reordering model.
Introduction	Specifically, we show that we can significantly improve reordering performance by using a large number of sentence pairs for which manual word alignments are not available.

word alignment is mentioned in 44 sentences in this paper.

Topics mentioned in this paper:

29. Microblogs as Parallel Corpora

Ling, Wang and Xiang, Guang and Dyer, Chris and Black, Alan and Trancoso, Isabel

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Parallel Segment Retrieval	Finally, a represents the word alignment between the words in the left and the right segments.
Parallel Segment Retrieval	Then, we would use an word alignment model (Brown et al., 1993; Vogel et al., 1996), with source s = sup, .
Parallel Segment Retrieval	Finally, from the probability of the word alignments , we can determine whether the segments are parallel.

word alignment is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

30. Discriminative Pruning for Discriminative ITG Alignment

Liu, Shujie and Li, Chi-Ho and Zhou, Ming

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Basics of ITG	From the viewpoint of word alignment , the terminal unary rules provide the links of word pairs, whereas the binary rules represent the reordering factor.
Basics of ITG	First of all, it imposes a l-to-l constraint in word alignment .
Basics of ITG	Secondly, the simple ITG leads to redundancy if word alignment is the sole purpose of applying ITG.
Basics of ITG Parsing	Based on the rules in normal form, ITG word alignment is done in a similar way to chart parsing (Wu, 1997).
Conclusion and Future Work	This paper reviews word alignment through ITG parsing, and clarifies the problem of ITG pruning.
Evaluation	Table 3 lists the word alignment time cost and SMT performance of different pruning methods.
Introduction	For this reason ITG has gained more and more attention recently in the word alignment community (Zhang and Gildea, 2005; Cherry and Lin, 2006; Haghighi et al., 2009).
The DPDI Framework	Discriminative approaches to word alignment use manually annotated alignment for sentence pairs.
The DPDI Framework	However, in reality there are often the cases where a foreign word aligns to more than one English word.

word alignment is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

31. Training Phrase Translation Models with Leaving-One-Out

Wuebker, Joern and Mauser, Arne and Ney, Hermann

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Alignment	A phrase extraction is performed for each training sentence pair separately using the same word alignment as for the initialization.
Experimental Evaluation	For the heuristic phrase model, we first use GIZA++ (Och and Ney, 2003) to compute the word alignment on TRAIN.
Experimental Evaluation	Next we obtain a phrase table by extraction of phrases from the word alignment .
Introduction	Viterbi Word Alignment Phrase Alignment word translation models phrase translation models trained by EM Algorithm trained by EM Algorithm heuristic phrase phrase translation counts probabilities Phrase Translation Table ‘ ‘ Phrase Translation Table
Phrase Model Training	The simplest of our generative phrase models estimates phrase translation probabilities by their relative frequencies in the Viterbi alignment of the data, similar to the heuristic model but with counts from the phrase-aligned data produced in training rather than computed on the basis of a word alignment .
Related Work	Their results show that it can not reach a performance competitive to extracting a phrase table from word alignment by heuristics (Och et al., 1999).
Related Work	In addition, we do not restrict the training to phrases consistent with the word alignment , as was done in (DeNero et al., 2006).
Related Work	This allows us to recover from flawed word alignments .

word alignment is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

32. Distortion Model Considering Rich Context for Statistical Machine Translation

Goto, Isao and Utiyama, Masao and Sumita, Eiichiro and Tamura, Akihiro and Kurohashi, Sadao

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiment	GIZA++ and grow-diag-final-and heuristics were used to obtain word alignments .
Experiment	In order to reduce word alignment errors, we removed articles {a, an, the} in English and particles {ga, wo, wa} in Japanese before performing word alignments because these function words do not correspond to any words in the other languages.
Experiment	After word alignment, we restored the removed words and shifted the word alignment positions to the original word positions.
Proposed Method	The lines represent word alignments .
Proposed Method	The English side arrows point to the nearest word aligned on the right.
Proposed Method	The training data is built from a parallel corpus and word alignments between corresponding source words and target words.

word alignment is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

33. Paraphrase-Driven Learning for Open Question Answering

Fader, Anthony and Zettlemoyer, Luke and Etzioni, Oren

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	The algorithm uses learned word alignments to aggressively generalize the seeds, producing a large set of possible lexical equivalences.
Learning	We call this procedure InduceLex(:c, :c’, y, A), which takes a paraphrase pair (at, :c’ ), a derivation y of at, and a word alignment A, and returns a new set of lexical entries.
Learning	, A word alignment A between at and :c’ is a subset of x [n’ A phrase alignment is a pair of index sets (1,1’) where I g and I’ g A phrase alignment (1,1’) is consistent with a word alignment A if for all (i, z") E A, i E I if and only if z" E I ’ .
Learning	In other words, a phrase alignment is consistent with a word alignment if the words in the phrases are aligned only with each other, and not with any outside words.

word alignment is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

34. Effective Use of Function Words for Rule Generalization in Forest-Based Translation

Wu, Xianchao and Matsuzaki, Takuya and Tsujii, Jun'ichi

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Backgrounds	1These numbers are language/corpus-dependent and are not necessarily to be taken as a general reflection of the overall quality of the word alignments for arbitrary language pairs.
Composed Rule Extraction	Input: HPSG forest F5, target sentence T, word alignment A = j)}, target function word set {fw} appeared in T, and target chunk set {C}
Introduction	However, forest-based translation systems, and, in general, most linguistically syntax-based SMT systems (Galley et al., 2004; Galley et al., 2006; Liu et al., 2006; Zhang et al., 2007; Mi et al., 2008; Liu et al., 2009; Chiang, 2010), are built upon word aligned parallel sentences and thus share a critical dependence on word alignments .
Introduction	For example, even a single spurious word alignment can invalidate a large number of otherwise extractable rules, and unaligned words can result in an exponentially large set of extractable rules for the interpretation of these unaligned words (Galley et al., 2006).
Introduction	What makes word alignment so fragile?
Related Research	By dealing with the ambiguous word alignment instead of unaligned target words, syntax-based realignment models were proposed by (May
Related Research	Specially, we observed that most incorrect or ambiguous word alignments are caused by function words rather than content words.

word alignment is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

35. Improve SMT Quality with Automatically Extracted Paraphrase Rules

He, Wei and Wu, Hua and Wang, Haifeng and Liu, Ting

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	The alignment was obtained using GIZA++ (Och and Ney, 2003) and then we symmetrized the word alignment using the grow-diag-fmal heuristic.
Extraction of Paraphrase Rules	3.3 Word Alignments Filtering
Extraction of Paraphrase Rules	We can construct word alignment between S0 and S1 through T 0.
Extraction of Paraphrase Rules	On the initial corpus of (S0, T 0), we conduct word alignment with Giza++ (Och and Ney, 2000) in both directions and then apply the grow-diag-fmal heuristic (Koehn et al., 2005) for symmetrization.

word alignment is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

36. A Ranking-based Approach to Word Reordering for Statistical Machine Translation

Yang, Nan and Li, Mu and Zhang, Dongdong and Yu, Nenghai

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	The ranking model is automatically derived from word aligned parallel data with a syntactic parser for source language based on both lexical and syntactical features.
Experiments	We use Giza++ (Och and Ney, 2003) to generate the word alignment for the parallel corpus.
Experiments	By manual analysis, we find that the gap is due to both errors of the ranking reorder model and errors from word alignment and parser.
Experiments	The reason is that our annotators tend to align function words which might be left unaligned by automatic word aligner .
Introduction	The ranking model is automatically derived from the word aligned parallel data, viewing the source tree nodes to be reordered as list items to be ranked.
Ranking Model Training	As pointed out by (Li et al., 2007), in practice, nodes often have overlapping target spans due to erroneous word alignment or different syntactic structures between source and target sentences.
Word Reordering as Syntax Tree Node Ranking	Constituent tree is shown above the source sentence; arrows below the source sentences show head-dependent arcs for dependency tree; word alignment links are lines without arrow between the source and target sentences.

word alignment is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

37. A Recursive Recurrent Neural Network for Statistical Machine Translation

Liu, Shujie and Yang, Nan and Li, Mu and Zhou, Ming

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	DNN is also introduced to Statistical Machine Translation (SMT) to learn several components or features of conventional framework, including word alignment , language modelling, translation modelling and distortion modelling.
Introduction	(2013) adapt and extend the CD-DNN-HMM (Dahl et al., 2012) method to HMM-based word alignment model.
Phrase Pair Embedding	where, fa, is the corresponding target word aligned to 6, , and it is similar for ea].
Phrase Pair Embedding	The recurrent neural network is trained with word aligned bilingual corpus, similar as (Auli et al., 2013).
Related Work	(2013) adapt and extend CD-DNN-HMM (Dahl et al., 2012) to word alignment .
Related Work	Word embeddings capturing lexical translation information and surrounding words modeling context information are leveraged to improve the word alignment performance.
Related Work	Unfortunately, the better word alignment result generated by this model, cannot bring significant performance improvement on a end-to-end SMT evaluation task.

word alignment is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

38. Kneser-Ney Smoothing on Expected Counts

Zhang, Hui and Chiang, David

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We rederive all the steps of KN smoothing to operate on count distributions instead of integral counts, and apply it to two tasks where KN smoothing was not applicable before: one in language model adaptation, and the other in word alignment .
Introduction	One is language model domain adaptation, and the other is word alignment using the IBM models (Brown et al., 1993).
Word Alignment	In this section, we show how to apply expected KN to the IBM word alignment models (Brown et al., 1993).
Word Alignment	Of course, expected KN can be applied to other instances of EM besides word alignment .
Word Alignment	The IBM models and related models define probability distributions p(a, f \| e, 6), which model how likely a French sentence f is to be generated from an English sentence e with word alignment a.

word alignment is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

39. Advancements in Reordering Models for Statistical Machine Translation

Feng, Minwei and Peter, Jan-Thorsten and Ney, Hermann

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Comparative Study	monotone for current phrase, if a word alignment to the bottom left (point A) exists and there is no word alignment point at the bottom right position (point B) .
Comparative Study	swap for current phrase, if a word alignment to the bottom right (point B) exists and there is no word alignment point at the bottom left position (point A) .
Comparative Study	0 when one source word aligned to multiple target words, duplicate the source word for each target word, e.g.
Experiments	The main reason is that the “annotated” corpus is converted from word alignment which contains lots of error.
Tagging-style Reordering Model	The first step is word alignment training.
Tagging-style Reordering Model	We also have the word alignment within the new phrase pair, which is stored during the phrase extraction process.

word alignment is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

CRFs (19)
LM (14)
BLEU (8)

40. Exact Maximum Inference for the Fertility Hidden Markov Model

Quirk, Chris

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	The notion of fertility in word alignment (the number of words emitted by a single state) is useful but difficult to model.
Evaluation	We explore the impact of this improved MAP inference procedure on a task in German-English word alignment .
Evaluation	For training data we use the news commentary data from the WMT 2012 translation task.1 120 of the training sentences were manually annotated with word alignments .
HMM alignment	, f J and word alignment vectors a = a1, .
HMM alignment	For the standard HMM, there is a dynamic programming algorithm to compute the posterior probability over word alignments Pr(a\|e, f These are the sufficient statistics gathered in the E step of EM.
Introduction	These word alignments are a crucial training component in most machine translation systems.
Introduction	2 and 3 incorporate a positional model based on the absolute position of the word; Models 4 and 5 use a relative position model instead (an English word tends to align to a French word that is nearby the French word aligned to the previous English word).

word alignment is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

41. Cross-Lingual Mixture Model for Sentiment Classification

Meng, Xinfan and Wei, Furu and Liu, Xiaohua and Zhou, Ming and Xu, Ge and Wang, Houfeng

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion and Future Work	In the future, we will work on leveraging parallel sentences and word alignments for other tasks in sentiment analysis, such as building multilingual sentiment lexicons.
Cross-Lingual Mixture Model for Sentiment Classification	We estimate word projection probability using word alignment probability generated by the Berkeley aligner (Liang et al., 2006).
Cross-Lingual Mixture Model for Sentiment Classification	The word alignment probabilities serves two purposes.
Cross-Lingual Mixture Model for Sentiment Classification	Figure 2 gives an example of word alignment probability.

word alignment is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

42. Multilingual Named Entity Recognition using Parallel Data and Metadata from Wikipedia

Kim, Sungchul and Toutanova, Kristina and Yu, Hwanjo

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Data and task	Word alignment features
Data and task	We eXploit a feature set based on HMM word alignments in both directions (Och and Ney, 2000).
Data and task	The first oracle ORACLEl has access to the gold-standard English entities and gold-standard word alignments among English and foreign words.

word alignment is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

43. Improving Tree-to-Tree Translation with Packed Forests

Liu, Yang and Lü, Yajuan and Liu, Qun

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We obtained word alignments of the training data by first running GIZA++ (Och and Ney, 2003) and then applying the refinement rule “grow-diag-final-and” (Koehn et al., 2003).
Introduction	The solid lines denote hyperedges and the dashed lines denote word alignments .
Model	The solid lines denote hyperedges and the dashed lines denote word alignments between the two forests.
Rule Extraction	By constructing a theory that gives formal semantics to word alignments , Galley et al.
Rule Extraction	Their GHKM procedure draws connections among word alignments , derivations, and rules.
Rule Extraction	They first identify the tree nodes that subsume tree-string pairs consistent with word alignments and then extract rules from these nodes.

word alignment is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

44. Dependency Parsing and Projection Based on Word-Pair Classification

Jiang, Wenbin and Liu, Qun

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	For dependency projection, the relationship between words in the parsed sentences can be simply projected across the word alignment to words in the unparsed sentences, according to the DCA assumption (Hwa et al., 2005).
Introduction	Such a projection procedure suffers much from the word alignment errors and syntactic isomerism between languages, which usually lead to relationship projection conflict and incomplete projected dependency structures.
Introduction	Because of the free translation, the syntactic isomerism between languages and word alignment errors, it would be strained to completely project the dependency structure from one language to another.
Projected Classification Instance	In order to alleviate the effect of word alignment errors, we base the projection on the alignment matrix, a compact representation of multiple GIZA++ (Och and Ney, 2000) results, rather than a single word alignment in previous dependency projection works.
Projected Classification Instance	Figure 2: The word alignment matrix between a Chinese sentence and its English translation.
Related Works	Because of the free translation, the word alignment errors, and the heterogeneity between two languages, it is reluctant and less effective to project the dependency tree completely to the target language sentence.

word alignment is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

45. Bayesian Synchronous Tree-Substitution Grammar Induction and Its Application to Sentence Compression

Yamangil, Elif and Shieber, Stuart M.

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	These translation tasks are characterized by the relative ability to commit to parallel parse trees and availability of word alignments , yet the unavailability of large-scale data, calling for a Bayesian tree-to-tree formalism.
Conclusion	The future for this work would involve natural extensions such as mixing over the space of word alignments ; this would allow application to MT—like tasks where flexible word reordering is allowed, such as abstractive sentence compression and paraphrasing.
Introduction	One approach is to use word alignments (where these can be reliably estimated, as in our testbed application) to align subtrees and extract rules (Och and Ney, 2004; Galley et al., 2004) but this leaves open the question of finding the right level of generality of the rules — how deep the rules should be and how much lexicalization they should involve — necessitating resorting to heuristics such as minimality of rules, and leading to
Introduction	possibility of searching over the infinite space of grammars (and, in machine translation, possible word alignments ), thus sidestepping the narrowness problem outlined above as well.
Introduction	This task is characterized by the availability of word alignments , providing a clean testbed for investigating the effects of grammar extraction.
The STSG Model	In particular, we visit every tree pair and each of its source nodes i, and update its alignment by selecting between and within two choices: (a) unaligned, (b) aligned with some target node j or e. The number of possibilities j in (b) is significantly limited, firstly by the word alignment (for instance, a source node dominating a deleted subspan cannot be aligned with a target node), and secondly by the current alignment of other nearby aligned source nodes.

word alignment is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

46. A Markov Model of Machine Translation using Non-parametric Bayesian Inference

Feng, Yang and Cohn, Trevor

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusions and Future Work	In this paper the model was only used to infer word alignments ; in future work we intend to develop a decoding algorithm for directly translating with the model.
Experiments	However in this paper we limit our focus to inducing word alignments , i.e., by using the model to infer alignments which are then used in a standard phrase-based translation pipeline.
Experiments	We present results on translation quality and word alignment .
Gibbs Sampling	Given the structure of our model, a word alignment uniquely specifies the translation decisions and the sequence follows the order of the target sentence left to right.
Introduction	In this paper we propose a new model to drop the independence assumption, by instead modelling correlations between translation decisions, which we use to induce translation derivations from aligned sentences (akin to word alignment ).
Model	Given a source sentence, our model infers a latent derivation which produces a target translation and meanwhile gives a word alignment between the source and the target.

word alignment is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

47. An Information Theoretic Approach to Bilingual Word Clustering

Faruqui, Manaal and Dyer, Chris

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusions	We presented a novel information theoretic model for bilingual word clustering which seeks a clustering with high average mutual information between clusters of adjacent words, and also high mutual information across observed word alignment links.
Experiments	The corpus was word aligned in two directions using an unsupervised word aligner (Dyer et al., 2013), then the intersected alignment points were taken.
Experiments	For Turkish the F1 score improves by 1.0 point over when there are no distributional clusters which clearly shows that the word alignment information improves the clustering quality.
Experiments	Thus we propose to further refine the quality of word alignment links as follows: Let c be a word in language 2 and y be a word in language 9 and let there exists an alignment link between cc and 3/.
Introduction	The second term ensures that the cluster alignments induced by a word alignment have high mutual information across languages (§2.2).
Word Clustering	For concreteness, A(:c, y) will be the number of times that cc is aligned to y in a word aligned parallel corpus.

word alignment is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

48. Extracting Opinion Targets and Opinion Words from Online Reviews with Graph Co-ranking

Liu, Kang and Xu, Liheng and Zhao, Jun

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	They employed a word alignment model to capture opinion relations among words, and then used a random walking algorithm to extract opinion targets.
Experiments	Second, our method captures semantic relations using topic modeling and captures opinion relations through word alignments , which are more precise than Hai which merely uses co-occurrence information to indicate such relations among words.
Introduction	They have investigated a series of techniques to enhance opinion relations identification performance, such as nearest neighbor rules (Liu et al., 2005), syntactic patterns (Zhang et al., 2010; Popescu and Etzioni, 2005), word alignment models (Liu et al., 2012; Liu et al., 2013b; Liu et al., 2013a), etc.
Related Work	(Liu et al., 2012; Liu et al., 2013a; Liu et al., 2013b) employed word alignment model to capture opinion relations rather than syntactic parsing.
The Proposed Method	This approach models capturing opinion relations as a monolingual word alignment process.
The Proposed Method	After performing word alignment , we obtain a set of word pairs composed of a noun (noun phrase) and its corresponding modified word.

word alignment is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

49. Enhancing Grammatical Cohesion: Generating Transitional Expressions for SMT

Tu, Mei and Zhou, Yu and Zong, Chengqing

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A semantic span can include one or more eus.	Instead, we reserve the cohesive information in the training process by converting the original source sentence into tagged-flattened CSS and then perform word alignment and extract the translation rules from the bilingual flattened source CSS and the target string.
A semantic span can include one or more eus.	We then perform word alignment on the modified bilingual sentences, and extract the new translation rules based on the new alignment, as shown in Figure 3(b) to Figure 3(c).
A semantic span can include one or more eus.	bound by the word alignment , the alignment complies with EUC only if there is no overlap between pSA and pSB.
Experiments	We obtain the word alignment with the grow-diag-final-and strategy with GIZA++.
Experiments	The merits of “Flattened Rule” are twofold: 1) In training process, the new word alignment upon modified sentence pairs can align transitional expressions to flattened CSS tags; 2) In decoding process, the CSS-based rules are more discriminating than the original rules, which is more flexible than “TFS”.

word alignment is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

50. An Infinite Hierarchical Bayesian Model of Phrasal Translation

Cohn, Trevor and Haffari, Gholamreza

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Following (Levenberg et al., 2012; Neubig et al., 2011), we evaluate our model by using its output word alignments to construct a phrase table.
Experiments	As a baseline, we train a phrase-based model using the moses toolkit12 based on the word alignments obtained using GIZA++ in both directions and symmetrized using the grow-diag-final-and heuristic13 (Koehn et al., 2003).
Experiments	11These are taken from the final model 4 word alignments , using the intersection of the source-target and target-source models.
Related Work	In the context of machine translation, ITG has been explored for statistical word alignment in both unsupervised (Zhang and Gildea, 2005; Cherry and Lin, 2007; Zhang et al., 2008; Pauls et al., 2010) and supervised (Haghighi et al., 2009; Cherry and Lin, 2006) settings, and for decoding (Petrov et al., 2008).
Related Work	Our paper fits into the recent line of work for jointly inducing the phrase table and word alignment (DeNero and Klein, 2010; Neubig et al., 2011).

word alignment is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

51. A Unified Model for Soft Linguistic Reordering Constraints in Statistical Machine Translation

Li, Junhui and Marton, Yuval and Resnik, Philip and Daumé III, Hal

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Our syntactic constituent reordering model considers context free grammar (CFG) rules in the source language and predicts the reordering of their elements on the target side, using word alignment information.
Introduction	We introduce novel soft reordering constraints, using syntactic constituents or semantic roles, composed over word alignment information in translation rules used during decoding time;
Unified Linguistic Reordering Models	parse tree and its word alignment links to the target language.
Unified Linguistic Reordering Models	Unlike the conventional phrase and lexical translation features, whose values are phrase pair-determined and thus can be calculated offline, the value of the reordering features can only be obtained during decoding time, and requires word alignment information as well.
Unified Linguistic Reordering Models	Before we present the algorithm integrating the reordering models, we define the following functions by assuming XP,- and XP,-+1 are the constituent pair of interest in CFG rule cfg, H is the translation hypothesis and a is its word alignment:

word alignment is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

52. Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections

Das, Dipanjan and Petrov, Slav

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Approach Overview	To establish a soft correspondence between the two languages, we use a second similarity function, which leverages standard unsupervised word alignment statistics (§3.3).3
Graph Construction	3The word alignment methods do not use POS information.
Graph Construction	To define a similarity function between the English and the foreign vertices, we rely on high-confidence word alignments .
Graph Construction	Since our graph is built from a parallel corpus, we can use standard word alignment techniques to align the English sentences “De

word alignment is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

53. The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis

Popat, Kashyap and A.R, Balamurali and Bhattacharyya, Pushpak and Haffari, Gholamreza

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Clustering for Cross Lingual Sentiment Analysis	Given a parallel bilingual corpus, word clusters in S can be aligned to clusters in T. Word alignments are created using parallel corpora.
Clustering for Cross Lingual Sentiment Analysis	Here, LTIS and LSIT(...) are factors based on word alignments , which can be represented as:
Conclusion and Future Work	A naive cluster linkage algorithm based on word alignments was used to perform CLSA.
Introduction	To perform CLSA, this study leverages unlabelled parallel corpus to generate the word alignments .
Introduction	These word alignments are then used to link cluster based features to obliterate the language gap for performing SA.

word alignment is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

54. Pivot Approach for Extracting Paraphrase Patterns from Bilingual Corpora

Zhao, Shiqi and Wang, Haifeng and Liu, Ting and Li, Sheng

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	It is not surprising, since Bannard and Callison-Burch (2005) have pointed out that word alignment error is the major factor that influences the performance of the methods learning paraphrases from bilingual corpora.
Experiments	The LW based features validate the quality of word alignment and assign low scores to those aligned EC pattern pairs with incorrect alignment.
Introduction	parsing and English-foreign language word alignment , (2) aligned patterns induction, which produces English patterns along with the aligned pivot patterns in the foreign language, (3) paraphrase patterns extraction, in which paraphrase patterns are extracted based on a log-linear model.
Proposed Method	We conduct word alignment with Giza++ (Och and Ney, 2000) in both directions and then apply the grow-diag heuristic (Koehn et al., 2005) for symmetrization.
Proposed Method	where a denotes the word alignment between 6 and 6. n is the number of words in 6.

word alignment is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

55. Handling Ambiguities of Bilingual Predicate-Argument Structures for Statistical Machine Translation

Zhai, Feifei and Zhang, Jiajun and Zhou, Yu and Zong, Chengqing

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiment	We run GIZA++ and then employ the grow-diag—final—and (gdfa) strategy to produce symmetric word alignments .
Inside Context Integration	We demand that every element and its corresponding target span must be consistent with word alignment .
Inside Context Integration	Note that we only apply the source-side PAS and word alignment for IC—PASTR extraction.
Inside Context Integration	Thus to get a high recall for PASs, we only utilize word alignment instead of capturing the relation between bilingual elements.
Maximum Entropy PAS Disambiguation (MEPD) Model	t_range(PAS) refers to the target range covering all the words that are reachable from the PAS via word alignment .

word alignment is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

56. Dependency Grammar Induction via Bitext Projection Constraints

Ganchev, Kuzman and Gillenwater, Jennifer and Taskar, Ben

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Approach	First, parser and word alignment errors cause much of the transferred information to be wrong.
Experiments	For both corpora, we performed word alignments with the open source PostCAT (Graca et al., 2009) toolkit.
Experiments	Preliminary experiments showed that our word alignments were not always appropriate for syntactic transfer, even when they were correct for translation.
Introduction	Nevertheless, several challenges to accurate training and evaluation from aligned bitext remain: (1) partial word alignment due to non-literal or distant translation; (2) errors in word alignments and source language parses, (3) grammatical annotation choices that differ across languages and linguistic theories (e. g., how to analyze auxiliary verbs, conjunctions).
Related Work	(2005) found that transferring dependencies directly was not sufficient to get a parser with reasonable performance, even when both the source language parses and the word alignments are performed by hand.

word alignment is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

57. A Tree Sequence Alignment-based Tree-to-Tree Translation Model

Zhang, Min and Jiang, Hongfei and Aw, Aiti and Li, Haizhou and Tan, Chew Lim and Li, Sheng

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusions and Future Work	In addition, word alignment is a hard constraint in our rule extraction.
Conclusions and Future Work	We will study direct structure alignments to reduce the impact of word alignment errors.
Experiments	We used GIZA++ (Och and Ney, 2004) and the heuristics “grow-diag-final” to generate m-to-n word alignments .
Experiments	(2006) reports that discontinuities are very useful for translational equivalence analysis using binary-branching structures under word alignment and parse tree constraints while they are almost of no use if under word alignment constraints only.

word alignment is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

58. Cohesive Phrase-Based Decoding for Statistical Machine Translation

Cherry, Colin

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Cohesive Decoding	showed that a soft cohesion constraint is superior to a hard constraint for word alignment .
Cohesive Phrasal Output	Previous approaches to measuring the cohesion of a sentence pair have worked with a word alignment (Fox, 2002; Lin and Cherry, 2003).
Experiments	Word alignments are provided by GIZA++ (Och and Ney, 2003) with grow-diag-final combination, with infrastructure for alignment combination and phrase extraction provided by the shared task.
Introduction	Fox (2002) showed that cohesion is held in the vast majority of cases for English-French, while Cherry and Lin (2006) have shown it to be a strong feature for word alignment .

word alignment is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

subtrees (11)
phrase-based (10)
BLEU (9)

59. Polylingual Tree-Based Topic Models for Translation Domain Adaptation

Hu, Yuening and Zhai, Ke and Eidelman, Vladimir and Boyd-Graber, Jordan

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We also extract the bidirectional word alignments between Chinese and English using GIZA++ (Och and Ney, 2003).
Experiments	While ptLDA-align performs better than baseline SMT and LDA, it is worse than ptLDA-dict, possibly because of errors in the word alignments , making the tree priors less effective.
Introduction	Topic models bridge the chasm between languages using document connections (Mimno et al., 2009), dictionaries (Boyd-Graber and Resnik, 2010), and word alignments (Zhao and Xing, 2006).
Polylingual Tree-based Topic Models	In addition, we extract the word alignments from aligned sentences in a parallel corpus.

word alignment is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

60. Fast and Robust Neural Network Joint Models for Statistical Machine Translation

Devlin, Jacob and Zbib, Rabih and Huang, Zhongqiang and Lamar, Thomas and Schwartz, Richard and Makhoul, John

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Decoding with the NNJ M	For aligned target words, the normal affiliation heuristic can be used, since the word alignment is available within the rule.
Model Variations	We treat NULL as a normal target word, and if a source word aligns to multiple target words, it is treated as a single concatenated token.
Model Variations	For word alignment , we align all of the training data with both GIZA++ (Och and Ney, 2003) and NILE (Riesa et al., 2011), and concatenate the corpora together for rule extraction.
Neural Network Joint Model (NNJ M)	This notion of afi‘iliation is derived from the word alignment, but unlike word alignment , each target word must be affiliated with exactly one non-NULL source word.

word alignment is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

61. Using subcategorization knowledge to improve case prediction for translation to German

Weller, Marion and Fraser, Alexander and Schulte im Walde, Sabine

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments and evaluation	We use the hierarchical translation system that comes with the Moses SMT-package and GIZA++ to compute the word alignment , using the “grow-diag-final-and” heuristics.
Translation pipeline	We consider gender as part of the stem, whereas the value for number is derived from the source-side: if marked for number, singular/plural nouns are distinguished during word alignment and then translated accordingly.
Using subcategorization information	Figure 1: Deriving features from dependency-parsed English data via the word alignment .
Using subcategorization information	to the SMT output via word alignment .

word alignment is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

62. Crosslingual Induction of Semantic Roles

Titov, Ivan and Klementiev, Alexandre

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Empirical Evaluation	We use GIZA++ (Och and Ney, 2003) to produce word alignments in Europarl: we ran it in both directions and kept the intersection of the induced word alignments .
Empirical Evaluation	We mark arguments in two languages as aligned if there is any word alignment between the corresponding sets and if they are arguments of aligned predicates.
Multilingual Extension	In doing so, as in much of previous work on unsupervised induction of linguistic structures, we rely on automatically produced word alignments .
Multilingual Extension	In Section 6, we describe how we use word alignment to decide if two arguments are aligned; for now, we assume that (noisy) argument alignments are given.

word alignment is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

63. Head-Driven Hierarchical Phrase-based Translation

Li, Junhui and Tu, Zhaopeng and Zhou, Guodong and van Genabith, Josef

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	For Moses HPB, we use “grow-diag-final-and” to obtain symmetric word alignments , 10 for the maximum phrase length, and the recommended default values for all other parameters.
Experiments	We obtain the word alignments by running
Head-Driven HPB Translation Model	Given the word alignment in Figure 1, Table 1 demonstrates the difference between hierarchical rules in Chiang (2007) and HD-HRs defined here.
Introduction	Figure 1: An example word alignment for a Chinese-English sentence pair with the dependency parse tree for the Chinese sentence.

word alignment is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

64. A Multi-Domain Translation Model Framework for Statistical Machine Translation

Sennrich, Rico and Schwenk, Holger and Aransa, Walid

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Translation Model Architecture	The lexical weights lex(§\|f) and lex(f\|§) are calculated as follows, using a set of word alignments a between E and El
Translation Model Architecture	30(3, t) and 005, s) are not identical since the lexical probabilities are based on the unsymmetrized word alignment frequencies (in the Moses implementation which we re-implement).
Translation Model Architecture	In the unweighted variant, the resulting features are equivalent to training on the concatenation of all training data, excepting differences in word alignment , pruning4 and rounding.

word alignment is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

65. Modeling the Translation of Predicate-Argument Structure for SMT

Xiong, Deyi and Zhang, Min and Li, Haizhou

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We ran GIZA++ on these corpora in both directions and then applied the “grow-diag-final” refinement rule to obtain word alignments .
Integrating the Two Models into SMT	We maintain word alignments for each phrase pair in the phrase table.
Integrating the Two Models into SMT	Whenever a hypothesis covers a new verbal predicate v, we find the target translation 6 for 7} through word alignments and then calculate its translation probability pt(e\|C according to Eq.
Related Work	Therefore they either postpone the integration of target side PASs until the whole decoding procedure is completed (Wu and Fung, 2009b), or directly project semantic roles from the source side to the target side through word alignments during decoding (Liu and Gildea, 2010).

word alignment is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

66. A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relation Extraction

Kim, Seokhwan and Lee, Gary Geunbae

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Cross-lingual Annotation Projection for Relation Extraction	However, these automatic annotations can be unreliable because of source text misclassification and word alignment errors; thus, it can cause a critical falling-off in the annotation projection quality.
Graph Construction	Das and Petrov (Das and Petrov, 2011) proposed a graph-based bilingual projection of part-of-speech tagging by considering the tagged words in the source language as labeled examples and connecting them to the unlabeled words in the target language, while referring to the word alignments .
Graph Construction	If the context vertices U S for the source language sentences are defined, then the units of context in the target language can also be created based on the word alignments .
Implementation	We used the GIZA++ software 3 (Och and Ney, 2003) to obtain the word alignments for each bi-sentence in the parallel corpus.

word alignment is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

67. A Syntax-Driven Bracketing Model for Phrase-Based Translation

Xiong, Deyi and Zhang, Min and Aw, Aiti and Li, Haizhou

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	To obtain word-level alignments, we ran GIZA++ (Och and Ney, 2000) on the remaining corpus in both directions, and applied the “grow-diag-final” refinement rule (Koehn et al., 2005) to produce the final many-to-many word alignments .
Introduction	According to the word alignments , we define bracketable and unbracketable instances.
The Acquisition of Bracketing Instances	Let c and e be the source sentence and the target sentence, W be the word alignment between them, T be the parse tree of c. We define a binary bracketing instance as a tuple (b,7'(cinj),7'(Cj+1nk),7'(cink)> where b E {bracketable,unbracketable}, cinj and cj+1nlc are two neighboring source phrases and 7'(T, 3) (7(3) for short) is a subtree function which returns the minimal subtree covering the source sequence 3 from the source parse tree T. Note that 7(cz-nk) includes both 7(cz-nj) and flog-+1.19).
The Acquisition of Bracketing Instances	1: Input: sentence pair (0, e), the parse tree T of c and the word alignment W between c and e 2: QR :2 (Z) 3: for each (i,j, k) E cdo 4: if There exist a target phrase can” aligned to Cinj and ep,,q aligned to Cj+1,_k; then

word alignment is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

68. Two-Neighbor Orientation Model with Cross-Boundary Global Contexts

Setiawan, Hendra and Zhou, Bowen and Xiang, Bing and Shen, Libin

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	We first identify anchors as regions in the source sentences around which ambiguous reordering patterns frequently occur and chunks as regions that are consistent with word alignment which may span multiple translation units at decoding time.
Maximal Orientation Span	We also attach the word indices as the superscript of the source words and project the indices to the target words aligned , as such “have5” suggests that the word “have” is aligned to the 5-th source word, i.e.
Maximal Orientation Span	Note that to facilitate the projection, the rules must come with internal word alignment in practice.
Maximal Orientation Span	the source sentence, the complete translation and the word alignment .

word alignment is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

69. An exponential translation model for target language morphology

Subotin, Michael

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Corpora and baselines	All conditions use word alignments produced by sequential iterations of IBM model 1, HMM, and IBM model 4 in GIZA++, followed by “diag-and” symmetrization (Koehn et al., 2003).
Features	We add inflection features for all words aligned to at least one English verb, adjective, noun, pronoun, or determiner, excepting definite and indefinite articles.
Features	These features would be more properly defined based on the identity of the target word aligned to these quantifiers, but little ambiguity seems to arise from this substitution in practice.
Features	These dependencies are inferred from source-side annotation Via word alignments , as depicted in figure 1, without any use of target-side dependency parses.

word alignment is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

70. Learning Topic Representation for SMT with Neural Networks

Cui, Lei and Zhang, Dongdong and Liu, Shujie and Chen, Qiming and Li, Mu and Zhou, Ming and Yang, Muyun

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	using GIZA++ in both directions, and the diag-grow-final heuristic is used to refine symmetric word alignment .
Related Work	They proposed a bilingual topical admixture approach for word alignment and assumed that each word-pair follows a topic-
Related Work	They reported extensive empirical analysis and improved word alignment accuracy as well as translation quality.

word alignment is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

71. MAXSIM: A Maximum Similarity Metric for Machine Translation Evaluation

Chan, Yee Seng and Ng, Hwee Tou

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Automatic Evaluation Metrics	Given a pair of strings to compare (a system translation and a reference translation), METEOR (Banerjee and Lavie, 2005) first creates a word alignment between the two strings.
Automatic Evaluation Metrics	These word alignments are created incrementally through a series of stages, where each stage only adds alignments between unigrams which have not been matched in previous stages.
Introduction	Although a maximum weight bipartite graph was also used in the recent work of (Taskar et al., 2005), their focus was on learning supervised models for single word alignment between sentences from a source and target language.

word alignment is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

unigram (24)
BLEU (20)
bigrams (16)

72. A Sense-Based Translation Model for Statistical Machine Translation

Xiong, Deyi and Zhang, Min

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Decoding with Sense-Based Translation Model	During decoding, we keep word alignments for each translation rule.
Decoding with Sense-Based Translation Model	Whenever a new source word 0 is translated, we find its translation 6 Via the kept word alignments .
Experiments	We ran Giza++ on the training data in two directions and applied the “grow-diag-final” refinement rule (Koehn et al., 2003) to obtain word alignments .

word alignment is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

73. Measure Word Generation for English-Chinese SMT Systems

Zhang, Dongdong and Li, Mu and Duan, Nan and Li, Chi-Ho and Zhou, Ming

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Model Training and Application 3.1 Training	For the bilingual corpus, we also perform word alignment to get correspondences between source and target words.
Model Training and Application 3.1 Training	According to word alignment results, we classify
Model Training and Application 3.1 Training	We ran GIZA++ (Och and Ney, 2000) on the training corpus in both directions with IBM model 4, and then applied the refinement rule described in (Koehn et al., 2003) to obtain a many-to-many word alignment for each sentence pair.

word alignment is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

74. XMEANT: Better semantic MT evaluation without reference translations

Lo, Chi-kiu and Beloucif, Meriem and Saers, Markus and Wu, Dekai

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	XMEANT is obtained by (1) using simple lexical translation probabilities, instead of the monolingual context vector model used in MEANT for computing the semantic role fillers similarities, and (2) incorporating bracketing ITG constrains for word alignment within the semantic role fillers.
Introduction	than that of the reference translation, and on the other hand, the BITG constraints the word alignment more accurately than the heuristic bag-of-word aggregation used in MEANT.
Results	It is also consistent with results observed while estimating word alignment probabilities, where BITG constraints outperformed alignments from GIZA++ (Saers and Wu, 2009).

word alignment is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

75. Word Sense Disambiguation Improves Information Retrieval

Zhong, Zhi and Ng, Hwee Tou

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Word Sense Disambiguation	Then, word alignment was performed on the parallel corpora with the GIZA+ + software (Och and Ney, 2003).
Word Sense Disambiguation	For each English morphological root 6, the English sentences containing its occurrences were eXtracted from the word aligned output of GIZA++, as well as the corresponding translations of these occurrences.
Word Sense Disambiguation	To minimize noisy word alignment result, translations with no Chinese character were deleted, and we further removed a translation when it only appears once, or its frequency is less than 10 and also less than 1% of the frequency of 6.

word alignment is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

76. Forest-based Tree Sequence to String Translation Model

Zhang, Hui and Zhang, Min and Li, Haizhou and Aw, Aiti and Tan, Chew Lim

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiment	GIZA++ (Och and Ney, 2003) and the heuristics “grow-diag-final-and” are used to generate m-to-n word alignments .
Experiment	This is mainly because tree sequence rules are only sensitive to word alignment while tree rules, even extracted from a forest (like in FT2S), are also limited by syntax according to grammar parsing rules.
Forest-based tree sequence to string model	Given a source forest F and target translation T S as well as word alignment A, our translation model is formulated as:

word alignment is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

77. Translating Italian connectives into Italian Sign Language

Lugaresi, Camillo and Di Eugenio, Barbara

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

The effect of the Italian connectives on the LIS translation	Word Alignment .
The effect of the Italian connectives on the LIS translation	For this purpose we used the Berkeley Word Aligner (BWA) 1 (Denero, 2007), a general tool for aligning sentences in bilingual corpora.
The effect of the Italian connectives on the LIS translation	Bold dashed lines show word alignment .

word alignment is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

78. Learning a Phrase-based Translation Model from Monolingual Data with Application to Domain Adaptation

Zhang, Jiajun and Zong, Chengqing

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

For each 1‘ E V[Sk] :	If a target word in t is a gap word, we suppose there is a word alignment between the target gap word and the source-side null.
Phrase Pair Refinement and Parameterization	According to our analysis, we find that the biggest problem is that in the target-side of the phrase pair, there are two or more identical words aligned to the same source-
Probabilistic Bilingual Lexicon Acquisition	We employ the same algorithm used in (Munteanu and Marcu, 2006) which first use the GIZA++ (with grow-diag-final-and heuristic) to obtain the word alignment between source and target words, and then calculate the association strength between the aligned words.

word alignment is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

79. Graph Propagation for Paraphrasing Out-of-Vocabulary Words in Statistical Machine Translation

Razmara, Majid and Siahbani, Maryam and Haffari, Reza and Sarkar, Anoop

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments & Results 4.1 Experimental Setup	Word alignment is done using GIZA++ (Och and Ney, 2003).
Experiments & Results 4.1 Experimental Setup	The resulting word alignments are used to extract the translations for each oov.
Experiments & Results 4.1 Experimental Setup	The correctness of this gold standard is limited to the size of the parallel data used as well as the quality of the word alignment software toolkit, and is not 100% precise.

word alignment is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

80. Question Answering Using Enhanced Lexical Semantic Models

Yih, Wen-tau and Chang, Ming-Wei and Meek, Christopher and Pastusiak, Andrzej

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusions	This may suggest that adding shallow semantic information is more effective than introducing complex structured constraints, at least for the specific word alignment model we experimented with in this work.
Introduction	This may suggest that compared to introducing complex structured constraints, incorporating shallow semantic information is both more effective and computationally inexpensive in improving the performance, at least for the specific word alignment model tested in this work.
Learning QA Matching Models	As a result, an “ideal” word alignment structure should not link words in this clause to those in the question.

word alignment is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

81. Reinforcement Learning for Mapping Instructions to Actions

Branavan, S.R.K. and Chen, Harr and Zettlemoyer, Luke and Barzilay, Regina

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	Additionally, we compute a word alignment score to investigate the extent to which the input text is used to construct correct analyses.
Results	The word alignment results from Table 2 indicate that the learners are mapping the correct words to actions for documents that are successfully completed.
Results	For example, the models that perform best in the Windows domain achieve nearly perfect word alignment scores.

word alignment is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

82. Revisiting Pivot Language Approach for Machine Translation

Wu, Hua and Wang, Haifeng

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Pivot Methods for Phrase-based SMT	(2003), there are two important elements in the lexical weight: word alignment information a in a phrase pair (5, f) and lexical translation probability w (s
Pivot Methods for Phrase-based SMT	Let a1 and a2 represent the word alignment information inside the phrase pairs (5,13) and (13, 2?)
Pivot Methods for Phrase-based SMT	Based on the the induced word alignment information, we estimate the co-occurring frequencies of word pairs directly from the induced phrase

word alignment is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

83. Integrating Translation Memory into Phrase-Based Machine Translation during Decoding

Wang, Kun and Zong, Chengqing and Su, Keh-Yih

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	The system configurations are as follows: GIZA++ (Och and Ney, 2003) is used to obtain the bidirectional word alignments .
Problem Formulation	is the final translation; [tm_s,tm_t,tm_f,s_a,tm_a] are the associated information of the best TM sentence-pair; tm_s and tm_t denote the corresponding TM sentence pair; tm_f denotes its associated fuzzy match score (from 0.0 to 1.0); 8_a is the editing operations between tm_8 and s; and tm_a denotes the word alignment between tm_s and tmi.
Problem Formulation	), we can find its corresponding TM source phrase tm_sa(k) and all possible TM target phrases (each of them is denoted by tmiaw» with the help of corresponding editing operations 8_a and word alignment tm_a.

word alignment is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

TER (14)
BLEU (11)
SMT system (11)

84. Learning Semantic Correspondences with Less Supervision

Liang, Percy and Jordan, Michael and Klein, Dan

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Many of the remaining errors are due to the garbage collection phenomenon familiar from word alignment models (Moore, 2004; Liang et al., 2006).
Generative Model	The alignment aspect of our model is similar to the HMM model for word alignment (Ney and Vogel, 1996).
Generative Model	(2008) perform joint segmentation and word alignment for machine translation, but the nature of that task is different from ours.

word alignment is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

85. Paraphrasing Adaptation for Web Search Ranking

Wang, Chenguang and Duan, Nan and Zhou, Ming and Zhang, Ming

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiment	Word alignments of each paraphrase pair are trained by GIZA++.
Paraphrasing for Web Search	Word alignments within each paraphrase pair are generated using GIZA++ (Och and Ney, 2000).
Paraphrasing for Web Search	In order to enable our paraphrasing model to learn the preferences on different paraphrasing strategies according to the characteristics of web queries, we design search-oriented features2 based on word alignments within Q and Q’, which can be described as follows:

word alignment is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: