Index of papers in Proc. ACL 2013 that mention
  • word alignment
Yang, Nan and Liu, Shujie and Li, Mu and Zhou, Ming and Yu, Nenghai
Abstract
In this paper, we explore a novel bilingual word alignment approach based on DNN (Deep Neural Network), which has been proven to be very effective in various machine learning tasks (Collobert et al., 2011).
Abstract
We describe in detail how we adapt and extend the CD-DNN-HMM (Dahl et al., 2012) method introduced in speech recognition to the HMM-based word alignment model, in which bilingual word embedding is discrimina-tively learnt to capture lexical translation information, and surrounding words are leveraged to model context information in bilingual sentences.
Abstract
Experiments on a large scale English-Chinese word alignment task show that the proposed method outperforms the HMM and IBM model 4 baselines by 2 points in F-score.
DNN for word alignment
Our DNN word alignment model extends classic HMM word alignment model (Vogel et al., 1996).
DNN for word alignment
Given a sentence pair (e, f), HMM word alignment takes the following form:
Introduction
Inspired by successful previous works, we propose a new DNN-based word alignment method, which exploits contextual and semantic similarities between words.
Introduction
Figure 1: Two examples of word alignment
Introduction
In the rest of this paper, related work about DNN and word alignment are first reviewed in Section 2, followed by a brief introduction of DNN in Section 3.
Related Work
For the related works of word alignment , the most popular methods are based on generative models such as IBM Models (Brown et al., 1993) and HMM (Vogel et al., 1996).
Related Work
Discriminative approaches are also proposed to use hand crafted features to improve word alignment .
word alignment is mentioned in 28 sentences in this paper.
Topics mentioned in this paper:
Xiang, Bing and Luo, Xiaoqiang and Zhou, Bowen
Abstract
We show that the recovered empty categories not only improve the word alignment quality, but also lead to significant improvements in a large-scale state-of-the-art syntactic MT system.
Experimental Results
Then we run GIZA++ (Och and Ney, 2000) to generate the word alignment for each direction and apply grow-diagonal-final (Koehn et al., 2003), same as in the baseline.
Integrating Empty Categories in Machine Translation
With the preprocessed MT training corpus, an unsupervised word aligner, such as GIZA++, can be used to generate automatic word alignment , as the first step of a system training pipeline.
Integrating Empty Categories in Machine Translation
The effect of inserting ECs is twofold: first, it can impact the automatic word alignment since now it allows the target-side words, especially the function words, to align to the inserted ECs and fix some errors in the original word alignment ; second, new phrases and rules can be extracted from the preprocessed training data.
Integrating Empty Categories in Machine Translation
A few examples of the extracted Hiero rules and tree-to-string rules are also listed, which we would not have been able to extract from the original incorrect word alignment when the *pro* was missing.
Introduction
In addition, the pro-drop problem can also degrade the word alignment quality in the training data.
Introduction
A sentence pair observed in the real data is shown in Figure 1 along with the word alignment obtained from an automatic word aligner , where the English subject pronoun
Introduction
Figure 1: Example of incorrect word alignment due to missing pronouns on the Chinese side.
word alignment is mentioned in 17 sentences in this paper.
Topics mentioned in this paper:
Wang, Mengqiu and Che, Wanxiang and Manning, Christopher D.
Abstract
However, most previous approaches to bilingual tagging assume word alignments are given as fixed input, which can cause cascading errors.
Abstract
We observe that NER label information can be used to correct alignment mistakes, and present a graphical model that performs bilingual NER tagging jointly with word alignment , by combining two monolingual tagging models with two unidirectional alignment models.
Abstract
Experiments on the OntoNotes dataset demonstrate that our method yields significant improvements in both NER and word alignment over state-of-the-art monolingual baselines.
Bilingual NER by Agreement
We also assume that a set of word alignments (A = {(z’,j) : e,- <—> fj}) is given by a word aligner and remain fixed in our model.
Bilingual NER by Agreement
The assumption in the hard agreement model can also be violated if there are word alignment errors.
Introduction
In this work, we first develop a bilingual NER model (denoted as BI-NER) by embedding two monolingual CRF-based NER models into a larger undirected graphical model, and introduce additional edge factors based on word alignment (WA).
Introduction
Our method does not require any manual annotation of word alignments or named entities over the bilingual training data.
Introduction
The aforementioned BI-NER model assumes fixed alignment input given by an underlying word aligner .
Joint Alignment and NER Decoding
To capture this intuition, we extend the BI-NER model to jointly perform word alignment and NER decoding, and call the resulting model BI-NER-WA.
word alignment is mentioned in 20 sentences in this paper.
Topics mentioned in this paper:
Visweswariah, Karthik and Khapra, Mitesh M. and Ramanathan, Ananthakrishnan
Abstract
Previous work has shown that a reordering model can be learned from high quality manual word alignments to improve machine translation performance.
Abstract
In this paper, we focus on further improving the performance of the reordering model (and thereby machine translation) by using a larger corpus of sentence aligned data for which manual word alignments are not available but automatic machine generated alignments are available.
Abstract
To mitigate the effect of noisy machine alignments, we propose a novel approach that improves reorderings produced given noisy alignments and also improves word alignments using information from the reordering model.
Introduction
These methods use a small corpus of manual word alignments (where the words in the source sentence are manually aligned to the words in the target sentence) to learn a model to preorder the source sentence to match target order.
Introduction
In this paper, we build upon the approach in (Visweswariah et al., 2011) which uses manual word alignments for learning a reordering model.
Introduction
Specifically, we show that we can significantly improve reordering performance by using a large number of sentence pairs for which manual word alignments are not available.
word alignment is mentioned in 44 sentences in this paper.
Topics mentioned in this paper:
Schoenemann, Thomas
Conclusion
We have shown that the word alignment models IBM-3 and IBM-4 can be turned into nondeficient
Introduction
While most people think of the translation and word alignment models IBM-3 and IBM-4 as inherently deficient models (i.e.
Introduction
The source code of this project is available in our word alignment software RegAlignerl, version 1.2 and later.
Introduction
Related Work Today’s most widely used models for word alignment are still the models IBM 1-5 of Brown et al.
The models IBM-3, IBM-4 and IBM-5
bility p(fi] |e{) of getting the foreign sentence as a translation of the English one is modeled by introducing the word alignment a as a hidden variable:
The models IBM-3, IBM-4 and IBM-5
,I, decide on the number (I),-of foreign words aligned to 6,.
The models IBM-3, IBM-4 and IBM-5
Foreachi=l,2,...,I, andk = l,...,<l>z-decide on (a) the identity fiyc of the next foreign word aligned to 6,.
Training the New Variants
For the task of word alignment , we infer the parameters of the models using the maximum likeli-
Training the New Variants
This task is also needed for the actual task of word alignment (annotating a given sentence pair with an alignment).
word alignment is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Li, Haibo and Zheng, Jing and Ji, Heng and Li, Qi and Wang, Wen
Abstract
Experiments on Chinese-English translation demonstrated the effectiveness of our approach on enhancing the quality of overall translation, name translation and word alignment over a high-quality MT baselinel.
Experiments
Therefore, it is important to use name-replaced corpora for rule extraction to fully take advantage of improved word alignment .
Experiments
5.4 Word Alignment
Experiments
It is also important to investigate the impact of our NAMT approach on improving word alignment .
Introduction
names in parallel corpora, updating word segmentation, word alignment and grammar extraction (Section 3.1).
Name-aware MT
We pair two entities from two languages, if they have the same entity type and are mapped together by word alignment .
Name-aware MT
First, we replace tagged name pairs with their entity types, and then use Giza++ and symmetrization heuristics to regenerate word alignment .
Name-aware MT
Since the name tags appear very frequently, the existence of such tags yields improvement in word alignment quality.
word alignment is mentioned in 17 sentences in this paper.
Topics mentioned in this paper:
Liu, Kang and Xu, Liheng and Zhao, Jun
Abstract
In contrast, alignment based methods used word alignment model to fulfill this task, which could avoid parsing errors without using parsing.
Introduction
A word can find its corresponding modifiers by using a word alignment
Introduction
Furthermore, this paper naturally addresses another question: is it useful for opinion targets extraction when we combine syntactic patterns and word alignment model into a unified model?
Introduction
Then, these partial alignment links can be regarded as the constrains for a standard unsupervised word alignment model.
Opinion Target Extraction Methodology
In the first component, we respectively use syntactic patterns and unsupervised word alignment model (WAM) to capture opinion relations.
Opinion Target Extraction Methodology
In addition, we employ a partially supervised word alignment model (PSWAM) to incorporate syntactic information into WAM.
Opinion Target Extraction Methodology
3.1.2 Unsupervised Word Alignment Model
word alignment is mentioned in 22 sentences in this paper.
Topics mentioned in this paper:
Ling, Wang and Xiang, Guang and Dyer, Chris and Black, Alan and Trancoso, Isabel
Parallel Segment Retrieval
Finally, a represents the word alignment between the words in the left and the right segments.
Parallel Segment Retrieval
Then, we would use an word alignment model (Brown et al., 1993; Vogel et al., 1996), with source s = sup, .
Parallel Segment Retrieval
Finally, from the probability of the word alignments , we can determine whether the segments are parallel.
word alignment is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Fader, Anthony and Zettlemoyer, Luke and Etzioni, Oren
Introduction
The algorithm uses learned word alignments to aggressively generalize the seeds, producing a large set of possible lexical equivalences.
Learning
We call this procedure InduceLex(:c, :c’, y, A), which takes a paraphrase pair (at, :c’ ), a derivation y of at, and a word alignment A, and returns a new set of lexical entries.
Learning
, A word alignment A between at and :c’ is a subset of x [n’ A phrase alignment is a pair of index sets (1,1’) where I g and I’ g A phrase alignment (1,1’) is consistent with a word alignment A if for all (i, z") E A, i E I if and only if z" E I ’ .
Learning
In other words, a phrase alignment is consistent with a word alignment if the words in the phrases are aligned only with each other, and not with any outside words.
word alignment is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Goto, Isao and Utiyama, Masao and Sumita, Eiichiro and Tamura, Akihiro and Kurohashi, Sadao
Experiment
GIZA++ and grow-diag-final-and heuristics were used to obtain word alignments .
Experiment
In order to reduce word alignment errors, we removed articles {a, an, the} in English and particles {ga, wo, wa} in Japanese before performing word alignments because these function words do not correspond to any words in the other languages.
Experiment
After word alignment, we restored the removed words and shifted the word alignment positions to the original word positions.
Proposed Method
The lines represent word alignments .
Proposed Method
The English side arrows point to the nearest word aligned on the right.
Proposed Method
The training data is built from a parallel corpus and word alignments between corresponding source words and target words.
word alignment is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Quirk, Chris
Abstract
The notion of fertility in word alignment (the number of words emitted by a single state) is useful but difficult to model.
Evaluation
We explore the impact of this improved MAP inference procedure on a task in German-English word alignment .
Evaluation
For training data we use the news commentary data from the WMT 2012 translation task.1 120 of the training sentences were manually annotated with word alignments .
HMM alignment
, f J and word alignment vectors a = a1, .
HMM alignment
For the standard HMM, there is a dynamic programming algorithm to compute the posterior probability over word alignments Pr(a|e, f These are the sufficient statistics gathered in the E step of EM.
Introduction
These word alignments are a crucial training component in most machine translation systems.
Introduction
2 and 3 incorporate a positional model based on the absolute position of the word; Models 4 and 5 use a relative position model instead (an English word tends to align to a French word that is nearby the French word aligned to the previous English word).
word alignment is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Feng, Minwei and Peter, Jan-Thorsten and Ney, Hermann
Comparative Study
monotone for current phrase, if a word alignment to the bottom left (point A) exists and there is no word alignment point at the bottom right position (point B) .
Comparative Study
swap for current phrase, if a word alignment to the bottom right (point B) exists and there is no word alignment point at the bottom left position (point A) .
Comparative Study
0 when one source word aligned to multiple target words, duplicate the source word for each target word, e.g.
Experiments
The main reason is that the “annotated” corpus is converted from word alignment which contains lots of error.
Tagging-style Reordering Model
The first step is word alignment training.
Tagging-style Reordering Model
We also have the word alignment within the new phrase pair, which is stored during the phrase extraction process.
word alignment is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Feng, Yang and Cohn, Trevor
Conclusions and Future Work
In this paper the model was only used to infer word alignments ; in future work we intend to develop a decoding algorithm for directly translating with the model.
Experiments
However in this paper we limit our focus to inducing word alignments , i.e., by using the model to infer alignments which are then used in a standard phrase-based translation pipeline.
Experiments
We present results on translation quality and word alignment .
Gibbs Sampling
Given the structure of our model, a word alignment uniquely specifies the translation decisions and the sequence follows the order of the target sentence left to right.
Introduction
In this paper we propose a new model to drop the independence assumption, by instead modelling correlations between translation decisions, which we use to induce translation derivations from aligned sentences (akin to word alignment ).
Model
Given a source sentence, our model infers a latent derivation which produces a target translation and meanwhile gives a word alignment between the source and the target.
word alignment is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Faruqui, Manaal and Dyer, Chris
Conclusions
We presented a novel information theoretic model for bilingual word clustering which seeks a clustering with high average mutual information between clusters of adjacent words, and also high mutual information across observed word alignment links.
Experiments
The corpus was word aligned in two directions using an unsupervised word aligner (Dyer et al., 2013), then the intersected alignment points were taken.
Experiments
For Turkish the F1 score improves by 1.0 point over when there are no distributional clusters which clearly shows that the word alignment information improves the clustering quality.
Experiments
Thus we propose to further refine the quality of word alignment links as follows: Let c be a word in language 2 and y be a word in language 9 and let there exists an alignment link between cc and 3/.
Introduction
The second term ensures that the cluster alignments induced by a word alignment have high mutual information across languages (§2.2).
Word Clustering
For concreteness, A(:c, y) will be the number of times that cc is aligned to y in a word aligned parallel corpus.
word alignment is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Popat, Kashyap and A.R, Balamurali and Bhattacharyya, Pushpak and Haffari, Gholamreza
Clustering for Cross Lingual Sentiment Analysis
Given a parallel bilingual corpus, word clusters in S can be aligned to clusters in T. Word alignments are created using parallel corpora.
Clustering for Cross Lingual Sentiment Analysis
Here, LTIS and LSIT(...) are factors based on word alignments , which can be represented as:
Conclusion and Future Work
A naive cluster linkage algorithm based on word alignments was used to perform CLSA.
Introduction
To perform CLSA, this study leverages unlabelled parallel corpus to generate the word alignments .
Introduction
These word alignments are then used to link cluster based features to obliterate the language gap for performing SA.
word alignment is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Zhai, Feifei and Zhang, Jiajun and Zhou, Yu and Zong, Chengqing
Experiment
We run GIZA++ and then employ the grow-diag—final—and (gdfa) strategy to produce symmetric word alignments .
Inside Context Integration
We demand that every element and its corresponding target span must be consistent with word alignment .
Inside Context Integration
Note that we only apply the source-side PAS and word alignment for IC—PASTR extraction.
Inside Context Integration
Thus to get a high recall for PASs, we only utilize word alignment instead of capturing the relation between bilingual elements.
Maximum Entropy PAS Disambiguation (MEPD) Model
t_range(PAS) refers to the target range covering all the words that are reachable from the PAS via word alignment .
word alignment is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Cohn, Trevor and Haffari, Gholamreza
Experiments
Following (Levenberg et al., 2012; Neubig et al., 2011), we evaluate our model by using its output word alignments to construct a phrase table.
Experiments
As a baseline, we train a phrase-based model using the moses toolkit12 based on the word alignments obtained using GIZA++ in both directions and symmetrized using the grow-diag-final-and heuristic13 (Koehn et al., 2003).
Experiments
11These are taken from the final model 4 word alignments , using the intersection of the source-target and target-source models.
Related Work
In the context of machine translation, ITG has been explored for statistical word alignment in both unsupervised (Zhang and Gildea, 2005; Cherry and Lin, 2007; Zhang et al., 2008; Pauls et al., 2010) and supervised (Haghighi et al., 2009; Cherry and Lin, 2006) settings, and for decoding (Petrov et al., 2008).
Related Work
Our paper fits into the recent line of work for jointly inducing the phrase table and word alignment (DeNero and Klein, 2010; Neubig et al., 2011).
word alignment is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Sennrich, Rico and Schwenk, Holger and Aransa, Walid
Translation Model Architecture
The lexical weights lex(§|f) and lex(f|§) are calculated as follows, using a set of word alignments a between E and El
Translation Model Architecture
30(3, t) and 005, s) are not identical since the lexical probabilities are based on the unsymmetrized word alignment frequencies (in the Moses implementation which we re-implement).
Translation Model Architecture
In the unweighted variant, the resulting features are equivalent to training on the concatenation of all training data, excepting differences in word alignment , pruning4 and rounding.
word alignment is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Setiawan, Hendra and Zhou, Bowen and Xiang, Bing and Shen, Libin
Introduction
We first identify anchors as regions in the source sentences around which ambiguous reordering patterns frequently occur and chunks as regions that are consistent with word alignment which may span multiple translation units at decoding time.
Maximal Orientation Span
We also attach the word indices as the superscript of the source words and project the indices to the target words aligned , as such “have5” suggests that the word “have” is aligned to the 5-th source word, i.e.
Maximal Orientation Span
Note that to facilitate the projection, the rules must come with internal word alignment in practice.
Maximal Orientation Span
the source sentence, the complete translation and the word alignment .
word alignment is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Weller, Marion and Fraser, Alexander and Schulte im Walde, Sabine
Experiments and evaluation
We use the hierarchical translation system that comes with the Moses SMT-package and GIZA++ to compute the word alignment , using the “grow-diag-final-and” heuristics.
Translation pipeline
We consider gender as part of the stem, whereas the value for number is derived from the source-side: if marked for number, singular/plural nouns are distinguished during word alignment and then translated accordingly.
Using subcategorization information
Figure 1: Deriving features from dependency-parsed English data via the word alignment .
Using subcategorization information
to the SMT output via word alignment .
word alignment is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zhang, Jiajun and Zong, Chengqing
For each 1‘ E V[Sk] :
If a target word in t is a gap word, we suppose there is a word alignment between the target gap word and the source-side null.
Phrase Pair Refinement and Parameterization
According to our analysis, we find that the biggest problem is that in the target-side of the phrase pair, there are two or more identical words aligned to the same source-
Probabilistic Bilingual Lexicon Acquisition
We employ the same algorithm used in (Munteanu and Marcu, 2006) which first use the GIZA++ (with grow-diag-final-and heuristic) to obtain the word alignment between source and target words, and then calculate the association strength between the aligned words.
word alignment is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Lugaresi, Camillo and Di Eugenio, Barbara
The effect of the Italian connectives on the LIS translation
Word Alignment .
The effect of the Italian connectives on the LIS translation
For this purpose we used the Berkeley Word Aligner (BWA) 1 (Denero, 2007), a general tool for aligning sentences in bilingual corpora.
The effect of the Italian connectives on the LIS translation
Bold dashed lines show word alignment .
word alignment is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Yih, Wen-tau and Chang, Ming-Wei and Meek, Christopher and Pastusiak, Andrzej
Conclusions
This may suggest that adding shallow semantic information is more effective than introducing complex structured constraints, at least for the specific word alignment model we experimented with in this work.
Introduction
This may suggest that compared to introducing complex structured constraints, incorporating shallow semantic information is both more effective and computationally inexpensive in improving the performance, at least for the specific word alignment model tested in this work.
Learning QA Matching Models
As a result, an “ideal” word alignment structure should not link words in this clause to those in the question.
word alignment is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Wang, Kun and Zong, Chengqing and Su, Keh-Yih
Experiments
The system configurations are as follows: GIZA++ (Och and Ney, 2003) is used to obtain the bidirectional word alignments .
Problem Formulation
is the final translation; [tm_s,tm_t,tm_f,s_a,tm_a] are the associated information of the best TM sentence-pair; tm_s and tm_t denote the corresponding TM sentence pair; tm_f denotes its associated fuzzy match score (from 0.0 to 1.0); 8_a is the editing operations between tm_8 and s; and tm_a denotes the word alignment between tm_s and tmi.
Problem Formulation
), we can find its corresponding TM source phrase tm_sa(k) and all possible TM target phrases (each of them is denoted by tmiaw» with the help of corresponding editing operations 8_a and word alignment tm_a.
word alignment is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Wang, Chenguang and Duan, Nan and Zhou, Ming and Zhang, Ming
Experiment
Word alignments of each paraphrase pair are trained by GIZA++.
Paraphrasing for Web Search
Word alignments within each paraphrase pair are generated using GIZA++ (Och and Ney, 2000).
Paraphrasing for Web Search
In order to enable our paraphrasing model to learn the preferences on different paraphrasing strategies according to the characteristics of web queries, we design search-oriented features2 based on word alignments within Q and Q’, which can be described as follows:
word alignment is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Razmara, Majid and Siahbani, Maryam and Haffari, Reza and Sarkar, Anoop
Experiments & Results 4.1 Experimental Setup
Word alignment is done using GIZA++ (Och and Ney, 2003).
Experiments & Results 4.1 Experimental Setup
The resulting word alignments are used to extract the translations for each oov.
Experiments & Results 4.1 Experimental Setup
The correctness of this gold standard is limited to the size of the parallel data used as well as the quality of the word alignment software toolkit, and is not 100% precise.
word alignment is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: