Index of papers in Proc. ACL that mention
  • alignment model
Ganchev, Kuzman and Graça, João V. and Taskar, Ben
Abstract
In this work we analyze a recently proposed agreement-constrained EM algorithm for unsupervised alignment models .
Abstract
We propose and extensively evaluate a simple method for using alignment models to produce alignments better-suited for phrase-based MT systems, and show significant gains (as measured by BLEU score) in end-to-end translation systems for six languages pairs used in recent MT competitions.
Adding agreement constraints
They suggest how this framework can be used to encourage two word alignment models to agree during training.
Adding agreement constraints
Most MT systems train an alignment model in each direction and then heuristically combine their predictions.
Introduction
In this work, we show that by changing the way the word alignment models are trained and
Introduction
We present extensive experimental results evaluating a new training scheme for unsupervised word alignment models : an extension of the Expectation Maximization algorithm that allows effective injection of additional information about the desired alignments into the unsupervised training process.
Phrase-based machine translation
We then train the competing alignment models and compute competing alignments using different decoding schemes.
Statistical word alignment
2.1 Baseline word alignment models
Statistical word alignment
Figure 1 illustrates the mapping between the usual HMM notation and the HMM alignment model .
Statistical word alignment
All word alignment models we consider are normally trained using the Expectation Maximization
alignment model is mentioned in 17 sentences in this paper.
Topics mentioned in this paper:
Tamura, Akihiro and Watanabe, Taro and Sumita, Eiichiro
Abstract
This study proposes a word alignment model based on a recurrent neural network (RNN), in which an unlimited alignment history is represented by recurrently connected hidden layers.
Abstract
Our alignment model is directional, similar to the generative IBM models (Brown et al., 1993).
Introduction
the HMM alignment model and achieved state-of-the-art performance.
Introduction
We assume that this property would fit with a word alignment task, and we propose an RNN-based word alignment model .
Introduction
The NN-based alignment models are supervised models.
Related Work
Various word alignment models have been proposed.
Related Work
2.1 Generative Alignment Model
Related Work
2.2 FFNN-based Alignment Model
alignment model is mentioned in 25 sentences in this paper.
Topics mentioned in this paper:
DeNero, John and Macherey, Klaus
Experimental Results
We trained the model on a portion of FBIS data that has been used previously for alignment model evaluation (Ayan and Dorr, 2006; Haghighi et al., 2009; DeNero and Klein, 2010).
Introduction
This result is achieved by embedding two directional HMM-based alignment models into a larger bidirectional graphical model.
Introduction
Moreover, the bidirectional model enforces a one-to-one phrase alignment structure, similar to the output of phrase alignment models (Marcu and Wong, 2002; DeNero et al., 2008), unsupervised inversion transduction grammar (ITG) models (Blunsom et al., 2009), and supervised ITG models (Haghighi et al., 2009; DeNero and Klein, 2010).
Model Definition
Our model contains two directional hidden Markov alignment models , which we review in Section 2.1, along with additional structure that that we introduce in Section 2.2.
Model Definition
2.1 HMM-Based Alignment Model
Model Definition
This section describes the classic hidden Markov model (HMM) based alignment model (Vogel et al., 1996).
Related Work
In addition, supervised word alignment models often use the output of directional unsupervised aligners as features or pruning signals.
Related Work
This approach to jointly learning two directional alignment models yields state-of-the-art unsupervised performance.
alignment model is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
Visweswariah, Karthik and Khapra, Mitesh M. and Ramanathan, Ananthakrishnan
Generating reference reordering from parallel sentences
Complementing this model, we build an alignment model (P(a|ws,wt,7rs,7rt)) that scores alignments a given the source and target sentences and their predicted reorderings according to source and target reordering models.
Generating reference reordering from parallel sentences
The model (C(773|ws, wt, a)) helps to produce better reference reorderings for training our final reordering model given fixed machine alignments and the alignment model (P (a|ws, Wt, 773, 79)) helps improve the machine alignments taking into account information from reordering models.
Generating reference reordering from parallel sentences
St 2: Feed predictions the reordering models to the alignment model
alignment model is mentioned in 14 sentences in this paper.
Topics mentioned in this paper:
Deng, Yonggang and Xu, Jia and Gao, Yuqing
A Generic Phrase Training Procedure
We first train word alignment models and will use them to evaluate the goodness of a phrase and a phrase pair.
A Generic Phrase Training Procedure
Beginning with a flat lexicon, we train IBM Model-l word alignment model with 10 iterations for each translation direction.
A Generic Phrase Training Procedure
We then train HMM word alignment models (Vogel et al., 1996) in two directions simultaneously by merging statistics collected in the
Features
All these features are data-driven and defined based on models, such as statistical word alignment model or language model.
Features
In a statistical generative word alignment model (Brown et al., 1993), it is assumed that (i) a random variable a specifies how each target word fj is generated by (therefore aligned to) a source 1 word eaj; and (ii) the likelihood function f (f, a|e) specifies a generative procedure from the source sentence to the target sentence.
Features
This distribution is applicable to all word alignment models that follow assumptions (i) and (ii).
Introduction
We employ features based on word alignment models and alignment matrix.
alignment model is mentioned in 20 sentences in this paper.
Topics mentioned in this paper:
Liu, Kang and Xu, Liheng and Zhao, Jun
Abstract
In contrast, alignment based methods used word alignment model to fulfill this task, which could avoid parsing errors without using parsing.
Abstract
We further combine syntactic patterns with alignment model by using a partially supervised framework and investigate whether this combination is useful or not.
Introduction
Nevertheless, we notice that the alignment model is a statistical model which needs sufficient data to estimate parameters.
Introduction
To answer these questions, in this paper, we adopt a unified framework to extract opinion targets from reviews, in the key component of which we vary the methods between syntactic patterns and alignment model .
Introduction
Furthermore, this paper naturally addresses another question: is it useful for opinion targets extraction when we combine syntactic patterns and word alignment model into a unified model?
Opinion Target Extraction Methodology
In the first component, we respectively use syntactic patterns and unsupervised word alignment model (WAM) to capture opinion relations.
Opinion Target Extraction Methodology
In addition, we employ a partially supervised word alignment model (PSWAM) to incorporate syntactic information into WAM.
Opinion Target Extraction Methodology
3.1.2 Unsupervised Word Alignment Model
Related Work
(Liu et al., 2013) extend Liu’s method, which is similar to our method and also used a partially supervised alignment model to extract opinion targets from reviews.
alignment model is mentioned in 32 sentences in this paper.
Topics mentioned in this paper:
Wang, Mengqiu and Che, Wanxiang and Manning, Christopher D.
Abstract
We observe that NER label information can be used to correct alignment mistakes, and present a graphical model that performs bilingual NER tagging jointly with word alignment, by combining two monolingual tagging models with two unidirectional alignment models .
Experimental Setup
directional HMM models as our baseline and monolingual alignment models .
Introduction
To capture this source of information, we present a novel extension that combines the BI-NER model with two unidirectional HMM-based alignment models , and perform joint decoding of NER and word alignments.
Introduction
The new model (denoted as BI-NER-WA) factors over five components: one NER model and one word alignment model for each language, plus a joint NER-alignment model which not only enforces NER label agreements but also facilitates message passing among the other four components.
Joint Alignment and NER Decoding
Most commonly used alignment models , such as the IBM models and HMM-based aligner are unsupervised learners, and can only capture simple distortion features and lexical translational features due to the high complexity of the structure prediction space.
Joint Alignment and NER Decoding
We name the Chinese-to-English aligner model as m(Be) and the reverse directional model 71(Bf Be is a matrix that holds the output of the Chinese-to-English aligner.
Joint Alignment and NER Decoding
In our experiments, we used two HMM-based alignment models .
alignment model is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Yang, Nan and Liu, Shujie and Li, Mu and Zhou, Ming and Yu, Nenghai
Abstract
We describe in detail how we adapt and extend the CD-DNN-HMM (Dahl et al., 2012) method introduced in speech recognition to the HMM-based word alignment model , in which bilingual word embedding is discrimina-tively learnt to capture lexical translation information, and surrounding words are leveraged to model context information in bilingual sentences.
Conclusion
Secondly, we want to explore the possibility of unsupervised training of our neural word alignment model , without reliance of alignment result of other models.
DNN for word alignment
Our DNN word alignment model extends classic HMM word alignment model (Vogel et al., 1996).
DNN for word alignment
In the classic HMM word alignment model , context is not considered in the lexical translation probability.
DNN for word alignment
Vocabulary V of our alignment model consists of a source vocabulary V6 and a target vocabulary Vf.
Experiments and Results
In future we would like to explore whether our method can improve other word alignment models .
Experiments and Results
embeddings trained by our word alignment model .
Training
As we do not have a large manually word aligned corpus, we use traditional word alignment models such as HMM and IBM model 4 to generate word alignment on a large parallel corpus.
Training
Tunable parameters in neural network alignment model include: word embeddings in lookup table LT, parameters Wl, bl for linear transformations in the hidden layers of the neural network, and distortion parameters 3d of jump distance.
alignment model is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Jiang, Long and Yang, Shiquan and Zhou, Ming and Liu, Xiaohua and Zhu, Qingsheng
Abstract
Specifically, given a web page, the method contains four steps: 1) preprocessing: parse the web page into a DOM tree and segment the inner text of each node into snippets; 2) seed mining: identify potential translation pairs (seeds) using a word based alignment model which takes both translation and transliteration into consideration; 3) pattern learning: learn generalized patterns with the identified seeds; 4) pattern based mining: extract all bilingual data in the page using the learned patterns.
Adaptive Pattern-based Bilingual Data Mining
In this step, every adjacent snippet pair in different languages will be checked by an alignment model to see if it is a potential translation pair.
Adaptive Pattern-based Bilingual Data Mining
The alignment model combines a translation and a transliteration model to compute the likelihood of a bilingual snippet pair being a translation pair.
Experimental Results
In Table 3, “Without pattern” means that we simply treat those seed pairs found by the alignment model as final bilingual data.
Introduction
2) Seed mining: identify potential translation pairs (seeds) using an alignment model which takes both translation and transliteration into consideration;
Overview of the Proposed Approach
The seed mining module receives the inner text of each selected tree node and uses a word-based alignment model to identify potential translation pairs.
Overview of the Proposed Approach
The alignment model can handle both translation and transliteration in a unified framework.
alignment model is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Liu, Shujie and Li, Chi-Ho and Zhou, Ming
Abstract
On top of the pruning framework, we also propose a discriminative ITG alignment model using hierarchical phrase pairs, which improves both F-score and Bleu score over the baseline alignment system of GIZA++.
Evaluation
IBM Model 1 and HMM alignment model are re-implemented as they are required by the three ITG pruning methods.
Introduction
(2009) do pruning based on the probabilities of links from a simpler alignment model (viz.
The DPDI Framework
The simpler alignment model we used is HMM.
The DPDI Framework
The four links are produced by some simpler alignment model like HMM.
The DPDI Framework
f len +elen Where #linksincon is the number of links which are inconsistent with the phrase pair according to some simpler alignment model (e.g.
alignment model is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Zhang, Hao and Quirk, Chris and Moore, Robert C. and Gildea, Daniel
Conclusion
However, our best system does not apply VB to a single probability model, as we found an appreciable benefit from bootstrapping each model from simpler models, much as the IBM word alignment models are usually trained in succession.
Introduction
As these word-level alignment models restrict the word alignment complexity by requiring each target word to align to zero or one source words, results are improved by aligning both source-to-target as well as target-to-source,
Introduction
Ideally, such a procedure would remedy the deficiencies of word-level alignment models , including the strong restrictions on the form of the alignment, and the strong independence assumption between words.
Phrasal Inversion Transduction Grammar
Our second approach was to constrain the search space using simpler alignment models , which has the further benefit of significantly speeding up training.
Phrasal Inversion Transduction Grammar
First we train a lower level word alignment model , then we place hard constraints on the phrasal alignment space using confident word links from this simpler model.
Variational Bayes for ITG
alignment models is the EM algorithm (Brown et al., 1993) which iteratively updates parameters to maximize the likelihood of the data.
alignment model is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Sajjad, Hassan and Fraser, Alexander and Schmid, Helmut
Experiments
4.3 Integration into Word Alignment Model
Experiments
4.3.1 Modified EM Training of the Word Alignment Models
Experiments
The normal translation probability pta( f |e) of the word alignment models is computed with relative frequency estimates.
alignment model is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Chang, Yin-Wen and Rush, Alexander M. and DeNero, John and Collins, Michael
Background
A first-order HMM alignment model (Vogel et al., 1996) is an HMM of length I + 1 where the hidden state at position i E [I ]0 is the aligned index j E [J ]0, and the transition score takes into account the previously aligned index 3" E [J ]0.1 Formally, define the set of possible HMM alignments as X C {0,1}([I]0X[J]0)U([I]X[J]0X[J]0) with
Bidirectional Alignment
The directional bias of the e—>f and f —>e alignment models may cause them to produce differing alignments.
Bidirectional Alignment
In this work, we instead consider a bidirectional alignment model that jointly considers both directional models.
Conclusion
We have introduced a novel Lagrangian relaxation algorithm for a bidirectional alignment model that uses incremental constraint addition and coarse-to-fine pruning to find exact solutions.
Experiments
Our experimental results compare the accuracy and optimality of our decoding algorithm to directional alignment models and previous work on this bidirectional model.
alignment model is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Neubig, Graham and Watanabe, Taro and Mori, Shinsuke and Kawahara, Tatsuya
Alignment Methods
These may be words in word-based alignment models or single characters in character-based alignment models.1 We define our alignment as of, where each element is a span ak = (s, t, u, 2)) indicating that the target string es, .
Alignment Methods
The most well-known and widely-used models for bitext alignment are for one-to-many alignment, including the IBM models (Brown et al., 1993) and HMM alignment model (Vogel et al., 1996).
Introduction
One barrier to applying many-to-many alignment models to character strings is training cost.
Introduction
Secondly, we describe a method to seed the search process using counts of all substring pairs in the corpus to bias the phrase alignment model .
Related Work on Data Sparsity in SMT
Sparsity causes trouble for alignment models , both in the form of incorrectly aligned uncommon words, and in the form of garbage collection, where uncommon words in one language are incorrectly aligned to large segments of the sentence in the other language (Och and Ney, 2003).
alignment model is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Ling, Wang and Xiang, Guang and Dyer, Chris and Black, Alan and Trancoso, Isabel
Parallel Data Extraction
The number of parallel messages is estimated by running our alignment model , and checking if 7' > gb, where gb was set empirically initially, and optimized after obtaining annotated data, which will be detailed in 5.1.
Parallel Data Extraction
Finally, we run our alignment model described in section 3, and obtain the parallel segments and their scores, which measure how likely those segments are parallel.
Parallel Segment Retrieval
Then, we would use an word alignment model (Brown et al., 1993; Vogel et al., 1996), with source s = sup, .
Parallel Segment Retrieval
Firstly, word alignment models generally attribute higher probabilities to smaller segments, since these are the result of a smaller product chain of probabilities.
alignment model is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Yih, Wen-tau and Chang, Ming-Wei and Meek, Christopher and Pastusiak, Andrzej
Conclusions
This may suggest that adding shallow semantic information is more effective than introducing complex structured constraints, at least for the specific word alignment model we experimented with in this work.
Introduction
Compared to the previous work, our latent alignment model improves the result on a benchmark dataset by a wide margin — the mean average precision (MAP) and mean reciprocal rank (MRR) scores are increased by 25.6% and 18.8%, respectively.
Introduction
Second, while the latent alignment model performs better than unstructured models, the difference diminishes after adding the enhanced lexical semantics information.
Introduction
This may suggest that compared to introducing complex structured constraints, incorporating shallow semantic information is both more effective and computationally inexpensive in improving the performance, at least for the specific word alignment model tested in this work.
alignment model is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Neubig, Graham and Watanabe, Taro and Sumita, Eiichiro and Mori, Shinsuke and Kawahara, Tatsuya
Experimental Evaluation
This is the first reported result in which an unsupervised phrase alignment model has built a phrase table directly from model probabilities and achieved results that compare to heuristic phrase extraction.
Hierarchical ITG Model
Previous research has used a variety of sampling methods to learn Bayesian phrase based alignment models (DeNero et al., 2008; Blunsom et al., 2009; Blunsom and Cohn, 2010).
Introduction
The model is similar to previously proposed phrase alignment models based on inversion transduction grammars (ITGs) (Cherry and Lin, 2007; Zhang et al., 2008; Blunsom et al., 2009), with one important change: ITG symbols and phrase pairs are generated in the opposite order.
alignment model is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Sun, Jun and Zhang, Min and Tan, Chew Lim
Substructure Spaces for BTKs
5 Alignment Model
Substructure Spaces for BTKs
Given feature spaces defined in the last two sections, we propose a 2-phase subtree alignment model as follows:
Substructure Spaces for BTKs
In order to evaluate the effectiveness of the alignment model and its capability in the applications requiring syntactic translational equivalences, we employ two corpora to carry out the subtree alignment evaluation.
alignment model is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Riesa, Jason and Marcu, Daniel
Experiments
We align the same core subset with our trained hypergraph alignment model , and extract a second set of translation rules.
Introduction
Generative alignment models like IBM Model-4 (Brown et al., 1993) have been in wide use for over 15 years, and while not perfect (see Figure 1), they are completely unsupervised, requiring no annotated training data to learn alignments that have powered many current state-of-the-art translation system.
Introduction
We present in this paper a discriminative alignment model trained on relatively little data, with a simple, yet powerful hierarchical search procedure.
alignment model is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Jiampojamarn, Sittichai and Kondrak, Grzegorz
Background
The following constraints on links are assumed by some or all alignment models:
Background
We refer to an alignment model that assumes all three constraints as a pure one-to-one (ll) model.
Extrinsic evaluation
The TiMBL L2P generation method (Table 2) is applicable only to the 1-1 alignment models .
alignment model is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Liu, Kang and Xu, Liheng and Zhao, Jun
Experiments
They employed a word alignment model to capture opinion relations among words, and then used a random walking algorithm to extract opinion targets.
Introduction
They have investigated a series of techniques to enhance opinion relations identification performance, such as nearest neighbor rules (Liu et al., 2005), syntactic patterns (Zhang et al., 2010; Popescu and Etzioni, 2005), word alignment models (Liu et al., 2012; Liu et al., 2013b; Liu et al., 2013a), etc.
Related Work
(Liu et al., 2012; Liu et al., 2013a; Liu et al., 2013b) employed word alignment model to capture opinion relations rather than syntactic parsing.
alignment model is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Hall, David and Klein, Dan
Experiments
This task highlights the evolution and alignment models .
Introduction
Finally, an alignment model maps the flat word lists to cognate groups.
Introduction
Inference requires a combination of message-passing in the evolutionary model and iterative bipartite graph matching in the alignment model .
alignment model is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Yao, Xuchen and Van Durme, Benjamin
Graph Features
Thus assuming there is an alignment model that is able to tell how likely one relation maps to the original question, we add extra alignment-based features for the incoming and outgoing relation of each node.
Graph Features
We describe such an alignment model in § 5.
Relation Mapping
Since the relations on one side of these pairs are not natural sentences, we ran the most simple IBM alignment Model 1 (Brown et al., 1993) to estimate the translation probability with GIZA++ (Och and Ney, 2003).
alignment model is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: