A Constrained Viterbi Relaxation for Bidirectional Word Alignment
Chang, Yin-Wen and Rush, Alexander M. and DeNero, John and Collins, Michael

Article Structure

Abstract

Bidirectional models of word alignment are an appealing alternative to post-hoc combinations of directional word aligners.

Introduction

Word alignment is a critical first step for building statistical machine translation systems.

Background

The focus of this work is on the word alignment decoding problem.

Bidirectional Alignment

The directional bias of the e—>f and f —>e alignment models may cause them to produce differing alignments.

Adjacent Agreement

Enforcing full agreement can be too strict an alignment criteria.

Adding Back Constraints

In general, it can be shown that Lagrangian relaxation is only guaranteed to solve a linear programming relaxation of the underlying combinatorial problem.

Pruning

Reintroducing constraints can lead to an exponential blowup in the search space of the Viterbi algorithm.

Related Work

The most common techniques for bidirectional alignment are post-hoc combinations, such as

Experiments

Our experimental results compare the accuracy and optimality of our decoding algorithm to directional alignment models and previous work on this bidirectional model.

Conclusion

We have introduced a novel Lagrangian relaxation algorithm for a bidirectional alignment model that uses incremental constraint addition and coarse-to-fine pruning to find exact solutions.

Topics

word alignment

Appears in 15 sentences as: word aligners (2) Word Alignment (1) Word alignment (1) word alignment (12) word alignments (1)
In A Constrained Viterbi Relaxation for Bidirectional Word Alignment
  1. Bidirectional models of word alignment are an appealing alternative to post-hoc combinations of directional word aligners .
    Page 1, “Abstract”
  2. Word alignment is a critical first step for building statistical machine translation systems.
    Page 1, “Introduction”
  3. In order to ensure accurate word alignments, most systems employ a post-hoc symmetrization step to combine directional word aligners , such as IBM Model 4 (Brown et al., 1993) or hidden Markov model (HMM) based aligners (Vogel et al., 1996).
    Page 1, “Introduction”
  4. We begin in Section 2 by formally describing the directional word alignment problem.
    Page 1, “Introduction”
  5. The focus of this work is on the word alignment decoding problem.
    Page 2, “Background”
  6. Before turning to the model of interest, we first introduce directional word alignment .
    Page 2, “Background”
  7. 2.1 Word Alignment
    Page 2, “Background”
  8. In the e—>f word alignment problem, each word in e is aligned to a word in f or to the null word 6.
    Page 2, “Background”
  9. We refer to a single word alignment as a link.
    Page 2, “Background”
  10. Cromieres and Kurohashi (2009) use belief propagation on a factor graph to train and decode a one-to-one word alignment problem.
    Page 7, “Related Work”
  11. (2008) use posterior regularization to constrain the posterior probability of the word alignment problem to be symmetric and bij ective.
    Page 7, “Related Work”

See all papers in Proc. ACL 2014 that mention word alignment.

See all papers in Proc. ACL that mention word alignment.

Back to top.

Viterbi

Appears in 14 sentences as: Viterbi (14)
In A Constrained Viterbi Relaxation for Bidirectional Word Alignment
  1. The relaxation can be solved with a modified version of the Viterbi algorithm.
    Page 1, “Abstract”
  2. Our decoder uses a simple variant of the Viterbi algorithm for solving a relaxed version of this model.
    Page 1, “Introduction”
  3. 0 It is based on a novel relaxation for the model of DeNero and Macherey (2011), solvable with a variant of the Viterbi algorithm.
    Page 1, “Introduction”
  4. Note that for both of these models we can solve the optimization problem exactly using the standard Viterbi algorithm for HMM decoding.
    Page 3, “Background”
  5. We then use the standard Viterbi update for computing the at variables, adding in the score of the y(i, j) necessary to satisfy the constraints.
    Page 4, “Bidirectional Alignment”
  6. The key algorithmic idea is to extend the Viterbi algorithm in order to consider possible adjacent links in the reverse direction.
    Page 4, “Adjacent Agreement”
  7. Despite the new constraints, we can still compute L()\) in 0(I J (I + J time using a variant of the Viterbi algorithm.
    Page 5, “Adjacent Agreement”
  8. Figure 5: Modified Viterbi algorithm for computing the adj a-cent agreement L()\).
    Page 5, “Adjacent Agreement”
  9. In order to compute the arg max of this optimization problem, we need to satisfy the constraints within the Viterbi algorithm.
    Page 6, “Adding Back Constraints”
  10. We augment the Viterbi chart with a count vector d E D where D C leplll and d(i,j) is a count for the (i,j)’th constraint, i.e.
    Page 6, “Adding Back Constraints”
  11. Figure 6 shows this constrained Viterbi relaxation approach.
    Page 6, “Adding Back Constraints”

See all papers in Proc. ACL 2014 that mention Viterbi.

See all papers in Proc. ACL that mention Viterbi.

Back to top.

sentence pairs

Appears in 11 sentences as: sentence pair (2) sentence pairs (9)
In A Constrained Viterbi Relaxation for Bidirectional Word Alignment
  1. The algorithm finds provably exact solutions on 86% of sentence pairs and shows improvements over directional models.
    Page 1, “Abstract”
  2. o Empirically, it is able to find exact solutions on 86% of sentence pairs and is significantly faster than general-purpose solvers.
    Page 1, “Introduction”
  3. Following past work, the first 150 sentence pairs of the training section are used for evaluation.
    Page 7, “Experiments”
  4. With incremental constraints and pruning, we are able to solve over 86% of sentence pairs including many longer and more difficult pairs.
    Page 8, “Experiments”
  5. Table 3: The average number of constraints added for sentence pairs where Lagrangian relaxation is not able to find an exact solution.
    Page 9, “Experiments”
  6. Table 3 gives the average number of constraints added for sentence pairs for which Lagrangian relaxation alone does not produce a certificate.
    Page 9, “Experiments”
  7. Figure 7(a) shows the average over all sentence pairs of the best dual and best primal scores.
    Page 9, “Experiments”
  8. On average, the pruning reduces the search space of each sentence pair to 20% of the initial search space after 200 iterations.
    Page 9, “Experiments”
  9. optimal score, averaged over all sentence pairs .
    Page 9, “Conclusion”
  10. The plot shows an average over all sentence pairs .
    Page 9, “Conclusion”
  11. We can construct a sentence pair in which I = J = N and e-alignments have infinite cost.
    Page 9, “Conclusion”

See all papers in Proc. ACL 2014 that mention sentence pairs.

See all papers in Proc. ACL that mention sentence pairs.

Back to top.

alignment model

Appears in 5 sentences as: alignment model (3) alignment models (2)
In A Constrained Viterbi Relaxation for Bidirectional Word Alignment
  1. A first-order HMM alignment model (Vogel et al., 1996) is an HMM of length I + 1 where the hidden state at position i E [I ]0 is the aligned index j E [J ]0, and the transition score takes into account the previously aligned index 3" E [J ]0.1 Formally, define the set of possible HMM alignments as X C {0,1}([I]0X[J]0)U([I]X[J]0X[J]0) with
    Page 2, “Background”
  2. The directional bias of the e—>f and f —>e alignment models may cause them to produce differing alignments.
    Page 3, “Bidirectional Alignment”
  3. In this work, we instead consider a bidirectional alignment model that jointly considers both directional models.
    Page 3, “Bidirectional Alignment”
  4. Our experimental results compare the accuracy and optimality of our decoding algorithm to directional alignment models and previous work on this bidirectional model.
    Page 7, “Experiments”
  5. We have introduced a novel Lagrangian relaxation algorithm for a bidirectional alignment model that uses incremental constraint addition and coarse-to-fine pruning to find exact solutions.
    Page 9, “Conclusion”

See all papers in Proc. ACL 2014 that mention alignment model.

See all papers in Proc. ACL that mention alignment model.

Back to top.

linear programming

Appears in 3 sentences as: linear programming (3)
In A Constrained Viterbi Relaxation for Bidirectional Word Alignment
  1. In general, it can be shown that Lagrangian relaxation is only guaranteed to solve a linear programming relaxation of the underlying combinatorial problem.
    Page 5, “Adding Back Constraints”
  2. General linear programming approaches have also been applied to word alignment problems.
    Page 7, “Related Work”
  3. (2006) formulate the word alignment problem as quadratic assignment problem and solve it using an integer linear programming solver.
    Page 7, “Related Work”

See all papers in Proc. ACL 2014 that mention linear programming.

See all papers in Proc. ACL that mention linear programming.

Back to top.

objective function

Appears in 3 sentences as: objective function (3)
In A Constrained Viterbi Relaxation for Bidirectional Word Alignment
  1. Given a sentence e of length |e| = I and a sentence f of length |f| = J, our goal is to find the best bidirectional alignment between the two sentences under a given objective function .
    Page 2, “Background”
  2. The HMM objective function f : X —> R can be written as a linear function of :c
    Page 2, “Background”
  3. Similarly define the objective function
    Page 2, “Background”

See all papers in Proc. ACL 2014 that mention objective function.

See all papers in Proc. ACL that mention objective function.

Back to top.

phrase pair

Appears in 3 sentences as: phrase pair (3)
In A Constrained Viterbi Relaxation for Bidirectional Word Alignment
  1. alignment phrase pair
    Page 8, “Experiments”
  2. Table 2: Alignment accuracy and phrase pair extraction accuracy for directional and bidirectional models.
    Page 8, “Experiments”
  3. AER is alignment error rate and F l is the phrase pair extraction F1 score.
    Page 8, “Experiments”

See all papers in Proc. ACL 2014 that mention phrase pair.

See all papers in Proc. ACL that mention phrase pair.

Back to top.