Index of papers in Proc. ACL 2013 that mention
  • Viterbi
Gormley, Matthew R. and Eisner, Jason
A Branch-and-Bound Algorithm
The mixed integer quadratic program with nonlinear constraints, given in the previous section, maximizes the nonconvex Viterbi EM objective and is NP-hard to solve (Cohen and Smith, 2010).
A Branch-and-Bound Algorithm
The standard approach to optimizing this program is local search by the hard ( Viterbi ) EM algorithm.
Abstract
Although these techniques do not yet provide a practical solution to our instance of this NP-hard problem, they sometimes find better solutions than Viterbi EM with random restarts, in the same time.
Introduction
Many parameter estimation techniques have been attempted, including expectation-maximization (EM) (Klein and Manning, 2004; Spitkovsky et al., 2010a), contrastive estimation (Smith and Eisner, 2006; Smith, 2006), Viterbi EM (Spitkovsky et al., 2010b), and variational EM (Naseem et al., 2010; Cohen et al., 2009; Cohen and Smith, 2009).
Introduction
Note that this objective is that of hard ( Viterbi ) EM—we do not marginalize over the parses as in classical EM.1
Introduction
Our results demonstrate that our method can obtain higher likelihoods than Viterbi EM with random restarts.
Projections
These correspond to the Viterbi E step and M step, respectively.
Relaxations
Since we optimize the Viterbi EM objective, we directly constrain the counts in the single corpus parse rather than expected counts from a distribution over parses.
The Constrained Optimization Task
We begin by describing how for our typical model, the Viterbi EM objective can be formulated as a mixed integer quadratic programming (MIQP) problem with nonlinear constraints (Figure 2).
The Constrained Optimization Task
Concretely, (1) specifies the Viterbi EM objective: the total log-probability of the best parse trees under the parameters 6, given by a sum of log-probabilities 6m of the individual steps needed to generate the tree, as encoded by the features fm.
The Constrained Optimization Task
Figure 2: Viterbi EM as a mathematical program
Viterbi is mentioned in 24 sentences in this paper.
Topics mentioned in this paper:
Ling, Wang and Xiang, Guang and Dyer, Chris and Black, Alan and Trancoso, Isabel
Parallel Segment Retrieval
to calculate the Viterbi alignment a.
Parallel Segment Retrieval
The main bottleneck of the naive algorithm is finding new Viterbi Model 1 word alignments every time we change the spans.
Parallel Segment Retrieval
an iterative approach to compute the Viterbi word alignments for IBM Model 1 using dynamic programming.
Viterbi is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Quirk, Chris
Evaluation
The next line shows the fertility HMM with approximate posterior computation from Gibbs sampling but with final alignment selected by the Viterbi algorithm.
Evaluation
The prior work compared Viterbi with a form of local search (sampling repeatedly and keeping the max), finding little difference between the two (Zhao and Gildea, 2010).
Evaluation
Here, however, the difference between a dual decomposition and Viterbi is significant: their results were likely due to search error.
Fertility distribution parameters
Algorithm AER (G—>E) AER (E—>G) HMM 24.0 21.8 FHMM Viterbi 19.7 19.6 FHMM Dual-dec 18.0 17.4
HMM alignment
argmaXa (f(a) + 2., u<k—1>(i,j>ya(i,j>) ing the standard Viterbi algorithm.
HMM alignment
The algorithm described in Figure 2 finds this maximal assignment in 0(I J log J) time, generally faster than the CU2 J) time used by Viterbi .
Introduction
Therefore, the standard forward-backward and Viterbi algorithms do not apply.
Viterbi is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Schoenemann, Thomas
Conclusion
In the E-step we use hillclimbing to get a likely alignment (ideally the Viterbi alignment).
Introduction
Iteration 11 shows that merely 5.14% of the (HMM) probability mass are covered by the Viterbi alignment and its neighbors.
Introduction
Having introduced the model variants, we proceed to derive a hillclimbing method to compute a likely alignment (ideally the Viterbi alignment) and its neighbors.
Introduction
As an aside, the transfer iteration from HMM to IBM3 (iteration 11) reveals that only 5.14% of the probability mass3 are preserved when using the Viterbi alignment and its neighbors instead of all alignments.
Training the New Variants
As in the training of the deficient IBM-3 and IBM-4 models, we approximate the expectations in the E-step by a set of likely alignments, ideally centered around the Viterbi alignment, but already for the regular deficient variants computing it is NP-hard (Udupa and Maji, 2006).
Viterbi is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Wang, Mengqiu and Che, Wanxiang and Manning, Christopher D.
Joint Alignment and NER Decoding
It is also worth noting that since we decode the alignment models with Viterbi inference, additional constraints such as the neighborhood constraint proposed by DeNero and Macherey (2011) can be easily integrated into our model.
Related Work
Instead of enforcing agreement in the alignment space based on best sequences found by Viterbi , we could opt to encourage agreement between posterior probability distributions, which is related to the posterior regularization work by Graca et al.
Related Work
We also differ in the implementation details, where in their case belief propagation is used in both training and Viterbi inference.
Viterbi is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Yu, Haonan and Siskind, Jeffrey Mark
The Sentence Tracker
25:2 This can be determined with the Viterbi (1967) algorithm and is known as detection-based tracking ( Viterbi , 1971).
The Sentence Tracker
max = q1,...,qT + which can also be found by the Viterbi algorithm.
The Sentence Tracker
3, which can also be determined using the Viterbi algorithm.
Viterbi is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: