Index of papers in Proc. ACL that mention
  • beam search
Ma, Ji and Zhu, Jingbo and Xiao, Tong and Yang, Nan
Abstract
In this paper, we combine easy-first dependency parsing and POS tagging algorithms with beam search and structured perceptron.
Abstract
The proposed solution can also be applied to combine beam search and structured perceptron with other systems that exhibit spurious ambiguity.
Bk <— BESTS(X,Bk_1,W)
3 Easy-first with beam search
Bk <— BESTS(X,Bk_1,W)
In this section, we introduce easy-first with beam search in our own notations that will be used throughout the rest of this paper.
Bk <— BESTS(X,Bk_1,W)
Following (Huang et al., 2012), in order to formalize beam search , we also use the argtopffleyw - <p(y) operation which returns the top s action sequences in Y according to w - (p(y).
Easy-first dependency parsing
Algorithm 1: Easy-first with beam search
Introduction
To enlarge the search space, a natural extension to greedy search is beam search .
Introduction
Recent work also shows that beam search together with perceptron-based global learning (Collins, 2002) enable the use of nonlocal features that are helpful to improve parsing performance without overfitting (Zhang and Nivre, 2012).
Introduction
Due to these advantages, beam search and global learning has been applied to many NLP tasks (Collins and
beam search is mentioned in 16 sentences in this paper.
Topics mentioned in this paper:
Choi, Jinho D. and McCallum, Andrew
Abstract
We present a novel approach, called selectional branching, which uses confidence estimates to decide when to employ a beam, providing the accuracy of beam search at speeds close to a greedy transition-based dependency parsing approach.
Abstract
Selectional branching is guaranteed to perform a fewer number of transitions than beam search yet performs as accurately.
Abstract
With the standard setup, our parser shows an unlabeled attachment score of 92.96% and a parsing speed of 9 milliseconds per sentence, which is faster and more accurate than the current state-of—the-art transition-based parser that uses beam search .
Introduction
Greedy transition-based dependency parsing has been widely deployed because of its speed (Cer et a1., 2010); however, state-of—the-art accuracies have been achieved by globally optimized parsers using beam search (Zhang and Clark, 2008; Huang and Sagae, 2010; Zhang and Nivre, 2011; Bohnet and Nivre, 2012).
Introduction
Coupled with dynamic programming, transition-based dependency parsing with beam search can be done very efficiently and gives significant improvement to parsing accuracy.
Introduction
One downside of beam search is that it always uses a fixed size of beam even when a smaller size of beam is sufficient for good results.
Selectional branching
For transition-based parsing, state-of-the-art accuracies have been achieved by parsers optimized on multiple transition sequences using beam search,
Selectional branching
Compared to beam search that always performs b - t transitions, selectional branching is guaranteed to perform fewer transitions given the same beam size because d g b and e > 0 except for d = 1, in which case, no branching happens.
Selectional branching
higher parsing accuracy than the current state-of-the-art transition-based parser using beam search , and performs about 3 times faster.
beam search is mentioned in 19 sentences in this paper.
Topics mentioned in this paper:
Björkelund, Anders and Kuhn, Jonas
Abstract
We investigate different ways of learning structured perceptron models for coreference resolution when using nonlocal features and beam search .
Introducing Nonlocal Features
In order to keep some options around during search, we extend the best-first decoder with beam search .
Introducing Nonlocal Features
Beam search works incrementally by keeping an agenda of state items.
Introducing Nonlocal Features
The beam search decoder can be plugged into the training algorithm, replacing the calls to arg max.
Introduction
We show that for the task of coreference resolution the straightforward combination of beam search and early update (Collins and Roark, 2004) falls short of more limited feature sets that allow for exact search.
Related Work
(2004) also apply beam search at test time, but use a static assignment of antecedents and learns log-linear model using batch learning.
beam search is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Wang, Lu and Raghavan, Hema and Castelli, Vittorio and Florian, Radu and Cardie, Claire
Abstract
An innovative beam search decoder is proposed to efficiently find highly probable compressions.
Results
Furthermore, our HEAD-driven beam search method with MULTI-scorer beats all systems on DUC 20066 and all systems on DUC 2007 except the best system in terms of R-2 (p < 0.01).
Results
Four native speakers who are undergraduate students in computer science (none are authors) performed the task, We compare our system based on HEAD-driven beam search with MULTI-scorer to the best systems in DUC 2006 achieving top ROUGE scores (Jagarlamudi et al., 2006) (Best DUC system (ROUGE)) and top linguistic quality scores (Lacatusu et al., 2006) (Best DUC system (LQ))7.
Sentence Compression
As the space of possible compressions is exponential in the number of leaves in the parse tree, instead of looking for the globally optimal solution, we use beam search to find a set of highly likely compressions and employ a language model trained on a large corpus for evaluation.
Sentence Compression
A Beam Search Decoder.
Sentence Compression
The beam search decoder (see Algorithm 1) takes as input the sentence’s parse tree T = 750751 .
beam search is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Huang, Zhiheng and Chang, Yi and Long, Bo and Crespo, Jean-Francois and Dong, Anlei and Keerthi, Sathiya and Wu, Su-Lin
Background
The lower bound can be initialized to the best sequence score using a beam search (with beam size being 1).
Background
1: lb 2 best score from beam search 2: init lattice 3: for i=0;;i++ do if i %2 == 0 then y 2 forward() else y = backwardO end if if y consists of active labels only then return y
Experiments
The beam search decoding is also shown as a baseline.
Experiments
The decoding speed of iterative Viterbi A* can even be comparable to that of beam search .
Proposed Algorithms
The search is similar to the beam search with beam size being 1.
Proposed Algorithms
We initialize the lower bound lb with the k-th best score from beam search (with beam size being 11:) at line 1.
Proposed Algorithms
Note that the beam search is performed on the original lattice which consists of L active labels per position.
Related work
Apart from the exact decoding, approximate decoding algorithms such as beam search are also related to our work.
beam search is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Hayashi, Katsuhiko and Watanabe, Taro and Asahara, Masayuki and Matsumoto, Yuji
Introduction
To improve parsing flexibility in deterministic parsing, our top-down parser uses beam search algorithm with dynamic programming (Huang and Sagae, 2010).
Time Complexity
n + (n ) + + Z 2 ( ) As 77.2 for prediction is the most dominant factor, the time complexity of the algorithm is 0(n2) and that of the algorithm with beam search is 0(n2 >x< b).
Weighted Parsing Model
Algorithm 1 Top-down Parsing with Beam Search 1: inputW= n0,...,nn
Weighted Parsing Model
As mentioned in Section 1, we apply beam search and Huang and Sagae (2010)’s DP techniques to our top-down parser.
Weighted Parsing Model
Algorithm 1 shows the our beam search algorithm in which top most b states are preserved in a buffer buflé] in each step.
beam search is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Minkov, Einat and Zettlemoyer, Luke
Corporate Acquisitions
Experiments We applied beam search , where corp tuples are extracted first, and acquisition tuples are constructed using the top scoring corp entities.
Structured Learning
4.2), where beam search is used to find the top scoring candidates efficiently (Sec.
Structured Learning
Algorithm 1: The beam search procedure 1.
Structured Learning
4.3 Beam Search
beam search is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Wu, Yuanbin and Ng, Hwee Tou
Experiments
We compare our ILP approach with two other systems: the beam search decoder of (Dahlmeier and Ng, 2012a) which achieves the best published performance to date on the H00 2011 data set, and UI Runl (Rozovskaya et al., 2011) which achieves the best performance among all participating systems at the H00 2011 shared task.
Experiments
P R F P R F UI Runl 40.86 11.21 17.59 54.61 14.57 23.00 Beam search 30.28 19.17 23.48 33.59 20.53 25.48 ILP 20.54 27.93 23.67 21.99 29.04 25.03
Experiments
The results of the beam search decoder and UI Run1 are taken from Table 2 of (Dahlmeier and Ng, 2012a).
beam search is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Wuebker, Joern and Ney, Hermann and Zens, Richard
Abstract
In this work we present two extensions to the well-known dynamic programming beam search in phrase-based statistical machine translation (SMT), aiming at increased efficiency of decoding by minimizing the number of language model computations and hypothesis expansions.
Conclusions
This work introduces two extensions to the well-known beam search algorithm for phrase-based machine translation.
Experimental Evaluation
In this section we evaluate the proposed extensions to the original beam search algorithm in terms of scalability and their usefulness for different application constraints.
Introduction
The idea of LM lookahead is to incorporate the LM probabilities into the pruning process of the beam search as early as possible.
Search Algorithm Extensions
A beam search strategy is used to find the best hypothesis.
Search Algorithm Extensions
In addition to the source sentence f1], the beam search algorithm takes a matrix E(-, as input, where for each contiguous phrase f = f j .
beam search is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Zhao, Hai and Song, Yan and Kit, Chunyu and Zhou, Guodong
Dependency Parsing: Baseline
4.2 Parsing using 3 Beam Search Algorithm
Dependency Parsing: Baseline
We use a beam search algorithm to find the object parsing action sequence.
Evaluation Results
beam search algorithm with width 5 is used for parsing, otherwise, a simple shift-reduce decoding is used.
Evaluation Results
+d 0.861 0.870 “+d: using three Markovian features preact and beam search decoding.
Treebank Translation and Dependency Transformation
A beam search algorithm is used for this process to find the best path from all the translation options; As the training stage, especially, the most time-consuming alignment substage, is skipped, the translation only includes a decoding procedure that takes about 4.5 hours for about one million words of the PTB in a 2.8GHz PC.
beam search is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Kushman, Nate and Artzi, Yoav and Zettlemoyer, Luke and Barzilay, Regina
Experimental Setup
Parameters and Solver In our experiments we set k in our beam search algorithm (Section 5) to 200, and l to 20.
Inference
Therefore, we approximate this computation using beam search .
Inference
During learning we compute the second term in the gradient (Equation 2) using our beam search approximation.
Learning
Section 5 describes how we approximate the two terms of the gradient using beam search .
Mapping Word Problems to Equations
We use a beam search inference procedure to approximately compute Equation 1, as described in Section 5.
beam search is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Socher, Richard and Bauer, John and Manning, Christopher D. and Andrew Y., Ng
Introduction
The subsequent section will then describe a bottom-up beam search and its approximation for finding the optimal tree.
Introduction
One could use a bottom-up beam search , keeping a k-best list at every cell of the chart, possibly for each syntactic category.
Introduction
This beam search inference procedure is still considerably slower than using only the simplified base PCFG, especially since it has a small state space (see next section for
beam search is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Li, Qi and Ji, Heng and Huang, Liang
Introduction
Therefore we employ beam search in decoding, and train the model using the early-update perceptron variant tailored for beam search (Collins and Roark, 2004; Huang et al., 2012).
Joint Framework for Event Extraction
3.1 Structured perceptron with beam search
Joint Framework for Event Extraction
Figure 2 describes the skeleton of perceptron training algorithm with beam search .
Joint Framework for Event Extraction
In each step of the beam search , if the prefix of oracle assignment 3/ falls out from the beam, then the top result in the beam is returned for early update.
Related Work
In comparison, our approach is a unified framework based on beam search , which allows us to exploit arbitrary global features efficiently.
beam search is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Stern, Asher and Stern, Roni and Dagan, Ido and Felner, Ariel
Background
Beam search (Furcy and Koenig, 2005; Zhou and Hansen, 2005) limits the number of states stored in OPEN, while Greedy search limits OPEN to contain only the single best state generated in the current iteration.
Background
In the earlier version of BIUTEE (Stern and Dagan, 2011)2, a version of beam search was incorporated, named hereafter BIUTEE-orig.
Evaluation
For beam search we used 1:: = 150, since higher val-
Evaluation
WA* 7.85 / 8.53 9797 / 9979 1.05/ 10.28 Greedy 0.20 / 0.10 468 / 158 1.10/ 10.55 Pure heuristic 0.09 / 0.10 123/ 167 1.35/ 12.51 Beam search 2053/ 9.48 43925 / 18992 1.08/ 10.52 BIUTEE-orig 7.86/ 14.61 14749 / 22795 1.03/10.28 LLGS 2.76/ 1.72 1722/ 842 0.95/ 10.14
Evaluation
Algorithm RTE—5 accuracy % RTE—6 F1 % Weighted A* 59.50 48.20 Dovetailing WA* 60.83 49.01 Greedy 60.50 48.56 Pure heuristic 60.83 45.70 Beam search 61.33 48.58 BIUTEE-orig 60.67 49.25 LLGS 64.00 49.09
beam search is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Kaji, Nobuhiro and Fujiwara, Yasuhiro and Yoshinaga, Naoki and Kitsuregawa, Masaru
Introduction
Approximate algorithms, such as beam search or island-driven search, have been proposed for speeding up decoding.
Introduction
In our experiments, we use the path located by the left-to-right deterministic decoding (i.e., beam search with a beam width of 1).
Introduction
with beam search , which is the approximate algorithm widely adopted for sequence labeling in NLP.
beam search is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Hatori, Jun and Matsuzaki, Takuya and Miyao, Yusuke and Tsujii, Jun'ichi
Model
In beam search, we use the step index that is associated with each state: the parser states in process are aligned according to the index, and the beam search pruning is applied to those states with the same index.
Model
Consequently, for the beam search to function effectively, all states with the same index must be comparable, and all terminal states should have the same step index.
Related Works
However, it requires to computationally expensive multiple beams to compare words of different lengths using beam search .
Related Works
Because we found that even an incremental approach with beam search is intractable if we perform the word-based decoding, we take a character-based approach to produce our joint model.
beam search is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Özbal, Gözde and Pighin, Daniele and Strapparava, Carlo
Architecture of BRAINSUP
A beam search in the space of all possible lexicalizations of a syntactic pattern promotes the words with the highest likelihood of satisfying the user specification.
Architecture of BRAINSUP
CompatiblePatterns(-) finds the most frequent syntactic patterns in ’P that are compatible with the user specification, as explained in Section 3.1; FillInPattern(-) carries out the beam search , and returns the best solutions generated for each pattern p given U.
Architecture of BRAINSUP
With the compatible patterns selected, we can initiate a beam search in the space of all possible lexicalizations of the patterns, i.e., the space of all sentences that can be generated by respecting the syntactic constraints encoded by each pattern.
beam search is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Green, Spence and DeNero, John
Inference during Translation Decoding
This decoding problem is NP—hard, thus a beam search is often used (Fig.
Inference during Translation Decoding
The beam search relies on three operations, two of which affect the agreement model:
Inference during Translation Decoding
Figure 4: Breadth-first beam search algorithm of Och and Ney (2004).
beam search is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Chen, Wenliang and Zhang, Min and Li, Haizhou
Abstract
In this paper, we present an approach to enriching high—order feature representations for graph-based dependency parsing models using a dependency language model and beam search .
Abstract
Finally, the features are efficiently integrated into the parsing model during decoding using beam search .
Decoding
We use beam search to choose the one having the overall best score as the final parse, where K spans are built at each step (Zhang and Clark, 2008).
Experiments
Since the setting of K (for beam search ) affects our parsers, we studied its influence on the development
beam search is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Nishikawa, Hitoshi and Hasegawa, Takaaki and Matsuo, Yoshihiro and Kikui, Genichiro
Conclusion
The preferred sequence is determined by using dynamic programming and beam search .
Introduction
Our algorithm efficiently searches for the best sequence of sentences by using dynamic programming and beam search .
Optimizing Sentence Sequence
To alleviate this, we find an approximate solution by adopting the dynamic programming technique of the Held and Karp Algorithm (Held and Karp, 1962) and beam search .
Optimizing Sentence Sequence
Therefore, we adopt the Held and Karp Algorithm and beam search to find approximate solutions.
beam search is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Blunsom, Phil and Cohn, Trevor and Osborne, Miles
Discriminative Synchronous Transduction
Most prior work in SMT, both generative and discriminative, has approximated the sum over derivations by choosing a single ‘best’ derivation using a Viterbi or beam search algorithm.
Discriminative Synchronous Transduction
Here we approximate the sum over derivations directly using a beam search in which we produce a beam of high probability translation sub-strings for each cell in the parse chart.
Discriminative Synchronous Transduction
When the beam search is complete we have a list of translations in the top beam cell spanning the entire source sentence along with their approximated inside derivation scores.
beam search is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Raghavan, Preethi and Fosler-Lussier, Eric and Elhadad, Noémie and Lai, Albert M.
Problem Description
During composition we retain intermediate paths like M 33 utilizing the ability to do lazy composition (Mohri and Pereira, 1998) in order to facilitate beam search through the multi-alignment.
Problem Description
However, performing a beam search over the composed WFST in equation 2 allows us to accommodate such constraints across multiple sequences.
Problem Description
The accuracy of the WFST—based representation and beam search across all sequences using the coreference and temporal relation scores to obtain the combined aligned sequence is 78.9%.
beam search is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Tamura, Akihiro and Watanabe, Taro and Sumita, Eiichiro
RNN-based Alignment Model
Thus, the Viterbi alignment is computed approximately using heuristic beam search .
Training
To reduce computation, we employ NCE, which uses randomly sampled sentences from all target language sentences in Q as e‘, and calculate the expected values by a beam search with beam width W to truncate alignments with low scores.
Training
GEN is a subset of all possible word alignments (I), which is generated by beam search .
beam search is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: