Index of papers in Proc. ACL 2013 that mention
  • beam size
Choi, Jinho D. and McCallum, Andrew
Experiments
For selectional branching, the margin threshold m and the beam size I) need to be tuned (Section 3.3).
Experiments
For this development set, the beam size of 64 and 80 gave the exact same result, so we kept the one with a larger beam size (I) = 80).
Experiments
Figure 3: Parsing accuracies with respect to margins and beam sizes on the English development set.
Introduction
Thus, it is preferred if the beam size is not fixed but proportional to the number of low confidence predictions that a greedy parser makes, in which case, fewer transition sequences need to be explored to produce the same or similar parse output.
Selectional branching
Thus, it is preferred if the beam size is not fixed but proportional to the number of low confidence predictions made for the one-best sequence.
Selectional branching
The selectional branching method presented here performs at most d - t — 6 transitions, where t is the maximum number of transitions performed to generate a transition sequence, d = min(b, |A|+1), bis the beam size , |A| is the number of low confidence predictions made for the one-best sequence, and e = @.
Selectional branching
Compared to beam search that always performs b - t transitions, selectional branching is guaranteed to perform fewer transitions given the same beam size because d g b and e > 0 except for d = 1, in which case, no branching happens.
beam size is mentioned in 21 sentences in this paper.
Topics mentioned in this paper:
Li, Qi and Ji, Heng and Huang, Liang
Experiments
Figure 6 shows the training curves of the averaged perceptron with respect to the performance on the development set when the beam size is 4.
Experiments
4.4 Impact of beam size
Experiments
The beam size is an important hyper parameter in both training and test.
Joint Framework for Event Extraction
In Section 4.5 we will show that the standard perceptron introduces many invalid updates especially with smaller beam sizes , also observed by Huang et al.
Joint Framework for Event Extraction
Then the K -best partial configurations are selected to the beam, assuming the beam size is K.
Joint Framework for Event Extraction
K: Beam size .
beam size is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
Zhang, Meishan and Zhang, Yue and Che, Wanxiang and Liu, Ting
Experiments
Figure 6: Accuracies against the training epoch for joint segmentation and tagging as well as joint phrase-structure parsing using beam sizes 1, 4, l6 and 64, respectively.
Experiments
Figure 6 shows the accuracies of our model using different beam sizes with respect to the training epoch.
Experiments
The performance of our model increases as the beam size increases.
Introduction
With linear-time complexity, our parser is highly efficient, processing over 30 sentences per second with a beam size of 16.
beam size is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Wang, Lu and Raghavan, Hema and Castelli, Vittorio and Florian, Radu and Cardie, Claire
Experimental Setup
Beam size is fixed at 2000.4 Sentence compressions are evaluated by a 5-gram language model trained on Gigaword (Graff, 2003) by SRILM (Stolcke, 2002).
Results
4We looked at various beam sizes on the heldout data, and observed that the performance peaks around this value.
Sentence Compression
postorder) as a sequence of nodes in T, the set L of possible node labels, a scoring function 8 for evaluating each sentence compression hypothesis, and a beam size N. Specifically, O is a permutation on the set {0, l, .
Sentence Compression
Om, L ={RET, REM, PAR}, hypothesis scorer S, beam size N
beam size is mentioned in 4 sentences in this paper.
Topics mentioned in this paper: