Index of papers in Proc. ACL 2013 that mention

Seen in text as:

Seen in 46 sentences in 4 papers.

Choi, Jinho D. and McCallum, Andrew

Experiments	For selectional branching, the margin threshold m and the beam size I) need to be tuned (Section 3.3).
Experiments	For this development set, the beam size of 64 and 80 gave the exact same result, so we kept the one with a larger beam size (I) = 80).
Experiments	Figure 3: Parsing accuracies with respect to margins and beam sizes on the English development set.
Introduction	Thus, it is preferred if the beam size is not fixed but proportional to the number of low confidence predictions that a greedy parser makes, in which case, fewer transition sequences need to be explored to produce the same or similar parse output.
Selectional branching	Thus, it is preferred if the beam size is not fixed but proportional to the number of low confidence predictions made for the one-best sequence.
Selectional branching	The selectional branching method presented here performs at most d - t — 6 transitions, where t is the maximum number of transitions performed to generate a transition sequence, d = min(b, \|A\|+1), bis the beam size , \|A\| is the number of low confidence predictions made for the one-best sequence, and e = @.
Selectional branching	Compared to beam search that always performs b - t transitions, selectional branching is guaranteed to perform fewer transitions given the same beam size because d g b and e > 0 except for d = 1, in which case, no branching happens.

beam size is mentioned in 21 sentences in this paper.

Topics mentioned in this paper:

Li, Qi and Ji, Heng and Huang, Liang

Experiments	Figure 6 shows the training curves of the averaged perceptron with respect to the performance on the development set when the beam size is 4.
Experiments	4.4 Impact of beam size
Experiments	The beam size is an important hyper parameter in both training and test.
Joint Framework for Event Extraction	In Section 4.5 we will show that the standard perceptron introduces many invalid updates especially with smaller beam sizes , also observed by Huang et al.
Joint Framework for Event Extraction	Then the K -best partial configurations are selected to the beam, assuming the beam size is K.
Joint Framework for Event Extraction	K: Beam size .

beam size is mentioned in 15 sentences in this paper.

Topics mentioned in this paper:

Zhang, Meishan and Zhang, Yue and Che, Wanxiang and Liu, Ting

Experiments	Figure 6: Accuracies against the training epoch for joint segmentation and tagging as well as joint phrase-structure parsing using beam sizes 1, 4, l6 and 64, respectively.
Experiments	Figure 6 shows the accuracies of our model using different beam sizes with respect to the training epoch.
Experiments	The performance of our model increases as the beam size increases.
Introduction	With linear-time complexity, our parser is highly efficient, processing over 30 sentences per second with a beam size of 16.

beam size is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

Wang, Lu and Raghavan, Hema and Castelli, Vittorio and Florian, Radu and Cardie, Claire

Experimental Setup	Beam size is fixed at 2000.4 Sentence compressions are evaluated by a 5-gram language model trained on Gigaword (Graff, 2003) by SRILM (Stolcke, 2002).
Results	4We looked at various beam sizes on the heldout data, and observed that the performance peaks around this value.
Sentence Compression	postorder) as a sequence of nodes in T, the set L of possible node labels, a scoring function 8 for evaluating each sentence compression hypothesis, and a beam size N. Specifically, O is a permutation on the set {0, l, .
Sentence Compression	Om, L ={RET, REM, PAR}, hypothesis scorer S, beam size N

beam size is mentioned in 4 sentences in this paper.

Topics mentioned in this paper: