SciSurf: Index of "perceptron" in Proc. ACL 2010

Index of papers in Proc. ACL 2010 that mention

perceptron

Seen in text as:

perceptron (22)
perceptrons (5)
Perceptron (4)

Seen in 31 sentences in 5 papers.

1. Efficient Staggered Decoding for Sequence Labeling

Kaji, Nobuhiro and Fujiwara, Yasuhiro and Yoshinaga, Naoki and Kitsuregawa, Masaru

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	In the past decade, sequence labeling algorithms such as HMMs, CRFs, and Collins’ perceptrons have been extensively studied in the field of NLP (Rabiner, 1989; Lafferty et al., 2001; Collins, 2002).
Introduction	Among them, we focus on the perceptron algorithm (Collins, 2002).
Introduction	In the perceptron , the score function f (:13, y) is given as f(a:,y) = w - qb(a:,y) where w is the weight vector, and qb(a:, y) is the feature vector representation of the pair (:13, By making the first-order Markov assumption, we have

perceptron is mentioned in 13 sentences in this paper.

Topics mentioned in this paper:

2. Hierarchical Search for Word Alignment

Riesa, Jason and Marcu, Daniel

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discriminative training	We incorporate all our new features into a linear model and learn weights for each using the online averaged perceptron algorithm (Collins, 2002) with a few modifications for structured outputs inspired by Chiang et al.
Experiments	We use 1,000 sentence pairs and gold alignments from LDC2006E86 to train model parameters: 800 sentences for training, 100 for testing, and 100 as a second held-out development set to decide when to stop perceptron training.
Experiments	Figure 8: Learning curves for 10 random restarts over time for parallel averaged perceptron training.
Experiments	Perceptron training here is quite stable, converging to the same general neighborhood each time.
Introduction	We train the parameters of the model using averaged perceptron (Collins, 2002) modified for structured outputs, but can easily fit into a max-margin or related framework.

perceptron is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

3. Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing

Sassano, Manabu and Kurohashi, Sadao

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	It is observed that active learning of parsing with the averaged perceptron , which is one of the large margin classifiers, works also well for Japanese dependency analysis.
Experimental Evaluation and Discussion	6.2 Averaged Perceptron
Experimental Evaluation and Discussion	We used the averaged perceptron (AP) (Freund and Schapire, 1999) with polynomial kernels.
Experimental Evaluation and Discussion	We found the best value of the epoch T of the averaged perceptron by using the development set.

perceptron is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

4. Efficient Third-Order Dependency Parsers

Koo, Terry and Collins, Michael

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Parsing experiments	7.2 Averaged perceptron training
Parsing experiments	We chose the averaged structured perceptron (Freund and Schapire, 1999; Collins, 2002) as it combines highly competitive performance with fast training times, typically converging in 5—10 iterations.
Parsing experiments	Pass = %dependencies surviving the beam in training data, Orac = maximum achievable UAS on validation data, Accl/Acc2 = UAS of Models 1/2 on validation data, and Timel/Time2 = minutes per perceptron training iteration for Models 1/2, averaged over all 10 iterations.

perceptron is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

5. Faster Parsing by Supertagger Adaptation

Kummerfeld, Jonathan K. and Roesner, Jessika and Dawborn, Tim and Haggerty, James and Curran, James R. and Clark, Stephen

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Results	In our first experiment, we trained supertagger models using Generalised Iterative Scaling (GIS) (Darroch and Ratcliff, 1972), the limited memory BFGS method (BFGS) (Nocedal and Wright, 1999), the averaged perceptron (Collins, 2002), and the margin infused relaxed algorithm (MIRA) (Crammer and Singer, 2003).
Results	GIS 96.34 96.43 96.53 96.62 85.3 Perceptron 95.82 95.99 96.30 - 85.2 MIRA 96.23 96.29 96.46 96.63 85 .4
Results	For all four algorithms the training time is proportional to the amount of data, but the GIS and BFGS models trained on only CCGbank took 4,500 and 4,200 seconds to train, while the equivalent perceptron and MIRA models took 90 and 95 seconds to train.

perceptron is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: