Index of papers in Proc. ACL 2014 that mention
  • perceptron
Björkelund, Anders and Kuhn, Jonas
Abstract
We investigate different ways of learning structured perceptron models for coreference resolution when using nonlocal features and beam search.
Conclusion
We evaluated standard perceptron learning techniques for this setting both using early updates and LaSO.
Conclusion
In the special case where only local features are used, this method coincides with standard structured perceptron learning that uses exact search.
Experimental Setup
Unless otherwise stated we use 25 iterations of perceptron training and a beam size of 20.
Introduction
This paper studies and extends previous work using the structured perceptron (Collins, 2002) for complex NLP tasks.
Related Work
Perceptrons for coreference.
Related Work
The perceptron has previously been used to train coreference resolvers either by casting the problem as a binary classification problem that considers pairs of mentions in isolation (Bengtson and Roth, 2008; Stoyanov et al., 2009; Chang et al., 2012, inter alia) or in the structured manner, where a clustering for an entire document is predicted in one go (Fernandes et al., 2012).
Related Work
Stoyanov and Eisner (2012) train an Easy-First coreference system with the perceptron to learn a sequence of join operations between arbitrary mentions in a document and accesses nonlocal features through previous merge operations in later stages.
Representation and Learning
We find the weight vector 21) by online learning using a variant of the structured perceptron (Collins, 2002).
Representation and Learning
The structured perceptron iterates over training instances (55,, 3),), where :10, are inputs and y, are outputs.
Representation and Learning
If 7' is set to l, the update reduces to the standard structured perceptron update.
perceptron is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Duan, Manjuan and White, Michael
Abstract
Using parse accuracy in a simple reranking strategy for self-monitoring, we find that with a state-of-the-art averaged perceptron realization ranking model, BLEU scores cannot be improved with any of the well-known Treebank parsers we tested, since these parsers too often make errors that human readers would be unlikely to make.
Background
Using the averaged perceptron algorithm (Collins, 2002), White & Rajkumar (2009) trained a structured prediction ranking model to combine these existing syntactic models with several n-gram language models.
Introduction
Rajkumar & White (2011; 2012) have recently shown that some rather egregious surface realization errors—in the sense that the reader would likely end up with the wrong interpretation—can be avoided by making use of features inspired by psycholinguistics research together with an otherwise state-of-the-art averaged perceptron realization ranking model (White and Rajkumar, 2009), as reviewed in the next section.
Introduction
With this simple reranking strategy and each of three different Treebank parsers, we find that it is possible to improve BLEU scores on Penn Treebank development data with White & Rajkumar’s (2011; 2012) baseline generative model, but not with their averaged perceptron model.
Introduction
Therefore, to develop a more nuanced self-monitoring reranker that is more robust to such parsing mistakes, we trained an SVM using dependency precision and recall features for all three parses, their n-best parsing results, and per-label precision and recall for each type of dependency, together with the realizer’s normalized perceptron model score as a feature.
Simple Reranking
The first one is the baseline generative model (hereafter, generative model) used in training the averaged perceptron model.
Simple Reranking
The second one is the averaged perceptron model (hereafter, perceptron model), which uses all the features reviewed in Section 2.
Simple Reranking
Table 2: Devset BLEU scores for simple ranking on top of n-best perceptron model realizations
perceptron is mentioned in 29 sentences in this paper.
Topics mentioned in this paper:
Cao, Yuan and Khudanpur, Sanjeev
Experiments
For comparison, we also investigated training the reranker with Perceptron and MIRA.
Experiments
The f -scores of the held-out and evaluation set given by T-MIRA as well as the Perceptron and
Experiments
When very few labeled data are available for training (compared with the number of features), T-MIRA performs much better than the vector-based models MIRA and Perceptron .
Introduction
Many learning algorithms applied to NLP problems, such as the Perceptron (Collins,
Tensor Model Construction
As a way out, we first run a simple vector-model based learning algorithm (say the Perceptron ) on the training data and estimate a weight vector, which serves as a “surro-
perceptron is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Li, Qi and Ji, Heng
Abstract
We present an incremental joint framework to simultaneously extract entity mentions and relations using structured perceptron with efficient beam-search.
Algorithm 3.1 The Model
To estimate the feature weights, we use structured perceptron (Collins, 2002), an extension of the standard perceptron for structured prediction, as the learning framework.
Algorithm 3.1 The Model
(2012) proved the convergency of structured perceptron when inexact search is applied with violation-fixing update methods such as early-update (Collins and Roark, 2004).
Algorithm 3.1 The Model
Figure 4 shows the pseudocode for structured perceptron training with early-update.
Conclusions and Future Work
For the first time, we addressed this challenging task by an incremental beam-search algorithm in conjunction with structured perceptron .
Introduction
Following the above intuitions, we introduce a joint framework based on structured perceptron (Collins, 2002; Collins and Roark, 2004) with beam-search to extract entity mentions and relations simultaneously.
Introduction
Our previous work (Li et al., 2013) used perceptron model with token-based tagging to jointly extract event triggers and arguments.
Related Work
Our previous work (Li et al., 2013) used structured perceptron with token-based decoder to jointly predict event triggers and arguments based on the assumption that entity mentions and other argument candidates are given as part of the input.
perceptron is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Xu, Wenduan and Clark, Stephen and Zhang, Yue
Introduction
The discrimina-tive model is global and trained with the structured perceptron .
Introduction
We also show how perceptron learning with beam-search (Collins and Roark, 2004) can be extended to handle the additional ambiguity, by adapting the “violation-fixing” perceptron of Huang et al.
The Dependency Model
We also show, in Section 3.3, how perceptron training with early-update (Collins and Roark, 2004) can be used in this setting.
The Dependency Model
We use the averaged perceptron (Collins, 2002) to train a global linear model and score each action.
The Dependency Model
Since there are potentially many gold items, and one gold item is required for the perceptron update, a decision needs
perceptron is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Anzaroot, Sam and Passos, Alexandre and Belanger, David and McCallum, Andrew
Citation Extraction Data
We then use the development set to learn the penalties for the soft constraints, using the perceptron algorithm described in section 3.1.
Soft Constraints in Dual Decomposition
All we need to employ the structured perceptron algorithm (Collins, 2002) or the structured SVM algorithm (Tsochantaridis et al., 2004) is a black-box procedure for performing MAP inference in the structured linear model given an arbitrary cost vector.
Soft Constraints in Dual Decomposition
This can be ensured by simple modifications of the perceptron and subgradient descent optimization of the structured SVM objective simply by truncating c coordinate-wise to be nonnegative at every learning iteration.
Soft Constraints in Dual Decomposition
Intuitively, the perceptron update increases the penalty for a constraint if it is satisfied in the ground truth and not in an inferred prediction, and decreases the penalty if the constraint is satisfied in the prediction and not the ground truth.
perceptron is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Srivastava, Shashank and Hovy, Eduard
Introduction
In this case, learning can follow the online structured perceptron learning procedure by Collins (2002), where weights updates for the k’th training example (x09), y("’)) are given as:
Introduction
While the Viterbi algorithm can be used for tagging optimal state-sequences given the weights, the structured perceptron can learn optimal model weights given gold-standard sequence labels.
Introduction
In the M-step, we take the decoded state-sequences in the E—step as observed, and run perceptron learning to update feature weights wi.
perceptron is mentioned in 4 sentences in this paper.
Topics mentioned in this paper: