SciSurf: Index of 'Learning with Annotation Noise'

Learning with Annotation Noise

Beigman, Eyal and Beigman Klebanov, Beata

Published in Proc. ACL, 2009

Article Structure

Abstract

It is usually assumed that the kind of noise existing in annotated data is random classification noise.

Introduction

It is assumed, often tacitly, that the kind of noise existing in human-annotated datasets used in computational linguistics is random classification noise (Kearns, 1993; Angluin and Laird, 1988), resulting from annotator attention slips randomly distributed across instances.

Topics

perceptron

Appears in 24 sentences as: Perceptron (4) perceptron (21) perceptrons (2)

In Learning with Annotation Noise

We show that these are relatively unproblematic for an algorithm operating under the 0-1 loss model, whereas for the commonly used voted perceptron algorithm, hard training cases could result in incorrect prediction on the uncontroversial cases at test time.
Page 1, “Abstract”
For example, the perceptron family of algorithms handle random classification noise well (Cohen, 1997).
Page 2, “Introduction”
We show in section 3.4 that the widely used Freund and Schapire (1999) voted perceptron algorithm could face a constant hard case bias when confronted with annotation noise in training data, irrespective of the size of the dataset.
Page 2, “Introduction”
3 Voted Perceptron
Page 3, “Introduction”
Freund and Schapire (1999) describe the voted perceptron .
Page 3, “Introduction”
In this section, we show that the voted perceptron can be vulnerable to annotation noise.
Page 3, “Introduction”
Algorithm I Voted Perceptron Training Input: a labeled training set (£131,341), .
Page 3, “Introduction”
, (azN, yN) Output: a list of perceptrons wl, .
Page 3, “Introduction”
Input: a list of perceptrons wl, .
Page 3, “Introduction”
The voted perceptron algorithm is a refinement of the perceptron algorithm (Rosenblatt, 1962; Minsky and Papert, 1969).
Page 3, “Introduction”
Perceptron is a dynamic algorithm; starting with an initial hyperplane wo, it passes repeatedly through the labeled sample.
Page 3, “Introduction”

See all papers in Proc. ACL 2009 that mention perceptron.

See all papers in Proc. ACL that mention perceptron.

machine learner

Appears in 7 sentences as: machine learner (3) machine learner’s (1) machine learning (3)

In Learning with Annotation Noise

For example, Osborne (2002) evaluates noise tolerance of shallow parsers, with random classification noise taken to be “crudely approximating annotation errors.” It has been shown, both theoretically and empirically, that this type of noise is tolerated well by the commonly used machine learning algorithms (Cohen, 1997; Blum et al., 1996; Osborne, 2002; Reidsma and Carletta, 2008).
Page 1, “Introduction”
When training data comes from one annotator and test data from another, the first annotator’s biases are sometimes systematic enough for a machine learner to pick them up, with detrimental results for the algorithm’s performance on the test data.
Page 1, “Introduction”
1The different biases might not amount to much in the small doubly annotated subset, resulting in acceptable inter-annotator agreement; yet when enacted throughout a large number of instances they can be detrimental from a machine learner’s perspective.
Page 1, “Introduction”
First, we show that a machine learner operating under a 0-1 loss minimization principle could sustain a hard case bias of in the worst case.
Page 2, “Introduction”
Finally, we discuss the implications of our findings for the practice of annotation studies and for data utilization in machine learning .
Page 2, “Introduction”
Subsequently, a machine learner can be told to ignore those cases during training, reducing the risk of hard case bias.
Page 6, “Introduction”
Reidsma and Carletta (2008) recently showed by simulation that different types of annotator behavior have different impact on the outcomes of machine learning from the annotated data.
Page 7, “Introduction”

See all papers in Proc. ACL 2009 that mention machine learner.

See all papers in Proc. ACL that mention machine learner.