Index of papers in Proc. ACL 2008 that mention
  • unlabeled data
Wang, Qin Iris and Schuurmans, Dale and Lin, Dekang
Introduction
Following the common theme of “more data is better data” we also use both a limited labeled corpora and a plentiful unlabeled data resource.
Introduction
(2006a) successfully applied self-training to parsing by exploiting available unlabeled data , and obtained remarkable results when the same technique was applied to parser adaptation (McClosky et al., 2006b).
Introduction
However, the standard objective of an S3VM is non-convex on the unlabeled data , thus requiring sophisticated global optimization heuristics to obtain reasonable solutions.
Semi-supervised Convex Training for Structured SVM
for structured large margin training, whose objective is a combination of two convex terms: the supervised structured large margin loss on labeled data and the cheap least squares loss on unlabeled data .
Semi-supervised Structured Large Margin Objective
The objective of standard semi-supervised structured SVM is a combination of structured large margin losses on both labeled and unlabeled data .
Semi-supervised Structured Large Margin Objective
We introduce an efficient approximation—least squares loss—for the structured large margin loss on unlabeled data below.
unlabeled data is mentioned in 18 sentences in this paper.
Topics mentioned in this paper:
Koo, Terry and Carreras, Xavier and Collins, Michael
Introduction
In general, semi-supervised learning can be motivated by two concerns: first, given a fixed amount of supervised data, we might wish to leverage additional unlabeled data to facilitate the utilization of the supervised corpus, increasing the performance of the model in absolute terms.
Introduction
Second, given a fixed target performance level, we might wish to use unlabeled data to reduce the amount of annotated data necessary to reach this target.
Related Work
Crucially, however, these methods do not exploit unlabeled data when leam-ing their representations.
unlabeled data is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Kulkarni, Anagha and Callan, Jamie
Data
The set of 3,123 words that were not annotated was the unlabeled data for the EM algorithm.
Experiments and Results
Our post-experimental analysis reveals that the parameter updation process using the unlabeled data has an effect of overly separating the two overlapping distributions.
Finding the Homographs in a Lexicon
In Model II, the semi-supervised setup, the training data is used to initialize the Expectation-Maximization (EM) algorithm (Dempster et al., 1977) and the unlabeled data , described in Section 3.1, updates the initial estimates.
unlabeled data is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: