Index of papers in Proc. ACL 2011 that mention
  • labeled data
Bollegala, Danushka and Weir, David and Carroll, John
Abstract
We describe a sentiment classification method that is applicable when we do not have any labeled data for a target domain but have some labeled data for multiple other domains, designated as the source domains.
Experiments
Figure 3: Effect of source domain labeled data .
Experiments
To investigate the impact of the quantity of source domain labeled data on our method, we vary the amount of data from zero to 800 reviews, with equal amounts of positive and negative labeled data .
Experiments
Note that source domain labeled data is used both to create the sentiment sensitive thesaurus as well as to train the sentiment classifier.
Introduction
Supervised learning algorithms that require labeled data have been successfully used to build sentiment classifiers for a specific domain (Pang et al., 2002).
Introduction
positive or negative sentiment) given a small set of labeled data for the source domain, and unlabeled data for both source and target domains.
Introduction
In particular, no labeled data is provided for the target domain.
labeled data is mentioned in 16 sentences in this paper.
Topics mentioned in this paper:
Lu, Bin and Tan, Chenhao and Cardie, Claire and K. Tsou, Benjamin
A Joint Model with Unlabeled Parallel Text
where v E {1,2} denotes L1 or L2; the first term on the right-hand side is the likelihood of labeled data for both D1 and D2; and the second term is the likelihood of the unlabeled parallel data U.
A Joint Model with Unlabeled Parallel Text
By further considering the weight to ascribe to the unlabeled data vs. the labeled data (and the weight for the L2-norm regularization), we get the following regularized joint log likelihood to be maximized:
A Joint Model with Unlabeled Parallel Text
where the first term on the right-hand side is the log likelihood of the labeled data from both D1 and D2; the second is the log likelihood of the unlabeled parallel data U, multiplied by Al 2 O, a constant that controls the contribution of the unlabeled data; and x12 2 0 is a regularization constant that penalizes model complexity or large feature weights.
Abstract
We present a novel approach for joint bilingual sentiment classification at the sentence level that augments available labeled data in each language with unlabeled parallel data.
Introduction
Given the labeled data in each language, we propose an approach that exploits an unlabeled parallel corpus with the following
Introduction
The proposed maximum entropy-based EM approach jointly learns two monolingual sentiment classifiers by treating the sentiment labels in the unlabeled parallel text as unobserved latent variables, and maximizes the regularized joint likelihood of the language-specific labeled data together with the inferred sentiment labels of the parallel text.
labeled data is mentioned in 37 sentences in this paper.
Topics mentioned in this paper:
Titov, Ivan
Empirical Evaluation
For every pair, the semi-supervised methods use labeled data from the source domain and unlabeled data from both domains.
Empirical Evaluation
We compare them with two supervised methods: a supervised model (Base) which is trained on the source domain data only, and another supervised model (In-domain) which is learned on the labeled data from the target domain.
Introduction
In addition to the labeled data from the source domain, they also exploit small amounts of labeled data and/or unlabeled data from the target domain to estimate a more predictive model for the target domain.
Introduction
We use generative latent variable models (LVMs) learned on all the available data: unlabeled data for both domains and on the labeled data for the source domain.
Related Work
Second, their expectation constraints are estimated from labeled data , whereas we are trying to match expectations computed on unlabeled data for two domains.
Related Work
This approach bears some similarity to the adaptation methods standard for the setting where labelled data is available for both domains (Chelba and Acero, 2004; Daume and Marcu, 2006).
The Latent Variable Model
1Among the versions which do not exploit labeled data from the target domain.
The Latent Variable Model
The parameters of this model 6 = (12,10) can be estimated by maximizing joint likelihood L(6) of labeled data for the source domain {330), y(l)}l€3L
The Latent Variable Model
However, given that, first, amount of unlabeled data |SU U TU| normally vastly exceeds the amount of labeled data |SL| and, second, the number of features for each example |a3(l)| is usually large, the label y will have only a minor effect on the mapping from the initial features a: to the latent representation z (i.e.
labeled data is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Kobdani, Hamidreza and Schuetze, Hinrich and Schiehlen, Michael and Kamp, Hans
Abstract
We show that this unsuperVised system has better CoRe performance than other learning approaches that do not use manually labeled data .
Introduction
Until recently, most approaches tried to solve the problem by binary classification, where the probability of a pair of markables being coreferent is estimated from labeled data .
Introduction
Self-training approaches usually include the use of some manually labeled data .
Introduction
In contrast, our self-trained system is not trained on any manually labeled data and is therefore a completely unsupervised system.
Related Work
not with approaches that make some limited use of labeled data .
Results and Discussion
Thus, this comparison of ACE-2/Ontonotes results is evidence that in a realistic scenario using association information in an unsupervised self-trained system is almost as good as a system trained on manually labeled data .
System Architecture
Automatically Labeled Data
labeled data is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Cai, Peng and Gao, Wei and Zhou, Aoying and Wong, Kam-Fai
Evaluation
All ranking models above were trained only on source domain training data and the labeled data of target domain was just used for testing.
Instance Weighting Scheme Review
(J iang and Zhai, 2007) used a small number of labeled data from target domain to weight source instances.
Introduction
To alleviate the lack of training data in the target domain, many researchers have proposed to transfer ranking knowledge from the source domain with plenty of labeled data to the target domain where only a few or no labeled data is available, which is known as ranking model adaptation (Chen et al., 2008a; Chen et al., 2010; Chen et al., 2008b; Geng et al., 2009; Gao et al., 2009).
Related Work
In (Geng et al., 2009; Chen et al., 2008b), the parameters of ranking model trained on the source domain was adjusted with the small set of labeled data in the target domain.
Related Work
al., 2008a) weighted source instances by using small amount of labeled data in the target domain.
labeled data is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Jiang, Qixia and Sun, Maosong
Introduction
This is implemented by maximizing the empirical accuracy on the prior knowledge ( labeled data ) and the entropy of hash functions (estimated over labeled and unlabeled data).
Semi-Supervised SimHash
Let XL 2 {(X1,cl)...(xu,cu)} be the labeled data , c E {1...0}, X 6 RM, and XU = {xu+1...xN} the unlabeled data.
Semi-Supervised SimHash
Given the labeled data XL, we construct two sets, attraction set @a and repulsion set 9?.
Semi-Supervised SimHash
Furthermore, we also hope to maximize the empirical accuracy on the labeled data @a and @r and
The direction is determined by concatenating w L times.
This is implemented by maximizing the empirical accuracy on labeled data together with the entropy of hash functions.
labeled data is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Khapra, Mitesh M. and Joshi, Salil and Chatterjee, Arindam and Bhattacharyya, Pushpak
Bilingual Bootstrapping
Algorithm 1 Bilingual Bootstrapping LD1 2: Seed Labeled Data from L1 LD2 2: Seed Labeled Data from L2 UD1 := Unlabeled Data from L1 U D2 := Unlabeled Data from L2
Bilingual Bootstrapping
These projected models are then applied to the untagged data of L1 and L2 and the instances which get labeled with a high confidence are added to the labeled data of the respective languages.
Bilingual Bootstrapping
Algorithm 2 Monolingual Bootstrapping LD1 2: Seed Labeled Data from L1 LD2 2: Seed Labeled Data from L2 U D1 := Unlabeled Data from L1 U D2 := Unlabeled Data from L2
Experimental Setup
In each iteration only those words for which P(assigned_sense|word) > 0.6 get moved to the labeled data .
Experimental Setup
Hence, we used a fixed threshold of 0.6 so that in each iteration only those words get moved to the labeled data for which the assigned sense is clearly a majority sense (P > 0.6).
labeled data is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Chambers, Nathanael and Jurafsky, Dan
Abstract
Standard algorithms for template-based information extraction (IE) require predefined template schemas, and often labeled data , to learn to extract their slot fillers (e.g., an embassy is the Target of a Bombing template).
Previous Work
Weakly supervised approaches remove some of the need for fully labeled data .
Previous Work
Shinyama and Sekine (2006) describe an approach to template learning without labeled data .
Standard Evaluation
Our precision is as good as (and our Fl score near) two algorithms that require knowledge of the templates and/or labeled data .
labeled data is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Krishnamurthy, Jayant and Mitchell, Tom
Abstract
ConceptResolver performs both word sense induction and synonym resolution on relations extracted from text using an ontology and a small amount of labeled data .
ConceptResolver
A is re-selected on each iteration because the initial labeled data set is extremely small, so the initial validation set is not necessarily representative of the actual data.
ConceptResolver
l. Initialize labeled data L with 10 positive and 50 negative examples (pairs of senses)
Prior Work
These approaches use large amounts of labeled data , which can be difficult to create.
labeled data is mentioned in 4 sentences in this paper.
Topics mentioned in this paper: