Index of papers in Proc. ACL 2014 that mention
  • labeled data
Huang, Hongzhao and Cao, Yunbo and Huang, Xiaojiang and Ji, Heng and Lin, Chin-Yew
Abstract
In addition, it is challenging to generate sufficient high quality labeled data for supervised models with low cost.
Abstract
Compared to the state-of-the-art supervised model trained from 100% labeled data, our proposed approach achieves comparable performance with 31% labeled data and obtains 5% absolute Fl gain with 50% labeled data .
Conclusions
By studying three novel fine-grained relations, detecting semantically-related information with semantic meta paths, and exploiting the data manifolds in both unlabeled and labeled data for collective inference, our work can dramatically save annotation cost and achieve better performance, thus shed light on the challenging wikification task for tweets.
Experiments
In comparision with the supervised baseline proposed by (Meij et al., 2012), our model SSRega1 relying on local compatibility already achieves comparable performance with 50% of labeled data .
Experiments
We can easily see that our proposed approach using 50% labeled data achieves similar performance with the state-of-the-art supervised model with 100% labeled data .
Experiments
6.4 Effect of Labeled Data Size
Introduction
Sufficient labeled data is crucial for supervised models.
Introduction
In order to address these unique challenges for wikification for the short tweets, we employ graph-based semi-supervised learning algorithms (Zhu et al., 2003; Smola and Kondor, 2003; Blum et al., 2004; Zhou et al., 2004; Talukdar and Crammer, 2009) for collective inference by exploiting the manifold (cluster) structure in both unlabeled and labeled data .
labeled data is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Nguyen, Minh Luan and Tsang, Ivor W. and Chai, Kian Ming A. and Chieu, Hai Leong
Abstract
We address two challenges: negative transfer when knowledge in source domains is used without considering the differences in relation distributions; and lack of adequate labeled samples for rarer relations in the new domain, due to a small labeled data set and imbalance relation distributions.
Introduction
However, most supervised learning algorithms require adequate labeled data for every relation type to be extracted.
Introduction
Instead, it can be more cost-effective to adapt an existing relation extraction system to the new domain using a small set of labeled data .
Introduction
Together with imbalanced relation distributions inherent in the domain, this can cause some rarer relations to constitute only a very small proportion of the labeled data set.
Problem Statement
The target domain has a few labeled data D; 2 {(xi, yi)}:.
Problem Statement
For the sth source domain, we have an adequate labeled data set DS.
Related Work
However, purely supervised relation extraction methods assume the availability of sufficient labeled data , which may be costly to obtain for new domains.
Related Work
We address this by augmenting a small labeled data set with other information in the domain adaptation setting.
Related Work
To create labeled data , the texts are dependency-parsed, and the domain-independent patterns on the parses form the basis for extractions.
Robust Domain Adaptation
By augmenting with unlabeled data D;,, we aim to alleviate the effect of imbalanced relation distribution, which causes a lack of labeled samples for rarer classes in a small set of labeled data .
labeled data is mentioned in 22 sentences in this paper.
Topics mentioned in this paper:
Bollegala, Danushka and Weir, David and Carroll, John
Datasets
(2006), we use sections 2 — 21 from Wall Street Journal (WSJ) as the source domain labeled data .
Distribution Prediction
Our distribution prediction learning method is unsupervised in the sense that it does not require manually labeled data for a particular task from any of the domains.
Domain Adaptation
The main reason that a model trained only on the source domain labeled data performs poorly in the target domain is the feature mismatch — few features in target domain test instances appear in source domain training instances.
Experiments and Results
For each domain, the accuracy obtained y a classifier trained using labeled data from that
Experiments and Results
This upper baseline represents the classification accuracy we could hope to obtain if we were to have labeled data for the target domain.
Introduction
Our proposed cross-domain word distribution prediction method is unsupervised in the sense that it does not require any labeled data in either of the two steps.
O \
Unlike our distribution prediction method, which is unsupervised, SST requires labeled data for the source domain to learn a feature mapping between a source and a target domain in the form of a thesaurus.
Related Work
(2006) append the source domain labeled data with predicted pivots (i.e.
Related Work
The unsupervised DA setting that we consider does not assume the availability of labeled data for the target domain.
Related Work
However, if a small amount of labeled data is available for the target domain, it can be used to further improve the performance of DA tasks (Xiao et al., 2013; Daume III, 2007).
labeled data is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Li, Zhenghua and Zhang, Min and Chen, Wenliang
Abstract
With a conditional random field based probabilistic dependency parser, our training objective is to maximize mixed likelihood of labeled data and auto-parsed unlabeled data with ambiguous labelings.
Ambiguity-aware Ensemble Training
not sufficiently covered in manually labeled data .
Ambiguity-aware Ensemble Training
Since 13’ contains much more instances than D (1.7M vs. 40K for English, and 4M vs. 16K for Chinese), it is likely that the unlabeled data may overwhelm the labeled data during SGD training.
Ambiguity-aware Ensemble Training
1: Input: Labeled data D = {(Xi,di)}£\;1, and unlabeled data 13’ = {(u,,v,)}§‘:1;Parameters: I, N1, M1, b : Output: w : Initialization: Wm) = O, k: = 0; : forz' = 1 to I do {iterations} Randomly select N1 instances from D and M1 instances from D’ to compose a new dataset Di, and shuffle it.
Conclusions
The training objective is to maximize the mixed likelihood of both the labeled data and the auto-parsed unlabeled data with ambiguous labelings.
Introduction
Such sentences can provide more discriminative instances for training which may be unavailable in labeled data .
Introduction
Evaluation on labeled data shows the oracle accuracy of parse forest is much higher than that of l-best outputs of single parsers (see Table 3).
Introduction
Finally, using a conditional random field (CRF) based probabilistic parser, we train a better model by maximizing mixed likelihood of labeled data and auto-parsed unlabeled data with ambiguous labelings.
labeled data is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Pershina, Maria and Min, Bonan and Xu, Wei and Grishman, Ralph
Abstract
However, in some cases a small amount of human labeled data is available.
Available at http://nlp. stanford.edu/software/mimlre. shtml.
Thus, our approach outperforms state-of-the-art model for relation extraction using much less labeled data that was used by Zhang et al., (2012) to outper-
Introduction
In this paper, we present the first effective approach, Guided DS (distant supervision), to incorporate labeled data into distant supervision for extracting relations from sentences.
Introduction
(2012), we generalize the labeled data through feature selection and model this additional information directly in the latent variable approaches.
Introduction
While prior work employed tens of thousands of human labeled examples (Zhang et al., 2012) and only got a 6.5% increase in F-score over a logistic regression baseline, our approach uses much less labeled data (about 1/8) but achieves much higher improvement on performance over stronger baselines.
The Challenge
Simply taking the union of the hand-labeled data and the corpus labeled by distant supervision is not effective since hand-labeled data will be swamped by a larger amount of distantly labeled data .
The Challenge
An effective approach must recognize that the hand-labeled data is more reliable than the automatically labeled data and so must take precedence in cases of conflict.
The Challenge
Instead we propose to perform feature selection to generalize human labeled data into training guidelines, and integrate them into latent variable model.
Training
Upsam—pling the labeled data did not improve the performance either.
labeled data is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Wang, Chang and Fan, James
Background
Recently, “distant supervision” has emerged to be a popular choice for training relation extractors without using manually labeled data (Mintz et al., 2009; J iang, 2009; Chan and Roth, 2010; Wang et al., 2011; Riedel et al., 2010; Ji et al., 2011; Hoffmann et al., 2011; Sur-deanu et al., 2012; Takamatsu et al., 2012; Min et al., 2013).
Experiments
(1) Manifold Unlabeled: We combined the labeled data and unlabeled set 1 in training.
Experiments
(2) Manifold Predicted Labels: We combined labeled data and unlabeled set 2 in training.
Experiments
beled data and the data from unlabeled set 2 was used as labeled data (With Weights).
Identifying Key Medical Relations
Our current strategy is to integrate all associated types, and rely on the relation detector trained with the labeled data to decide how to weight different types based upon the context.
Introduction
When we build a naive model to detect relations, the model tends to overfit for the labeled data .
Relation Extraction with Manifold Models
Integration of the unlabeled data can help solve overfitting problems when the labeled data is not sufficient.
labeled data is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Yang, Bishan and Cardie, Claire
Abstract
The context-aware constraints provide additional power to the CRF model and can guide semi-supervised learning when labeled data is limited.
Approach
PR makes the assumption that the labeled data we have is not enough for learning good model parameters, but we have a set of constraints on the posterior distribution of the labels.
Experiments
For the MD dataset, we also used the dvd domain as additional labeled data for developing the constraints.
Experiments
We found that the PR model is able to correct many CRF errors caused by the lack of labeled data .
Experiments
However, with limited labeled data , the CRF learner can only associate very weak sentiment signals to these features.
Related Work
Compared to the existing work on semi-supervised learning for sentence-level sentiment classification (Tackstro'm and McDonald, 2011a; Tackstrom and McDonald, 2011b; Qu et al., 2012), our work does not rely on a large amount of coarse-grained (document-level) labeled data , instead, distant supervision mainly comes from linguistically-motivated constraints.
labeled data is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Martineau, Justin and Chen, Lu and Cheng, Doreen and Sheth, Amit
Experiments
Due to these reasons, there is a lack of sufficient and high quality labeled data for emotion research.
Experiments
Since in real world applications people are primarily concerned with how well the algorithm will work for new TV shows or movies that may not be included in the training data, we defined a test fold for each TV show or movie in our labeled data set.
Experiments
Each test fold corresponded to a training fold containing all the labeled data from all the other TV shows and movies.
Introduction
An active learner uses a small set of labeled data to iteratively select the most informative instances from a large pool of unlabeled data for human annotators to label (Settles, 2010).
Related Work
In Active Learning (Settles, 2010) a small set of labeled data is used to find documents that should be annotated from a large pool of unlabeled documents.
labeled data is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Ma, Ji and Zhang, Yue and Zhu, Jingbo
Experiments
The data set consists of labelled data for both the source (Wall Street Journal portion of the Penn Treebank) and target (web) domains.
Experiments
Participants are not allowed to use web-domain labelled data for training.
Experiments
In addition to labelled data , a large amount of unlabelled data on the web domain is also provided.
Introduction
The problem we face here can be considered as a special case of domain adaptation, where we have access to labelled data on the source domain (PTB) and unlabelled data on the target domain (web data).
labeled data is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Qian, Longhua and Hui, Haotian and Hu, Ya'nan and Zhou, Guodong and Zhu, Qiaoming
Abstract
Usually the extraction performance depends heavily on the quality and quantity of the labeled data , however, the manual annotation of a large-scale corpus is labor-intensive and time-consuming.
Abstract
During iterations a batch of unlabeled instances are chosen in terms of their informativeness to the current classifier, labeled by an oracle and in turn added into the labeled data to retrain the classifier.
Abstract
Input: - L, labeled data set - U, unlabeled data set - n, batch size Output: - SVM, classifier Repeat: 1.
labeled data is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Zeng, Xiaodong and Chao, Lidia S. and Wong, Derek F. and Trancoso, Isabel and Tian, Liang
Experiments
0 Self-training Segmenters (STS): two variant models were defined by the approach reported in (Subramanya et al., 2010) that uses the supervised CRFs model’s decodings, incorporating empirical and constraint information, for unlabeled examples as additional labeled data to retrain a CRFs model.
Introduction
They leverage such mappings to either constitute a Chinese word dictionary for maximum-matching segmentation (Xu et al., 2004), or form labeled data for training a sequence labeling model (Paul et al., 2011).
Methodology
Our learning problem belongs to semi-supervised learning (SSL), as the training is done on treebank labeled data (XL,YL) = {(X1,y1), ..., (Xl,yl)}, and bilingual unlabeled data (XU) 2 {X1, ..., Xu} where X,- = {531, ...,:cm} is an input word sequence and yi = {3/1, ...,ym}, y E T is its corresponding label sequence.
labeled data is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: