Index of papers in Proc. ACL 2012 that mention
  • unlabeled data
Meng, Xinfan and Wei, Furu and Liu, Xiaohua and Zhou, Ming and Xu, Ge and Wang, Houfeng
Conclusion and Future Work
First, the proposed model can learn previously unseen sentiment words from large unlabeled data , which are not covered by the limited vocabulary in machine translation of the labeled data.
Cross-Lingual Mixture Model for Sentiment Classification
where A and AWL) are weighting factor to control the influence of the unlabeled data .
Cross-Lingual Mixture Model for Sentiment Classification
We set A, (AND) to A, (At) when d, belongs to unlabeled data , 1 otherwise.
Cross-Lingual Mixture Model for Sentiment Classification
When d, belongs to unlabeled data , P(cj is computed according to Equation 5 or 6.
Experiment
Larger weights indicate larger influence from the unlabeled data .
Experiment
Second, the two classifiers make prediction on Chinese unlabeled data and their English translation,
Experiment
Figure 3: Accuracy with different size of unlabeled data for NTICR-EN+NTCIR-CH
Related Work
After that, co-training (Blum and Mitchell, 1998) approach is adopted to leverage Chinese unlabeled data and their English translation to improve the SVM classifier for Chinese sentiment classification.
unlabeled data is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Sun, Weiwei and Uszkoreit, Hans
Abstract
Paradigmatic lexical relations are explicitly captured by word clustering on large-scale unlabeled data and are used to design new features to enhance a discriminative tagger.
Capturing Paradigmatic Relations via Word Clustering
Brown Clustering Our first choice is the bottom-up agglomerative word clustering algorithm of (Brown et al., 1992) which derives a hierarchical clustering of words from unlabeled data .
Capturing Paradigmatic Relations via Word Clustering
The large-scale unlabeled data we use in our experiments comes from the Chinese Gigaword (LDC2005T14).
Capturing Paradigmatic Relations via Word Clustering
Furthermore, as shown in (Sun and Xu, 2011), appropriate string knowledge acquired from large-scale unlabeled data can significantly enhance a supervised model, especially for the prediction of out-of-vocabulary (OOV) words.
Introduction
First, we employ unsupervised word clustering to explore paradigmatic relations that are encoded in large-scale unlabeled data .
unlabeled data is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Celikyilmaz, Asli and Hakkani-Tur, Dilek
Experiments
At each random selection, the rest of the utterances are used as unlabeled data to boost the performance of MCM.
Experiments
Being Bayesian, our model can incorporate unlabeled data at training time.
Experiments
Here, we evaluate the performance gain on domain, act and slot predictions as more unlabeled data is introduced at learning time.
unlabeled data is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Titov, Ivan and Klementiev, Alexandre
Inference
An inference algorithm for an unsupervised model should be efficient enough to handle vast amounts of unlabeled data , as it can easily be obtained and is likely to improve results.
Introduction
This suggests that the models scale to much larger corpora, which is an important property for a successful unsupervised learning method, as unlabeled data is abundant.
Related Work
Most of the SRL research has focused on the supervised setting, however, lack of annotated resources for most languages and insufficient coverage provided by the existing resources motivates the need for using unlabeled data or other forms of weak supervision.
Related Work
This includes methods based on graph alignment between labeled and unlabeled data (Furstenau and Lapata, 2009), using unlabeled data to improve lexical generalization (Deschacht and Moens, 2009), and projection of annotation across languages (Pado and Lapata, 2009; van der Plas et al., 2011).
unlabeled data is mentioned in 4 sentences in this paper.
Topics mentioned in this paper: