Conclusion and Future Work | First, the proposed model can learn previously unseen sentiment words from large unlabeled data , which are not covered by the limited vocabulary in machine translation of the labeled data. |
Cross-Lingual Mixture Model for Sentiment Classification | where A and AWL) are weighting factor to control the influence of the unlabeled data . |
Cross-Lingual Mixture Model for Sentiment Classification | We set A, (AND) to A, (At) when d, belongs to unlabeled data , 1 otherwise. |
Cross-Lingual Mixture Model for Sentiment Classification | When d, belongs to unlabeled data , P(cj is computed according to Equation 5 or 6. |
Experiment | Larger weights indicate larger influence from the unlabeled data . |
Experiment | Second, the two classifiers make prediction on Chinese unlabeled data and their English translation, |
Experiment | Figure 3: Accuracy with different size of unlabeled data for NTICR-EN+NTCIR-CH |
Related Work | After that, co-training (Blum and Mitchell, 1998) approach is adopted to leverage Chinese unlabeled data and their English translation to improve the SVM classifier for Chinese sentiment classification. |
Abstract | Paradigmatic lexical relations are explicitly captured by word clustering on large-scale unlabeled data and are used to design new features to enhance a discriminative tagger. |
Capturing Paradigmatic Relations via Word Clustering | Brown Clustering Our first choice is the bottom-up agglomerative word clustering algorithm of (Brown et al., 1992) which derives a hierarchical clustering of words from unlabeled data . |
Capturing Paradigmatic Relations via Word Clustering | The large-scale unlabeled data we use in our experiments comes from the Chinese Gigaword (LDC2005T14). |
Capturing Paradigmatic Relations via Word Clustering | Furthermore, as shown in (Sun and Xu, 2011), appropriate string knowledge acquired from large-scale unlabeled data can significantly enhance a supervised model, especially for the prediction of out-of-vocabulary (OOV) words. |
Introduction | First, we employ unsupervised word clustering to explore paradigmatic relations that are encoded in large-scale unlabeled data . |
Experiments | At each random selection, the rest of the utterances are used as unlabeled data to boost the performance of MCM. |
Experiments | Being Bayesian, our model can incorporate unlabeled data at training time. |
Experiments | Here, we evaluate the performance gain on domain, act and slot predictions as more unlabeled data is introduced at learning time. |
Inference | An inference algorithm for an unsupervised model should be efficient enough to handle vast amounts of unlabeled data , as it can easily be obtained and is likely to improve results. |
Introduction | This suggests that the models scale to much larger corpora, which is an important property for a successful unsupervised learning method, as unlabeled data is abundant. |
Related Work | Most of the SRL research has focused on the supervised setting, however, lack of annotated resources for most languages and insufficient coverage provided by the existing resources motivates the need for using unlabeled data or other forms of weak supervision. |
Related Work | This includes methods based on graph alignment between labeled and unlabeled data (Furstenau and Lapata, 2009), using unlabeled data to improve lexical generalization (Deschacht and Moens, 2009), and projection of annotation across languages (Pado and Lapata, 2009; van der Plas et al., 2011). |