Introduction | Since the unlabeled data is ample and easy to collect, a successful semi-supervised sentiment classification system would significantly minimize the involvement of labor and time. |
Introduction | Therefore, given the two different views mentioned above, one promising application is to adopt them in co-training algorithms, which has been proven to be an effective semi-supervised learning strategy of incorporating unlabeled data to further improve the classification performance (Zhu, 2005). |
Introduction | Finally, a co-training algorithm is proposed to incorporate unlabeled data for semi-supervised sentiment classification. |
Related Work | Semi-supervised methods combine unlabeled data with labeled training data (often small-scaled) to improve the models. |
Unsupervised Mining of Personal and Impersonal Views | Semi-supervised learning is a strategy which combines unlabeled data with labeled training data to improve the models. |
Unsupervised Mining of Personal and Impersonal Views | The co-training algorithm is a specific semi-supervised learning approach which starts with a set of labeled data and increases the amount of labeled data using the unlabeled data by bootstrapping (Blum and Mitchell, 1998). |
Unsupervised Mining of Personal and Impersonal Views | L—impersonal The unlabeled data U containing personal sentence set S l and impersonal sentence set |
Abstract | We report on analyses that reveal quantitative insights about the use of unlabeled data and the complexity of inter-language correspondence modeling. |
Cross-Language Structural Correspondence Learning | Note that the support of 2125 and 2127 can be determined from the unlabeled data Du. |
Cross-Language Structural Correspondence Learning | Input: Labeled source data D5 Unlabeled data Du 2 D5,” U D1,, |
Experiments | Due to the use of task-specific, unlabeled data , relevant characteristics are captured by the pivot classifiers. |
Experiments | Unlabeled Data The first row of Figure 2 shows the performance of CL-SCL as a function of the ratio of labeled and unlabeled documents. |
Related Work | In the basic domain adaptation setting we are given labeled data from the source domain and unlabeled data from the target domain, and the goal is to train a classifier for the target domain. |
Related Work | SCL then models the correlation between the pivots and all other features by training linear classifiers on the unlabeled data from both domains. |
Related Work | Ando and Zhang (2005b) present a semi-supervised learning method based on this paradigm, which generates related tasks from unlabeled data . |
Introduction | By using unlabelled data to reduce data sparsity in the labeled training data, semi-supervised approaches improve generalization accuracy. |
Unlabled Data | Unlabeled data is used for inducing the word representations. |
Unlabled Data | (2009), we found that all word representations performed better on the supervised task when they were induced on the clean unlabeled data , both embeddings and Brown clusters. |
Unlabled Data | Note that cleaning is applied only to the unlabeled data , not to the labeled data used in the supervised tasks. |
A <— METRICLEARNER(X, 3,1?) | In this case, we treat the set of test instances (without their gold labels) as the unlabeled data . |
Introduction | IDML-IT (Dhillon et al., 2010) is another such method which exploits labeled as well as unlabeled data during metric learning. |
Metric Learning | The ITML metric learning algorithm, which we reviewed in Section 2.2, is supervised in nature, and hence it does not exploit widely available unlabeled data . |
Metric Learning | In this section, we review Inference Driven Metric Learning (IDML) (Algorithm 1) (Dhillon et al., 2010), a recently proposed metric learning framework which combines an existing supervised metric learning algorithm (such as ITML) along with transductive graph-based label inference to learn a new distance metric from labeled as well as unlabeled data combined. |