Abstract | We propose a semi-supervised AL approach for sequence labeling where only highly uncertain subsequences are presented to human annotators, while all others in the selected sequences are automatically labeled. |
Active Learning for Sequence Labeling | This section, first, describes a common approach to AL for sequential data, and then presents our approach to semi-supervised AL. |
Active Learning for Sequence Labeling | 3.2 Semi-Supervised Active Learning |
Active Learning for Sequence Labeling | We call this semi-supervised Active Learning (SeSAL) for sequence labeling. |
Introduction | Accordingly, our approach is a combination of AL and self-training to which we will refer as semi-supervised Active Learning (SeSAL) for sequence labeling. |
Introduction | After a brief overview of the formal underpinnings of Conditional Random Fields, our base classifier for sequence labeling tasks (Section 2), a fully supervised approach to AL for sequence labeling is introduced and complemented by our semi-supervised approach in Section 3. |
Introduction | Our experiments are laid out in Section 5 where we compare fully and semi-supervised AL for NER on two corpora, the newspaper selection of MUC7 and PENNBIOIE, a biological abstracts corpus. |
Related Work | Self-training (Yarowsky, 1995) is a form of semi-supervised learning. |
Related Work | A combination of active and semi-supervised learning has first been proposed by McCallum and Nigam (1998) for text classification. |
Related Work | Similarly, co-testing (Muslea et al., 2002), a multi-view AL algorithms, selects examples for the multi-view, semi-supervised Co-EM algorithm. |
Abstract | In this paper, we exploit semi-supervised learning with the co-training algorithm for automatic detection of coarse level representation of prosodic events such as pitch accents, intonational phrase boundaries, and break indices. |
Co-training strategy for prosodic event detection | Co-training (Blum and Mitchell, 1998) is a semi-supervised multi-view algorithm that uses the initial training set to learn a (weak) classifier in each view. |
Conclusions | In addition, we plan to compare this to other semi-supervised learning techniques such as active learning. |
Experiments and results | Although the test condition is different, our result is significantly better than that of other semi-supervised approaches of previous work and comparable with supervised approaches. |
Introduction | Limited research has been conducted using unsupervised and semi-supervised methods. |
Introduction | In this paper, we exploit semi-supervised learning with the |
Introduction | Our experiments on the Boston Radio News corpus show that the use of unlabeled data can lead to significant improvement of prosodic event detection compared to using the original small training set, and that the semi-supervised learning result is comparable with supervised learning with similar amount of training data. |
Previous work | Limited research has been done in prosodic detection using unsupervised or semi-supervised methods. |
Previous work | She also exploited a semi-supervised approach using Laplacian SVM classification on a small set of examples. |
Previous work | In this paper, we apply co-training algorithm to automatic prosodic event detection and propose methods to better select samples to improve semi-supervised learning performance for this task. |
Experiments | Genuinely unlabeled posts for Political and Lotus were used for semi-supervised learning experiments in section 6.3; they were not used in section 6.2 on the effect of lexical prior knowledge. |
Experiments | Robustness to Vocabulary Size High dimensionality and noise can have profound impact on the comparative performance of clustering and semi-supervised learning algorithms. |
Experiments | The natural question is whether the presence of lexical constraints leads to better semi-supervised models. |
Introduction | However, the treatment of such dictionaries as forms of prior knowledge that can be incorporated in machine learning models is a relatively less explored topic; even lesser so in conjunction with semi-supervised models that attempt to utilize un- |
Related Work | In this regard, our model brings two interrelated but distinct themes from machine learning to bear on this problem: semi-supervised learning and learning from labeled features. |
Related Work | (Goldberg and Zhu, 2006) adapt semi-supervised graph-based methods for sentiment analysis but do not incorporate lexical prior knowledge in the form of labeled features. |
Related Work | We also note the very recent work of (Sindhwani and Melville, 2008) which proposes a dual-supervision model for semi-supervised sentiment analysis. |
Semi-Supervised Learning With Lexical Knowledge | Therefore, the semi-supervised learning with lexical knowledge can be described as |
Semi-Supervised Learning With Lexical Knowledge | Thus the algorithm for semi-supervised learning with lexical knowledge based on our matrix factorization framework, referred as SSMFLK, consists of an iterative procedure using the above three rules until convergence. |
Abstract | To address this problem, we propose a semi-supervised approach to sentiment classification where we first mine the unambiguous reviews using spectral techniques and then exploit them to classify the ambiguous reviews via a novel combination of active learning, transductive learning, and ensemble learning. |
Conclusions | We have proposed a novel semi-supervised approach to polarity classification. |
Conclusions | Since the semi-supervised learner is discriminative, our approach can adopt a richer representation that makes use of more sophisticated features such as bigrams or manually labeled sentiment-oriented words. |
Evaluation | Semi-supervised spectral clustering. |
Evaluation | We implemented Kamvar et al.’s (2003) semi-supervised spectral clustering algorithm, which incorporates labeled data into the clustering framework in the form of must-link and cannot-link constraints. |
Evaluation | As we can see, accuracy ranges from 57.3% to 68.7% and ARI ranges from 0.02 to 0.14, which are significantly better than those of semi-supervised spectral learning. |
Introduction | In light of the difficulties posed by ambiguous reviews, we differentiate between ambiguous and unambiguous reviews in our classification process by addressing the task of semi-supervised polarity classification via a “mine the easy, classify the hard” approach. |
Our Approach | Recall that the goal of this step is not only to identify the unambiguous reviews, but also to annotate them as POSITIVE or NEGATIVE, so that they can serve as seeds for semi-supervised learning in a later step. |
Discussion | Therefore, it is also interesting to combine them in a discriminative way as per-sued in POS tagging using CRF+HMM (Suzuki et al., 2007), let alone a simple semi-supervised approach in Section 5.2. |
Experiments | Table 4: Semi-supervised and supervised results. |
Experiments | Semi-supervised results used only 10K sentences (1/5) of supervised segmentations. |
Experiments | Furthermore, NPYLM is easily amenable to semi-supervised or even supervised learning. |
Introduction | In Section 5 we describe experiments on the standard datasets in Chinese and Japanese in addition to English phonetic transcripts, and semi-supervised experiments are also explored. |
Abstract | We present a graph-based semi-supervised learning for the question-answering (QA) task for ranking candidate sentences. |
Abstract | We implement a semi-supervised learning (SSL) approach to demonstrate that utilization of more unlabeled data points can improve the answer-ranking task of QA. |
Conclusions and Discussions | (2) We will use other distance measures to better explain entailment between q/a pairs and compare with other semi-supervised and transductive approaches. |
Graph Based Semi-Supervised Learning for Entailment Ranking | We formulate semi-supervised entailment rank scores as follows. |
Introduction | Recent research indicates that using labeled and unlabeled data in semi-supervised learning (SSL) environment, with an emphasis on graph-based methods, can improve the performance of information extraction from data for tasks such as question classification (Tri et al., 2006), web classification (Liu et al., 2006), relation extraction (Chen et al., 2006), passage-retrieval (Otterbacher et al., 2009), various natural language processing tasks such as part-of-speech tagging, and named-entity recognition (Suzuki and Isozaki, 2008), word-sense disam- |
Abstract | In this paper, we present a combination of active learning and semi-supervised learning method to treat the case when positive examples, which have an expected word sense in web search result, are only given. |
Introduction | McCallum and Nigam (1998) combined active learning and semi-supervised learning technique |
Introduction | Figure l: A combination of active learning and semi-supervised learning starting with positive and unlabeled examples |
Introduction | Next we use Nigam’s semi-supervised learning method using EM and a naive Bayes classifier (Nigam et. |
Experiments | (2006): the semi-supervised Alternating Structural Optimization (ASO) technique and the Structural Correspondence Learning (SCL) technique for domain adaptation. |
Related Work | Ando and Zhang develop a semi-supervised chunker that outperforms purely supervised approaches on the CoNLL 2000 dataset (Ando and Zhang, 2005). |
Related Work | Recent projects in semi-supervised (Toutanova and Johnson, 2007) and unsupervised (Biemann et al., 2007; Smith and Eisner, 2005) tagging also show significant progress. |
Related Work | HMMs have been used many times for POS tagging and chunking, in supervised, semi-supervised , and in unsupervised settings (Banko and Moore, 2004; Goldwater and Griffiths, 2007; Johnson, 2007; Zhou, 2004). |
Abstract | In this paper, we propose a novel method for semi-supervised learning of non-projective log-linear dependency parsers using directly expressed linguistic prior knowledge (e.g. |
Experimental Comparison with Unsupervised Learning | In this paper, we developed a novel method for the semi-supervised learning of a non-projective CRF dependency parser that directly uses linguistic prior knowledge as a training signal. |
Related Work | Conventional semi-supervised learning requires parsed sentences. |
Related work | (2006) explored semi-supervised learning for relation extraction using label propagation, which makes use of unlabeled data. |
Task definition | Another existing solution to weakly-supervised learning problems is semi-supervised learning, e.g. |
Task definition | However, because our proposed transfer learning method can be combined with semi-supervised learning, here we do not include semi-supervised learning as a baseline. |
Abstract | The evaluation set is derived from WordNet in a semi-supervised way. |
Conclusion and Future Work | We proposed a semi-supervised way to extract non-compositional MWEs from WordNet. |
Introduction and related work | Thirdly, we propose a semi-supervised approach for extracting non-compositional MWEs from WordNet, to decrease annotation cost. |
Discussion and Related Work | In earlier work on semi-supervised learning, e. g., (Blum and Mitchell 1998), the classifiers learned from unlabeled data were used directly. |
Discussion and Related Work | There are two main scenarios that motivate semi-supervised learning. |
Introduction | In this paper, we present a semi-supervised learning algorithm that goes a step further. |
Introduction | Even with semi-supervised approaches, which use a large unlabeled corpus, manual construction of a small set of seeds known as true instances of the target entity or relation is susceptible to arbitrary human decisions. |
Introduction | 0 The combination of these patterns produces a clustering method to achieve high precision for different Information Extraction applications, especially for bootstrapping a high-recall semi-supervised relation extraction system. |
Related Work | (Rosenfeld and Feldman, 2006) showed that the clusters discovered by URI are useful for seeding a semi-supervised relation extraction system. |