Index of papers in Proc. ACL 2013 that mention
  • semi-supervised
Lucas, Michael and Downey, Doug
Abstract
Semi-supervised learning (SSL) methods augment standard machine learning (ML) techniques to leverage unlabeled data.
Abstract
In this paper, we show that improving marginal word frequency estimates using unlabeled data can enable semi-supervised text classification that scales to massive unlabeled data sets.
Introduction
Semi-supervised Learning (SSL) is a Machine Learning (ML) approach that utilizes large amounts of unlabeled data, combined with a smaller amount of labeled data, to learn a target function (Zhu, 2006; Chapelle et al., 2006).
Introduction
Typically, for each target concept to be learned, a semi-supervised classifier is trained using iterative techniques that execute multiple passes over the unlabeled data (e. g., Expectation-Maximization (Nigam et al., 2000) or Label Propagation (Zhu and Ghahramani, 2002)).
Problem Definition
We consider a semi-supervised classification task, in which the goal is to produce a mapping from an instance space 26 consisting of T-tuples of nonnegative integer-valued features w 2 (ml, .
Problem Definition
Our semi-supervised technique utilizes statistics computed over the labeled corpus, denoted as follows.
Problem Definition
In addition to Multinomial Naive Bayes (discussed in Section 3), we evaluate against a variety of supervised and semi-supervised techniques from previous work, which provide a representation of the state of the art.
semi-supervised is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
Zeng, Xiaodong and Wong, Derek F. and Chao, Lidia S. and Trancoso, Isabel
Abstract
This paper introduces a graph-based semi-supervised joint model of Chinese word segmentation and part-of-speech tagging.
Abstract
Empirical results on Chinese tree bank (CTB-7) and Microsoft Research corpora (MSR) reveal that the proposed model can yield better results than the supervised baselines and other competitive semi-supervised CRFs in this task.
Introduction
Therefore, semi-supervised joint S&T appears to be a natural solution for easily incorporating accessible unlabeled data to improve the joint S&T model.
Introduction
This study focuses on using a graph-based label propagation method to build a semi-supervised joint S&T model.
Introduction
labeled and unlabeled data to achieve the semi-supervised learning.
Related Work
There are few explorations of semi-supervised approaches for CWS or POS tagging in previous works.
Related Work
(2008) described a Bayesian semi-supervised CWS model by considering the segmentation as the hidden variable in machine translation.
Related Work
(2011) proposed a semi-supervised pipeline S&T model by incorporating n-gram and lexicon features derived from unlabeled data.
semi-supervised is mentioned in 31 sentences in this paper.
Topics mentioned in this paper:
Zeng, Xiaodong and Wong, Derek F. and Chao, Lidia S. and Trancoso, Isabel
Abstract
This paper presents a semi-supervised Chinese word segmentation (CWS) approach that co-regularizes character-based and word-based models.
Abstract
The evaluation on the Chinese tree bank reveals that our model results in better gains over the state-of-the-art semi-supervised models reported in the literature.
Experiment
The line of “ours” reports the performance of our semi-supervised model with the tuned parameters.
Experiment
It can be observed that our semi-supervised model is able to benefit from unlabeled data and greatly improves the results over the supervised baseline.
Experiment
We also compare our model with two state-of-the-art semi-supervised methods of Wang ’11 (Wang et al., 2011) and Sun ’11 (Sun and Xu, 2011).
Introduction
This naturally provides motivation for using easily accessible raw texts to enhance supervised CWS models, in semi-supervised approaches.
Introduction
In the past years, however, few semi-supervised CWS models have been proposed.
Introduction
(2008) described a Bayesian semi-supervised model by considering the segmentation as the hidden variable in machine translation.
Semi-supervised Learning via Co-regularizing Both Models
As mentioned earlier, the primary challenge of semi-supervised CWS concentrates on the unlabeled data.
semi-supervised is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Zhu, Muhua and Zhang, Yue and Chen, Wenliang and Zhang, Min and Zhu, Jingbo
Experiments
Table 6: Experimental results on the English and Chinese development sets with different types of semi-supervised features added incrementally to the extended parser.
Experiments
Based on the extended parser, we experimented different types of semi-supervised features by adding the features incrementally.
Experiments
By comparing the results in Table 5 and the results in Table 6 we can see that the semi-supervised features achieve an overall improvement of 1.0% on the English data and an im-
Introduction
In addition to the above contributions, we apply a variety of semi-supervised learning techniques to our transition-based parser.
Introduction
Experimental results show that semi-supervised methods give a further improvement of 0.9% in F-score on the English data and 2.4% on the Chinese data.
Semi-supervised Parsing with Large Data
4.4 Semi-supervised Features
semi-supervised is mentioned in 27 sentences in this paper.
Topics mentioned in this paper:
Celikyilmaz, Asli and Hakkani-Tur, Dilek and Tur, Gokhan and Sarikaya, Ruhi
Abstract
To deal with these issues, we describe an efficient semi-supervised learning (SSL) approach which has two components: (i) Markov Topic Regression is a new probabilistic model to cluster words into semantic tags (concepts).
Abstract
Our new SSL approach improves semantic tagging performance by 3% absolute over the baseline models, and also compares favorably on semi-supervised syntactic tagging.
Introduction
To deal with these issues, we present a new semi-supervised learning (SSL) approach, which mainly has two components.
Related Work and Motivation
(I) Semi-Supervised Tagging.
Related Work and Motivation
0 (Wang et al., 2009; Li et al., 2009; Li, 2010; Liu et al., 2011) investigate web query tagging using semi-supervised sequence models.
Semi-Supervised Semantic Labeling
4.2 Retrospective Semi-Supervised CRF
Semi-Supervised Semantic Labeling
Algorithm 2 Retrospective Semi-Supervised CRF Input: Labeled Lil, and unlabeled Ll“ data.
semi-supervised is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Garrette, Dan and Mielens, Jason and Baldridge, Jason
Abstract
While a variety of semi-supervised methods exist for training from incomplete data, there are open questions regarding what types of training data should be used and how much is necessary.
Abstract
Our results show that annotation of word types is the most important, provided a sufficiently capable semi-supervised learning infrastructure is in place to project type information onto a raw corpus.
Conclusions and Future Work
Most importantly, it is clear that type annotations are the most useful input one can obtain from a linguist—provided a semi-supervised algorithm for projecting that information reliably onto raw tokens is available.
Data
While we do not explore a rule-writing approach to POS-tagging, we do consider the impact of rule-based morphological analyzers as a component in our semi-supervised POS-tagging system.
Experiments3
In addition to annotations, semi-supervised tagger training requires a corpus of raw text.
Introduction
The overwhelming take away from our results is that type supervision—when backed by an effective semi-supervised learning approach—is the most important source of linguistic information.
semi-supervised is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Scheible, Christian and Schütze, Hinrich
Abstract
Since no large amount of labeled training data for our new notion of sentiment relevance is available, we investigate two semi-supervised methods for creating sentiment relevance classifiers: a distant supervision approach that leverages structured information about the domain of the reviews; and transfer learning on feature representations based on lexical taxonomies that enables knowledge transfer.
Conclusion
Since a large labeled sentiment relevance resource does not yet exist, we investigated semi-supervised approaches to S-relevance classification that do not require S-relevance-labeled data.
Distant Supervision
Since a large labeled resource for sentiment relevance classification is not yet available, we investigate semi-supervised methods for creating sentiment relevance classifiers.
Introduction
For this reason, we investigate two semi-supervised approaches to S-relevance classification that do not require S-relevance-labeled data.
Transfer Learning
To address the problem that we do not have enough labeled SR data we now investigate a second semi-supervised method for SR classification, transfer learning (TL).
semi-supervised is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Jiang, Wenbin and Sun, Meng and Lü, Yajuan and Yang, Yating and Liu, Qun
Experiments
resorting to complicated features, system combination and other semi-supervised technologies.
Related Work
Lots of efforts have been devoted to semi-supervised methods in sequence labeling and word segmentation (Xu et al., 2008; Suzuki and Isozaki, 2008; Haffari and Sarkar, 2008; Tomanek and Hahn, 2009; Wang et al., 2011).
Related Work
A semi-supervised method tries to find an optimal hyperplane of both annotated data and raw data, thus to result in a model with better coverage and higher accuracy.
Related Work
It is fundamentally different from semi-supervised and unsupervised methods in that we aimed to excavate a totally different kind of knowledge, the natural annotations implied by the structural information in web text.
semi-supervised is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Lan, Man and Xu, Yu and Niu, Zhengyu
Multitask Learning for Discourse Relation Prediction
ASO has been shown to be useful in a semi-supervised learning configuration for several NLP applications, such as, text chunking (Ando and Zhang, 2005b) and text classification (Ando and Zhang, 2005a).
Related Work
2.1.3 Semi-supervised approaches
Related Work
(Hernault et al., 2010) presented a semi-supervised method based on the analysis of co-occurring features in labeled and unlabeled data.
Related Work
Very recently, (Hernault et al., 2011) introduced a semi-supervised work using structure learning method for discourse relation classification, which is quite relevant to our work.
semi-supervised is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zhang, Longkai and Li, Li and He, Zhengyan and Wang, Houfeng and Sun, Ni
Experiment
Another baseline is Li and Sun (2009), which also uses punctuation in their semi-supervised framework.
INTRODUCTION
We build a semi-supervised learning (SSL) framework which can iteratively incorporate newly labeled instances from unlabeled micro-blog data during the training process.
Related Work
Meanwhile semi-supervised methods have been applied into NLP applications.
Related Work
Similar semi-supervised applications include Shen et al.
semi-supervised is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Fukumoto, Fumiyo and Suzuki, Yoshimi and Matsuyoshi, Suguru
Introduction
Quite a lot of learning techniques e. g., semi-supervised learning, self-training, and active learning have been proposed.
Introduction
proposed a semi-supervised learning approach called the Graph Mincut algorithm which uses a small number of positive and negative examples and assigns values to unlabeled examples in a way that optimizes consistency in a nearest-neighbor sense (Blum et al., 2001).
Introduction
Like much previous work on semi-supervised ML, we apply SVM to the positive and unlabeled data, and add the classification results to the training data.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Labutov, Igor and Lipson, Hod
Introduction
Traditionally, we would learn the embeddings for the target task jointly with whatever unlabeled data we may have, in an instance of semi-supervised learning, and/or we may leverage labels from multiple other related tasks in a multitask approach.
Related Work
Embeddings are learned in a semi-supervised fashion, and the components of the embedding are given an explicit probabilistic interpretation.
Related Work
In machine learning literature, joint semi-supervised embedding takes form in methods such as the LaplacianSVM (LapSVM) (Belkin et al., 2006) and Label Propogation (Zhu and Ghahra—mani, 2002), to which our approach is related.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Mukherjee, Arjun and Liu, Bing
Conclusion
A novel technique was also proposed to rank n-gram phrases where relevance based ranking was used in conjunction with a semi-supervised generative model.
Introduction
We employ a semi-supervised generative model called JTE-P to jointly model AD-expressions, pair interactions, and discussion topics simultaneously in a single framework.
Model
JTE-P is a semi-supervised generative model motivated by the joint occurrence of expression types (agreement and disagreement), topics in discussion posts, and user pairwise interactions.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: