Index of papers in Proc. ACL that mention
  • semi-supervised
Huang, Hongzhao and Cao, Yunbo and Huang, Xiaojiang and Ji, Heng and Lin, Chin-Yew
Abstract
To tackle these challenges, we propose a novel semi-supervised graph regularization model to incorporate both local and global evidence from multiple tweets through three fine-grained relations.
Introduction
Therefore, it is challenging to create sufficient high quality labeled tweets for supervised models and worth considering semi-supervised learning with the exploration of unlabeled data.
Introduction
However, when selecting semi-supervised learning frameworks, we noticed another unique challenge that tweets pose to wikification due to their informal writing style, shortness and noisiness.
Introduction
Therefore, a collective inference model over multiple tweets in the semi-supervised setting is desirable.
Principles and Approach Overview
The label assignment is obtained by our semi-supervised graph regularization framework based on a relational graph, which is constructed from local compatibility, coreference, and semantic relatedness relations.
Relational Graph Construction
They con-:rol the contributions of these three relations in our semi-supervised graph regularization model.
Relational Graph Construction
(ii) It is more appropriate for our graph-based semi-supervised model since it is difficult to assign labels to a pair of mention and concept in the referent graph.
Semi-supervised Graph Regularization
We propose a novel semi-supervised graph regularization framework based on the graph-based semi-supervised learning algorithm (Zhu et al., 2003):
semi-supervised is mentioned in 16 sentences in this paper.
Topics mentioned in this paper:
Titov, Ivan and Kozhevnikov, Mikhail
Empirical Evaluation
In this section, we consider the semi-supervised setup, and present evaluation of our approach on on the problem of aligning weather forecast reports to the formal representation of weather.
Empirical Evaluation
Only then, in the semi-supervised learning scenarios, we added unlabeled data and ran 5 additional iterations of EM.
Empirical Evaluation
We compare our approach (Semi-superv, non-contr) with two baselines: the basic supervised training on 100 labeled forecasts (Supervised BL) and with the semi-supervised training which disregards the non-contradiction relations (Semi-superv BL).
Inference with NonContradictory Documents
However, in a semi-supervised or unsupervised case variational techniques, such as the EM algorithm (Dempster et al., 1977), are often used to estimate the model.
Introduction
Such annotated resources are scarce and expensive to create, motivating the need for unsupervised or semi-supervised techniques (Poon and Domingos, 2009).
Introduction
This compares favorably with 69.1% shown by a semi-supervised learning approach, though, as expected, does not reach the score of the model which, in training, observed semantics states for all the 750 documents (77.7% F1).
Summary and Future Work
Our approach resulted in an improvement over the scores of both the supervised baseline and of the traditional semi-supervised leam-ing.
semi-supervised is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Li, Zhenghua and Zhang, Min and Chen, Wenliang
Abstract
This paper proposes a simple yet effective framework for semi-supervised dependency parsing at entire tree level, referred to as ambiguity-aware ensemble training.
Abstract
Experimental results on benchmark data show that our method significantly outperforms the baseline supervised parser and other entire-tree based semi-supervised methods, such as self-training, co-training and tri-training.
Ambiguity-aware Ensemble Training
In standard entire-tree based semi-supervised methods such as self/co/tri-training, automatically parsed unlabeled sentences are used as additional training data, and noisy l-best parse trees are considered as gold-standard.
Ambiguity-aware Ensemble Training
We apply L2-norm regularized SGD training to iteratively learn feature weights w for our CRF-based baseline and semi-supervised parsers.
Experiments and Analysis
For the semi-supervised parsers trained with Algorithm 1, we use N1 = 20K and M1 = 50K for English, and N1 = 15K and M1 = 50K for Chinese, based on a few preliminary experiments.
Experiments and Analysis
For semi-supervised cases, one iteration takes about 2 hours on an IBM server having 2.0 GHz Intel Xeon CPUs and 72G memory.
Introduction
In contrast, semi-supervised approaches, which can make use of large-scale unlabeled data, have attracted more and more interest.
Introduction
To solve above issues, this paper proposes a more general and effective framework for semi-supervised dependency parsing, referred to as ambiguity-aware ensemble training.
Introduction
We propose a generalized ambiguity-aware ensemble training framework for semi-supervised dependency parsing, which can
semi-supervised is mentioned in 23 sentences in this paper.
Topics mentioned in this paper:
Li, Shoushan and Huang, Chu-Ren and Zhou, Guodong and Lee, Sophia Yat Mei
Abstract
In this paper, we adopt two views, personal and impersonal views, and systematically employ them in both supervised and semi-supervised sentiment classification.
Abstract
On this basis, an ensemble method and a co—training algorithm are explored to employ the two views in supervised and semi-supervised sentiment classification respectively.
Introduction
Since the unlabeled data is ample and easy to collect, a successful semi-supervised sentiment classification system would significantly minimize the involvement of labor and time.
Introduction
Therefore, given the two different views mentioned above, one promising application is to adopt them in co-training algorithms, which has been proven to be an effective semi-supervised learning strategy of incorporating unlabeled data to further improve the classification performance (Zhu, 2005).
Introduction
In this paper, we systematically employ personal/impersonal views in supervised and semi-supervised sentiment classification.
Related Work
Generally, document-level sentiment classification methods can be categorized into three types: unsupervised, supervised, and semi-supervised .
Related Work
Semi-supervised methods combine unlabeled data with labeled training data (often small-scaled) to improve the models.
Related Work
Compared to the supervised and unsupervised methods, semi-supervised methods for sentiment classification are relatively new and have much less related studies.
semi-supervised is mentioned in 30 sentences in this paper.
Topics mentioned in this paper:
Titov, Ivan
Abstract
We consider a semi-supervised setting for domain adaptation where only unlabeled data is available for the target domain.
Constraints on Inter-Domain Variability
As we discussed in the introduction, our goal is to provide a method for domain adaptation based on semi-supervised learning of models with distributed representations.
Constraints on Inter-Domain Variability
In this section, we first discuss the shortcomings of domain adaptation with the above-described semi-supervised approach and motivate constraints on inter-domain variability of
Empirical Evaluation
For every pair, the semi-supervised methods use labeled data from the source domain and unlabeled data from both domains.
Empirical Evaluation
All the methods, supervised and semi-supervised , are based on the model described in Section 2.
Empirical Evaluation
This does not seem to have an adverse effect on the accuracy but makes learning very efficient: the average training time for the semi-supervised methods was about 20 minutes on a standard PC.
Introduction
The danger of this semi-supervised approach in the domain-adaptation setting is that some of the latent variables will correspond to clusters of features specific only to the source domain, and consequently, the classifier relying on this latent variable will be badly affected when tested on the target domain.
Related Work
Various semi-supervised techniques for domain-adaptation have also been considered, one example being self-training (McClosky et al., 2006).
Related Work
Semi-supervised leam-ing with distributed representations and its application to domain adaptation has previously been considered in (Huang and Yates, 2009), but no attempt has been made to address problems specific to the domain-adaptation setting.
semi-supervised is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Lu, Shixiang and Chen, Zhenbiao and Xu, Bo
Abstract
Using the unsupervised pre-trained deep belief net (DBN) to initialize DAE’s parameters and using the input original phrase features as a teacher for semi-supervised fine-tuning, we learn new semi-supervised DAE features, which are more effective and stable than the unsupervised DBN features.
Abstract
On two Chinese-English tasks, our semi-supervised DAE features obtain statistically significant improvements of l.34/2.45 (IWSLT) and 0.82/1.52 (NIST) BLEU points over the unsupervised DBN features and the baseline features, respectively.
Introduction
al., 2010), and speech spectrograms (Deng et al., 2010), we propose new feature learning using semi-supervised DAE for phrase-based translation model.
Introduction
By using the input data as the teacher, the “semi-supervised” fine-tuning process of DAE addresses the problem of “back-propagation without a teacher” (Rumelhart et al., 1986), which makes the DAB learn more powerful and abstract features (Hinton and Salakhutdinov, 2006).
Introduction
For our semi-supervised DAE feature learning task, we use the unsupervised pre-trained DBN to initialize DAE’s parameters and use the input original phrase features as the “teacher” for semi-supervised back-propagation.
Semi-Supervised Deep Auto-encoder Features Learning for SMT
Each translation rule in the phrase-based translation model has a set number of features that are combined in the log-linear model (Och and Ney, 2002), and our semi-supervised DAE features can also be combined in this model.
Semi-Supervised Deep Auto-encoder Features Learning for SMT
Figure 2: After the unsupervised pre-training, the DBNs are “unrolled” to create a semi-supervised DAE, which is then fine-tuned using back-propagation of error derivatives.
Semi-Supervised Deep Auto-encoder Features Learning for SMT
To learn a semi-supervised DAE, we first “unroll” the above 11 layer DBN by using its weight matrices to create a deep, 2n-l layer network whose lower layers use the matrices to “encode” the input and whose upper layers use the matrices in reverse order to “decode” the input (Hinton and Salakhutdinov, 2006; Salakhutdinov and Hinton, 2009; Deng et al., 2010), as shown in Figure 2.
semi-supervised is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Zeng, Xiaodong and Wong, Derek F. and Chao, Lidia S. and Trancoso, Isabel
Abstract
This paper presents a semi-supervised Chinese word segmentation (CWS) approach that co-regularizes character-based and word-based models.
Abstract
The evaluation on the Chinese tree bank reveals that our model results in better gains over the state-of-the-art semi-supervised models reported in the literature.
Experiment
The line of “ours” reports the performance of our semi-supervised model with the tuned parameters.
Experiment
It can be observed that our semi-supervised model is able to benefit from unlabeled data and greatly improves the results over the supervised baseline.
Experiment
We also compare our model with two state-of-the-art semi-supervised methods of Wang ’11 (Wang et al., 2011) and Sun ’11 (Sun and Xu, 2011).
Introduction
This naturally provides motivation for using easily accessible raw texts to enhance supervised CWS models, in semi-supervised approaches.
Introduction
In the past years, however, few semi-supervised CWS models have been proposed.
Introduction
(2008) described a Bayesian semi-supervised model by considering the segmentation as the hidden variable in machine translation.
Semi-supervised Learning via Co-regularizing Both Models
As mentioned earlier, the primary challenge of semi-supervised CWS concentrates on the unlabeled data.
semi-supervised is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Zeng, Xiaodong and Wong, Derek F. and Chao, Lidia S. and Trancoso, Isabel
Abstract
This paper introduces a graph-based semi-supervised joint model of Chinese word segmentation and part-of-speech tagging.
Abstract
Empirical results on Chinese tree bank (CTB-7) and Microsoft Research corpora (MSR) reveal that the proposed model can yield better results than the supervised baselines and other competitive semi-supervised CRFs in this task.
Introduction
Therefore, semi-supervised joint S&T appears to be a natural solution for easily incorporating accessible unlabeled data to improve the joint S&T model.
Introduction
This study focuses on using a graph-based label propagation method to build a semi-supervised joint S&T model.
Introduction
labeled and unlabeled data to achieve the semi-supervised learning.
Related Work
There are few explorations of semi-supervised approaches for CWS or POS tagging in previous works.
Related Work
(2008) described a Bayesian semi-supervised CWS model by considering the segmentation as the hidden variable in machine translation.
Related Work
(2011) proposed a semi-supervised pipeline S&T model by incorporating n-gram and lexicon features derived from unlabeled data.
semi-supervised is mentioned in 31 sentences in this paper.
Topics mentioned in this paper:
Tomanek, Katrin and Hahn, Udo
Abstract
We propose a semi-supervised AL approach for sequence labeling where only highly uncertain subsequences are presented to human annotators, while all others in the selected sequences are automatically labeled.
Active Learning for Sequence Labeling
This section, first, describes a common approach to AL for sequential data, and then presents our approach to semi-supervised AL.
Active Learning for Sequence Labeling
3.2 Semi-Supervised Active Learning
Active Learning for Sequence Labeling
We call this semi-supervised Active Learning (SeSAL) for sequence labeling.
Introduction
Accordingly, our approach is a combination of AL and self-training to which we will refer as semi-supervised Active Learning (SeSAL) for sequence labeling.
Introduction
After a brief overview of the formal underpinnings of Conditional Random Fields, our base classifier for sequence labeling tasks (Section 2), a fully supervised approach to AL for sequence labeling is introduced and complemented by our semi-supervised approach in Section 3.
Introduction
Our experiments are laid out in Section 5 where we compare fully and semi-supervised AL for NER on two corpora, the newspaper selection of MUC7 and PENNBIOIE, a biological abstracts corpus.
Related Work
Self-training (Yarowsky, 1995) is a form of semi-supervised learning.
Related Work
A combination of active and semi-supervised learning has first been proposed by McCallum and Nigam (1998) for text classification.
Related Work
Similarly, co-testing (Muslea et al., 2002), a multi-view AL algorithms, selects examples for the multi-view, semi-supervised Co-EM algorithm.
semi-supervised is mentioned in 16 sentences in this paper.
Topics mentioned in this paper:
Turian, Joseph and Ratinov, Lev-Arie and Bengio, Yoshua
Introduction
By using unlabelled data to reduce data sparsity in the labeled training data, semi-supervised approaches improve generalization accuracy.
Introduction
Semi-supervised models such as Ando and Zhang (2005), Suzuki and Isozaki (2008), and Suzuki et al.
Introduction
It can be tricky and time-consuming to adapt an existing supervised NLP system to use these semi-supervised techniques.
Supervised evaluation tasks
This technique for turning a supervised approach into a semi-supervised one is general and task-agnostic.
Supervised evaluation tasks
We apply clustering and distributed representations to NER and chunking, which allows us to compare our semi-supervised models to those of Ando and Zhang (2005) and Suzuki and Isozaki (2008).
Unlabled Data
Ando and Zhang (2005) present a semi-supervised learning algorithm called alternating structure optimization (ASO).
Unlabled Data
Suzuki and Isozaki (2008) present a semi-supervised extension of CRFs.
Unlabled Data
(2009), they extend their semi-supervised approach to more general conditional models.)
semi-supervised is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Wang, Zhiguo and Xue, Nianwen
Abstract
Third, to enhance the power of parsing models, we enlarge the feature set with nonlocal features and semi-supervised word cluster features.
Experiment
In this subsection, we examined the usefulness of the new nonlocal features and the semi-supervised word cluster features described in Subsection 3.3.
Experiment
We built three new parsing systems based on the StateAlign system: Nonlocal system extends the feature set of StateAlign system with nonlocal features, Cluster system extends the feature set with semi-supervised word cluster features, and Nonlocal & Cluster system extend the feature set with both groups of features.
Experiment
Compared with the StateAlign system which takes only the baseline features, the nonlocal features improved parsing F1 by 0.8%, while the semi-supervised word cluster features result in an improvement of 2.3% in parsing F1 and an 1.1% improvement on POS tagging accuracy.
Introduction
Third, we take into account two groups of complex structural features that have not been previously used in transition-based parsing: nonlocal features (Charniak and Johnson, 2005) and semi-supervised word cluster features (Koo et al., 2008).
Introduction
After integrating semi-supervised word cluster features, the parsing accuracy is further improved to 86.3% when trained on CTB 5.1 and 87.1% when trained on CTB 6.0, and this is the best reported performance for Chinese.
Joint POS Tagging and Parsing with Nonlocal Features
To further improve the performance of our transition-based constituent parser, we consider two group of complex structural features: nonlocal features (Chamiak and Johnson, 2005; Collins and Koo, 2005) and semi-supervised word cluster features (Koo et al., 2008).
Joint POS Tagging and Parsing with Nonlocal Features
Semi-supervised word cluster features have been successfully applied to many NLP tasks (Miller et al., 2004; Koo et al., 2008; Zhu et al., 2013).
Joint POS Tagging and Parsing with Nonlocal Features
Using these two types of clusters, we construct semi-supervised word cluster features by mimicking the template structure of the original baseline features in Table 1.
semi-supervised is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
Yang, Bishan and Cardie, Claire
Abstract
The context-aware constraints provide additional power to the CRF model and can guide semi-supervised learning when labeled data is limited.
Abstract
Experiments on standard product review datasets show that our method outperforms the state-of—the-art methods in both the supervised and semi-supervised settings.
Approach
Global Sentiment Previous studies have demonstrated the value of document-level sentiment in guiding the semi-supervised learning of sentence-level sentiment (Tackstrom and McDonald, 2011b; Qu et al., 2012).
Experiments
We evaluated our method in two settings: supervised and semi-supervised .
Experiments
In the semi-supervised setting, our unlabeled data consists of
Experiments
Table 3: Accuracy results (%) for semi-supervised sentiment classification (three-way) on the MD dataset
Introduction
Semi-supervised techniques have been proposed for sentence-level sentiment classification (Tackstro'm and McDonald, 2011a; Qu et al., 2012).
Introduction
Experimental results show that our model outperforms state-of-the-art methods in both the supervised and semi-supervised settings.
Related Work
Our approach is also semi-supervised .
Related Work
Compared to the existing work on semi-supervised learning for sentence-level sentiment classification (Tackstro'm and McDonald, 2011a; Tackstrom and McDonald, 2011b; Qu et al., 2012), our work does not rely on a large amount of coarse-grained (document-level) labeled data, instead, distant supervision mainly comes from linguistically-motivated constraints.
semi-supervised is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Zhang, Jiajun and Liu, Shujie and Li, Mu and Zhou, Ming and Zong, Chengqing
Bilingually-constrained Recursive Auto-encoders
And the semi-supervised phrase embedding (Socher et al., 2011; Socher et al., 2013a; Li et al., 2013) further indicates that phrase embedding can be tuned with respect to the label.
Bilingually-constrained Recursive Auto-encoders
We will first briefly present the unsupervised phrase embedding, and then describe the semi-supervised framework.
Bilingually-constrained Recursive Auto-encoders
3.2 Semi-supervised Phrase Embedding
Related Work
This kind of semi-supervised phrase embedding is in fact performing phrase clustering with respect to the phrase label.
Related Work
Obviously, this kind methods of semi-supervised phrase embedding do not fully address the semantic meaning of the phrases.
semi-supervised is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Krishnamurthy, Jayant and Mitchell, Tom
Abstract
Synonym detection exploits redundant information to train several domain-specific synonym classifiers in a semi-supervised fashion.
Background: Never-Ending Language Learner
NELL is an information extraction system that has been running 24x7 for over a year, using coupled semi-supervised learning to populate an ontology from unstructured text found on the web.
ConceptResolver
After mapping each noun phrase to one or more senses (each with a distinct category type), ConceptResolver performs semi-supervised clustering to find synonymous senses.
ConceptResolver
For each category, ConceptResolver trains a semi-supervised synonym classifier then uses its predictions to cluster word senses.
Introduction
Train a semi-supervised classifier to predict synonymy.
Introduction
used to train a semi-supervised classifier.
Prior Work
However, our evaluation shows that ConceptResolver has higher synonym resolution precision than Resolver, which we attribute to our semi-supervised approach and the known relation schema.
Prior Work
ConceptResolver’s approach lies between these two extremes: we label a small number of synonyms (10 pairs), then use semi-supervised training to learn a similarity function.
Prior Work
ConceptResolver uses a novel algorithm for semi-supervised clustering which is conceptually similar to other work in the area.
semi-supervised is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
LIU, Xiaohua and ZHANG, Shaodian and WEI, Furu and ZHOU, Ming
Abstract
We propose to combine a K-Nearest Neighbors (KNN) classifier with a linear Conditional Random Fields (CRF) model under a semi-supervised learning framework to tackle these challenges.
Abstract
The semi-supervised learning plus the gazetteers alleviate the lack of training data.
Abstract
Extensive experiments show the advantages of our method over the baselines as well as the effectiveness of KNN and semi-supervised learning.
Introduction
(2010), which introduces a high-level rule language, called NERL, to build the general and domain specific NER systems; and 2) semi-supervised learning, which aims to use the abundant unlabeled data to compensate for the lack of annotated data.
Introduction
Indeed, it is the combination of KNN and CRF under a semi-supervised learning framework that differentiates ours from the existing.
Introduction
It is also demonstrated that integrating KNN classified results into the CRF model and semi-supervised learning considerably boost the performance.
Related Work
Related work can be roughly divided into three categories: NER on tweets, NER on non-tweets (e.g., news, biological medicine, and clinical notes), and semi-supervised learning for NER.
Related Work
To achieve this, a KNN classifier with a CRF model is combined to leverage cross tweets information, and the semi-supervised learning is adopted to leverage unlabeled tweets.
Related Work
2.3 Semi-supervised Learning for NER
semi-supervised is mentioned in 24 sentences in this paper.
Topics mentioned in this paper:
Lucas, Michael and Downey, Doug
Abstract
Semi-supervised learning (SSL) methods augment standard machine learning (ML) techniques to leverage unlabeled data.
Abstract
In this paper, we show that improving marginal word frequency estimates using unlabeled data can enable semi-supervised text classification that scales to massive unlabeled data sets.
Introduction
Semi-supervised Learning (SSL) is a Machine Learning (ML) approach that utilizes large amounts of unlabeled data, combined with a smaller amount of labeled data, to learn a target function (Zhu, 2006; Chapelle et al., 2006).
Introduction
Typically, for each target concept to be learned, a semi-supervised classifier is trained using iterative techniques that execute multiple passes over the unlabeled data (e. g., Expectation-Maximization (Nigam et al., 2000) or Label Propagation (Zhu and Ghahramani, 2002)).
Problem Definition
We consider a semi-supervised classification task, in which the goal is to produce a mapping from an instance space 26 consisting of T-tuples of nonnegative integer-valued features w 2 (ml, .
Problem Definition
Our semi-supervised technique utilizes statistics computed over the labeled corpus, denoted as follows.
Problem Definition
In addition to Multinomial Naive Bayes (discussed in Section 3), we evaluate against a variety of supervised and semi-supervised techniques from previous work, which provide a representation of the state of the art.
semi-supervised is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
Wang, Qin Iris and Schuurmans, Dale and Lin, Dekang
Abstract
We present a novel semi-supervised training algorithm for learning dependency parsers.
Abstract
By combining a supervised large margin loss with an unsupervised least squares loss, a dis-criminative, convex, semi-supervised learning algorithm can be obtained that is applicable to large-scale problems.
Introduction
heavy dependence on annotated corpora—many researchers have investigated semi-supervised learning techniques that can take both labeled and unlabeled training data as input.
Introduction
Unfortunately, although significant recent progress has been made in the area of semi-supervised learning, the performance of semi-supervised learning algorithms still fall far short of expectations, particularly in challenging real-world tasks such as natural language parsing or machine translation.
Introduction
A large number of distinct approaches to semi-supervised training algorithms have been investigated in the literature (Bennett and Demiriz, 1998; Zhu et al., 2003; Altun et al., 2005; Mann and McCallum, 2007).
semi-supervised is mentioned in 39 sentences in this paper.
Topics mentioned in this paper:
Zhu, Muhua and Zhang, Yue and Chen, Wenliang and Zhang, Min and Zhu, Jingbo
Experiments
Table 6: Experimental results on the English and Chinese development sets with different types of semi-supervised features added incrementally to the extended parser.
Experiments
Based on the extended parser, we experimented different types of semi-supervised features by adding the features incrementally.
Experiments
By comparing the results in Table 5 and the results in Table 6 we can see that the semi-supervised features achieve an overall improvement of 1.0% on the English data and an im-
Introduction
In addition to the above contributions, we apply a variety of semi-supervised learning techniques to our transition-based parser.
Introduction
Experimental results show that semi-supervised methods give a further improvement of 0.9% in F-score on the English data and 2.4% on the Chinese data.
Semi-supervised Parsing with Large Data
4.4 Semi-supervised Features
semi-supervised is mentioned in 27 sentences in this paper.
Topics mentioned in this paper:
Li, Tao and Zhang, Yi and Sindhwani, Vikas
Experiments
Genuinely unlabeled posts for Political and Lotus were used for semi-supervised learning experiments in section 6.3; they were not used in section 6.2 on the effect of lexical prior knowledge.
Experiments
Robustness to Vocabulary Size High dimensionality and noise can have profound impact on the comparative performance of clustering and semi-supervised learning algorithms.
Experiments
The natural question is whether the presence of lexical constraints leads to better semi-supervised models.
Introduction
However, the treatment of such dictionaries as forms of prior knowledge that can be incorporated in machine learning models is a relatively less explored topic; even lesser so in conjunction with semi-supervised models that attempt to utilize un-
Related Work
In this regard, our model brings two interrelated but distinct themes from machine learning to bear on this problem: semi-supervised learning and learning from labeled features.
Related Work
(Goldberg and Zhu, 2006) adapt semi-supervised graph-based methods for sentiment analysis but do not incorporate lexical prior knowledge in the form of labeled features.
Related Work
We also note the very recent work of (Sindhwani and Melville, 2008) which proposes a dual-supervision model for semi-supervised sentiment analysis.
Semi-Supervised Learning With Lexical Knowledge
Therefore, the semi-supervised learning with lexical knowledge can be described as
Semi-Supervised Learning With Lexical Knowledge
Thus the algorithm for semi-supervised learning with lexical knowledge based on our matrix factorization framework, referred as SSMFLK, consists of an iterative procedure using the above three rules until convergence.
semi-supervised is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Jeon, Je Hun and Liu, Yang
Abstract
In this paper, we exploit semi-supervised learning with the co-training algorithm for automatic detection of coarse level representation of prosodic events such as pitch accents, intonational phrase boundaries, and break indices.
Co-training strategy for prosodic event detection
Co-training (Blum and Mitchell, 1998) is a semi-supervised multi-view algorithm that uses the initial training set to learn a (weak) classifier in each view.
Conclusions
In addition, we plan to compare this to other semi-supervised learning techniques such as active learning.
Experiments and results
Although the test condition is different, our result is significantly better than that of other semi-supervised approaches of previous work and comparable with supervised approaches.
Introduction
Limited research has been conducted using unsupervised and semi-supervised methods.
Introduction
In this paper, we exploit semi-supervised learning with the
Introduction
Our experiments on the Boston Radio News corpus show that the use of unlabeled data can lead to significant improvement of prosodic event detection compared to using the original small training set, and that the semi-supervised learning result is comparable with supervised learning with similar amount of training data.
Previous work
Limited research has been done in prosodic detection using unsupervised or semi-supervised methods.
Previous work
She also exploited a semi-supervised approach using Laplacian SVM classification on a small set of examples.
Previous work
In this paper, we apply co-training algorithm to automatic prosodic event detection and propose methods to better select samples to improve semi-supervised learning performance for this task.
semi-supervised is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Dhillon, Paramveer S. and Talukdar, Partha Pratim and Crammer, Koby
A <— METRICLEARNER(X, 3,1?)
3.3 Semi-Supervised Classification
A <— METRICLEARNER(X, 3,1?)
In this section, we trained the GRF classifier (see Equation 3), a graph-based semi-supervised leam-ing (SSL) algorithm (Zhu et al., 2003), using Gaussian kernel parameterized by A = FTP to set edge weights.
Abstract
We initiate a study comparing effectiveness of the transformed spaces learned by recently proposed supervised, and semi-supervised metric learning algorithms to those generated by previously proposed unsupervised dimensionality reduction methods (e.g., PCA).
Abstract
Through a variety of experiments on different real-world datasets, we find IDML—IT, a semi-supervised metric learning algorithm to be the most effective.
Conclusion
In this paper, we compared the effectiveness of the transformed spaces learned by recently proposed supervised, and semi-supervised metric learning algorithms to those generated by previously proposed unsupervised dimensionality reduction methods (e.g., PCA).
Conclusion
Through a variety of experiments on different real-world NLP datasets, we demonstrated that supervised as well as semi-supervised classifiers trained on the space learned by IDML—IT consistently result in the lowest classification errors.
Introduction
Even though different supervised and semi-supervised metric learning algorithms have recently been proposed, effectiveness of the transformed spaces learned by them in NLP
Introduction
We find IDML-IT, a semi-supervised metric learning algorithm to be the most effective.
Metric Learning
2.3 Inference-Driven Metric Learning (IDML): Semi-Supervised
Metric Learning
Since we are focusing on the semi-supervised learning (SSL) setting with n; labeled and nu unlabeled instances, the idea is to automatically label the unlabeled instances using a graph based SSL algorithm, and then include instances with low assigned label entropy (i.e., high confidence label assignments) in the next round of metric learning.
semi-supervised is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Hassan, Ahmed and Radev, Dragomir R.
Abstract
The method could be used both in a semi-supervised setting where a training set of labeled words is used, and in an unsupervised setting where a handful of seeds is used to define the two polarity classes.
Abstract
It outperforms the state of the art methods in the semi-supervised setting.
Conclusions
The proposed method can be used in a semi-supervised setting where a training set of labeled words is used, and in an unsupervised setting where only a handful of seeds is used to define the two polarity classes.
Experiments
This method could be used in a semi-supervised setting where a set of labeled words are used and the system learns from these labeled nodes and from other unlabeled nodes.
Introduction
Previous work on identifying the semantic orientation of words has addressed the problem as both a semi-supervised (Takamura et al., 2005) and an unsupervised (Turney and Littman, 2003) learning problem.
Introduction
In the semi-supervised setting, a training set of labeled words
Introduction
The proposed method could be used both in a semi-supervised and in an unsupervised setting.
Word Polarity
This view is closely related to the partially labeled classification with random walks approach in (Szummer and J aakkola, 2002) and the semi-supervised learning using harmonic functions approach in (Zhu et al., 2003).
semi-supervised is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Sajjad, Hassan and Fraser, Alexander and Schmid, Helmut
Abstract
We conduct experiments on data sets from the NEWS 2010 shared task on transliteration mining and achieve an F-measure of up to 92%, outperforming most of the semi-supervised systems that were submitted.
Conclusion
We evaluated it against the semi-supervised systems of NEWS10 and achieved high F-measure and performed better than most of the semi-supervised systems.
Experiments
On the WIL data sets, we compare our fully unsupervised system with the semi-supervised systems presented at the NEWSlO (Kumaran et al., 2010).
Experiments
For English/Arabic, English/Hindi and English/Tamil, our system is better than most of the semi-supervised systems presented at the NEWS 2010 shared task for transliteration mining.
Introduction
We compare our unsupervised transliteration mining method with the semi-supervised systems presented at the NEWS 2010 shared task on transliteration mining (Kumaran et al., 2010) using four language pairs.
Introduction
These systems used a manually labelled set of data for initial supervised training, which means that they are semi-supervised systems.
Introduction
We achieve an F-measure of up to 92% outperforming most of the semi-supervised systems.
Previous Research
Our unsupervised method seems robust as its performance is similar to or better than many of the semi-supervised systems on three language pairs.
semi-supervised is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Jiang, Qixia and Sun, Maosong
Abstract
This paper proposes a novel (semi-)supervised hashing method named Semi-Supervised SimHash (83H) for high—dimensional data similarity search.
Background and Related Works
2.3 Semi-Supervised Hashing
Background and Related Works
Semi-Supervised Hashing (SSH) (Wang et al., 2010a) is recently proposed to incorporate prior knowledge for better hashing.
Introduction
vated by this, some supervised methods are proposed to derive effective hash functions from prior knowledge, i.e., Spectral Hashing (Weiss et al., 2009) and Semi-Supervised Hashing (SSH) (Wang et al., 2010a).
Introduction
This paper proposes a novel (semi-)supervised hashing method, Semi-Supervised SimHash (S3H), for high-dimensional data similarity search.
Introduction
In Section 3, we describe our proposed Semi-Supervised SimHash (S3H).
Semi-Supervised SimHash
In this section, we present our hashing method, named Semi-Supervised SimHash (83H).
The direction is determined by concatenating w L times.
We have proposed a novel supervised hashing method named Semi-Supervised Simhash (83H) for high-dimensional data similarity search.
semi-supervised is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Dasgupta, Sajib and Ng, Vincent
Abstract
To address this problem, we propose a semi-supervised approach to sentiment classification where we first mine the unambiguous reviews using spectral techniques and then exploit them to classify the ambiguous reviews via a novel combination of active learning, transductive learning, and ensemble learning.
Conclusions
We have proposed a novel semi-supervised approach to polarity classification.
Conclusions
Since the semi-supervised learner is discriminative, our approach can adopt a richer representation that makes use of more sophisticated features such as bigrams or manually labeled sentiment-oriented words.
Evaluation
Semi-supervised spectral clustering.
Evaluation
We implemented Kamvar et al.’s (2003) semi-supervised spectral clustering algorithm, which incorporates labeled data into the clustering framework in the form of must-link and cannot-link constraints.
Evaluation
As we can see, accuracy ranges from 57.3% to 68.7% and ARI ranges from 0.02 to 0.14, which are significantly better than those of semi-supervised spectral learning.
Introduction
In light of the difficulties posed by ambiguous reviews, we differentiate between ambiguous and unambiguous reviews in our classification process by addressing the task of semi-supervised polarity classification via a “mine the easy, classify the hard” approach.
Our Approach
Recall that the goal of this step is not only to identify the unambiguous reviews, but also to annotate them as POSITIVE or NEGATIVE, so that they can serve as seeds for semi-supervised learning in a later step.
semi-supervised is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Srivastava, Shashank and Hovy, Eduard
Introduction
This is necessary for the scenario of semi-supervised learning of weights with partially annotated sentences, as described later.
Introduction
Semi-supervised learning: In the semi-supervised case, the labels yz-(k) are known only for some of the tokens in x(k).
Introduction
The semi-supervised approach enables incorporation of significantly more training data.
semi-supervised is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Whitney, Max and Sarkar, Anoop
Bootstrapping
Abney (2004) defines useful notation for semi-supervised learning, shown in table 1.
Existing algorithms 3.1 Yarowsky
Haffari and Sarkar (2007) suggest a bipartite graph framework for semi-supervised learning based on their analysis of Y— l/DL-l-VS and objective (2).
Existing algorithms 3.1 Yarowsky
3.7 Semi-supervised learning algorithm of Subramanya et al.
Existing algorithms 3.1 Yarowsky
(2010) give a semi-supervised algorithm for part of speech tagging.
Graph propagation
Note that (3) is independent of their specific graph structure, distributions, and semi-supervised learning algorithm.
Introduction
In this paper, we are concerned with a case of semi-supervised learning that is close to unsupervised learning, in that the labelled and unlabelled data points are from the same domain and only a small set of seed rules is used to derive the labelled points.
Introduction
In contrast, typical semi-supervised learning deals with a large number of labelled points, and a domain adaptation task with unlabelled points from the new domain.
semi-supervised is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Celikyilmaz, Asli and Hakkani-Tur, Dilek and Tur, Gokhan and Sarikaya, Ruhi
Abstract
To deal with these issues, we describe an efficient semi-supervised learning (SSL) approach which has two components: (i) Markov Topic Regression is a new probabilistic model to cluster words into semantic tags (concepts).
Abstract
Our new SSL approach improves semantic tagging performance by 3% absolute over the baseline models, and also compares favorably on semi-supervised syntactic tagging.
Introduction
To deal with these issues, we present a new semi-supervised learning (SSL) approach, which mainly has two components.
Related Work and Motivation
(I) Semi-Supervised Tagging.
Related Work and Motivation
0 (Wang et al., 2009; Li et al., 2009; Li, 2010; Liu et al., 2011) investigate web query tagging using semi-supervised sequence models.
Semi-Supervised Semantic Labeling
4.2 Retrospective Semi-Supervised CRF
Semi-Supervised Semantic Labeling
Algorithm 2 Retrospective Semi-Supervised CRF Input: Labeled Lil, and unlabeled Ll“ data.
semi-supervised is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Liu, Shujie and Li, Chi-Ho and Li, Mu and Zhou, Ming
Conclusion and Future Work
The features and weights are tuned with an iterative semi-supervised method.
Experiments and Results
To perform consensus-based re-ranking, we first use the baseline decoder to get the n-best list for each sentence of development and test data, then we create graph using the n-best lists and training data as we described in section 5.1, and perform semi-supervised training as mentioned in section 4.3.
Features and Training
Algorithm 1 Semi-Supervised Learning
Features and Training
Algorithm 1 outlines our semi-supervised method for such alternative training.
Graph-based Translation Consensus
Before elaborating how the graph model of consensus is constructed for both a decoder and N-best output re-ranking in section 5, we will describe how the consensus features and their feature weights can be trained in a semi-supervised way, in section 4.
Introduction
Alexandrescu and Kirchhoff (2009) proposed a graph-based semi-supervised model to re-rank n-best translation output.
semi-supervised is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Sun, Ang and Grishman, Ralph and Sekine, Satoshi
Abstract
We present a simple semi-supervised relation extraction system with large-scale word clustering.
Abstract
When training on different sizes of data, our semi-supervised approach consistently outperformed a state-of-the-art supervised baseline system.
Cluster Feature Selection
The cluster based semi-supervised system works by adding an additional layer of lexical features that incorporate word clusters as shown in column 4 of Table 4.
Conclusion and Future Work
We have described a semi-supervised relation extraction system with large-scale word clustering.
Experiments
For the semi-supervised system, 70 percent of the rest of the documents were randomly selected as training data and 30 percent as development data.
Experiments
For the semi-supervised system, each test fold was the same one used in the baseline and the other 4 folds were further split into a training set and a development set in a ratio of 7:3 for selecting clusters.
semi-supervised is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Garrette, Dan and Mielens, Jason and Baldridge, Jason
Abstract
While a variety of semi-supervised methods exist for training from incomplete data, there are open questions regarding what types of training data should be used and how much is necessary.
Abstract
Our results show that annotation of word types is the most important, provided a sufficiently capable semi-supervised learning infrastructure is in place to project type information onto a raw corpus.
Conclusions and Future Work
Most importantly, it is clear that type annotations are the most useful input one can obtain from a linguist—provided a semi-supervised algorithm for projecting that information reliably onto raw tokens is available.
Data
While we do not explore a rule-writing approach to POS-tagging, we do consider the impact of rule-based morphological analyzers as a component in our semi-supervised POS-tagging system.
Experiments3
In addition to annotations, semi-supervised tagger training requires a corpus of raw text.
Introduction
The overwhelming take away from our results is that type supervision—when backed by an effective semi-supervised learning approach—is the most important source of linguistic information.
semi-supervised is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Koo, Terry and Carreras, Xavier and Collins, Michael
Abstract
We present a simple and effective semi-supervised method for training dependency parsers.
Conclusions
In this paper, we have presented a simple but effective semi-supervised learning approach and demonstrated that it achieves substantial improvement over a competitive baseline in two broad-coverage depen-
Introduction
In this paper, we introduce lexical intermediaries via a simple two-stage semi-supervised approach.
Introduction
In general, semi-supervised learning can be motivated by two concerns: first, given a fixed amount of supervised data, we might wish to leverage additional unlabeled data to facilitate the utilization of the supervised corpus, increasing the performance of the model in absolute terms.
Introduction
We show that our semi-supervised approach yields improvements for fixed datasets by performing parsing experiments on the Penn Treebank (Marcus et al., 1993) and Prague Dependency Treebank (Hajic, 1998; Hajic et al., 2001) (see Sections 4.1 and 4.3).
Related Work
Semi-supervised phrase structure parsing has been previously explored by McClosky et al.
semi-supervised is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Croce, Danilo and Giannone, Cristina and Annesi, Paolo and Basili, Roberto
Conclusions
In this paper, a distributional approach for acquiring a semi-supervised model of argument classification (AC) preferences has been proposed.
Conclusions
Moreover, dimensionality reduction methods alternative to LSA, as currently studied on semi-supervised spectral learning (Johnson and Zhang, 2008), will be experimented.
Introduction
Finally, the application of semi-supervised learning is attempted to increase the lexical expressiveness of the model, e.g.
Introduction
A semi-supervised statistical model exploiting useful lexical information from unlabeled corpora is proposed.
Related Work
Accordingly a semi-supervised approach for reducing the costs of the manual annotation effort is proposed.
Related Work
It embodies the idea that a multitask learning architecture coupled with semi-supervised learning can be effectively applied even to complex linguistic tasks such as SRL.
semi-supervised is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Mochihashi, Daichi and Yamada, Takeshi and Ueda, Naonori
Discussion
Therefore, it is also interesting to combine them in a discriminative way as per-sued in POS tagging using CRF+HMM (Suzuki et al., 2007), let alone a simple semi-supervised approach in Section 5.2.
Experiments
Table 4: Semi-supervised and supervised results.
Experiments
Semi-supervised results used only 10K sentences (1/5) of supervised segmentations.
Experiments
Furthermore, NPYLM is easily amenable to semi-supervised or even supervised learning.
Introduction
In Section 5 we describe experiments on the standard datasets in Chinese and Japanese in addition to English phonetic transcripts, and semi-supervised experiments are also explored.
semi-supervised is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Scheible, Christian and Schütze, Hinrich
Abstract
Since no large amount of labeled training data for our new notion of sentiment relevance is available, we investigate two semi-supervised methods for creating sentiment relevance classifiers: a distant supervision approach that leverages structured information about the domain of the reviews; and transfer learning on feature representations based on lexical taxonomies that enables knowledge transfer.
Conclusion
Since a large labeled sentiment relevance resource does not yet exist, we investigated semi-supervised approaches to S-relevance classification that do not require S-relevance-labeled data.
Distant Supervision
Since a large labeled resource for sentiment relevance classification is not yet available, we investigate semi-supervised methods for creating sentiment relevance classifiers.
Introduction
For this reason, we investigate two semi-supervised approaches to S-relevance classification that do not require S-relevance-labeled data.
Transfer Learning
To address the problem that we do not have enough labeled SR data we now investigate a second semi-supervised method for SR classification, transfer learning (TL).
semi-supervised is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Imamura, Makoto and Takayama, Yasuhiro and Kaji, Nobuhiro and Toyoda, Masashi and Kitsuregawa, Masaru
Abstract
In this paper, we present a combination of active learning and semi-supervised learning method to treat the case when positive examples, which have an expected word sense in web search result, are only given.
Introduction
McCallum and Nigam (1998) combined active learning and semi-supervised learning technique
Introduction
Figure l: A combination of active learning and semi-supervised learning starting with positive and unlabeled examples
Introduction
Next we use Nigam’s semi-supervised learning method using EM and a naive Bayes classifier (Nigam et.
semi-supervised is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Zhang, Zhe and Singh, Munindar P.
Abstract
We propose a semi-supervised framework for generating a domain-specific sentiment lexicon and inferring sentiments at the segment level.
Introduction
To address the above shortcomings of lexicon and granularity, we propose a semi-supervised framework named ReNew.
Related Work
Rao and Ravichandran (2009) formalize the problem of sentiment detection as a semi-supervised label propagation problem in a graph.
Related Work
Esuli and Sebas-tiani (2006) use a set of classifiers in a semi-supervised fashion to iteratively expand a manu-
Related Work
(2011) introduce a semi-supervised approach that uses recursive autoencoders to learn the hierarchical structure and sentiment distribution of a sentence.
semi-supervised is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Celikyilmaz, Asli and Thint, Marcus and Huang, Zhiheng
Abstract
We present a graph-based semi-supervised learning for the question-answering (QA) task for ranking candidate sentences.
Abstract
We implement a semi-supervised learning (SSL) approach to demonstrate that utilization of more unlabeled data points can improve the answer-ranking task of QA.
Conclusions and Discussions
(2) We will use other distance measures to better explain entailment between q/a pairs and compare with other semi-supervised and transductive approaches.
Graph Based Semi-Supervised Learning for Entailment Ranking
We formulate semi-supervised entailment rank scores as follows.
Introduction
Recent research indicates that using labeled and unlabeled data in semi-supervised learning (SSL) environment, with an emphasis on graph-based methods, can improve the performance of information extraction from data for tasks such as question classification (Tri et al., 2006), web classification (Liu et al., 2006), relation extraction (Chen et al., 2006), passage-retrieval (Otterbacher et al., 2009), various natural language processing tasks such as part-of-speech tagging, and named-entity recognition (Suzuki and Isozaki, 2008), word-sense disam-
semi-supervised is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Hingmire, Swapnil and Chakraborti, Sutanu
Related Work
Several researchers have proposed semi-supervised text classification algorithms with the aim of reducing the time, effort and cost involved in labeling documents.
Related Work
Semi-supervised text classification algorithms proposed in (Nigam et al., 2000), (J oachims, 1999), (Zhu and Ghahra—mani, 2002) and (Blum and Mitchell, 1998) are a few examples of this type.
Related Work
The third type of semi-supervised text classification algorithms is based on active learning.
semi-supervised is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Liu, Shujie and Yang, Nan and Li, Mu and Zhou, Ming
Abstract
A semi-supervised training approach is proposed to train the parameters, and the phrase pair embedding is explored to model translation confidence directly.
Conclusion and Future Work
We apply our model to SMT decoding, and propose a three-step semi-supervised training method.
Introduction
We propose a three-step semi-supervised training approach to optimizing the parameters of RZNN, which includes recursive auto-encoding for unsupervised pre-training, supervised local training based on the derivation trees of forced decoding, and supervised global training using early update strategy.
Introduction
Our RZNN framework is introduced in detail in Section 3, followed by our three-step semi-supervised training approach in Section 4.
semi-supervised is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zhang, Longkai and Li, Li and He, Zhengyan and Wang, Houfeng and Sun, Ni
Experiment
Another baseline is Li and Sun (2009), which also uses punctuation in their semi-supervised framework.
INTRODUCTION
We build a semi-supervised learning (SSL) framework which can iteratively incorporate newly labeled instances from unlabeled micro-blog data during the training process.
Related Work
Meanwhile semi-supervised methods have been applied into NLP applications.
Related Work
Similar semi-supervised applications include Shen et al.
semi-supervised is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Silberer, Carina and Lapata, Mirella
Autoencoders for Grounded Semantics
Alternatively, a semi-supervised criterion can be used (Ranzato and Szummer, 2008; Socher et al., 2011) through combination of the unsupervised training criterion (global reconstruction) with a supervised criterion (prediction of some target given the latent representation).
Autoencoders for Grounded Semantics
Stacked Bimodal Autoencoder We finally build a stacked bimodal autoencoder (SAE) with all pre-trained layers and fine-tune them with respect to a semi-supervised criterion.
Autoencoders for Grounded Semantics
Furthermore, the semi-supervised setting affords flexibility, allowing to adapt the architecture to specific tasks.
Related Work
Secondly, our problem setting is different from the former studies, which usually deal with classification tasks and fine-tune the deep neural networks using training data with explicit class labels; in contrast we fine-tune our autoencoders using a semi-supervised criterion.
semi-supervised is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Kulkarni, Anagha and Callan, Jamie
Experiments and Results
The results for the semi-supervised models are nonconclusive.
Finding the Homographs in a Lexicon
We experiment with three model setups: Supervised, semi-supervised , and unsupervised.
Finding the Homographs in a Lexicon
In Model II, the semi-supervised setup, the training data is used to initialize the Expectation-Maximization (EM) algorithm (Dempster et al., 1977) and the unlabeled data, described in Section 3.1, updates the initial estimates.
Finding the Homographs in a Lexicon
The unsupervised setup, Model III, is similar to the semi-supervised setup except that the EM algorithm is initialized using an informed guess by the authors.
semi-supervised is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Lan, Man and Xu, Yu and Niu, Zhengyu
Multitask Learning for Discourse Relation Prediction
ASO has been shown to be useful in a semi-supervised learning configuration for several NLP applications, such as, text chunking (Ando and Zhang, 2005b) and text classification (Ando and Zhang, 2005a).
Related Work
2.1.3 Semi-supervised approaches
Related Work
(Hernault et al., 2010) presented a semi-supervised method based on the analysis of co-occurring features in labeled and unlabeled data.
Related Work
Very recently, (Hernault et al., 2011) introduced a semi-supervised work using structure learning method for discourse relation classification, which is quite relevant to our work.
semi-supervised is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Jiang, Wenbin and Sun, Meng and Lü, Yajuan and Yang, Yating and Liu, Qun
Experiments
resorting to complicated features, system combination and other semi-supervised technologies.
Related Work
Lots of efforts have been devoted to semi-supervised methods in sequence labeling and word segmentation (Xu et al., 2008; Suzuki and Isozaki, 2008; Haffari and Sarkar, 2008; Tomanek and Hahn, 2009; Wang et al., 2011).
Related Work
A semi-supervised method tries to find an optimal hyperplane of both annotated data and raw data, thus to result in a model with better coverage and higher accuracy.
Related Work
It is fundamentally different from semi-supervised and unsupervised methods in that we aimed to excavate a totally different kind of knowledge, the natural annotations implied by the structural information in web text.
semi-supervised is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Huang, Fei and Yates, Alexander
Experiments
(2006): the semi-supervised Alternating Structural Optimization (ASO) technique and the Structural Correspondence Learning (SCL) technique for domain adaptation.
Related Work
Ando and Zhang develop a semi-supervised chunker that outperforms purely supervised approaches on the CoNLL 2000 dataset (Ando and Zhang, 2005).
Related Work
Recent projects in semi-supervised (Toutanova and Johnson, 2007) and unsupervised (Biemann et al., 2007; Smith and Eisner, 2005) tagging also show significant progress.
Related Work
HMMs have been used many times for POS tagging and chunking, in supervised, semi-supervised , and in unsupervised settings (Banko and Moore, 2004; Goldwater and Griffiths, 2007; Johnson, 2007; Zhou, 2004).
semi-supervised is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Li, Fangtao and Pan, Sinno Jialin and Jin, Ou and Yang, Qiang and Zhu, Xiaoyan
Introduction
(2009) proposed a rule-based semi-supervised learning methods for lexicon extraction.
Introduction
Semi-Supervised Method (Semi) we implement the double propagation model proposed in (Qiu et al., 2009).
Introduction
The relational bootstrapping method performs better than the unsupervised method, TrAdaBoost and the cross-domain CRF algorithm, and achieves comparable results with the semi-supervised method.
semi-supervised is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Chen, Wenliang and Zhang, Min and Li, Haizhou
Experiments
Suzuki2009 (Suzuki et al., 2009) reported the best reported result by combining a Semi-supervised Structured Conditional Model (Suzuki and Isozaki, 2008) with the method of (Koo et al., 2008).
Experiments
G denotes the supervised graph-based parsers, S denotes the graph-based parsers with semi-supervised methods, D denotes our new parsers
Related work
(2009) presented a semi-supervised learning approach.
Related work
They extended a Semi-supervised Structured Conditional Model (SS-SCM)(Suzuki and Isozaki, 2008) to the dependency parsing problem and combined their method with the approach of Koo et al.
semi-supervised is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Druck, Gregory and Mann, Gideon and McCallum, Andrew
Abstract
In this paper, we propose a novel method for semi-supervised learning of non-projective log-linear dependency parsers using directly expressed linguistic prior knowledge (e.g.
Experimental Comparison with Unsupervised Learning
In this paper, we developed a novel method for the semi-supervised learning of a non-projective CRF dependency parser that directly uses linguistic prior knowledge as a training signal.
Related Work
Conventional semi-supervised learning requires parsed sentences.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Labutov, Igor and Lipson, Hod
Introduction
Traditionally, we would learn the embeddings for the target task jointly with whatever unlabeled data we may have, in an instance of semi-supervised learning, and/or we may leverage labels from multiple other related tasks in a multitask approach.
Related Work
Embeddings are learned in a semi-supervised fashion, and the components of the embedding are given an explicit probabilistic interpretation.
Related Work
In machine learning literature, joint semi-supervised embedding takes form in methods such as the LaplacianSVM (LapSVM) (Belkin et al., 2006) and Label Propogation (Zhu and Ghahra—mani, 2002), to which our approach is related.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Yogatama, Dani and Smith, Noah A.
Experiments
Note that we ran Brown clustering only on the training documents; running it on a larger collection of (unlabeled) documents relevant to the prediction task (i.e., semi-supervised learning) is worth exploring in future work.
Structured Regularizers for Text
This contrasts with typical semi-supervised learning methods for text categorization that combine unlabeled and labeled data within a generative model, such as multinomial na‘1've Bayes, via expectation-maximization (Nigam et al., 2000) or semi-supervised frequency estimation (Su et al., 2011).
Structured Regularizers for Text
We leave comparison with other semi-supervised methods for future work.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Jiang, Jing
Related work
(2006) explored semi-supervised learning for relation extraction using label propagation, which makes use of unlabeled data.
Task definition
Another existing solution to weakly-supervised learning problems is semi-supervised learning, e.g.
Task definition
However, because our proposed transfer learning method can be combined with semi-supervised learning, here we do not include semi-supervised learning as a baseline.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Xu, Liheng and Liu, Kang and Lai, Siwei and Zhao, Jun
Experiments
Afterwards, word-syntactic pattern co-occurrence statistic is used as feature for a semi-supervised classifier TSVM (J oachims, 1999) to further refine the results.
Introduction
At the same time, a semi-supervised convolutional neural model (Collobert et al., 2011) is employed to encode contextual semantic clue.
Related Work
A recent research (Xu et al., 2013) extracted infrequent product features by a semi-supervised classifier, which used word-syntactic pattern co-occurrence statistics as features for the classifier.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Korkontzelos, Ioannis and Manandhar, Suresh
Abstract
The evaluation set is derived from WordNet in a semi-supervised way.
Conclusion and Future Work
We proposed a semi-supervised way to extract non-compositional MWEs from WordNet.
Introduction and related work
Thirdly, we propose a semi-supervised approach for extracting non-compositional MWEs from WordNet, to decrease annotation cost.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Lin, Dekang and Wu, Xiaoyun
Discussion and Related Work
In earlier work on semi-supervised learning, e. g., (Blum and Mitchell 1998), the classifiers learned from unlabeled data were used directly.
Discussion and Related Work
There are two main scenarios that motivate semi-supervised learning.
Introduction
In this paper, we present a semi-supervised learning algorithm that goes a step further.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Yan, Yulan and Okazaki, Naoaki and Matsuo, Yutaka and Yang, Zhenglu and Ishizuka, Mitsuru
Introduction
Even with semi-supervised approaches, which use a large unlabeled corpus, manual construction of a small set of seeds known as true instances of the target entity or relation is susceptible to arbitrary human decisions.
Introduction
0 The combination of these patterns produces a clustering method to achieve high precision for different Information Extraction applications, especially for bootstrapping a high-recall semi-supervised relation extraction system.
Related Work
(Rosenfeld and Feldman, 2006) showed that the clusters discovered by URI are useful for seeding a semi-supervised relation extraction system.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Saluja, Avneesh and Hassan, Hany and Toutanova, Kristina and Quirk, Chris
Abstract
In this work, we present a semi-supervised graph-based approach for generating new translation rules that leverages bilingual and monolingual data.
Evaluation
In this set of experiments, we examined if the improvements in §3.2 can be explained primarily through the extraction of language model characteristics during the semi-supervised learning phase, or through orthogonal pieces of evidence.
Introduction
Our work introduces a new take on the problem using graph-based semi-supervised learning to acquire translation rules and probabilities by leveraging both monolingual and parallel data resources.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Qian, Longhua and Hui, Haotian and Hu, Ya'nan and Zhou, Guodong and Zhu, Qiaoming
Abstract
In the literature, the mainstream research on relation extraction adopts statistical machine learning methods, which can be grouped into supervised learning (Zelenko et al., 2003; Culotta and Soresen, 2004; Zhou et al., 2005; Zhang et al., 2006; Qian et al., 2008; Chan and Roth, 2011), semi-supervised learning (Zhang et al., 2004; Chen et al., 2006; Zhou et al., 2008; Qian et al., 2010) and unsupervised learning (Hase-gawa et al., 2004; Zhang et al., 2005) in terms of the amount of labeled training data they need.
Abstract
Therefore, Kim and Lee (2012) further employ a graph-based semi-supervised learning method, namely Label Propagation (LP), to indirectly propagate labels from the source language to the target language in an iterative fashion.
Abstract
For future work, on one hand, we plan to combine uncertainty sampling with diversity and informativeness measures; on the other hand, we intend to combine BAL with semi-supervised learning to further reduce human annotation efforts.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Hoffmann, Raphael and Zhang, Congle and Weld, Daniel S.
Extraction with Lexicons
Then Section 4.2 presents our semi-supervised algorithm for learning semantic lexicons from these lists.
Extraction with Lexicons
4.2 Semi-Supervised Learning of Lexicons
Introduction
When learning an extractor for relation R, LUCHS extracts seed phrases from R’s training data and uses a semi-supervised learning algorithm to create several relation-specific lexicons at different points on a precision-recall spectrum.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Huang, Fei and Yates, Alexander
Introduction
Furstenau and Lapata (2009b; 2009a) use semi-supervised techniques to automatically annotate data for previously unseen predicates with semantic role information.
Introduction
(2008) use deep learning techniques based on semi-supervised em-beddings to improve an SRL system, though their tests are on in-domain data.
Introduction
Unsupervised SRL systems (Swier and Stevenson, 2004; Grenager and Manning, 2006; Abend et al., 2009) can naturally be ported to new domains with little trouble, but their accuracy thus far falls short of state-of-the-art supervised and semi-supervised systems.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Ravi, Sujith and Baldridge, Jason and Knight, Kevin
Data
We use the standard splits of the data used in semi-supervised tagging experiments (e. g. Banko and Moore (2004)): sections 0-18 for training, 19-21 for development, and 22-24 for test.
Experiments
The HMM when using full supervision obtains 87.6% accuracy (Baldridge, 2008),8 so the accuracy of 63.8% achieved by EMGI+IPGI nearly halves the gap between the supervised model and the 45.6% obtained by basic EM semi-supervised model.
Introduction
This provides a much more challenging starting point for the semi-supervised methods typically applied to the task.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Kushman, Nate and Artzi, Yoav and Zettlemoyer, Luke and Barzilay, Regina
Experimental Setup
Forms of Supervision We consider both semi-supervised and supervised learning.
Experimental Setup
In the semi-supervised scenario, we assume access to the numerical answers of all problems in the training corpus and to a small number of problems paired with full equation systems.
Learning
Also, using different types of validation functions on different subsets of the data enables semi-supervised learning.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Chen, Harr and Benson, Edward and Naseem, Tahira and Barzilay, Regina
Related Work
(2007) propose an objective function for semi-supervised extraction that balances likelihood of labeled instances and constraint violation on unlabeled instances.
Results
Comparison against Supervised CRF Our final set of experiments compares a semi-supervised version of our model against a conditional random field (CRF) model.
Results
For finance, it takes at least 10 annotated documents (corresponding to roughly 130 annotated relation instances) for the CRF to match the semi-supervised model’s performance.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Kobdani, Hamidreza and Schuetze, Hinrich and Schiehlen, Michael and Kamp, Hans
Related Work
There are three main approaches to CoRe: supervised, semi-supervised (or weakly supervised) and unsupervised.
Related Work
We use the term semi-supervised for approaches that use some amount of human-labeled coreference pairs.
Related Work
(2002) used co-training for coreference resolution, a semi-supervised method.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Lu, Bin and Tan, Chenhao and Cardie, Claire and K. Tsou, Benjamin
Related Work
Semi-supervised Learning.
Related Work
Another line of related work is semi-supervised learning, which combines labeled and unlabeled data to improve the performance of the task of interest (Zhu and Goldberg, 2009).
Related Work
Among the popular semi-supervised methods (e. g. EM on Nai've Bayes (Nigam et al., 2000), co-training (Blum and Mitchell, 1998), transductive SVMs (Joachims, 1999b), and co-regularization (Sindhwani et al., 2005; Amini et al., 2010)), our approach employs the EM algorithm, extending it to the bilingual case based on maximum entropy.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Zhou, Guangyou and Zhao, Jun and Liu, Kang and Cai, Li
Experiments
Type D, C and S denote discriminative, combined and semi-supervised systems, respectively.
Experiments
We also compare our method with the semi-supervised approaches, the semi-supervised approaches achieved very high accuracies by leveraging on large unlabeled data directly into the systems for joint learning and decoding, while in our method, we only explore the N-gram features to further improve supervised dependency parsing performance.
Introduction
(2008) proposed a semi-supervised dependency parsing by introducing lexical intermediaries at a coarser level than words themselves via a cluster method.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Mukherjee, Arjun and Liu, Bing
Introduction
Semi-Supervised Modeling
Introduction
With seeds, our models are thus semi-supervised and need a different formulation.
Related Work
In (Lu and Zhai, 2008), a semi-supervised model was proposed.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Mukherjee, Arjun and Liu, Bing
Conclusion
A novel technique was also proposed to rank n-gram phrases where relevance based ranking was used in conjunction with a semi-supervised generative model.
Introduction
We employ a semi-supervised generative model called JTE-P to jointly model AD-expressions, pair interactions, and discussion topics simultaneously in a single framework.
Model
JTE-P is a semi-supervised generative model motivated by the joint occurrence of expression types (agreement and disagreement), topics in discussion posts, and user pairwise interactions.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Fukumoto, Fumiyo and Suzuki, Yoshimi and Matsuyoshi, Suguru
Introduction
Quite a lot of learning techniques e. g., semi-supervised learning, self-training, and active learning have been proposed.
Introduction
proposed a semi-supervised learning approach called the Graph Mincut algorithm which uses a small number of positive and negative examples and assigns values to unlabeled examples in a way that optimizes consistency in a nearest-neighbor sense (Blum et al., 2001).
Introduction
Like much previous work on semi-supervised ML, we apply SVM to the positive and unlabeled data, and add the classification results to the training data.
semi-supervised is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: