SciSurf: Index of "unlabeled data" in Proc. ACL 2009

Index of papers in Proc. ACL 2009 that mention

unlabeled data

Seen in text as:

unlabeled data (71)

Seen in 70 sentences in 10 papers.

1. A Graph-based Semi-Supervised Learning for Question-Answering

Celikyilmaz, Asli and Thint, Marcus and Huang, Zhiheng

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We implement a semi-supervised learning (SSL) approach to demonstrate that utilization of more unlabeled data points can improve the answer-ranking task of QA.
Abstract	We create a graph for labeled and unlabeled data using match-scores of textual entailment features as similarity weights between data points.
Experiments	We show that as we increase the number of unlabeled data , with our graph-summarization, it is feasible to extract information that can improve the performance of QA models.
Experiments	In the first part, we randomly selected subsets of labeled training dataset X2 C X L with different sample sizes, ={l% >\|< 711;, 5% >\|< 71L, 10% * 71L, 25% * 71L, 50% * 71L, 100% * 711;}, where 71,; represents the sample size of X L. At each random selection, the rest of the labeled dataset is hypothetically used as unlabeled data to verify the performance of our SSL using different sizes of labeled data.
Experiments	Table 3: The effect of number of unlabeled data on MRR from Hybrid graph Summarization SSL.
Graph Summarization	Using graph-based SSL method on the new representative dataset, X’ = X U XTe, which is comprised of summarized dataset, X = {Xifizy as labeled data points, and the testing dataset, XTe as unlabeled data points.
Graph Summarization	Since we do not know estimated local density constraints of unlabeled data points, we use constants to construct local density constraint column vector for X’ dataset as follows:
Introduction	Recent research indicates that using labeled and unlabeled data in semi-supervised learning (SSL) environment, with an emphasis on graph-based methods, can improve the performance of information extraction from data for tasks such as question classification (Tri et al., 2006), web classification (Liu et al., 2006), relation extraction (Chen et al., 2006), passage-retrieval (Otterbacher et al., 2009), various natural language processing tasks such as part-of-speech tagging, and named-entity recognition (Suzuki and Isozaki, 2008), word-sense disam-
Introduction	We consider situations where there are much more unlabeled data , X U, than labeled data, X L, i.e., 711; << 711].

unlabeled data is mentioned in 14 sentences in this paper.

Topics mentioned in this paper:

2. Semi-supervised Learning for Automatic Prosodic Event Detection Using Co-training Algorithm

Jeon, Je Hun and Liu, Yang

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We propose a confidence-based method to assign labels to unlabeled data and demonstrate improved results using this method compared to the widely used agreement-based method.
Abstract	In our experiments on the Boston University radio news corpus, using only a small amount of the labeled data as the initial training set, our proposed labeling method combined with most confidence sample selection can effectively use unlabeled data to improve performance and finally reach performance closer to that of the supervised method using all the training data.
Co-training strategy for prosodic event detection	Given a set L of labeled data and a set U of unlabeled data, the algorithm first creates a smaller pool U’ containing u unlabeled data .
Co-training strategy for prosodic event detection	There are two issues: (1) the accurate self-labeling method for unlabeled data and (2) effective heuristics to se-
Co-training strategy for prosodic event detection	Given a set L of labeled training data and a set U of unlabeled data Randomly select U’ from U, \|U’ \|=u while iteration < k do Use L to train classifiers h] and [12 Apply h] and 112 to assign labels for all examples in U’ Select n self-labeled samples and add to L Remove these it samples from U Recreate U’ by choosing u instances randomly from U
Conclusions	We introduced a confidence-based method to assign possible labels to unlabeled data and evaluated the performance combined with informative sample selection methods.
Conclusions	This suggests that the use of unlabeled data can lead to significant improvement for prosodic event detection.
Experiments and results	Our goal is to determine whether the co-training algorithm described above could successfully use the unlabeled data for prosodic event detection.
Introduction	We propose a confidence-based method to assign labels to unlabeled data in training iterations and evaluate its performance combined with different informative sample selection methods.
Introduction	Our experiments on the Boston Radio News corpus show that the use of unlabeled data can lead to significant improvement of prosodic event detection compared to using the original small training set, and that the semi-supervised learning result is comparable with supervised learning with similar amount of training data.

unlabeled data is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

3. Active Learning for Multilingual Statistical Machine Translation

Haffari, Gholamreza and Sarkar, Anoop

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

AL-SMT: Multilingual Setting	(Ueffing et al., 2007; Haffari et al., 2009) show that treating [0+ as a source for a new feature function in a log-linear model for SMT (Och and Ney, 2004) allows us to maximally take advantage of unlabeled data by finding a weight for this feature using minimum error-rate training (MERT) (Och, 2003).
Introduction	In self-training each MT system is retrained using human labeled data plus its own noisy translation output on the unlabeled data .
Related Work	(Reichart et al., 2008) introduces multitask active learning where unlabeled data require annotations for multiple tasks, e. g. they consider named-entities and parse trees, and showed that multiple tasks helps selection compared to individual tasks.
Sentence Selection: Multiple Language Pairs	Using this method, we rank the entries in unlabeled data U for each translation task defined by language pair (Fd, This results in several ranking lists, each of which represents the importance of entries with respect to a particular translation task.
Sentence Selection: Single Language Pair	The more frequent a phrase (not a phrase pair) is in the unlabeled data , the more important it is to
Sentence Selection: Single Language Pair	know its translation; since it is more likely to see it in test data (specially when the test data is in-domain with respect to unlabeled data ).
Sentence Selection: Single Language Pair	In the labeled data L, phrases are the ones which are extracted by the SMT models; but what are the candidate phrases in the unlabeled data U?

unlabeled data is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

4. Mine the Easy, Classify the Hard: A Semi-Supervised Approach to Automatic Sentiment Classification

Dasgupta, Sajib and Ng, Vincent

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Evaluation	We employ as our second baseline a transductive SVM5 trained using 100 points randomly sampled from the training folds as labeled data and the remaining 1900 points as unlabeled data .
Evaluation	This could be attributed to (l) the unlabeled data , which may have provided the transductive learner with useful information that are not accessible to the other learners, and (2) the ensemble, which is more noise-tolerant to the imperfect seeds.
Evaluation	Specifically, we used the 500 seeds to guide the selection of active leam-ing points, but trained a transductive SVM using only the active learning points as labeled data (and the rest as unlabeled data ).
Our Approach	Given that we now have a labeled set (composed of 100 manually labeled points selected by active learning and 500 unambiguous points) as well as a larger set of points that are yet to be labeled (i.e., the remaining unlabeled points in the training folds and those in the test fold), we aim to train a better classifier by using a weakly supervised learner to learn from both the labeled and unlabeled data .
Our Approach	points in Lj, where i 75 j) as unlabeled data .
Our Approach	Since the points in the test fold are included in the unlabeled data , they are all classified in this step.

unlabeled data is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

5. Phrase Clustering for Discriminative Learning

Lin, Dekang and Wu, Xiaoyun

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discussion and Related Work	In earlier work on semi-supervised learning, e. g., (Blum and Mitchell 1998), the classifiers learned from unlabeled data were used directly.
Discussion and Related Work	Recent research shows that it is better to use whatever is learned from the unlabeled data as features in a discriminative classifier.
Discussion and Related Work	Wong and Ng (2007) and Suzuki and Isozaki (2008) are similar in that they run a baseline discriminative classifier on unlabeled data to generate pseudo examples, which are then used to train a different type of classifier for the same problem.
Introduction	two-stage strategy: first create word clusters with unlabeled data and then use the clusters as features in supervised training.

unlabeled data is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

6. Exploiting Heterogeneous Treebanks for Parsing

Niu, Zheng-Yu and Wang, Haifeng and Wu, Hua

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Evaluation on the Penn Chinese Treebank indicates that a converted dependency treebank helps constituency parsing and the use of unlabeled data by self-training further increases parsing f-score to 85.2%, resulting in 6% error reduction over the previous best result.
Experiments of Parsing	4.3 Using Unlabeled Data for Parsing
Experiments of Parsing	Recent studies on parsing indicate that the use of unlabeled data by self-training can help parsing on the WSJ data, even when labeled data is relatively large (McClosky et al., 2006a; Reichart and Rappoport, 2007).
Experiments of Parsing	2000) (PDC) as unlabeled data for parsing.

unlabeled data is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

7. Distributional Representations for Handling Sparsity in Supervised Sequence-Labeling

Huang, Fei and Yates, Alexander

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	N0 peak in performance is reached, so further improvements are possible with more unlabeled data .
Experiments	Thus smoothing is optimizing performance for the case where unlabeled data is plentiful and labeled data is scarce, as we would hope.
Related Work	Several researchers have previously studied methods for using unlabeled data for tagging and chunking, either alone or as a supplement to labeled data.
Related Work	Unlike these systems, our efforts are aimed at using unlabeled data to find distributional representations that work well on rare terms, making the supervised systems more applicable to other domains and decreasing their sample complexity.
Smoothing Natural Language Sequences	Using Expectation-Maximization (Dempster et al., 1977), it is possible to estimate the distributions for and P \|yi_1) from unlabeled data .

unlabeled data is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

8. A Non-negative Matrix Tri-factorization Approach to Sentiment Classification with Lexical Prior Knowledge

Li, Tao and Zhang, Yi and Sindhwani, Vikas

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We propose a novel approach to learn from lexical prior knowledge in the form of domain-independent sentiment-laden terms, in conjunction with domain-dependent unlabeled data and a few labeled documents.
Conclusion	To more effectively utilize unlabeled data and induce domain-specific adaptation of our models, several extensions are possible: facilitating learning from related domains, incorporating hyperlinks between documents, incorporating synonyms or co-occurences between words etc.
Incorporating Lexical Knowledge	It should be noted, that this list was constructed without a specific domain in mind; which is further motivation for using training examples and unlabeled data to learn domain specific connotations.
Related Work	The goal of the former theme is to learn from few labeled examples by making use of unlabeled data , while the goal of the latter theme is to utilize weak prior knowledge about term-class affinities (e.g., the term “awful” indicates negative sentiment and therefore may be considered as a negatively labeled feature).

unlabeled data is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

9. Semi-Supervised Cause Identification from Aviation Safety Reports

Persing, Isaac and Ng, Vincent

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	To improve the performance of a cause identification system for the minority classes, we present a bootstrapping algorithm that automatically augments a training set by learning from a small amount of labeled data and a large amount of unlabeled data .
Evaluation	To get a sense of the accuracy of the bootstrapped documents without further manual labeling, recall that our experimental setup resembles a transductive setting where the test documents are part of the unlabeled data , and consequently, some of them may have been automatically labeled by the bootstrapping algorithm.
Our Bootstrapping Algorithm	To alleviate the data scarcity problem and improve the accuracy of the classifiers, we propose in this section a bootstrapping algorithm that automatically augments a training set by exploiting a large amount of unlabeled data .
Related Work	Minority classes can be expanded without the availability of unlabeled data as well.

unlabeled data is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

10. Cross Language Dependency Parsing using a Bilingual Lexicon

Zhao, Hai and Song, Yan and Kit, Chunyu and Zhou, Guodong

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

The Related Work	Typical domain adaptation tasks often assume annotated data in new domain absent or insufficient and a large scale unlabeled data available.
The Related Work	As unlabeled data are concerned, semi-supervised or unsupervised methods will be naturally adopted.
The Related Work	The first is usually focus on exploiting automatic generated labeled data from the unlabeled data (Steedman et al., 2003; McClosky et al., 2006; Reichart and Rappoport, 2007; Sagae and Tsujii, 2007; Chen et al., 2008), the second is on combining supervised and unsupervised methods, and only unlabeled data are considered (Smith and Eisner, 2006; Wang and Schuurmans, 2008; Koo et al., 2008).

unlabeled data is mentioned in 4 sentences in this paper.

Topics mentioned in this paper: