Index of papers in Proc. ACL 2009 that mention
  • text classification
Persing, Isaac and Ng, Vincent
Baseline Approaches
The SVM learning algorithm as implemented in the LIB SVM software package (Chang and Lin, 2001) is used for classifier training, owing to its robust performance on many text classification tasks.
Dataset
Unlike newswire articles, at which many topic-based text classification tasks are targeted, the ASRS reports are informally written using various domain-specific abbreviations and acronyms, tend to contain poor grammar, and have capitalization information removed, as illustrated in the following sentence taken from one of the reports.
Introduction
Automatic text classification is one of the most important applications in natural language processing (NLP).
Introduction
The difficulty of a text classification task depends on various factors, but typically, the task can be difficult if (1) the amount of labeled data available for learning the task is small; (2) it involves multiple classes; (3) it involves multi-label categorization, where more than one label can be assigned to each document; (4) the class distributions are skewed, with some categories significantly outnumbering the others; and (5) the documents belong to the same domain (e. g., movie review classification).
Introduction
In this paper, we introduce a new text classification problem involving the Aviation Safety Reporting System (ASRS) that can be viewed as a difficult task along each of the five dimensions discussed above.
Related Work
Since we recast cause identification as a text classification task and proposed a bootstrapping approach that targets at improving minority class prediction, the work most related to ours involves one or both of these topics.
Related Work
(2007) address the problem of class skewness in text classification .
Related Work
Similar bootstrapping methods are applicable outside text classification as well.
text classification is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Wan, Xiaojun
Related Work 2.1 Sentiment Classification
2.2 Cross-Domain Text Classification
Related Work 2.1 Sentiment Classification
Cross-domain text classification can be considered as a more general task than cross-lingual sentiment classification.
Related Work 2.1 Sentiment Classification
In the problem of cross-domain text classification , the labeled and unlabeled data come from different domains, and their underlying distributions are often different from each other, which violates the basic assumption of traditional classification learning.
The Co-Training Approach
Typical text classifiers include Support Vector Machine (SVM), Na'1've Bayes (NB), Maximum Entropy (ME), K-Nearest Neighbor (KNN), etc.
text classification is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Dasgupta, Sajib and Ng, Vincent
Conclusions
This makes it applicable to other text classification tasks.
Introduction
Unlike topic-based text classification , where a high accuracy can be achieved even for datasets with a large number of classes (e.g., 20 Newsgroups), polarity classification appears to be a more difficult task.
Introduction
One reason topic-based text classification is easier than polarity classification is that topic clusters are typically well-separated from each other, resulting from the fact that word usage differs considerably between two topically-different documents.
text classification is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: