Index of papers in Proc. ACL that mention

text classification

Seen in text as:

text classification (113)
text classifiers (4)
Text classification (4)
Text Classification (4)
text classifier (3)

Seen in 127 sentences in 16 papers.

1. Semi-Supervised Cause Identification from Aviation Safety Reports

Persing, Isaac and Ng, Vincent

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Baseline Approaches	The SVM learning algorithm as implemented in the LIB SVM software package (Chang and Lin, 2001) is used for classifier training, owing to its robust performance on many text classification tasks.
Dataset	Unlike newswire articles, at which many topic-based text classification tasks are targeted, the ASRS reports are informally written using various domain-specific abbreviations and acronyms, tend to contain poor grammar, and have capitalization information removed, as illustrated in the following sentence taken from one of the reports.
Introduction	Automatic text classification is one of the most important applications in natural language processing (NLP).
Introduction	The difficulty of a text classification task depends on various factors, but typically, the task can be difficult if (1) the amount of labeled data available for learning the task is small; (2) it involves multiple classes; (3) it involves multi-label categorization, where more than one label can be assigned to each document; (4) the class distributions are skewed, with some categories significantly outnumbering the others; and (5) the documents belong to the same domain (e. g., movie review classification).
Introduction	In this paper, we introduce a new text classification problem involving the Aviation Safety Reporting System (ASRS) that can be viewed as a difficult task along each of the five dimensions discussed above.
Related Work	Since we recast cause identification as a text classification task and proposed a bootstrapping approach that targets at improving minority class prediction, the work most related to ours involves one or both of these topics.
Related Work	(2007) address the problem of class skewness in text classification .
Related Work	Similar bootstrapping methods are applicable outside text classification as well.

text classification is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

2. Cross-Language Text Classification Using Structural Correspondence Learning

Prettenhofer, Peter and Stein, Benno

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We present a new approach to cross-language text classification that builds on structural correspondence learning, a recently proposed theory for domain adaptation.
Introduction	This paper deals with cross-language text classification problems.
Introduction	Stated precisely: We are given a text classification task 7 in a target language ’2' for which no labeled documents are available.
Introduction	Such type of cross-language text classification problems are addressed by constructing a classifier f5 with training documents written in S and by applying f3 to unlabeled documents written in ’2'.
Related Work	Cross-Language Text Classification Bel et al.
Related Work	(2003) belong to the first who explicitly considered the problem of cross-language text classification .

text classification is mentioned in 25 sentences in this paper.

Topics mentioned in this paper:

3. Text Classification based on the Latent Topics of Important Sentences extracted by the PageRank Algorithm

Ogura, Yukari and Kobayashi, Ichiro

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In this paper, we propose a method to raise the accuracy of text classification based on latent topics, reconsidering the techniques necessary for good classification — for example, to decide important sentences in a document, the sentences with important words are usually regarded as important sentences.
Introduction	Text classification is an essential issue in the field of natural language processing and many techniques using latent topics have so far been proposed and used under many purposes.
Introduction	In this paper, we aim to raise the accuracy of text classification using latent information by reconsidering elemental techniques necessary for good classification in the following three points: 1) important words extraction
Introduction	— to decide important words in documents is a crucial issue for text classification , tfidf is often used to decide them.
Related studies	Many studies have proposed to improve the accuracy of text classification .
Related studies	ment for text classification , there are many studies which use the PageRank algorithm.
Related studies	(2005) have introduced association rule mining to decide important words for text classification .

text classification is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

4. Sprinkling Topics for Weakly Supervised Text Classification

Hingmire, Swapnil and Chakraborti, Sutanu

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Supervised text classification algorithms require a large number of documents labeled by humans, that involve a labor-intensive and time consuming process.
Abstract	We evaluate this approach to improve performance of text classification on three real world datasets.
Introduction	In supervised text classification learning algorithms, the learner (a program) takes human labeled documents as input and learns a decision function that can classify a previously unseen document to one of the predefined classes.
Introduction	In this paper, we propose a text classification algorithm based on Latent Dirichlet Allocation (LDA) (Blei et al., 2003) which does not need labeled documents.
Introduction	(Blei et al., 2003) used LDA topics as features in text classification , but they use labeled documents while learning a classifier.
Related Work	Several researchers have proposed semi-supervised text classification algorithms with the aim of reducing the time, effort and cost involved in labeling documents.
Related Work	Semi-supervised text classification algorithms proposed in (Nigam et al., 2000), (J oachims, 1999), (Zhu and Ghahra—mani, 2002) and (Blum and Mitchell, 1998) are a few examples of this type.
Related Work	Also a human annotator may discard or mislabel a polysemous word, which may affect the performance of a text classifier .

text classification is mentioned in 19 sentences in this paper.

Topics mentioned in this paper:

5. Extracting Social Power Relationships from Natural Language

Bramsen, Philip and Escobar-Molano, Martha and Patel, Ami and Alonso, Rafael

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	This paper explores a text classification problem we will call lect modeling, an example of what has been termed computational sociolinguistics.
Abstract	Our results validate the treatment of lect modeling as a text classification problem — albeit a hard one — and constitute a case for future research in computational sociolinguistics.
Abstract	Given, then, that there are distinct differences among what we term UpSpeak and DownSpeak, we treat Social Power Modeling as an instance of text classification (or categorization): we seek to assign a class (UpSpeak or DownSpeak) to a text sample.

text classification is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

6. Co-Training for Cross-Lingual Sentiment Classification

Wan, Xiaojun

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Related Work 2.1 Sentiment Classification	2.2 Cross-Domain Text Classification
Related Work 2.1 Sentiment Classification	Cross-domain text classification can be considered as a more general task than cross-lingual sentiment classification.
Related Work 2.1 Sentiment Classification	In the problem of cross-domain text classification , the labeled and unlabeled data come from different domains, and their underlying distributions are often different from each other, which violates the basic assumption of traditional classification learning.
The Co-Training Approach	Typical text classifiers include Support Vector Machine (SVM), Na'1've Bayes (NB), Maximum Entropy (ME), K-Nearest Neighbor (KNN), etc.

text classification is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

7. Scaling Semi-supervised Naive Bayes with Feature Marginals

Lucas, Michael and Downey, Doug

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	SSL techniques are often effective in text classification , where labeled data is scarce but large unlabeled corpora are readily available.
Abstract	In this paper, we show that improving marginal word frequency estimates using unlabeled data can enable semi-supervised text classification that scales to massive unlabeled data sets.
Introduction	This is problematic for text classification over large unlabeled corpora like the Web: new target concepts (new tasks and new topics of interest) arise frequently, and performing even a single pass over a large corpus for each new target concept is intractable.
Introduction	In this paper, we present a new SSL text classification approach that scales to large corpora.
Problem Definition	In the text classification setting , each feature value wd represents count of observations of word 21) in document d. MNB makes the simplifying assumption that word occurrences are conditionally independent of each other given the class (+ or —) of the example.
Problem Definition	We evaluate on two text classification tasks: topic classification, and sentiment detection.
Problem Definition	Our experiments demonstrate that MNB-FM outperforms previous approaches across multiple text classification techniques including topic classification and sentiment analysis.

text classification is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

8. Text Classification from Positive and Unlabeled Data using Misclassified Data Correction

Fukumoto, Fumiyo and Suzuki, Yoshimi and Matsuyoshi, Suguru

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	This paper addresses the problem of dealing with a collection of labeled training documents, especially annotating negative training documents and presents a method of text classification from positive and unlabeled data.
Conclusion	The research described in this paper involved text classification using positive and unlabeled data.
Experiments	The remaining data consisting 607,259 from 20 Nov 1996 to 19 Aug 1997 is used as a test data for text classification .
Experiments	3.2 Text classification
Framework of the System	Thus, if some training document reduces the overall performance of text classification because of an outlier, we can assume that the document is a SV.
Introduction	Text classification using machine learning (ML) techniques with a small number of labeled data has become more important with the rapid increase in volume of online documents.

text classification is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

SVM (23)
unlabeled data (10)
F-score (6)

9. Topic Modeling Based Classification of Clinical Reports

Sarioglu, Efsun and Yadav, Kabir and Choi, Hyeong-Ah

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In addition to regular text classification , we utilized topic modeling of the entire dataset in various ways.
Abstract	Our proposed topic based classifier system is shown to be competitive with existing text classification techniques and provides a more efficient and interpretable representation.
Background	2.3 Text Classification
Background	Text classification is a supervised learning algorithm where documents’ categories are learned from pre-labeled set of documents.
Experiments	SVM was chosen as the classification algorithm as it was shown that it performs well in text classification tasks (J oachims, 1998; Yang and Liu, 1999) and it is robust to overfitting (Sebastiani, 2002).
Related Work	For text classification , topic modeling techniques have been utilized in various ways.

text classification is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

topic model (23)
SVM (14)
LDA (6)

10. The Impact of Topic Bias on Quality Flaw Prediction in Wikipedia

Ferschke, Oliver and Gurevych, Iryna and Rittberger, Marc

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusions	We showed that text classification based on Wikipedia cleanup templates is prone to a topic bias which causes skewed classifiers and overly optimistic cross-validated evaluation results.
Conclusions	This bias is known from other text classification applications, such as authorship attribution, genre detection and native language detection.
Introduction	However, quality flaw detection based on cleanup template recognition suffers from a topic bias that is well known from other text classification applications such as authorship attribution or genre identification.
Related Work	Topic bias is a known problem in text classification .

text classification is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

11. Text-Driven Toponym Resolution using Indirect Supervision

Speriosu, Michael and Baldridge, Jason

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We exploit document-level geotags to indirectly generate training instances for text classifiers for toponym resolution, and show that textual cues can be straightforwardly integrated with other commonly used ones.
Introduction	Essentially, we learn a text classifier per toponym.
Introduction	Our results show these text classifiers are far more accurate than algorithms based on spatial proximity or metadata.
Toponym Resolvers	It learns text classifiers based on local context window features trained on instances automatically extracted from GEOWIKI.

text classification is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

12. Mine the Easy, Classify the Hard: A Semi-Supervised Approach to Automatic Sentiment Classification

Dasgupta, Sajib and Ng, Vincent

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusions	This makes it applicable to other text classification tasks.
Introduction	Unlike topic-based text classification , where a high accuracy can be achieved even for datasets with a large number of classes (e.g., 20 Newsgroups), polarity classification appears to be a more difficult task.
Introduction	One reason topic-based text classification is easier than polarity classification is that topic clusters are typically well-separated from each other, resulting from the fact that word usage differs considerably between two topically-different documents.

text classification is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

13. A New Dataset and Method for Automatically Grading ESOL Texts

Yannakoudakis, Helen and Briscoe, Ted and Medlock, Ben

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Implicitly or explicitly, previous work has mostly treated automated assessment as a supervised text classification task, where training texts are labelled with a grade and unlabelled test texts are fitted to the same grade point scale via a regression step applied to the classifier output (see Section 6 for more details).
Introduction	Discriminative classification techniques often outperform non-discriminative ones in the context of text classification (J oachims, 1998).
Previous work	This system shows that treating AA as a text classification problem is viable, but the feature types are all fairly shallow, and the approach doesn’t make efficient use of the training data as a separate classifier is trained for each grade point.

text classification is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

14. A Novel Classifier Based on Quantum Computation

Liu, Ding and Yang, Xiaofang and Jiang, Minghu

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Basic principle of quantum classifier	Specifically, in our experiment, we assigned the term frequency, a feature frequently used in text classification to rn , and treated the phase 0" as a constant, since we found the phase makes little contribution to the classification.
Discussion	We present here our model of text classification and compare it with SVM and KNN on two datasets.
Discussion	Moreover, the QC performs well in text classification compared with SVM and KNN and outperforms them on small-scale training sets.

text classification is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

15. Bridging Languages through Etymology: The case of cross language text categorization

Nastase, Vivi and Strapparava, Carlo

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	The motivation for this work was to test the hypothesis that information about word etymology is useful for computational approaches to language, in particular for text classification .
Conclusion	Cross-language text classification can be used to build comparable corpora in different languages, using a single language starting point, preferably one with more resources, that can thus spill over to other languages.
Cross Language Text Categorization	Text categorization (also text classification ), “the task of automatically sorting a set of documents into categories (or classes or topics) from a predefined set” (Sebastiani, 2005), allows for the quick selection of documents from the same domain, or the same topic.

text classification is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

16. Applying Grammar Induction to Text Mining

Salway, Andrew and Touileb, Samia

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Closing Remarks	The H-groups shown in Table 1 provide richer semantic descriptions of the domain than keywords do, and we noted potential applications for high-level summarization of a whole corpus, the creation of information extraction templates and finer- grained text classification and retrieval.
Implementation	For broad topics it is desirable to perform f1ner- grained text classification and retrieval.
Implementation	The alternation in V-groups contained by H-groups may reflect different beliefs and opinions which could be used for text classification and opinion mining.

text classification is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: