Index of papers in Proc. ACL 2013 that mention
  • text classification
Ogura, Yukari and Kobayashi, Ichiro
Abstract
In this paper, we propose a method to raise the accuracy of text classification based on latent topics, reconsidering the techniques necessary for good classification — for example, to decide important sentences in a document, the sentences with important words are usually regarded as important sentences.
Introduction
Text classification is an essential issue in the field of natural language processing and many techniques using latent topics have so far been proposed and used under many purposes.
Introduction
In this paper, we aim to raise the accuracy of text classification using latent information by reconsidering elemental techniques necessary for good classification in the following three points: 1) important words extraction
Introduction
— to decide important words in documents is a crucial issue for text classification , tfidf is often used to decide them.
Related studies
Many studies have proposed to improve the accuracy of text classification .
Related studies
ment for text classification , there are many studies which use the PageRank algorithm.
Related studies
(2005) have introduced association rule mining to decide important words for text classification .
text classification is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Lucas, Michael and Downey, Doug
Abstract
SSL techniques are often effective in text classification , where labeled data is scarce but large unlabeled corpora are readily available.
Abstract
In this paper, we show that improving marginal word frequency estimates using unlabeled data can enable semi-supervised text classification that scales to massive unlabeled data sets.
Introduction
This is problematic for text classification over large unlabeled corpora like the Web: new target concepts (new tasks and new topics of interest) arise frequently, and performing even a single pass over a large corpus for each new target concept is intractable.
Introduction
In this paper, we present a new SSL text classification approach that scales to large corpora.
Problem Definition
In the text classification setting , each feature value wd represents count of observations of word 21) in document d. MNB makes the simplifying assumption that word occurrences are conditionally independent of each other given the class (+ or —) of the example.
Problem Definition
We evaluate on two text classification tasks: topic classification, and sentiment detection.
Problem Definition
Our experiments demonstrate that MNB-FM outperforms previous approaches across multiple text classification techniques including topic classification and sentiment analysis.
text classification is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Fukumoto, Fumiyo and Suzuki, Yoshimi and Matsuyoshi, Suguru
Abstract
This paper addresses the problem of dealing with a collection of labeled training documents, especially annotating negative training documents and presents a method of text classification from positive and unlabeled data.
Conclusion
The research described in this paper involved text classification using positive and unlabeled data.
Experiments
The remaining data consisting 607,259 from 20 Nov 1996 to 19 Aug 1997 is used as a test data for text classification .
Experiments
3.2 Text classification
Framework of the System
Thus, if some training document reduces the overall performance of text classification because of an outlier, we can assume that the document is a SV.
Introduction
Text classification using machine learning (ML) techniques with a small number of labeled data has become more important with the rapid increase in volume of online documents.
text classification is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Sarioglu, Efsun and Yadav, Kabir and Choi, Hyeong-Ah
Abstract
In addition to regular text classification , we utilized topic modeling of the entire dataset in various ways.
Abstract
Our proposed topic based classifier system is shown to be competitive with existing text classification techniques and provides a more efficient and interpretable representation.
Background
2.3 Text Classification
Background
Text classification is a supervised learning algorithm where documents’ categories are learned from pre-labeled set of documents.
Experiments
SVM was chosen as the classification algorithm as it was shown that it performs well in text classification tasks (J oachims, 1998; Yang and Liu, 1999) and it is robust to overfitting (Sebastiani, 2002).
Related Work
For text classification , topic modeling techniques have been utilized in various ways.
text classification is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Ferschke, Oliver and Gurevych, Iryna and Rittberger, Marc
Conclusions
We showed that text classification based on Wikipedia cleanup templates is prone to a topic bias which causes skewed classifiers and overly optimistic cross-validated evaluation results.
Conclusions
This bias is known from other text classification applications, such as authorship attribution, genre detection and native language detection.
Introduction
However, quality flaw detection based on cleanup template recognition suffers from a topic bias that is well known from other text classification applications such as authorship attribution or genre identification.
Related Work
Topic bias is a known problem in text classification .
text classification is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Speriosu, Michael and Baldridge, Jason
Abstract
We exploit document-level geotags to indirectly generate training instances for text classifiers for toponym resolution, and show that textual cues can be straightforwardly integrated with other commonly used ones.
Introduction
Essentially, we learn a text classifier per toponym.
Introduction
Our results show these text classifiers are far more accurate than algorithms based on spatial proximity or metadata.
Toponym Resolvers
It learns text classifiers based on local context window features trained on instances automatically extracted from GEOWIKI.
text classification is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Liu, Ding and Yang, Xiaofang and Jiang, Minghu
Basic principle of quantum classifier
Specifically, in our experiment, we assigned the term frequency, a feature frequently used in text classification to rn , and treated the phase 0" as a constant, since we found the phase makes little contribution to the classification.
Discussion
We present here our model of text classification and compare it with SVM and KNN on two datasets.
Discussion
Moreover, the QC performs well in text classification compared with SVM and KNN and outperforms them on small-scale training sets.
text classification is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Nastase, Vivi and Strapparava, Carlo
Conclusion
The motivation for this work was to test the hypothesis that information about word etymology is useful for computational approaches to language, in particular for text classification .
Conclusion
Cross-language text classification can be used to build comparable corpora in different languages, using a single language starting point, preferably one with more resources, that can thus spill over to other languages.
Cross Language Text Categorization
Text categorization (also text classification ), “the task of automatically sorting a set of documents into categories (or classes or topics) from a predefined set” (Sebastiani, 2005), allows for the quick selection of documents from the same domain, or the same topic.
text classification is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: