Fine-Grained Genre Classification Using Structural Learning Algorithms
Wu, Zhili and Markert, Katja and Sharoff, Serge

Article Structure

Abstract

Prior use of machine learning in genre classification used a list of labels as classification categories.

Introduction

Automatic genre identification (AGI) can be traced to the mid-1990s (Karlgren and Cutting, 1994; Kessler et al., 1997), but this research became much more active in recent years, partly because of the explosive growth of the Web, and partly because of the importance of making genre distinctions in NLP applications.

Structural SVMs

Discriminative methods are often used for classification, with SVMs being a well-performing method in many tasks (Boser et al., 1992; Joachims, 1999).

Genre Distance Measures

The structural SVM (Section 2) requires a distance measure h between two genres.

Experiments

4.1 Datasets

Discussion

Given that structural learning can help in topical classification tasks (Tsochantaridis et al., 2005; Dekel et al., 2004), the lack of success on genres is surprising.

Conclusions

In this paper, we have evaluated structural learning approaches to genre classification using sev-

Topics

SVM

Appears in 13 sentences as: (1) SVM (14)
In Fine-Grained Genre Classification Using Structural Learning Algorithms
  1. To strengthen the constraints, the zero value on the right hand side of the inequality for the flat SVM can be replaced by a positive value, corresponding to a distance measure h(yi, m) between two genre classes, leading to the following constraint:
    Page 3, “Structural SVMs”
  2. The structural SVM (Section 2) requires a distance measure h between two genres.
    Page 3, “Genre Distance Measures”
  3. As a baseline we use the accuracy achieved by a standard "flat" SVM.
    Page 5, “Experiments”
  4. A standard flat SVM achieves an accuracy of 64.4% whereas the best structural SVM based on Lin’s information content distance measure (IC-lin-word-bnc) achieves 68.8% accuracy, significantly better at the 1% level.
    Page 5, “Experiments”
  5. Table 1 summarizes the best performing measures that all outperform the flat SVM at the 1% level.
    Page 5, “Experiments”
  6. Method Accuracy Karlgren and Cutting, 1994 65 (Training) Flat SVM 64.40 SSVM(IC-lin-word-bnc) 68.80 SSVM(IC-lin-word-br) 68.60 SSVM(IC-lin-gram-br) 67.80
    Page 6, “Experiments”
  7. In our experiment, the flat SVM achieves an accuracy of 52.40%, and the structural SVM using path length measure achieves 55.40%, a difference significant at the 5% level.
    Page 6, “Experiments”
  8. Standard accuracy for the best performing structural methods on HGC is just the same as for flat SVM (69.1%), with marginally better structural accuracy (for example, 71.39 vs. 71.04%, using a path-length based structural accuracy).
    Page 6, “Experiments”
  9. The accuracy of SSVM is also just comparable to flat SVM (73.6%).
    Page 6, “Experiments”
  10. As expected, the structural methods on either skewed or flattened hierarchies are not significantly better than the flat SVM .
    Page 7, “Discussion”
  11. For the flattened hierarchy of 15 leaf genres the maximal accuracy is 54.2% vs. 52.4% for the flat SVM (Figure 3), a nonsignificant improvement.
    Page 7, “Discussion”

See all papers in Proc. ACL 2010 that mention SVM.

See all papers in Proc. ACL that mention SVM.

Back to top.

weight vector

Appears in 5 sentences as: weight vector (5) weight vectors (2)
In Fine-Grained Genre Classification Using Structural Learning Algorithms
  1. Let x be a document and wm a weight vector associated with the genre class m in a corpus with k genres at the most fine-grained level.
    Page 2, “Structural SVMs”
  2. The predicted class is the class achieving the maximum inner product between x and the weight vector for the class, denoted as,
    Page 2, “Structural SVMs”
  3. Accurate prediction requires that when a document vector is multiplied with the weight vector associated with its own class, the resulting inner product should be larger than its inner products with a weight vector for any other genre class m. This helps us to define criteria for weight vectors .
    Page 3, “Structural SVMs”
  4. For its weight vector wyi, the inner
    Page 3, “Structural SVMs”
  5. Adding up all loss terms over all training documents, and further introducing a term to penalize large values in the weight vectors , we have the following objective function (0 is a user-specified nonnegative parameter).
    Page 3, “Structural SVMs”

See all papers in Proc. ACL 2010 that mention weight vector.

See all papers in Proc. ACL that mention weight vector.

Back to top.

classification tasks

Appears in 3 sentences as: classification tasks (3)
In Fine-Grained Genre Classification Using Structural Learning Algorithms
  1. But many implementations are not publicly available, and their scalability to real-life text classification tasks is unknown.
    Page 2, “Structural SVMs”
  2. We used character n-grams because they are very easy to extract, language-independent (no need to rely on parsing or even stemming), and they are known to have the best performance in genre classification tasks (Kanaris and Stamatatos, 2009; Sharoff et al., 2010).
    Page 5, “Experiments”
  3. Given that structural learning can help in topical classification tasks (Tsochantaridis et al., 2005; Dekel et al., 2004), the lack of success on genres is surprising.
    Page 6, “Discussion”

See all papers in Proc. ACL 2010 that mention classification tasks.

See all papers in Proc. ACL that mention classification tasks.

Back to top.

cosine similarities

Appears in 3 sentences as: cosine similarities (2) cosine similarity (1)
In Fine-Grained Genre Classification Using Structural Learning Algorithms
  1. An alternative to structural distance measures would be distance measures between the genres based on pairwise cosine similarities between them.
    Page 9, “Discussion”
  2. To assess this, we aggregated all character 4-gram training vectors of each genre and calculated standard cosine similarities .
    Page 9, “Discussion”
  3. Inspecting the distance matrix visually, we determined that the cosine similarity could clearly distinguish between Fiction and NonFiction texts but not between any other genres.
    Page 9, “Discussion”

See all papers in Proc. ACL 2010 that mention cosine similarities.

See all papers in Proc. ACL that mention cosine similarities.

Back to top.

fine-grained

Appears in 3 sentences as: fine-grained (3)
In Fine-Grained Genre Classification Using Structural Learning Algorithms
  1. This paper explores a way of using information on the hierarchy of labels for improving fine-grained genre classification.
    Page 2, “Introduction”
  2. Let x be a document and wm a weight vector associated with the genre class m in a corpus with k genres at the most fine-grained level.
    Page 2, “Structural SVMs”
  3. We use standard classification accuracy (Acc) on the most fine-grained level of target categories in the genre hierarchy.
    Page 5, “Experiments”

See all papers in Proc. ACL 2010 that mention fine-grained.

See all papers in Proc. ACL that mention fine-grained.

Back to top.

similarity measures

Appears in 3 sentences as: similarity measure (1) similarity measures (2)
In Fine-Grained Genre Classification Using Structural Learning Algorithms
  1. We can derive such distance measures from the genre hierarchy in a way similar to word similarity measures that were invented for lexical hierarchies such as WordNet (see (Pedersen et al., 2007) for an overview).
    Page 3, “Genre Distance Measures”
  2. The Leaeoek & Chodorow similarity measure (Leacock and Chodorow, 1998) normalizes the path length measure (6) by the maximum number of nodes D when traversing down from the root.
    Page 3, “Genre Distance Measures”
  3. Several other similarity measures have been proposed based on the Resnik similarity such as the one by (Lin, 1998):
    Page 4, “Genre Distance Measures”

See all papers in Proc. ACL 2010 that mention similarity measures.

See all papers in Proc. ACL that mention similarity measures.

Back to top.