Combining EM Training and the MDL Principle for an Automatic Verb Classification Incorporating Selectional Preferences
Schulte im Walde, Sabine and Hying, Christian and Scheible, Christian and Schmid, Helmut

Article Structure

Abstract

This paper presents an innovative, complex approach to semantic verb classification that relies on selectional preferences as verb properties.

Introduction

In recent years, the computational linguistics community has developed an impressive number of semantic verb classifications, i.e., classifications that generalise over verbs according to their semantic properties.

Verb Class Model 2.1 Probabilistic Model

This paper suggests a probabilistic model of verb classes that groups verbs into clusters with similar subcategorisation frames and selectional preferences.

Experiments

The model is generally applicable to all languages for which WordNet exists, and for which the WordNet functions provided by Princeton University are available.

Related Work

Our model is an extension of and thus most closely related to the latent semantic clustering (LS C) model (Rooth et al., 1999) for verb-argument pairs (2), a) which defines their probability as follows:

Summary and Outlook

This paper presented an innovative, complex approach to semantic verb classes that relies on selectional preferences as verb properties.

Topics

WordNet

Appears in 21 sentences as: WordNet (24) WordNets (1) WordNet’s (1)
In Combining EM Training and the MDL Principle for an Automatic Verb Classification Incorporating Selectional Preferences
  1. The selectional preferences are expressed in terms of semantic concepts from WordNet , rather than a set of individual words.
    Page 2, “Verb Class Model 2.1 Probabilistic Model”
  2. 4. selecting a WordNet concept r for each argument slot, e.g.
    Page 2, “Verb Class Model 2.1 Probabilistic Model”
  3. and Light (1999) and turn WordNet into a Hidden Markov model (HMM).
    Page 3, “Verb Class Model 2.1 Probabilistic Model”
  4. We create a new pseudo-concept for each WordNet noun and add it as a hy-ponym to each synset containing this word.
    Page 3, “Verb Class Model 2.1 Probabilistic Model”
  5. The probability of a path in this (a priori) WordNet HMM is the product of the probabilities of the transitions within the path.
    Page 3, “Verb Class Model 2.1 Probabilistic Model”
  6. Similarly, we create a partial WordNet HMM for each argument slot (0, f, which encodes the selectional preferences.
    Page 3, “Verb Class Model 2.1 Probabilistic Model”
  7. It contains only the WordNet concepts that the slot selects for, according to the MDL principle (cf.
    Page 3, “Verb Class Model 2.1 Probabilistic Model”
  8. The probability p(r|c, f, is the total probability of all paths from the topmost WordNet concept entity to the terminal node 7“.
    Page 3, “Verb Class Model 2.1 Probabilistic Model”
  9. Based on the above definitions, a partial “parse” for (speak subj-pp.t0 professor audience), referring to cluster 3 and one possible WordNet path, is shown in Figure l. The connections within R3 (R3,n,,entity—R3,...,pe7“son/g7“0up) and Within R (Rperson/group_ RpmfeSSOT/audience) refer to sequential applications of rule types (5) and (7), respectively.
    Page 3, “Verb Class Model 2.1 Probabilistic Model”
  10. Our model uses WordNet 3.0 as the concept hierarchy, and comprises one (complete) a priori WordNet model for the lexical head probabilities p(a|r) and one (partial) model for each selectional probability distribution p(r|c, f, 2'), cf.
    Page 4, “Verb Class Model 2.1 Probabilistic Model”
  11. The selectional preference models start out with the most general WordNet concept only, i.e., the partial WordNet hierarchies underlying the probabilities p(7“|c, f, initially only contain the concept 7“ for entity.
    Page 4, “Verb Class Model 2.1 Probabilistic Model”

See all papers in Proc. ACL 2008 that mention WordNet.

See all papers in Proc. ACL that mention WordNet.

Back to top.

fine-grained

Appears in 3 sentences as: fine-grained (3)
In Combining EM Training and the MDL Principle for an Automatic Verb Classification Incorporating Selectional Preferences
  1. A model with a large number of fine-grained concepts as selectional preferences assigns a higher likelihood to the data than a model with a small number of general concepts, because in general a larger number of parameters is better in describing training data.
    Page 4, “Verb Class Model 2.1 Probabilistic Model”
  2. Consequently, the EM algorithm a priori prefers fine-grained concepts but — due to sparse data problems — tends to overfit the training data.
    Page 4, “Verb Class Model 2.1 Probabilistic Model”
  3. On the one hand, their model is asymmetric, thus not giving the same interpretation power to verbs and arguments; on the other hand, the model provides a more fine-grained clustering for nouns, in the form of an additional hierarchical structure of the noun clusters.
    Page 8, “Related Work”

See all papers in Proc. ACL 2008 that mention fine-grained.

See all papers in Proc. ACL that mention fine-grained.

Back to top.

im

Appears in 3 sentences as: im (3)
In Combining EM Training and the MDL Principle for an Automatic Verb Classification Incorporating Selectional Preferences
  1. Up to now, such classifications have been used in applications such as word sense disambiguation (Dorr and Jones, 1996; Kohomban and Lee, 2005), machine translation (Prescher et al., 2000; Koehn and Hoang, 2007), document classification (Klavans and Kan, 1998), and in statistical lexical acquisition in general (Rooth et al., 1999; Merlo and Stevenson, 2001; Korhonen, 2002; Schulte im Walde, 2006).
    Page 1, “Introduction”
  2. Two large-scale approaches of this kind are Schulte im Walde (2006), who used k-Means on verb subcategorisation frames and verbal arguments to cluster verbs semantically, and J oanis et al.
    Page 8, “Related Work”
  3. To the best of our knowledge, Schulte im Walde (2006) is the only hard-clustering approach that previously incorporated selectional preferences as verb features.
    Page 8, “Related Work”

See all papers in Proc. ACL 2008 that mention im.

See all papers in Proc. ACL that mention im.

Back to top.

parse tree

Appears in 3 sentences as: parse tree (1) parse trees (1) ”parse tree (1)
In Combining EM Training and the MDL Principle for an Automatic Verb Classification Incorporating Selectional Preferences
  1. Figure 1: Example parse tree .
    Page 3, “Verb Class Model 2.1 Probabilistic Model”
  2. (b) The training tuples are processed: For each tuple, a PCFG parse forest as indicated by Figure l is done, and the Inside-Outside algorithm is applied to estimate the frequencies of the ”parse tree rules”, given the current model probabilities.
    Page 5, “Verb Class Model 2.1 Probabilistic Model”
  3. Furthermore, we aim to use the verb class model in NLP tasks, (i) as resource for lexical induction of verb senses, verb alternations, and collocations, and (ii) as a lexical resource for the statistical disambiguation of parse trees .
    Page 8, “Summary and Outlook”

See all papers in Proc. ACL 2008 that mention parse tree.

See all papers in Proc. ACL that mention parse tree.

Back to top.