Latent Variable Models of Concept-Attribute Attachment

This paper presents a set of Bayesian methods for automatically extending the WORDNET ontology with new concepts and annotating existing concepts with generic property fields, or attributes.

We present a Bayesian approach for simultaneously extending IsA hierarchies such as those found in WORDNET (WN) (Fellbaum, 1998) with additional concepts, and annotating the resulting concept graph with attributes, i.e., generic property fields shared by instances of that concept.

Input to our ontology annotation procedure consists of sets of class instances (e.g., Pisanello, Hieronymus Bosch) associated with class labels (e.g., renaissance painters) and attributes (e.g., “birthplace”, “famous works”, “style” and “early life”).

The underlying mechanism for our annotation procedure is LDA (Blei et al., 2003b), a fully Bayesian extension of probabilistic Latent Semantic Analysis (Hofmann, 1999).

We employ two data sets derived using the procedure in (Pasca and Van Durme, 2008): the full set of automatic extractions generated in § 2, and a subset consisting of all attribute sets that fall under the hierarchies rooted at the WN concepts living thing#1 (i.e., the first sense of living thing), substance#7, locati0n#1, pers0n#1, 0rganizati0n#1 and f00d#1, manually selected to cover a high-precision subset of labeled attribute sets.

5.1 Attribute Precision

A large body of previous work exists on extending WORDNET with additional concepts and instances (Snow et al., 2006; Suchanek et al., 2007); these methods do not address attributes directly.

This paper introduced a set of methods based on Latent Dirichlet Allocation (LDA) for jointly extending the WORDNET ontology and annotating its concepts with attributes (see Figure 4 for the end result).

Appears in 31 sentences as: LDA (34)

In *Latent Variable Models of Concept-Attribute Attachment*

- In this paper, we show that both of these goals can be realized jointly using a probabilistic topic model, namely hierarchical Latent Dirichlet Allocation ( LDA ) (Blei et al., 2003b).Page 1, “Introduction”
- There are three main advantages to using a topic model as the annotation procedure: (1) Unlike hierarchical clustering (Duda et al., 2000), the attribute distribution at a concept node is not composed of the distributions of its children; attributes found specific to the concept Painter would not need to appear in the distribution of attributes for Person, making the internal distributions at each concept more meaningful as attributes specific to that concept; (2) Since LDA is fully Bayesian, its model semantics allow additional prior information to be included, unlike standard models such as Latent Semantic Analysis (Hofmann, 1999), improving annotation precision; (3) Attributes with multiple related meanings (i.e., polysemous attributes) are modeled implicitly: if an attribute (e.g., “style”) occurs in two separate input classes (e.g., poets and car models), then that attribute might attach at two different concepts in the ontology, which is better than attaching it at their most specific common ancestor (Whole) if that ancestor is too general to be useful.Page 1, “Introduction”
- ate three variants: (1) a fixed structure approach where each flat class is attached to WN using a simple string-matching heuristic, and concept nodes are annotated using LDA, (2) an extension of LDA allowing for sense selection in addition to annotation, and (3) an approach employing a nonparametric prior over tree structures capable of inferring arbitrary ontologies.Page 2, “Introduction”
- LDA Fixed Structure LDA nCRPPage 2, “Ontology Annotation”
- Figure 2: Graphical models for the LDA variants; shaded nodes indicate observed quantities.Page 2, “Ontology Annotation”
- We propose a set of Bayesian generative models based on LDA that take as input labeled attribute sets generated using an extraction procedure such as the above and organize the attributes in WN according to their level of generality.Page 2, “Ontology Annotation”
- Annotating WN with attributes proceeds in three steps: (1) attaching labeled attribute sets to leaf concepts in WN using string distance, (2) inferring an attribute model using one of the LDA variants discussed in § 3, and (3) generating ranked lists of attributes for each concept using the model probabilities (§ 4.3).Page 2, “Ontology Annotation”
- The underlying mechanism for our annotation procedure is LDA (Blei et al., 2003b), a fully Bayesian extension of probabilistic Latent Semantic Analysis (Hofmann, 1999).Page 2, “Hierarchical Topic Models 3.1 Latent Dirichlet Allocation”
- Given D labeled attribute sets wd, d E D, LDA infers an unstructured set of T latent annotated concepts over which attribute sets decompose as mixtures.2 The latent annotated concepts represent semantically coherent groups of attributes expressed in the data, as shown in Example 1.Page 2, “Hierarchical Topic Models 3.1 Latent Dirichlet Allocation”
- The generative model for LDA is given byPage 2, “Hierarchical Topic Models 3.1 Latent Dirichlet Allocation”
- (Example 1) Given 26 labeled attribute sets falling into three broad semantic categories: philosophers, writers and actors (e.g., sets for contemporary philosophers, women writers, bollywood actors), LDA is able to infer a meaningful set of latent annotated concepts:Page 3, “Hierarchical Topic Models 3.1 Latent Dirichlet Allocation”

See all papers in *Proc. ACL 2009* that mention LDA.

See all papers in *Proc. ACL* that mention LDA.

Back to top.

Appears in 6 sentences as: WORDNET (6)

In *Latent Variable Models of Concept-Attribute Attachment*

- This paper presents a set of Bayesian methods for automatically extending the WORDNET ontology with new concepts and annotating existing concepts with generic property fields, or attributes.Page 1, “Abstract”
- We base our approach on Latent Dirichlet Allocation and evaluate along two dimensions: (l) the precision of the ranked lists of attributes, and (2) the quality of the attribute assignments to WORDNET concepts.Page 1, “Abstract”
- We present a Bayesian approach for simultaneously extending IsA hierarchies such as those found in WORDNET (WN) (Fellbaum, 1998) with additional concepts, and annotating the resulting concept graph with attributes, i.e., generic property fields shared by instances of that concept.Page 1, “Introduction”
- We use WORDNET 3.0 as the specific test ontology for our annotation procedure, and evalu-Page 1, “Introduction”
- A large body of previous work exists on extending WORDNET with additional concepts and instances (Snow et al., 2006; Suchanek et al., 2007); these methods do not address attributes directly.Page 7, “Related Work”
- This paper introduced a set of methods based on Latent Dirichlet Allocation (LDA) for jointly extending the WORDNET ontology and annotating its concepts with attributes (see Figure 4 for the end result).Page 7, “Conclusion”

See all papers in *Proc. ACL 2009* that mention WORDNET.

See all papers in *Proc. ACL* that mention WORDNET.

Back to top.

Appears in 5 sentences as: Gibbs sample (1) Gibbs samples (2) Gibbs sampling (2)

In *Latent Variable Models of Concept-Attribute Attachment*

- This distribution can be approximated efficiently using Gibbs sampling .Page 3, “Hierarchical Topic Models 3.1 Latent Dirichlet Allocation”
- An efficient Gibbs sampling procedure is given in (Blei et al., 2003a).Page 4, “Hierarchical Topic Models 3.1 Latent Dirichlet Allocation”
- Per-Node Distribution: In stDA and ssLDA, attribute rankings can be constructed directly for each WN concept 0, by computing the likelihood of attribute 212 attaching to c, £(c|w) = p(w|c) averaged over all Gibbs samples (discarding a fixed number of samples for burn-in).Page 5, “Experimental Setup 4.1 Data Analysis”
- Precision was manually evaluated relative to 23 concepts chosen for broad coverage.7 Table 1 shows precision at n and the Mean Average Precision (MAP); In all LDA-based models, the Bayes average posterior is taken over all Gibbs samplesPage 5, “Results”
- Inset plots show log-likelihood of each Gibbs sample , indicating convergence except in the case of nCRP.Page 6, “Results”

See all papers in *Proc. ACL 2009* that mention Gibbs samples.

See all papers in *Proc. ACL* that mention Gibbs samples.

Back to top.

Appears in 4 sentences as: topic model (2) topic modeling (1) topic models (1)

In *Latent Variable Models of Concept-Attribute Attachment*

- In this paper, we show that both of these goals can be realized jointly using a probabilistic topic model , namely hierarchical Latent Dirichlet Allocation (LDA) (Blei et al., 2003b).Page 1, “Introduction”
- There are three main advantages to using a topic model as the annotation procedure: (1) Unlike hierarchical clustering (Duda et al., 2000), the attribute distribution at a concept node is not composed of the distributions of its children; attributes found specific to the concept Painter would not need to appear in the distribution of attributes for Person, making the internal distributions at each concept more meaningful as attributes specific to that concept; (2) Since LDA is fully Bayesian, its model semantics allow additional prior information to be included, unlike standard models such as Latent Semantic Analysis (Hofmann, 1999), improving annotation precision; (3) Attributes with multiple related meanings (i.e., polysemous attributes) are modeled implicitly: if an attribute (e.g., “style”) occurs in two separate input classes (e.g., poets and car models), then that attribute might attach at two different concepts in the ontology, which is better than attaching it at their most specific common ancestor (Whole) if that ancestor is too general to be useful.Page 1, “Introduction”
- The remainder of this paper is organized as follows: §2 describes the full ontology annotation framework, §3 introduces the LDA-based topic models , §4 gives the experimental setup, §5 gives results, §6 gives related work and §7 concludes.Page 2, “Introduction”
- 2In topic modeling literature, attributes are words and attribute sets are documents.Page 2, “Hierarchical Topic Models 3.1 Latent Dirichlet Allocation”

See all papers in *Proc. ACL 2009* that mention topic model.

See all papers in *Proc. ACL* that mention topic model.

Back to top.

Appears in 3 sentences as: gold-standard (4)

In *Latent Variable Models of Concept-Attribute Attachment*

- where rank(c) is the rank (from 1 up to 10) of a concept 0 in C(21)), and PathToGold is the length of the minimum path along IsA edges in the conceptual hierarchies between the concept 0, on one hand, and any of the gold-standard concepts manually identified for the attribute 212, on the other hand.Page 5, “Experimental Setup 4.1 Data Analysis”
- The length PathToGold is 0, if the returned concept is the same as the gold-standard concept.Page 5, “Experimental Setup 4.1 Data Analysis”
- Conversely, a gold-standard attribute receives no credit (that is, DRR is 0) if no path is found in the hierarchies between the top 10 concepts of C and any of the gold-standard concepts, or if C is empty.Page 5, “Experimental Setup 4.1 Data Analysis”

See all papers in *Proc. ACL 2009* that mention gold-standard.

See all papers in *Proc. ACL* that mention gold-standard.

Back to top.

Appears in 3 sentences as: hyperparameter (1) Hyperparameters (1) hyperparameters (1)

In *Latent Variable Models of Concept-Attribute Attachment*

- (1) where 04 and 77 are hyperparameters smoothing the per-attribute set distribution over concepts and per-concept attribute distribution respectively (see Figure 2 for the graphical model).Page 2, “Hierarchical Topic Models 3.1 Latent Dirichlet Allocation”
- The hyperparameter *y controls the probability of branching via the per-node Dirichlet Process, and L is the fixed tree depth.Page 4, “Hierarchical Topic Models 3.1 Latent Dirichlet Allocation”
- Hyperparameters were a=0.1, 7720.1, 721.0.Page 4, “Hierarchical Topic Models 3.1 Latent Dirichlet Allocation”

See all papers in *Proc. ACL 2009* that mention Hyperparameters.

See all papers in *Proc. ACL* that mention Hyperparameters.

Back to top.