Index of papers in Proc. ACL 2010 that mention

topic models

Seen in text as:

topic models (56)
topic model (33)
Topic models (4)
“topic models” (3)
topic modeling (3)

Seen in 99 sentences in 7 papers.

1. PCFGs, Topic Models, Adaptor Grammars and Learning Topical Collocations and the Structure of Proper Names

Johnson, Mark

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Latent Dirichlet Allocation (LDA) models are used as “topic models” to produce a low-dimensional representation of documents, while Probabilistic Context-Free Grammars (PCFGs) define distributions over trees.
Abstract	The paper begins by showing that LDA topic models can be viewed as a special kind of PCFG, so Bayesian inference for PCFGs can be used to infer Topic Models as well.
Abstract	The first replaces the unigram component of LDA topic models with multi-word sequences or collocations generated by an AG.
Introduction	so Bayesian inference for PCFGs can be used to learn LDA topic models as well.
Introduction	However, once this link is established it suggests a variety of extensions to the LDA topic models , two of which we explore in this paper.
Introduction	The first involves extending the LDA topic model so that it generates collocations (sequences of words) rather than individual words.
LDA topic models as PCFGs	Figure 2: A tree generated by the CFG encoding an LDA topic model .
Latent Dirichlet Allocation Models	Figure l: A graphical model “plate” representation of an LDA topic model .

topic models is mentioned in 26 sentences in this paper.

Topics mentioned in this paper:

2. Topic Models for Word Sense Disambiguation and Token-Based Idiom Detection

Li, Linlin and Roth, Benjamin and Sporleder, Caroline

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We use a topic model to decompose this conditional probability into two conditional probabilities with latent variables.
Introduction	Recently, several researchers have experimented with topic models (Brody and Lapata, 2009; Boyd-Graber et al., 2007; Boyd-Graber and Blei, 2007; Cai et al., 2007) for sense disambiguation and induction.
Introduction	Topic models are generative probabilistic models of text corpora in which each document is modelled as a mixture over (latent) topics, which are in turn represented by a distribution over words.
Introduction	Previous approaches using topic models for sense disambiguation either embed topic features in a supervised model (Cai et al., 2007) or rely heavily on the structure of hierarchical lexicons such as WordNet (Boyd-Graber et al., 2007).
Related Work	Recently, a number of systems have been proposed that make use of topic models for sense disambiguation.
Related Work	They compute topic models from a large unlabelled corpus and include them as features in a supervised system.
Related Work	Boyd-Graber and Blei (2007) propose an unsupervised approach that integrates McCarthy et al.’s (2004) method for finding predominant word senses into a topic modelling framework.
The Sense Disambiguation Model	3.1 Topic Model
The Sense Disambiguation Model	As pointed out by Hofmann (1999), the starting point of topic models is to decompose the conditional word-document probability distribution p(w\|d) into two different distributions: the word-topic distribution p(w\|z), and the topic-document distribution p(z\|d) (see Equation 1).

topic models is mentioned in 15 sentences in this paper.

Topics mentioned in this paper:

3. Latent Variable Models of Selectional Preference

Ó Séaghdha, Diarmuid

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	This paper describes the application of so-called topic models to selectional preference induction.
Experimental setup	In the document modelling literature, probabilistic topic models are often evaluated on the likelihood they assign to unseen documents; however, it has been shown that higher log likelihood scores do not necessarily correlate with more semantically coherent induced topics (Chang etal., 2009).
Introduction	This paper takes up tools ( “topic models” ) that have been proven successful in modelling document-word co-occurrences and adapts them to the task of selectional preference learning.
Introduction	Section 2 surveys prior work on selectional preference modelling and on semantic applications of topic models .
Related work	2.2 Topic modelling
Related work	In the field of document modelling, a class of methods known as “topic models” have become a de facto standard for identifying semantic structure in documents.
Related work	As a result of intensive research in recent years, the behaviour of topic models is well-understood and computa-tionally efficient implementations have been developed.
Three selectional preference models	Unlike some topic models such as HDP (Teh et a1., 2006), LDA is parametric: the number of topics Z must be set by the user in advance.

topic models is mentioned in 15 sentences in this paper.

Topics mentioned in this paper:

4. A Latent Dirichlet Allocation Method for Selectional Preferences

Ritter, Alan and Mausam and Etzioni, Oren

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We perform three main experiments to assess the quality of the preferences obtained using topic models .
Experiments	We use this experiment to compare the various topic models as well as the best model with the known state of the art approaches to selectional preferences.
Experiments	Tigure 3 plots the precision-recall curve for the tseudo-disambiguation experiment comparing the hree different topic models .
Introduction	In this paper we describe a novel approach to computing selectional preferences by making use of unsupervised topic models .
Introduction	Unsupervised topic models , such as latent Dirichlet allocation (LDA) (Blei et al., 2003) and its variants are characterized by a set of hidden topics, which represent the underlying semantic structure of a document collection.
Introduction	Thus, topic models are a natural fit for modeling our relation data.
Previous Work	Topic models such as LDA (Blei et al., 2003) and its variants have recently begun to see use in many NLP applications such as summarization (Daume III and Marcu, 2006), document alignment and segmentation (Chen et al., 2009), and inferring class-attribute hierarchies (Reisinger and Pasca, 2009).
Topic Models for Selectional Prefs.	We present a series of topic models for the task of computing selectional preferences.
Topic Models for Selectional Prefs.	Readers familiar with topic modeling terminology can understand our approach as follows: we treat each relation as a document whose contents consist of a bags of words corresponding to all the noun phrases observed as arguments of the relation in our corpus.
Topic Models for Selectional Prefs.	3.5 Advantages of Topic Models

topic models is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

topic models (12)
WordNet (10)
LDA (8)

5. Cross-Lingual Latent Topic Extraction

Zhang, Duo and Mei, Qiaozhu and Zhai, ChengXiang

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Probabilistic latent topic models have recently enjoyed much success in extracting and analyzing latent topics in text in an unsupervised way.
Abstract	One common deficiency of existing topic models , though, is that they would not work well for extracting cross-lingual latent topics simply because words in different languages generally do not co-occur with each other.
Abstract	In this paper, we propose a way to incorporate a bilingual dictionary into a probabilistic topic model so that we can apply topic models to extract shared latent topics in text data of different languages.
Introduction	As a robust unsupervised way to perform shallow latent semantic analysis of topics in text, probabilistic topic models (Hofmann, 1999a; Blei et al., 2003b) have recently attracted much attention.
Introduction	Although many topic models have been proposed and shown to be useful (see Section 2 for more detailed discussion of related work), most of them share a common deficiency: they are designed to work only for monolingual text data and would not work well for extracting cross-lingual latent topics, i.e.
Introduction	In this paper, we propose a novel topic model , called Probabilistic Cross-Lingual Latent Semantic Analysis (PCLSA) model, which can be used to mine shared latent topics from unaligned text data in different languages.
Related Work	Many topic models have been proposed, and the two basic models are the Probabilistic Latent Semantic Analysis (PLSA) model (Hofmann, 1999a) and the Latent Dirichlet Allocation (LDA) model (Blei et al., 2003b).
Related Work	They and their extensions have been successfully applied to many problems, including hierarchical topic extraction (Hofmann, 1999b; Blei et al., 2003a; Li and McCallum, 2006), author-topic modeling (Steyvers et al., 2004), contextual topic analysis (Mei and Zhai, 2006), dynamic and correlated topic models (Blei and Lafferty, 2005; Blei and Lafferty, 2006), and opinion analysis (Mei et al., 2007; Branavan et al., 2008).
Related Work	Some previous work on multilingual topic models assume documents in multiple languages are aligned either at the document level, sentence level or by time stamps (Mimno et al., 2009; Zhao and Xing, 2006; Kim and Khudanpur, 2004; Ni et al., 2009; Wang et al., 2007).

topic models is mentioned in 17 sentences in this paper.

Topics mentioned in this paper:

6. A Hybrid Hierarchical Model for Multi-Document Summarization

Celikyilmaz, Asli and Hakkani-Tur, Dilek

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We calculate scores for sentences in document clusters based on their latent characteristics using a hierarchical topic model .
Background and Motivation	One of the challenges of using a previously trained topic model is that the new document might have a totally new vocabulary or may include many other specific topics, which may or may not exist in the trained model.
Background and Motivation	A common method is to rebuild a topic model for new sets of documents (Haghighi and Vanderwende, 2009), which has proven to produce coherent summaries.
Conclusion	We demonstrated that implementation of a summary focused hierarchical topic model to discover sentence structures as well as construction of a discriminative method for inference can benefit summarization quality on manual and automatic evaluation metrics.
Experiments and Discussions	* HIERSUM : (Haghighi and Vanderwende, 2009) A generative summarization method based on topic models , which uses sentences as an additional level.
Experiments and Discussions	* HbeSum (Hybrid Flat Summarizer): To investigate the performance of hierarchical topic model , we build another hybrid model using flat LDA (Blei et al., 2003b).
Experiments and Discussions	Compared to the HbeSum built on LDA, both HybHSum1&2 yield better performance indicating the effectiveness of using hierarchical topic model in summarization task.
Introduction	We present a probabilistic topic model on sentence level building on hierarchical Latent Dirichlet Allocation (hLDA) (Blei et al., 2003a), which is a generalization of LDA (Blei et al., 2003b).
Summary-Focused Hierarchical Model	We build a summary-focused hierarchical probabilistic topic model , sumHLDA, for each document cluster at sentence level, because it enables capturing expected topic distributions in given sentences directly from the model.
Summary-Focused Hierarchical Model	1Please refer to (Blei et al., 2003b) and (Blei et al., 2003a) for details and demonstrations of topic models .

topic models is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

7. How Many Words Is a Picture Worth? Automatic Caption Generation for News Images

Feng, Yansong and Lapata, Mirella

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	The underlying topic model was trained with 1,000 topics using only content words (i.e., nouns, verbs, and adjectives) that appeared
Extractive Caption Generation	Probabilistic Similarity Recall that the backbone of our image annotation model is a topic model with images and documents represented as a probability distribution over latent topics.
Image Annotation	The basic idea underlying LDA, and topic models in general, is that each document is composed of a probability distribution over topics, where each topic represents a probability distribution over words.
Results	As can be seen the probabilistic models (KL and J S divergence) outperform word overlap and cosine similarity (all differences are statistically significant, p < 0.01).6 They make use of the same topic model as the image annotation model, and are thus able to select sentences that cover common content.

topic models is mentioned in 4 sentences in this paper.

Topics mentioned in this paper: