Index of papers in Proc. ACL 2012 that mention
  • topic model
Diao, Qiming and Jiang, Jing and Zhu, Feida and Lim, Ee-Peng
Abstract
To find topics that have bursty patterns on microblogs, we propose a topic model that simultaneously captures two observations: (1) posts published around the same time are more likely to have the same topic, and (2) posts published by the same user are more likely to have the same topic.
Introduction
To discover topics, we can certainly apply standard topic models such as LDA (Blei et al., 2003), but with standard LDA temporal information is lost during topic discovery.
Introduction
(2007) proposed a PLSA-based topic model that exploits this idea to find correlated bursty patterns across multiple text streams.
Introduction
In this paper, we propose a topic model designed for finding bursty topics from microblogs.
Method
At the topic discovery step, we propose a topic model that considers both users’ topical interests and the global topic trends.
Method
3.2 Our Topic Model
Method
Just like standard LDA, our topic model itself finds a set of topics represented by gbc but does not directly generate bursty topics.
Related Work
Topic models provide a principled and elegant way to discover hidden topics from large document collections.
Related Work
Standard topic models do not consider temporal information.
Related Work
A number of temporal topic models have been proposed to consider topic changes over time.
topic model is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Xiao, Xinyan and Xiong, Deyi and Zhang, Min and Liu, Qun and Lin, Shouxun
Abstract
Previous work using topic model for statistical machine translation (SMT) explore topic information at the word level.
Background: Topic Model
A topic model is used for discovering the topics that occur in a collection of documents.
Background: Topic Model
Both Latent Dirichlet Allocation (LDA) (Blei et al., 2003) and Probabilistic Latent Semantic Analysis (PLSA) (Hofmann, 1999) are types of topic models .
Background: Topic Model
LDA is the most common topic model currently in use, therefore we exploit it for mining topics in this paper.
Estimation
To achieve this goal, we use both source-side and target-side monolingual topic models, and learn the correspondence between the two topic models from word-aligned bilingual corpus.
Estimation
These two rule-topic distributions are estimated by corresponding topic models in the same way (Section 4.1).
Introduction
Topic model (Hofmann, 1999; Blei et al., 2003) is a popular technique for discovering the underlying topic structure of documents.
Introduction
Since a synchronous rule is rarely factorized into individual words, we believe that it is more reasonable to incorporate the topic model directly at the rule level rather than the word level.
Introduction
0 We estimate the topic distribution for a rule based on both the source and target side topic models (Section 4.1).
Topic Similarity Model
Hellinger function is used to calculate distribution distance and is popular in topic model (Blei and Laf-ferty, 2007).1 By topic similarity, we aim to encourage or penalize the application of a rule for a given document according to their topic distributions, which then helps the SMT system make better translation decisions.
topic model is mentioned in 32 sentences in this paper.
Topics mentioned in this paper:
Eidelman, Vladimir and Boyd-Graber, Jordan and Resnik, Philip
Abstract
We propose an approach that biases machine translation systems toward relevant translations based on topic-specific contexts, where topics are induced in an unsupervised way using topic models ; this can be thought of as inducing subcorpora for adaptation without any human annotation.
Discussion and Conclusion
We can construct a topic model once on the training data, and use it infer topics on any test set to adapt the translation model.
Discussion and Conclusion
Multilingual topic models (Boyd-Graber and Resnik, 2010) would provide a technique to use data from multiple languages to ensure consistent topics.
Experiments
Since FBIS has document delineations, we compare local topic modeling (LTM) with modeling at the document level (GTM).
Experiments
Topic modeling was performed with Mallet (Mccallum, 2002), a standard implementation of LDA, using a Chinese sto-plist and setting the per-document Dirichlet parameter a = 0.01.
Experiments
Although the performance on BLEU for both the 20 topic models LTM-20 and GTM-20 is suboptimal, the TER improvement is better.
Introduction
Topic modeling has received some use in SMT, for instance Bilingual LSA adaptation (Tam et al., 2007), and the BiTAM model (Zhao and Xing, 2006), which uses a bilingual topic model for learning alignment.
Introduction
This topic model infers the topic distribution of a test set and biases sentence translations to appropriate topics.
Model Description
Topic Modeling for MT We extend provenance to cover a set of automatically generated topics Zn.
Model Description
Given a parallel training corpus T composed of documents di, we build a source side topic model over T, which provides a topic distribution p(zn|d7;) for Zn 2 {1, .
topic model is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Wang, William Yang and Mayfield, Elijah and Naidu, Suresh and Dittmar, Jeremiah
Abstract
This work extends prior work in topic modelling by incorporating metadata, and the interactions between the components in metadata, in a general way.
Conclusion and Future Work
We jointly model those observed labels as well as unsupervised topic modelling .
Prediction Experiments
To compare the two models in different settings, we first empirically set the number of topics K in our SME model to be 25, as this setting was shown to yield a promising result in a previous study (Eisenstein et al., 2011a) on sparse topic models .
Prediction Experiments
Most studies on topic modelling have not been able to report results when using different sizes of vocabulary for training.
Related Work
Related research efforts include using the LDA model for topic modeling in historical newspapers (Yang et al., 2011), a rule-based approach to extract verbs in historical Swedish texts (Pettersson and Nivre, 2011), a system for semantic tagging of historical Dutch archives (Cybulska and Vossen, 2011).
Related Work
Despite our historical data domain, our approach is more relevant to text classification and topic modelling .
Related Work
mantic information in multifaceted topic models for text categorization.
topic model is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Alfonseca, Enrique and Filippova, Katja and Delort, Jean-Yves and Garrido, Guillermo
Abstract
We describe the use of a hierarchical topic model for automatically identifying syntactic and lexical patterns that explicitly state ontological relations.
Experiments and results
A random sample of 3M of them is used for building the document collections on which to train the topic models , and the remaining 30M is used for testing.
Experiments and results
In both cases, a topic model has been trained to learn the probability of a relation given a pattern w: p(r|w).
Experiments and results
As can be seen, the MLE baselines (in red with syntactic patterns and green with intertext) perform consistently worse than the models learned using the topic models (in pink and blue).
Introduction
Instead, we use topic models to discriminate between the patterns that are expressing the relation and those that are ambiguous and can be applied across relations.
Unsupervised relational pattern learning
Note that we refer to patterns with the symbol w, as they are the words in our topic models .
Unsupervised relational pattern learning
Document contain dependency patterns, which are words in the topic model .
Unsupervised relational pattern learning
The topic model gbG captures general patterns that appear for all relations.
topic model is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Mukherjee, Arjun and Liu, Bing
Abstract
Current methods either extract aspects without categorizing them, or extract and categorize them using unsupervised topic modeling .
Experiments
Setting the number of topics/aspects in topic models is often tricky as it is difficult to know the
Experiments
Topic models are often evaluated quantitatively using perplexity and likelihood on held-out test data (Blei et al., 2003).
Introduction
The second type uses statistical topic models to extract aspects and group them at the same time in an unsupervised manner.
Introduction
Our models are related to topic models in general (Blei et al., 2003) and joint models of aspects and sentiments in sentiment analysis in specific (e.g., Zhao et al., 2010).
Related Work
In recent years, topic models have been used to perform extraction and grouping at the same time.
Related Work
Aspect and sentiment extraction using topic modeling come in two flavors: discovering aspect words sentiment wise (i.e., discovering positive and negative aspect words and/or sentiments for each aspect without separating aspect and sentiment terms) (Lin and He, 2009; Brody and Elhadad, 2010, Jo and Oh, 2011) and separately discovering both aspects and sentiments (e.g., Mei et al., 2007; Zhao et al., 2010).
Related Work
(2009) stated that one reason is that the objective function of topic models does not always correlate well with human judgments.
topic model is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Yao, Limin and Riedel, Sebastian and McCallum, Andrew
Abstract
In particular, we employ a topic model to partition entity pairs associated with patterns into sense clusters using local and global features.
Conclusion
We employ a topic model to partition entity pairs of a path into different sense clusters and use hierarchical agglomerative clustering to merge senses into semantic relations.
Experiments
It does not employ global topic model features extracted from documents and sentences.
Experiments
Local: This system uses our approach (both sense clustering with topic models and hierarchical clustering), but without global features.
Our Approach
We represent each pattern as a list of entity pairs and employ a topic model to partition them into different sense clusters using local and global features.
Our Approach
We employ a topic model to discover senses for each path.
Related Work
Hachey (2009) uses topic models to perform dimensionality reduction on features when clustering entity pairs into relations.
Related Work
For example, varieties of topic models are employed for both open domain (Yao et al., 2011) and in-domain relation discovery (Chen et al., 2011; Rink and Harabagiu, 2011).
topic model is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Guo, Weiwei and Diab, Mona
Limitations of Topic Models and LSA for Modeling Sentences
Topic models (PLSNLDA) do not explicitly model missing words.
Limitations of Topic Models and LSA for Modeling Sentences
However, empirical results show that given a small number of observed words, usually topic models can only find one topic (most evident topic) for a sentence, e. g., the concept definitions of banh#n#1 and stoch#n#1 are assigned the financial topic only without any further discernabil-ity.
Limitations of Topic Models and LSA for Modeling Sentences
The reason is topic models try to learn a 100-dimension latent vector (assume dimension K = 100) from very few data points (10 observed words on average).
topic model is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Johnson, Mark and Demuth, Katherine and Frank, Michael
Introduction
(This is also appropriate, given that our models are specialisations of topic models ).
Introduction
2.1 Topic models and the unigram PCFG
Introduction
(2010) observe, this kind of grounded learning can be viewed as a specialised kind of topic inference in a topic model , where the utterance topic is constrained by the available objects (possible topics).
topic model is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Nguyen, Viet-An and Boyd-Graber, Jordan and Resnik, Philip
Modeling Multiparty Discussions
Topics—after the topic modeling literature (Blei and Lafferty, 2009)—are multinomial distributions over terms.
Modeling Multiparty Discussions
However, topic models alone cannot model the dynamics of a conversation.
Modeling Multiparty Discussions
Topic models typically do not model the temporal dynamics of individual documents, and those that do (Wang et al., 2008; Gerrish and Blei, 2010) are designed for larger documents and are not applicable here because they assume that most topics appear in every time slice.
Related and Future Work
For example: models having sticky topics over n-grams (Johnson, 2010), sticky HDP-HMM (Fox et al., 2008); models that are an amalgam of sequential models and topic models (Griffiths et al., 2005; Wal-
topic model is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Celikyilmaz, Asli and Hakkani-Tur, Dilek
MultiLayer Context Model - MCM
In hierarchical topic models (Blei et al., 2003; Mimno et al., 2007), etc., topics are represented as distributions over words, and each document expresses an admixture of these topics, both of which have symmetric Dirichlet (Dir) prior distributions.
MultiLayer Context Model - MCM
In the topic model literature, such constraints are sometimes used to deterministically allocate topic assignments to known labels (Labeled Topic Modeling (Ramage et al., 2009)) or in terms of pre-learnt topics encoded as prior knowledge on topic distributions in documents (Reisinger and Pasca, 2009).
MultiLayer Context Model - MCM
3See (Wallach, 2008) Chapter 3 for analysis of hyper-priors on topic models .
topic model is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Pantel, Patrick and Lin, Thomas and Gamon, Michael
Related Work
2.4 Topic Modeling on Query Logs
Related Work
Other projects have also demonstrated the utility of topic modeling on query logs.
Related Work
(2011) applied topic models to query logs in order to improve document ranking for search.
topic model is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: