Introduction | Keyphrases are clustered based on their distributional and lexical properties, and a hidden topic model is applied to the document text. |
Model Description | During training, we learn a hidden topic model from the text; each topic is also asso- |
Model Description | — probability of selecting 77 instead of ¢ — selects between 77 and ¢ for word topics — document topic model |
Model Description | The hidden topic model of the review text is used to determine the properties that a document as a whole supports. |
Related Work | Bayesian Topic Modeling One aspect of our model views properties as distributions over words in the document. |
Related Work | This approach is inspired by methods in the topic modeling literature, such as Latent Dirichlet Allocation (LDA) (Blei et al., 2003), where topics are treated as hidden variables that govern the distribution of words in a text. |
Related Work | Recent work has examined coupling topic models with explicit supervision (Blei and McAuliffe, 2007; Titov and McDonald, 2008). |
Abstract | In addition to regular text classification, we utilized topic modeling of the entire dataset in various ways. |
Abstract | Topic modeling of the corpora provides interpretable themes that exist in these reports. |
Abstract | A binary topic model was also built as an unsupervised classification approach with the assumption that each topic corresponds to a class. |
Background | 2.2 Topic Modeling |
Background | Topic modeling is an unsupervised learning algorithm that can automatically discover themes of a document collection. |
Background | Either sampling methods such as Gibbs Sampling (Griffiths and Steyvers, 2004) or optimization methods such as variational Bayes approximation (Asuncion et al., 2009) can be used to train a topic model based on LDA. |
Introduction | In this study, we developed several topic modeling based classification systems for clinical reports. |
Introduction | Topic modeling is an unsupervised technique that can automatically identify themes from a given set of documents and find topic distributions of each document. |
Introduction | Therefore, topic model output of patient reports could contain very useful clinical information. |
Related Work | For text classification, topic modeling techniques have been utilized in various ways. |
Related Work | In our study, we removed the most frequent and infrequent words to have a manageable vocabulary size but we did not utilize topic model output for this purpose. |
Abstract | We present a topic model based approach to answer these questions. |
Abstract | Since agreement and disagreement expressions are usually multi-word phrases, we propose to employ a ranking method to identify highly relevant phrases prior to topic modeling . |
Empirical Evaluation | For quantitative evaluation, topic models are often compared using perplexity. |
Introduction | In our earlier work (Mukherjee and Liu, 2012a), we proposed three topic models to mine contention points, which also extract AD-expressions. |
Introduction | In this paper, we further improve the work by coupling an information retrieval method to rank good candidate phrases with topic modeling in order to discover more accurate AD-expressions. |
Phrase Ranking based on Relevance | Topics in most topic models like LDA are usually unigram distributions. |
Related Work | Topic models: Our work is also related to topic modeling and joint modeling of topics and other information as we jointly model several aspects of discussions/debates. |
Related Work | Topic models like pLSA (Hofmann, 1999) and LDA (Blei et al., 2003) have proved to be very successful in mining topics from large text collections. |
Related Work | There have been various extensions to multi-grain (Titov and McDonald, 2008), labeled (Ramage et al., 2009), and sequential (Du et al., 2010) topic models . |
Abstract | Supervised topic models with a logistic likelihood have two issues that potentially limit their practical use: 1) response variables are usually over-weighted by document word counts; and 2) existing variational inference methods make strict mean-field assumptions. |
Introduction | First, we present a general framework of Bayesian logistic supervised topic models with a regularization parameter to better balance response variables and words. |
Introduction | Second, to solve the intractable posterior inference problem of the generalized Bayesian logistic supervised topic models , we present a simple Gibbs sampling algorithm by exploring the ideas of data augmentation (Tanner and Wong, 1987; van Dyk and Meng, 2001; Holmes and Held, 2006). |
Introduction | More specifically, we extend Polson’s method for Bayesian logistic regression (Polson et al., 2012) to the generalized logistic supervised topic models , which are much more challeng- |
Logistic Supervised Topic Models | We now present the generalized Bayesian logistic supervised topic models . |
Logistic Supervised Topic Models | A logistic supervised topic model consists of two parts — an LDA model (Blei et al., 2003) for describing the words W = {wd}dD=1, where Wd 2 {wdnfigl denote the words within document d, and a logistic classifier for considering the supervising signal y = {yd}dD=1. |
Logistic Supervised Topic Models | Logistic classifier: To consider binary supervising information, a logistic supervised topic model (e.g., sLDA) builds a logistic classifier using the topic representations as input features |
Abstract | We present an algorithm for Query—Chain Summarization based on a new LDA topic model variant. |
Algorithms | We developed a novel Topic Model to identify words that are associated to the current query and not shared with the previous queries. |
Algorithms | Figure 3 Plate Model for Our Topic Model |
Algorithms | We implemented inference over this topic model using Gibbs Sampling (we distribute the code of the sampler together with our dataset). |
Introduction | We introduce a new algorithm to address the task of Query-Chain Focused Summarization, based on a new LDA topic model variant, and present an evaluation which demonstrates it improves on these baselines. |
Previous Work | As evidenced since (Daume and Marcu, 2006), Bayesian techniques have proven effective at this task: we construct a latent topic model on the basis of the document set and the query. |
Previous Work | This topic model effectively serves as a query expansion mechanism, which helps assess the relevance of individual sentences to the original que- |
Previous Work | In recent years, three major techniques have emerged to perform multi-document summarization: graph-based methods such as LexRank (Erkan and Radev, 2004) for multi document summarization and Biased-LexRank (Otterbacher et al., 2008) for query focused summarization, language model methods such as KLSum (Haghighi and Vanderwende, 2009) and variants of KLSum based on topic models such as BayesSum (Daume and Marcu, 2006) and TopicSum (Haghighi and Vanderwende, 2009). |
Abstract | Topic modeling is a popular method for the task. |
Abstract | However, unsupervised topic models often generate incoherent aspects. |
Abstract | Such knowledge can then be used by a topic model to discover more coherent aspects. |
Introduction | Recently, topic models have been extensively applied to aspect extraction because they can perform both subtasks at the same time while other |
Introduction | Traditional topic models such as LDA (Blei et al., 2003) and pLSA (Hofmann, 1999) are unsupervised methods for extracting latent topics in text documents. |
Introduction | However, researchers have shown that fully unsupervised models often produce incoherent topics because the objective functions of topic models do not always correlate well with human judgments (Chang et al., 2009). |
Related Work | To extract and group aspects simultaneously, topic models have been applied by researchers (Branavan et al., 2008, Brody and Elhadad, 2010, Chen et al., 2013b, Fang and Huang, 2012, He et al., 2011, Jo and Oh, 2011, Kim et al., 2013, Lazaridou et al., 2013, Li et al., 2011, Lin and He, 2009, Lu et al., 2009, Lu et al., 2012, Lu and Zhai, 2008, Mei et al., 2007, Moghaddam and Ester, 2013, Mukherjee and Liu, 2012, Sauper and Barzilay, 2013, Titov and McDonald, 2008, Wang et al., 2010, Zhao et al., 2010). |
Abstract | The current topic modeling approaches for Information Retrieval do not allow to explicitly model query-oriented latent topics. |
Abstract | We propose a model-based feedback approach that learns Latent Dirichlet Allocation topic models on the top-ranked pseudo-relevant feedback, and we measure the semantic coherence of those topics. |
Introduction | Based on the words used within a document, topic models learn topic level relations by assuming that the document covers a small set of concepts. |
Introduction | This is one of the reasons of the intensive use of topic models (and especially LDA) in current research in Natural Language Processing (NLP) related areas. |
Introduction | From that perspective, topic models seem attractive in the sense that they can provide a descriptive and intuitive representation of concepts. |
Topic-Driven Relevance Models | Instead of viewing 9 as a set of document language models that are likely to contain topical information about the query, we take a probabilistic topic modeling approach. |
Topic-Driven Relevance Models | Figure 1: Semantic coherence of the topic models f( N of feedback documents. |
Abstract | Previous work using topic model for statistical machine translation (SMT) explore topic information at the word level. |
Background: Topic Model | A topic model is used for discovering the topics that occur in a collection of documents. |
Background: Topic Model | Both Latent Dirichlet Allocation (LDA) (Blei et al., 2003) and Probabilistic Latent Semantic Analysis (PLSA) (Hofmann, 1999) are types of topic models . |
Background: Topic Model | LDA is the most common topic model currently in use, therefore we exploit it for mining topics in this paper. |
Estimation | To achieve this goal, we use both source-side and target-side monolingual topic models, and learn the correspondence between the two topic models from word-aligned bilingual corpus. |
Estimation | These two rule-topic distributions are estimated by corresponding topic models in the same way (Section 4.1). |
Introduction | Topic model (Hofmann, 1999; Blei et al., 2003) is a popular technique for discovering the underlying topic structure of documents. |
Introduction | Since a synchronous rule is rarely factorized into individual words, we believe that it is more reasonable to incorporate the topic model directly at the rule level rather than the word level. |
Introduction | 0 We estimate the topic distribution for a rule based on both the source and target side topic models (Section 4.1). |
Topic Similarity Model | Hellinger function is used to calculate distribution distance and is popular in topic model (Blei and Laf-ferty, 2007).1 By topic similarity, we aim to encourage or penalize the application of a rule for a given document according to their topic distributions, which then helps the SMT system make better translation decisions. |
Abstract | Topic models , an unsupervised technique for inferring translation domains improve machine translation quality. |
Abstract | We propose new polylingual tree-based topic models to extract domain knowledge that considers both source and target languages and derive three different inference schemes. |
Introduction | Probabilistic topic models (Blei and Lafferty, 2009), exemplified by latent Dirichlet allocation (Blei et al., 2003, LDA), are one of the most popular statistical frameworks for navigating large unannotated document collections. |
Introduction | Topic models discover—without any supervision—the primary themes presented in a dataset: the namesake topics. |
Introduction | Topic models have two primary applications: to aid human exploration of corpora (Chang et al., 2009) or serve as a low-dimensional representation for downstream applications. |
Abstract | To find topics that have bursty patterns on microblogs, we propose a topic model that simultaneously captures two observations: (1) posts published around the same time are more likely to have the same topic, and (2) posts published by the same user are more likely to have the same topic. |
Introduction | To discover topics, we can certainly apply standard topic models such as LDA (Blei et al., 2003), but with standard LDA temporal information is lost during topic discovery. |
Introduction | (2007) proposed a PLSA-based topic model that exploits this idea to find correlated bursty patterns across multiple text streams. |
Introduction | In this paper, we propose a topic model designed for finding bursty topics from microblogs. |
Method | At the topic discovery step, we propose a topic model that considers both users’ topical interests and the global topic trends. |
Method | 3.2 Our Topic Model |
Method | Just like standard LDA, our topic model itself finds a set of topics represented by gbc but does not directly generate bursty topics. |
Related Work | Topic models provide a principled and elegant way to discover hidden topics from large document collections. |
Related Work | Standard topic models do not consider temporal information. |
Related Work | A number of temporal topic models have been proposed to consider topic changes over time. |
Abstract | Based on the assumption that a corpus follows Zipf’s law, we derive tradeoff formulae of the perpleXity of k-gram models and topic models with respect to the size of the reduced vocabulary. |
Experiments | We used Zip fMix only for the experiments on topic models . |
Experiments | Therefore, we can use TheoryAve as a heuristic function for estimating the perplexity of topic models . |
Introduction | Removing low-frequency words from a corpus (often called cutofi‘) is a common practice to save on the computational costs involved in learning language models and topic models . |
Introduction | In the case of topic models , the intuition is that low-frequency words do not make a large contribution to the statistics of the models. |
Introduction | Actually, when we try to roughly analyze a corpus with topic models , a reduced corpus is enough for the purpose (Steyvers and Griffiths, 2007). |
Perplexity on Reduced Corpora | 3.3 PerpleXity of Topic Models |
Perplexity on Reduced Corpora | In this section, we consider the perpleXity of the widely used topic model , Latent Dirichlet Allocation (LDA) (Blei et al., 2003), by using the notation given in (Griffiths and Steyvers, 2004). |
Preliminaries | Perplexity is a widely used evaluation measure of k-gram models and topic models . |
Abstract | We examine Arora et al.’s anchor words algorithm for topic modeling and develop new, regularized algorithms that not only mathematically resemble Gaussian and Dirichlet priors but also improve the interpretability of topic models . |
Adding Regularization | While the distribution over topics is typically Dirichlet, Dirichlet distributions have been replaced by logistic normals in topic modeling applications (Blei and Lafferty, 2005) and for probabilistic grammars of language (Cohen and Smith, 2009). |
Anchor Words: Scalable Topic Models | In this section, we briefly review the anchor method and place it in the context of topic model inference. |
Anchor Words: Scalable Topic Models | Rethinking Data: Word Co-occurrence Inference in topic models can be viewed as a black box: given a set of documents, discover the topics that best explain the data. |
Anchor Words: Scalable Topic Models | Like other topic modeling algorithms, the output of the anchor method is the topic word distributions A with size V >I< K, where K is the total number of topics desired, a parameter of the algorithm. |
Introduction | Topic models are of practical and theoretical interest. |
Introduction | Modern topic models are formulated as a latent variable model. |
Introduction | Unlike a HMM, topic models assume that each document is an admixture of these hidden components called topics. |
Abstract | Code-switched documents are common in social media, providing evidence for polylingual topic models to infer aligned topics across languages. |
Code-Switching | By collecting the entire conversation into a single document we provide the topic model with additional content. |
Code-Switching | To train a polylingual topic model on social media, we make two modifications to the model of Mimno et al. |
Code-Switching | First, polylingual topic models require parallel or comparable corpora in which each document has an assigned language. |
Introduction | Topic models (Blei et al., 2003) have become standard tools for analyzing document collections, and topic analyses are quite common for social media (Paul and Dredze, 2011; Zhao et al., 2011; Hong and Davison, 2010; Ramage et al., 2010; Eisenstein et al., 2010). |
Introduction | A good candidate for multilingual topic analyses are polylingual topic models (Mimno et al., 2009), which learn topics for multiple languages, creating tuples of language specific distributions over monolingual vocabularies for each topic. |
Introduction | Polylingual topic models enable cross language analysis by grouping documents by topic regardless of language. |
Final Experiments | The following models are used as benchmark: (i) PYTHY (Toutanova et al., 2007): Utilizes human generated summaries to train a sentence ranking system using a classifier model; (ii) HIERSUM (Haghighi and Vanderwende, 2009): Based on hierarchical topic models . |
Introduction | In particular (Haghighi and Vanderwende, 2009; Celikyilmaz and Hakkani-Tur, 2010) build hierarchical topic models to identify salient sentences that contain abstract concepts rather than specific concepts. |
Multi-Document Summarization Models | Some of these work (Haghighi and Vanderwende, 2009; Celikyilmaz and Hakkani-Tur, 2010) focus on the discovery of hierarchical concepts from documents (from abstract to specific) using extensions of hierarchal topic models (Blei et al., 2004) and reflect this hierarchy on the sentences. |
Multi-Document Summarization Models | we utilize the advantages of previous topic models and build an unsupervised generative model that can associate each word in each document with three random variables: a sentence S, a higher-level topic H, and a lower-level topic T, in an analogical way to PAM models (Li and McCallum, 2006), i.e., a directed acyclic graph (DAG) representing mixtures of hierarchical structure, where super-topics are multi-nomials over subtopics at lower levels in the DAG. |
Topic Coherence for Summarization | In this section we discuss the main contribution, our two hierarchical mixture models, which improve summary generation performance through the use of tiered topic models . |
Two-Tiered Topic Model - TTM | Our base model, the two-tiered topic model (TTM), is inspired by the hierarchical topic model , PAM, proposed by Li and McCallum (2006). |
Two-Tiered Topic Model - TTM | Figure l: Graphical model depiction of two-tiered topic model (TTM) described in section §4. |
Two-Tiered Topic Model - TTM | Our two-tiered topic model for salient sentence discovery can be generated for each word in the document (Algorithm 1) as follows: For a word wid in document d, a random variable acid is drawn, which determines if wid is query related, i.e., wid either exists in the query or is related to the queryz. |
Experiments | Distributions of words in each topic were estimated as the proportion of words assigned to each topic, taking into account topic model priors Bgl and Bloc. |
Experiments | Before applying the topic models we removed punctuation and also removed stop words using the standard list of stop words,8 however, all the words and punctuation were used in the sentiment predictors. |
Experiments | To combat this problem we first train the sentiment classifiers by assuming that pygm is equal for all the local topics, which effectively ignores the topic model . |
Introduction | The model is at heart a topic model in that it assigns words to a set of induced topics, each of which may represent one particular aspect. |
Introduction | For example, other topic models can be used as a part of our model and the proposed class of models can be employed in other tasks beyond sentiment summarization, e.g., segmentation of blogs on the basis of topic labels provided by users, or topic discovery on the basis of tags given by users on social bookmarking sites.3 |
Related Work | Text excerpts are usually extracted through string matching (Hu and Liu, 2004a; Popescu and Etzioni, 2005), sentence clustering (Gamon et al., 2005), or through topic models (Mei et al., 2007; Titov and McDonald, 2008). |
Related Work | String extraction methods are limited to fine-grained aspects whereas clustering and topic model approaches must resort to ad-hoc means of labeling clusters or topics. |
Related Work | Recently, Blei and McAuliffe (2008) proposed an approach for joint sentiment and topic modeling that can be viewed as a supervised LDA (sLDA) model that tries to infer topics appropriate for use in a given classification or regression problem. |
Abstract | We build a broad-coverage sense tagger based on a nonparametric Bayesian topic model that automatically learns sense clusters for words in the source language. |
Introduction | We use a nonparametric Bayesian topic model based WSI to infer word senses for source words in our training, development and test set. |
Related Work | For ease of comparison, we roughly divide them into 4 categories: 1) WSD for SMT, 2) topic-based W81, 3) topic model for SMT and 4) lexical selection. |
Related Work | Brody and Lapata (2009)’s work is the first attempt to approach WSI Via topic modeling . |
Related Work | They adapt LDA to word sense induction by building one topic model per word type. |
WSI-Based Broad-Coverage Sense Tagger | Recently, we have also witnessed that WSI is cast as a topic modeling problem where the sense clusters of a word type are considered as underlying topics (Brody and Lapata, 2009; Yao and Durme, 2011; Lau et al., 2012). |
WSI-Based Broad-Coverage Sense Tagger | We follow this line to tailor a topic modeling framework to induce word senses for our large-scale training data. |
WSI-Based Broad-Coverage Sense Tagger | We can induce topics on this corpus for each pseudo document via topic modeling approaches. |
Abstract | This paper studies the employment of topic models to automatically construct semantic classes, taking as the source data a collection of raw semantic classes (RASCs), which were extracted by applying predefined patterns to web pages. |
Abstract | To adopt topic models , we treat RASCs as “documents”, items as “words”, and the final semantic classes as “topics”. |
Abstract | Appropriate preprocessing and postprocessing are performed to improve results quality, to reduce computation cost, and to tackle the fixed-k constraint of a typical topic model . |
Introduction | In this paper, we propose to use topic models to address the problem. |
Introduction | In some topic models , a document is modeled as a mixture of hidden topics. |
Introduction | Topic modeling provides a formal and convenient way of dealing with multi-membership, which is our primary motivation of adopting topic models here. |
Abstract | Latent Dirichlet Allocation (LDA) models are used as “topic models” to produce a low-dimensional representation of documents, while Probabilistic Context-Free Grammars (PCFGs) define distributions over trees. |
Abstract | The paper begins by showing that LDA topic models can be viewed as a special kind of PCFG, so Bayesian inference for PCFGs can be used to infer Topic Models as well. |
Abstract | The first replaces the unigram component of LDA topic models with multi-word sequences or collocations generated by an AG. |
Introduction | so Bayesian inference for PCFGs can be used to learn LDA topic models as well. |
Introduction | However, once this link is established it suggests a variety of extensions to the LDA topic models , two of which we explore in this paper. |
Introduction | The first involves extending the LDA topic model so that it generates collocations (sequences of words) rather than individual words. |
LDA topic models as PCFGs | Figure 2: A tree generated by the CFG encoding an LDA topic model . |
Latent Dirichlet Allocation Models | Figure l: A graphical model “plate” representation of an LDA topic model . |
Abstract | We use a topic model to decompose this conditional probability into two conditional probabilities with latent variables. |
Introduction | Recently, several researchers have experimented with topic models (Brody and Lapata, 2009; Boyd-Graber et al., 2007; Boyd-Graber and Blei, 2007; Cai et al., 2007) for sense disambiguation and induction. |
Introduction | Topic models are generative probabilistic models of text corpora in which each document is modelled as a mixture over (latent) topics, which are in turn represented by a distribution over words. |
Introduction | Previous approaches using topic models for sense disambiguation either embed topic features in a supervised model (Cai et al., 2007) or rely heavily on the structure of hierarchical lexicons such as WordNet (Boyd-Graber et al., 2007). |
Related Work | Recently, a number of systems have been proposed that make use of topic models for sense disambiguation. |
Related Work | They compute topic models from a large unlabelled corpus and include them as features in a supervised system. |
Related Work | Boyd-Graber and Blei (2007) propose an unsupervised approach that integrates McCarthy et al.’s (2004) method for finding predominant word senses into a topic modelling framework. |
The Sense Disambiguation Model | 3.1 Topic Model |
The Sense Disambiguation Model | As pointed out by Hofmann (1999), the starting point of topic models is to decompose the conditional word-document probability distribution p(w|d) into two different distributions: the word-topic distribution p(w|z), and the topic-document distribution p(z|d) (see Equation 1). |
Abstract | This paper describes the application of so-called topic models to selectional preference induction. |
Experimental setup | In the document modelling literature, probabilistic topic models are often evaluated on the likelihood they assign to unseen documents; however, it has been shown that higher log likelihood scores do not necessarily correlate with more semantically coherent induced topics (Chang etal., 2009). |
Introduction | This paper takes up tools ( “topic models” ) that have been proven successful in modelling document-word co-occurrences and adapts them to the task of selectional preference learning. |
Introduction | Section 2 surveys prior work on selectional preference modelling and on semantic applications of topic models . |
Related work | 2.2 Topic modelling |
Related work | In the field of document modelling, a class of methods known as “topic models” have become a de facto standard for identifying semantic structure in documents. |
Related work | As a result of intensive research in recent years, the behaviour of topic models is well-understood and computa-tionally efficient implementations have been developed. |
Three selectional preference models | Unlike some topic models such as HDP (Teh et a1., 2006), LDA is parametric: the number of topics Z must be set by the user in advance. |
Abstract | Topic models have been used extensively as a tool for corpus exploration, and a cottage industry has developed to tweak topic models to better encode human intuitions or to better model data. |
Abstract | However, creating such extensions requires expertise in machine learning unavailable to potential end-users of topic modeling software. |
Introduction | Probabilistic topic models , as exemplified by probabilistic latent semantic indexing (Hofmann, 1999) and latent Dirichlet allocation (LDA) (Blei et al., 2003) are unsupervised statistical techniques to discover the thematic topics that permeate a large corpus of text documents. |
Introduction | Topic models have had considerable application beyond natural language processing in computer vision (Rob et al., 2005), biology (Shringarpure and Xing, 2008), and psychology (Landauer et al., 2006) in addition to their canonical application to text. |
Introduction | For text, one of the few real-world applications of topic models is corpus exploration. |
Experiments | We perform three main experiments to assess the quality of the preferences obtained using topic models . |
Experiments | We use this experiment to compare the various topic models as well as the best model with the known state of the art approaches to selectional preferences. |
Experiments | Tigure 3 plots the precision-recall curve for the tseudo-disambiguation experiment comparing the hree different topic models . |
Introduction | In this paper we describe a novel approach to computing selectional preferences by making use of unsupervised topic models . |
Introduction | Unsupervised topic models , such as latent Dirichlet allocation (LDA) (Blei et al., 2003) and its variants are characterized by a set of hidden topics, which represent the underlying semantic structure of a document collection. |
Introduction | Thus, topic models are a natural fit for modeling our relation data. |
Previous Work | Topic models such as LDA (Blei et al., 2003) and its variants have recently begun to see use in many NLP applications such as summarization (Daume III and Marcu, 2006), document alignment and segmentation (Chen et al., 2009), and inferring class-attribute hierarchies (Reisinger and Pasca, 2009). |
Topic Models for Selectional Prefs. | We present a series of topic models for the task of computing selectional preferences. |
Topic Models for Selectional Prefs. | Readers familiar with topic modeling terminology can understand our approach as follows: we treat each relation as a document whose contents consist of a bags of words corresponding to all the noun phrases observed as arguments of the relation in our corpus. |
Topic Models for Selectional Prefs. | 3.5 Advantages of Topic Models |
Abstract | Probabilistic latent topic models have recently enjoyed much success in extracting and analyzing latent topics in text in an unsupervised way. |
Abstract | One common deficiency of existing topic models , though, is that they would not work well for extracting cross-lingual latent topics simply because words in different languages generally do not co-occur with each other. |
Abstract | In this paper, we propose a way to incorporate a bilingual dictionary into a probabilistic topic model so that we can apply topic models to extract shared latent topics in text data of different languages. |
Introduction | As a robust unsupervised way to perform shallow latent semantic analysis of topics in text, probabilistic topic models (Hofmann, 1999a; Blei et al., 2003b) have recently attracted much attention. |
Introduction | Although many topic models have been proposed and shown to be useful (see Section 2 for more detailed discussion of related work), most of them share a common deficiency: they are designed to work only for monolingual text data and would not work well for extracting cross-lingual latent topics, i.e. |
Introduction | In this paper, we propose a novel topic model , called Probabilistic Cross-Lingual Latent Semantic Analysis (PCLSA) model, which can be used to mine shared latent topics from unaligned text data in different languages. |
Related Work | Many topic models have been proposed, and the two basic models are the Probabilistic Latent Semantic Analysis (PLSA) model (Hofmann, 1999a) and the Latent Dirichlet Allocation (LDA) model (Blei et al., 2003b). |
Related Work | They and their extensions have been successfully applied to many problems, including hierarchical topic extraction (Hofmann, 1999b; Blei et al., 2003a; Li and McCallum, 2006), author-topic modeling (Steyvers et al., 2004), contextual topic analysis (Mei and Zhai, 2006), dynamic and correlated topic models (Blei and Lafferty, 2005; Blei and Lafferty, 2006), and opinion analysis (Mei et al., 2007; Branavan et al., 2008). |
Related Work | Some previous work on multilingual topic models assume documents in multiple languages are aligned either at the document level, sentence level or by time stamps (Mimno et al., 2009; Zhao and Xing, 2006; Kim and Khudanpur, 2004; Ni et al., 2009; Wang et al., 2007). |
Abstract | We propose an approach that biases machine translation systems toward relevant translations based on topic-specific contexts, where topics are induced in an unsupervised way using topic models ; this can be thought of as inducing subcorpora for adaptation without any human annotation. |
Discussion and Conclusion | We can construct a topic model once on the training data, and use it infer topics on any test set to adapt the translation model. |
Discussion and Conclusion | Multilingual topic models (Boyd-Graber and Resnik, 2010) would provide a technique to use data from multiple languages to ensure consistent topics. |
Experiments | Since FBIS has document delineations, we compare local topic modeling (LTM) with modeling at the document level (GTM). |
Experiments | Topic modeling was performed with Mallet (Mccallum, 2002), a standard implementation of LDA, using a Chinese sto-plist and setting the per-document Dirichlet parameter a = 0.01. |
Experiments | Although the performance on BLEU for both the 20 topic models LTM-20 and GTM-20 is suboptimal, the TER improvement is better. |
Introduction | Topic modeling has received some use in SMT, for instance Bilingual LSA adaptation (Tam et al., 2007), and the BiTAM model (Zhao and Xing, 2006), which uses a bilingual topic model for learning alignment. |
Introduction | This topic model infers the topic distribution of a test set and biases sentence translations to appropriate topics. |
Model Description | Topic Modeling for MT We extend provenance to cover a set of automatically generated topics Zn. |
Model Description | Given a parallel training corpus T composed of documents di, we build a source side topic model over T, which provides a topic distribution p(zn|d7;) for Zn 2 {1, . |
Abstract | We calculate scores for sentences in document clusters based on their latent characteristics using a hierarchical topic model . |
Background and Motivation | One of the challenges of using a previously trained topic model is that the new document might have a totally new vocabulary or may include many other specific topics, which may or may not exist in the trained model. |
Background and Motivation | A common method is to rebuild a topic model for new sets of documents (Haghighi and Vanderwende, 2009), which has proven to produce coherent summaries. |
Conclusion | We demonstrated that implementation of a summary focused hierarchical topic model to discover sentence structures as well as construction of a discriminative method for inference can benefit summarization quality on manual and automatic evaluation metrics. |
Experiments and Discussions | * HIERSUM : (Haghighi and Vanderwende, 2009) A generative summarization method based on topic models , which uses sentences as an additional level. |
Experiments and Discussions | * HbeSum (Hybrid Flat Summarizer): To investigate the performance of hierarchical topic model , we build another hybrid model using flat LDA (Blei et al., 2003b). |
Experiments and Discussions | Compared to the HbeSum built on LDA, both HybHSum1&2 yield better performance indicating the effectiveness of using hierarchical topic model in summarization task. |
Introduction | We present a probabilistic topic model on sentence level building on hierarchical Latent Dirichlet Allocation (hLDA) (Blei et al., 2003a), which is a generalization of LDA (Blei et al., 2003b). |
Summary-Focused Hierarchical Model | We build a summary-focused hierarchical probabilistic topic model , sumHLDA, for each document cluster at sentence level, because it enables capturing expected topic distributions in given sentences directly from the model. |
Summary-Focused Hierarchical Model | 1Please refer to (Blei et al., 2003b) and (Blei et al., 2003a) for details and demonstrations of topic models . |
Abstract | Our approach employs a monolingual LDA topic model to derive a similarity measure between the test conversation and the set of training conversations, which is used to bias translation choices towards the current context. |
Corpus Data and Baseline SMT | We use the DARPA TransTac English-Iraqi parallel two-way spoken dialogue collection to train both translation and LDA topic models . |
Corpus Data and Baseline SMT | We use the English side of these conversations for training LDA topic models . |
Incremental Topic-Based Adaptation | 4.1 Topic modeling with LDA |
Incremental Topic-Based Adaptation | The full conversation history is available for training the topic models and estimating topic distributions in the training set. |
Incremental Topic-Based Adaptation | We use Mallet (McCallum, 2002) for training topic models and inferring topic distributions. |
Introduction | We begin by building a monolingual latent Dirichlet allocation (LDA) topic model on the training conversations (each conversation corresponds to a “document” in the LDA paradigm). |
Relation to Prior Work | To avoid the need for hard decisions about domain membership, some have used topic modeling to improve SMT performance, e.g., using latent semantic analysis (Tam et al., 2007) or ‘biTAM’ (Zhao and Xing, 2006). |
Relation to Prior Work | (2012), who both describe adaptation techniques where monolingual LDA topic models are used to obtain a topic distribution over the training data, followed by dynamic adaptation of the phrase table based on the inferred topic of the test document. |
Relation to Prior Work | While our proposed approach also employs monolingual LDA topic models , it deviates from the above methods in the following important ways. |
Abstract | It can efficiently handle semantic ambiguity by extending standard topic models with two new features. |
Experiments | Since MTR provides a mixture of properties adapted from earlier models, we present performance benchmarks on tag clustering using: (i) LDA; (ii) Hidden Markov Topic Model HMTM (Gruber et al., 2005); and, (iii) w-LDA (Petterson et al., 2010) that uses word features as priors in LDA. |
Experiments | Each topic model uses Gibbs sampling for inference and parameter learning. |
Experiments | For fair comparison, each benchmark topic model is provided with prior information on word-semantic tag distributions based on the labeled training data, hence, each K latent topic is assigned to one of K semantic tags at the beginning of Gibbs sampling. |
Introduction | Our first contribution is a new probabilistic topic model , Markov Topic Regression (MTR), which uses rich features to capture the degree of association between words and semantic tags. |
Related Work and Motivation | Standard topic models , such as Latent Dirichlet Allocation (LDA) (Blei et al., 2003), use a bag-of-words approach, which disregards word order and clusters words together that appear in a similar global context. |
Related Work and Motivation | Recent topic models consider word sequence information in documents (Griffiths et al., 2005; Moon et al., 2010). |
Related Work and Motivation | Thus, we build a semantically rich topic model , MTR, using word context features as side information. |
Abstract | This work extends prior work in topic modelling by incorporating metadata, and the interactions between the components in metadata, in a general way. |
Conclusion and Future Work | We jointly model those observed labels as well as unsupervised topic modelling . |
Prediction Experiments | To compare the two models in different settings, we first empirically set the number of topics K in our SME model to be 25, as this setting was shown to yield a promising result in a previous study (Eisenstein et al., 2011a) on sparse topic models . |
Prediction Experiments | Most studies on topic modelling have not been able to report results when using different sizes of vocabulary for training. |
Related Work | Related research efforts include using the LDA model for topic modeling in historical newspapers (Yang et al., 2011), a rule-based approach to extract verbs in historical Swedish texts (Pettersson and Nivre, 2011), a system for semantic tagging of historical Dutch archives (Cybulska and Vossen, 2011). |
Related Work | Despite our historical data domain, our approach is more relevant to text classification and topic modelling . |
Related Work | mantic information in multifaceted topic models for text categorization. |
Abstract | We describe the use of a hierarchical topic model for automatically identifying syntactic and lexical patterns that explicitly state ontological relations. |
Experiments and results | A random sample of 3M of them is used for building the document collections on which to train the topic models , and the remaining 30M is used for testing. |
Experiments and results | In both cases, a topic model has been trained to learn the probability of a relation given a pattern w: p(r|w). |
Experiments and results | As can be seen, the MLE baselines (in red with syntactic patterns and green with intertext) perform consistently worse than the models learned using the topic models (in pink and blue). |
Introduction | Instead, we use topic models to discriminate between the patterns that are expressing the relation and those that are ambiguous and can be applied across relations. |
Unsupervised relational pattern learning | Note that we refer to patterns with the symbol w, as they are the words in our topic models . |
Unsupervised relational pattern learning | Document contain dependency patterns, which are words in the topic model . |
Unsupervised relational pattern learning | The topic model gbG captures general patterns that appear for all relations. |
Abstract | In particular, we employ a topic model to partition entity pairs associated with patterns into sense clusters using local and global features. |
Conclusion | We employ a topic model to partition entity pairs of a path into different sense clusters and use hierarchical agglomerative clustering to merge senses into semantic relations. |
Experiments | It does not employ global topic model features extracted from documents and sentences. |
Experiments | Local: This system uses our approach (both sense clustering with topic models and hierarchical clustering), but without global features. |
Our Approach | We represent each pattern as a list of entity pairs and employ a topic model to partition them into different sense clusters using local and global features. |
Our Approach | We employ a topic model to discover senses for each path. |
Related Work | Hachey (2009) uses topic models to perform dimensionality reduction on features when clustering entity pairs into relations. |
Related Work | For example, varieties of topic models are employed for both open domain (Yao et al., 2011) and in-domain relation discovery (Chen et al., 2011; Rink and Harabagiu, 2011). |
Abstract | Current methods either extract aspects without categorizing them, or extract and categorize them using unsupervised topic modeling . |
Experiments | Setting the number of topics/aspects in topic models is often tricky as it is difficult to know the |
Experiments | Topic models are often evaluated quantitatively using perplexity and likelihood on held-out test data (Blei et al., 2003). |
Introduction | The second type uses statistical topic models to extract aspects and group them at the same time in an unsupervised manner. |
Introduction | Our models are related to topic models in general (Blei et al., 2003) and joint models of aspects and sentiments in sentiment analysis in specific (e.g., Zhao et al., 2010). |
Related Work | In recent years, topic models have been used to perform extraction and grouping at the same time. |
Related Work | Aspect and sentiment extraction using topic modeling come in two flavors: discovering aspect words sentiment wise (i.e., discovering positive and negative aspect words and/or sentiments for each aspect without separating aspect and sentiment terms) (Lin and He, 2009; Brody and Elhadad, 2010, Jo and Oh, 2011) and separately discovering both aspects and sentiments (e.g., Mei et al., 2007; Zhao et al., 2010). |
Related Work | (2009) stated that one reason is that the objective function of topic models does not always correlate well with human judgments. |
Introduction | Topic modeling is a useful mechanism for discovering and characterizing various semantic concepts embedded in a collection of documents. |
Introduction | In this way, the topic of a sentence can be inferred with document-level information using off-the-shelf topic modeling toolkits such as Latent Dirichlet Allocation (LDA) (Blei et al., 2003) or Hidden Topic Markov Model (HTMM) (Gruber et al., 2007). |
Introduction | Since the information within the sentence is insufficient for topic modeling , we first enrich sentence contexts via Information Retrieval (IR) methods using content words in the sentence as queries, so that topic-related monolingual documents can be collected. |
Related Work | Topic modeling was first leveraged to improve SMT performance in (Zhao and Xing, 2006; Zhao and Xing, 2007). |
Related Work | Another direction of approaches leveraged topic modeling techniques for domain adaptation. |
Related Work | Generally, most previous research has leveraged conventional topic modeling techniques such as LDA or HTMM. |
Abstract | Our unsupervised model brings together familiar components in natural language processing (like parsers and topic models ) with contextual political information—temporal and dyad dependence—to infer latent event classes. |
Experiments | This is an example of individual lexical features outperforming a topic model for predictive task, because the topic model’s dimension reduction obscures important indicators from individual words. |
Experiments | Similarly, Gerrish and Blei (2011) found that word-based regression outperformed a customized topic model when predicting Congressional bill passage, and Eisen- |
Experiments | 13 In the latter, a problem-specific topic model did best. |
Introduction | We use syntactic preprocessing and a logistic normal topic model , including latent temporal smoothing on the political context prior. |
Model | Thus the language model is very similar to a topic model’s generation of token topics and wordtypes. |
Model | This simple logistic normal prior is, in terms of topic models , analogous to the asymmetric Dirichlet prior version of LDA in Wallach et al. |
Background and Related Work | Recent work on finding novel senses has tended to focus on comparing diachronic corpora (Sagi et al., 2009; Cook and Stevenson, 2010; Gulor—dava and Baroni, 2011) and has also considered topic models (Lau et al., 2012). |
Introduction | In this paper, we propose a method which uses topic models to estimate word sense distributions. |
Introduction | Topic models have been used for WSD in a number of studies (Boyd-Graber et al., 2007; Li et al., 2010; Lau et al., 2012; Preiss and Stevenson, 2013; Cai et al., 2007; Knopp et al., 2013), but our work extends significantly on this earlier work in focusing on the acquisition of prior word sense distributions (and predominant senses). |
Introduction | (2004b), the use of topic models makes this possible, using topics as a proxy for sense (Brody and Lapata, 2009; Yao and Durme, 2011; Lau et al., 2012). |
Methodology | (2006)), a nonparametric variant of a Latent Dirichlet Allocation topic model (Blei et al., 2003) where the model automatically opti-mises the number of topics in a fully-unsupervised fashion over the training data. |
Methodology | To learn the senses of a target lemma, we train a single topic model per target lemma. |
Limitations of Topic Models and LSA for Modeling Sentences | Topic models (PLSNLDA) do not explicitly model missing words. |
Limitations of Topic Models and LSA for Modeling Sentences | However, empirical results show that given a small number of observed words, usually topic models can only find one topic (most evident topic) for a sentence, e. g., the concept definitions of banh#n#1 and stoch#n#1 are assigned the financial topic only without any further discernabil-ity. |
Limitations of Topic Models and LSA for Modeling Sentences | The reason is topic models try to learn a 100-dimension latent vector (assume dimension K = 100) from very few data points (10 observed words on average). |
Motivation | Given the rise of unsupervised latent topic modeling with Latent Dirchlet Allocation (Blei et al., 2003) and similar latent variable approaches for discovering meaningful word co-occurrence patterns in large text corpora, we ought to be able to leverage these topic contexts instead of merely N-grams. |
Motivation | Indeed there is work in the literature that shows that various topic models , latent or otherwise, can be useful for improving lan- |
Motivation | In the information retrieval community, clustering and latent topic models have yielded improvements over traditional vector space models. |
Experimental setup | We separated this corpus into three non-overlapping sets: a training set of 500 programs for parameter estimation in topic modeling and LE, a development set of 133 programs for empirical tuning and a test set of 400 programs for performance evaluation. |
Experimental setup | When evaluating the effects of different size of the training set, the number of latent topics in topic modeling process was set to 64. |
Experimental setup | When evaluating the effects of different number of latent topics in topic modeling computation, we fixed the size of the training set to 500 news programs and changed the number of latent topics from 16 to 256. |
Introduction | To deal with these problems, some topic model techniques which provide conceptual level matching have been introduced to text and story segmentation task (Hearst, 1997). |
Building comparable corpora | generate the bilingual topic model (0” from the |
Introduction | Preiss (2012) transformed the source language topical model to the target language and classified probability distribution of topics in the same language, whose shortcoming is that the effect of model translation seriously hampers the comparable corpora quality. |
Introduction | (2009) adapted monolingual topic model to bilingual topic model in which the documents of a concept unit in different languages were assumed to share identical topic distribution. |
Introduction | Bilingual topic model is widely adopted to mine translation equivalents from multi-language documents (Mimno et al., 2009;Ivaneta1,2011) |
Conclusion | Using topic models to discover subtypes of businesses, a domain-specific sentiment lexicon, and a number of new techniques for increasing precision in sentiment aspect extraction yields attributes that give a rich representation of the restaurant domain. |
Evaluation | ‘Top-level’ repeatedly queries the user’s top-level category preferences, ‘Subtopic’ additionally uses our topic modeling subcategories, and ‘All’ uses these plus the aspects extracted from reviews. |
Generating Questions from Reviews | Using these topic models , we assign a business |
Generating Questions from Reviews | 2We use the Topic Modeling Toolkit implementation: http://n1p.stanford.edu/software/tmt |
Introduction | The framework makes use of techniques from topic modeling and sentiment-based aspect extraction to identify fine-grained attributes for each business. |
Introduction | (2012) in the context of learning topic models . |
Related Work | Recently a number of researchers have developed provably correct algorithms for parameter estimation in latent variable models such as hidden Markov models, topic models , directed graphical models with latent variables, and so on (Hsu et al., 2009; Bailly et al., 2010; Siddiqi et al., 2010; Parikh et al., 2011; Balle et al., 2011; Arora et al., 2013; Dhillon et al., 2012; Anandkumar et al., 2012; Arora et al., 2012; Arora et al., 2013). |
Related Work | Our work is most directly related to the algorithm for parameter estimation in topic models described by Arora et al. |
The Learning Algorithm for L-PCFGS | (2012) in the context of learning topic models . |
Background and overview of models | To demonstrate the benefit of situational information, we develop the Topic-Lexical-Distributional (TLD) model, which extends the LD model by assuming that words appear in situations analogous to documents in a topic model . |
Background and overview of models | (2012) found that topics learned from similar transcript data using a topic model were strongly correlated with immediate activities and contexts. |
Background and overview of models | training an LDA topic model (Blei et al., 2003) on a superset of the child-directed transcript data we use for lexical-phonetic learning, dividing the transcripts into small sections (the ‘documents’ in LDA) that serve as our distinct situations h. As noted above, the learned document-topic distributions 6 are treated as observed variables in the TLD model to represent the situational context. |
Introduction | However, in our simulations we approximate the environmental information by running a topic model (Blei et al., 2003) over a corpus of child-directed speech to infer a topic distribution for each situation. |
Abstract | Our methods synthesize hidden Markov models (for underlying state) and topic models (to connect words to states). |
Introduction | In this paper, we retain the underlying HMM, but assume words are emitted using topic models (TM), exemplified by latent Dirichlet allocation (Blei et al., 2003, LDA). |
Latent Structure in Dialogues | In other words, instead of generating words via a LM, we generate words from a topic model (TM), where each state maps to a mixture of topics. |
Latent Structure in Dialogues | 4Note that a TM-HMMS model with state-specific topic models (instead of state-specific language models) would be subsumed by TM—HMM, since one topic could be used as the background topic in TM -HMMS. |
Discussion and Future Work | While most works on context-insensitive predicate inference rules, such as DIRT (Lin and Pantel, 2001), are based on word-level similarity measures, almost all prior models addressing context-sensitive predicate inference rules are based on topic models (except for (Pantel et al., 2007), which was outperformed by later models). |
Discussion and Future Work | In addition, (Dinu and Lapata, 2010a) adapted the predicate inference topic model from (Dinu and Lapata, 2010b) to compute lexical similarity in context. |
Introduction | This way, we calculate similarity over vectors in the original word space, while biasing them towards the given context via a topic model . |
Two-level Context-sensitive Inference | Our model follows the general DIRT scheme while extending it to handle context-sensitive scoring of rule applications, addressing the scenario dealt by the context-sensitive topic models . |
Hierarchical Topic Models 3.1 Latent Dirichlet Allocation | 2In topic modeling literature, attributes are words and attribute sets are documents. |
Introduction | In this paper, we show that both of these goals can be realized jointly using a probabilistic topic model , namely hierarchical Latent Dirichlet Allocation (LDA) (Blei et al., 2003b). |
Introduction | There are three main advantages to using a topic model as the annotation procedure: (1) Unlike hierarchical clustering (Duda et al., 2000), the attribute distribution at a concept node is not composed of the distributions of its children; attributes found specific to the concept Painter would not need to appear in the distribution of attributes for Person, making the internal distributions at each concept more meaningful as attributes specific to that concept; (2) Since LDA is fully Bayesian, its model semantics allow additional prior information to be included, unlike standard models such as Latent Semantic Analysis (Hofmann, 1999), improving annotation precision; (3) Attributes with multiple related meanings (i.e., polysemous attributes) are modeled implicitly: if an attribute (e.g., “style”) occurs in two separate input classes (e.g., poets and car models), then that attribute might attach at two different concepts in the ontology, which is better than attaching it at their most specific common ancestor (Whole) if that ancestor is too general to be useful. |
Introduction | The remainder of this paper is organized as follows: §2 describes the full ontology annotation framework, §3 introduces the LDA-based topic models , §4 gives the experimental setup, §5 gives results, §6 gives related work and §7 concludes. |
Experimental Setup | The underlying topic model was trained with 1,000 topics using only content words (i.e., nouns, verbs, and adjectives) that appeared |
Extractive Caption Generation | Probabilistic Similarity Recall that the backbone of our image annotation model is a topic model with images and documents represented as a probability distribution over latent topics. |
Image Annotation | The basic idea underlying LDA, and topic models in general, is that each document is composed of a probability distribution over topics, where each topic represents a probability distribution over words. |
Results | As can be seen the probabilistic models (KL and J S divergence) outperform word overlap and cosine similarity (all differences are statistically significant, p < 0.01).6 They make use of the same topic model as the image annotation model, and are thus able to select sentences that cover common content. |
Experiments | The larger question threads data is employed for feature learning, such as translation model, and topic model training. |
Proposed Features | Topic Model |
Proposed Features | To reduce the false negatives of word mismatch in vector model, we also use the topic models to extend matching to semantic topic level. |
Proposed Features | The topic model , such as Latent Dirichlet Allocation (LDA) (Blei et al., 2003), considers a collection of documents with K latent topics, where K is much smaller than the number of words. |
Abstract | We present a probabilistic topic model for jointly identifying properties and attributes of social media review snippets. |
Conclusion | We have presented a probabilistic topic model for identifying properties and attitudes of product review snippets. |
Introduction | We capture this idea using a Bayesian topic model where a set of properties and corresponding attribute tendencies are represented as hidden variables. |
Related Work | Finally, a number of approaches analyze review documents using probabilistic topic models (Lu and Zhai, 2008; Titov and McDonald, 2008; Mei et al., 2007). |
Modeling Multiparty Discussions | Topics—after the topic modeling literature (Blei and Lafferty, 2009)—are multinomial distributions over terms. |
Modeling Multiparty Discussions | However, topic models alone cannot model the dynamics of a conversation. |
Modeling Multiparty Discussions | Topic models typically do not model the temporal dynamics of individual documents, and those that do (Wang et al., 2008; Gerrish and Blei, 2010) are designed for larger documents and are not applicable here because they assume that most topics appear in every time slice. |
Related and Future Work | For example: models having sticky topics over n-grams (Johnson, 2010), sticky HDP-HMM (Fox et al., 2008); models that are an amalgam of sequential models and topic models (Griffiths et al., 2005; Wal- |
RSP: A Random Walk Model for SP | Since RSP falls into the unsupervised distributional approach, we compare it with previous similarity-based methods and unsupervised generative topic model 3. |
RSP: A Random Walk Model for SP | C) Seaghdha (2010) applies topic models for the SP induction with three variations: LDA, Rooth-LDA, and Dual-LDA; Ritter et al. |
RSP: A Random Walk Model for SP | We use the Mat-lab Topic Modeling Toolbox4 for the inference of latent topics. |
Related Work 2.1 WordNet-based Approach | Recently, more sophisticated methods are innovated for SP based on topic models , where the latent variables (topics) take the place of semantic classes and distributional clusterings (Seaghdha, 2010; Ritter et al., 2010). |
Introduction | (This is also appropriate, given that our models are specialisations of topic models ). |
Introduction | 2.1 Topic models and the unigram PCFG |
Introduction | (2010) observe, this kind of grounded learning can be viewed as a specialised kind of topic inference in a topic model , where the utterance topic is constrained by the available objects (possible topics). |
Experiments | We also tried the standard LDA model and the author-topic model on our data set and found that our proposed topic model was better or at least comparable in terms of finding meaningful topics. |
Method | To extract keyphrases, we first identify topics from the Twitter collection using topic models (Section 3.2). |
Method | Author-topic models have been shown to be effective for topic modeling of microblogs (Weng et al., 2010; Hong and Davison, 2010). |
Method | Given the topic model gbt previously learned for topic 75, we can set P(w|t,R = 1) to 2,, i.e. |
Introduction | LDA is an unsupervised probabilistic topic model and it is widely used to discover latent semantic structure of a document collection by modeling words in the documents. |
Introduction | In this approach, a topic model on a given set of unlabeled training documents is constructed using LDA, then an annotator assigns a class label to some topics based on their most probable words. |
Introduction | These labeled topics are used to create a new topic model such that in the new model topics are better aligned to class labels. |
BBC News Database | A simple way to implement this idea is by re-ranking our k-best list according to a topic model estimated from the entire document collection. |
BBC News Database | Specifically, we use Latent Dirichlet Allocation (LDA) as our topic model (Blei et al., 2003). |
BBC News Database | An advantage of using LDA is that at test time we can perform inference without retraining the topic model . |
Introduction | Hao et al (2010) use a location-based topic model to summarize travelogues, enrich them with automatically chosen images, and provide travel recommendations. |
Introduction | (2010) evaluate their geographic topic model by geolocating USA-based Twitter users based on their tweet content. |
Introduction | Their geographic topic model receives supervision from many documents/users and predicts locations for unseen documents/users. |
Experiments | Second, our method captures semantic relations using topic modeling and captures opinion relations through word alignments, which are more precise than Hai which merely uses co-occurrence information to indicate such relations among words. |
Related Work | In terms of considering semantic relations among words, our method is related with several approaches based on topic model (Zhao et al., 2010; Moghaddam and Ester, 2011; Moghaddam and Ester, 2012a; Moghaddam and Ester, 2012b; Mukherjee and Liu, 2012). |
The Proposed Method | After topic modeling , we obtain the probability of the candidates (2)75 and 210) to topic 2, i.e. |
Feature-based Additive Model | where we assume ysemZ—ment and ypomam are given for each document d. Note that we assume conditional independence between features and words given 3/, similar to other topic models (Blei et al., 2003). |
Related Work | (2011), which can be viewed as an combination of topic models (Blei et al., 2003) and generalized additive models (Hastie and Tibshirani, 1990). |
Related Work | Unlike other derivatives of topic models , SAGE drops the Dirichlet-multinomial assumption and adopts a Laplacian prior, triggering sparsity in topic-word distribution. |
MultiLayer Context Model - MCM | In hierarchical topic models (Blei et al., 2003; Mimno et al., 2007), etc., topics are represented as distributions over words, and each document expresses an admixture of these topics, both of which have symmetric Dirichlet (Dir) prior distributions. |
MultiLayer Context Model - MCM | In the topic model literature, such constraints are sometimes used to deterministically allocate topic assignments to known labels (Labeled Topic Modeling (Ramage et al., 2009)) or in terms of pre-learnt topics encoded as prior knowledge on topic distributions in documents (Reisinger and Pasca, 2009). |
MultiLayer Context Model - MCM | 3See (Wallach, 2008) Chapter 3 for analysis of hyper-priors on topic models . |
Related Work | 2.4 Topic Modeling on Query Logs |
Related Work | Other projects have also demonstrated the utility of topic modeling on query logs. |
Related Work | (2011) applied topic models to query logs in order to improve document ranking for search. |
Abstract | To address the term ambiguity detection problem, we employ a model that combines data from language models, ontologies, and topic modeling . |
Term Ambiguity Detection (TAD) | To do so, we utilized the popular Latent Dirichlet Allocation (LDA (Blei et al., 2003)) topic modeling method. |
Term Ambiguity Detection (TAD) | Following standard procedure, stopwords and infrequent words were removed before topic modeling was performed. |
Experiments | We also attempted using topic modeling approach to detect target candidates. |
Experiments | As shown in Table9 (K is the number of predefined topics), PLSA is not quite effective mainly because traditional topic modeling approaches do not perform well on short texts from social media. |
Target Candidate Identification | For comparison we also attempted topic modeling approach to detect target candidates, as shown in section 5.3. |
Generative Model of Coreference | The entire corpus, including these entities, is generated according to standard topic model assumptions; we first generate a topic distribution for a document, then sample topics and words for the document (Blei et al., 2003). |
Inference by Block Gibbs Sampling | The \1136 factors in (5) approximate the topic model’s prior distribution over z. is proportional to the probability that a Gibbs sampling step for an ordinary topic model would choose this value of cc.z. |
Introduction | Our novel approach features: §4.1 A topical model of which entities from previ- |
Background and Related Work | (2013a) used topic models to combine type-level predicate inference rules with token-level information from their arguments in a specific context. |
Our Proposal: A Latent LC Approach | We further incorporate features based on a Latent Dirichlet Allocation (LDA) topic model (Blei et al., 2003). |
Our Proposal: A Latent LC Approach | Several recent works have underscored the usefulness of using topic models to model a predicate’s selectional preferences (Ritter et al., 2010; Dinu and Lapata, 2010; Seaghdha, 2010; Lewis and Steedman, 2013; Melamud et al., 2013a). |
Bayesian MT Decipherment via Hash Sampling | Firstly, we would like to include as many features as possible to represent the source/target words in our framework besides simple bag-of-words context similarity (for example, left-context, right-context, and other general-purpose features based on topic models , etc.). |
Feature-based representation for Source and Target | Similarly, we can add other features based on topic models , orthography (Haghighi et al., 2008), temporal (Klementiev et al., 2012), etc. |
Feature-based representation for Source and Target | We note that the new sampling framework is easily extensible to many additional feature types (for example, monolingual topic model features, etc.) |