BBC News Database | Specifically, we use Latent Dirichlet Allocation ( LDA ) as our topic model (Blei et al., 2003). |
BBC News Database | LDA |
BBC News Database | Given a collection of documents and a set of latent variables (i.e., the number of topics), the LDA model estimates the probability of topics per document and the probability of words per topic. |
Related Work | More sophisticated graphical models (Blei and Jordan, 2003) have also been employed including Gaussian Mixture Models (GMM) and Latent Dirichlet Allocation ( LDA ). |
Experiments | It is difficult to compare our model to other unsupervised systems such as MG—LDA or LDA . |
Related Work | Recently, Blei and McAuliffe (2008) proposed an approach for joint sentiment and topic modeling that can be viewed as a supervised LDA (sLDA) model that tries to infer topics appropriate for use in a given classification or regression problem. |
The Model | 2.1 Multi-Grain LDA |
The Model | The Multi-Grain Latent Dirichlet Allocation model (MG-LDA) is an extension of Latent Dirichlet Allocation ( LDA ) (Blei et al., 2003). |
The Model | strated in Titov and McDonald (2008), the topics produced by LDA do not correspond to ratable aspects of entities. |
Model Description | Our analysis of the document text is based on probabilistic topic models such as LDA (Blei et al., 2003). |
Model Description | In the LDA framework, each word is generated from a language model that is indexed by the word’s topic assignment. |
Model Description | Thus, rather than identifying a single topic for a document, LDA identifies a distribution over topics. |
Related Work | This approach is inspired by methods in the topic modeling literature, such as Latent Dirichlet Allocation ( LDA ) (Blei et al., 2003), where topics are treated as hidden variables that govern the distribution of words in a text. |
Linguistic Mapping | In this work we follow closely the Author-Topic (AT) model (Steyvers et al., 2004) which is a generalization of Latent Dirichlet Allocation ( LDA ) (Blei et al., 2005).3 |
Linguistic Mapping | LDA is a technique that was developed to model the distribution of topics discussed in a large corpus of documents. |
Linguistic Mapping | The AT model generalizes LDA , saying that the mixture of topics is not dependent on the document itself, but rather on the authors who wrote it. |