Index of papers in Proc. ACL that mention

LDA

Seen in text as:

LDA (643)

Seen in 613 sentences in 53 papers.

1. Automatic Image Annotation Using Auxiliary Text Information

Feng, Yansong and Lapata, Mirella

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

BBC News Database	Specifically, we use Latent Dirichlet Allocation ( LDA ) as our topic model (Blei et al., 2003).
BBC News Database	LDA
BBC News Database	Given a collection of documents and a set of latent variables (i.e., the number of topics), the LDA model estimates the probability of topics per document and the probability of words per topic.
Related Work	More sophisticated graphical models (Blei and Jordan, 2003) have also been employed including Gaussian Mixture Models (GMM) and Latent Dirichlet Allocation ( LDA ).

LDA is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

2. Incremental Topic-Based Translation Model Adaptation for Conversational Spoken Language Translation

Hewavitharana, Sanjika and Mehay, Dennis and Ananthakrishnan, Sankaranarayanan and Natarajan, Prem

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Our approach employs a monolingual LDA topic model to derive a similarity measure between the test conversation and the set of training conversations, which is used to bias translation choices towards the current context.
Corpus Data and Baseline SMT	We use the DARPA TransTac English-Iraqi parallel two-way spoken dialogue collection to train both translation and LDA topic models.
Corpus Data and Baseline SMT	We use the English side of these conversations for training LDA topic models.
Incremental Topic-Based Adaptation	4.1 Topic modeling with LDA
Incremental Topic-Based Adaptation	We use latent Dirichlet allocation, or LDA , (Blei et al., 2003) to obtain a topic distribution over conversations.
Incremental Topic-Based Adaptation	For each conversation di in the training collection (1,600 conversations), LDA infers a topic distribution Qdi = p(zk\|di) for all latent topics 2],, = {1, ...,K}, where K is the number of topics.
Introduction	We begin by building a monolingual latent Dirichlet allocation (LDA) topic model on the training conversations (each conversation corresponds to a “document” in the LDA paradigm).
Relation to Prior Work	(2012), who both describe adaptation techniques where monolingual LDA topic models are used to obtain a topic distribution over the training data, followed by dynamic adaptation of the phrase table based on the inferred topic of the test document.
Relation to Prior Work	While our proposed approach also employs monolingual LDA topic models, it deviates from the above methods in the following important ways.

LDA is mentioned in 15 sentences in this paper.

Topics mentioned in this paper:

Xiao, Xinyan and Xiong, Deyi and Zhang, Min and Liu, Qun and Lin, Shouxun

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background: Topic Model	Both Latent Dirichlet Allocation ( LDA ) (Blei et al., 2003) and Probabilistic Latent Semantic Analysis (PLSA) (Hofmann, 1999) are types of topic models.
Background: Topic Model	LDA is the most common topic model currently in use, therefore we exploit it for mining topics in this paper.
Background: Topic Model	Here, we first give a brief description of LDA .
Estimation	Unlike document-topic distribution that can be directly learned by LDA tools, we need to estimate the rule-topic distribution according to our requirement.
Estimation	bution of every documents inferred by LDA tool.
Estimation	The topic assignments are output by LDA tool.
Topic Similarity Model	The k-th dimension P(z = k3\|d) means the probability of topic k: given document d. Different from rule-topic distribution, the document-topic distribution can be directly inferred by an off-the-shelf LDA tool.

LDA is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

4. Broadcast News Story Segmentation Using Manifold Learning on Latent Topic Distributions

Lu, Xiaoming and Xie, Lei and Leung, Cheung-Chi and Ma, Bin and Li, Haizhou

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	The latent topic distribution estimated by Latent Dirichlet Allocation ( LDA ) is used to represent each text block.
Abstract	We evaluate two approaches employing LDA and probabilistic latent semantic analysis (PLSA) distributions respectively.
Introduction	To deal with this issue, Latent Dirichlet Allocation ( LDA ) (Blei et al., 2003) has been proposed.
Introduction	LDA has been proved to be effective in many segmentation tasks (Arora and Ravindran, 2008; Hall et al., 2008; Sun et al., 2008; Riedl and Biemann, 2012; Chien and Chueh, 2012).
Our Proposed Approach	In this paper, we propose to apply LE on the LDA topic distributions, each of which is estimated from a text block.
Our Proposed Approach	2.1 Latent Dirichlet Allocation Latent Dirichlet allocation ( LDA ) (Blei et al., 2003) is a generative probabilistic model of a corpus.
Our Proposed Approach	In LDA , given a corpus D 2 {d1, d2, .

LDA is mentioned in 22 sentences in this paper.

Topics mentioned in this paper:

5. A Two Level Model for Context Sensitive Inference Rules

Melamud, Oren and Berant, Jonathan and Dagan, Ido and Goldberger, Jacob and Szpektor, Idan

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background and Model Setting	Several more recent works utilize a Latent Dirichlet Allocation ( LDA ) (Blei et al., 2003) framework.
Background and Model Setting	We note that a similar LDA model construction was employed also in (Séaghdha, 2010), for estimating predicate-argument likelihood.
Background and Model Setting	First, an LDA model is constructed, as follows.
Introduction	cation ( LDA ) model.
Introduction	Rather than computing a single context-insensitive rule score, we compute a distinct word-level similarity score for each topic in an LDA model.
Two-level Context-sensitive Inference	Based on all pseudo-documents we learn an LDA model and obtain its associated probability distributions.
Two-level Context-sensitive Inference	At learning time, we compute for each candidate rule a separate, topic-biased, similarity score per each of the topics in the LDA model.

LDA is mentioned in 25 sentences in this paper.

Topics mentioned in this paper:

6. Modeling Sentences in the Latent Space

Guo, Weiwei and Diab, Mona

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments and Results	The performance of WTMF on CDR is compared with (a) an Information Retrieval model (IR) that is based on surface word matching, (b) an n-gram model (N-gram) that captures phrase overlaps by returning the number of overlapping ngrams as the similarity score of two sentences, (c) LSA that uses svds() function in Matlab, and (d) LDA that uses Gibbs Sampling for inference (Griffiths and Steyvers, 2004).
Experiments and Results	WTMF is also compared with all existing reported SS results on L106 and MSR04 data sets, as well as LDA that is trained on the same data as WTMF.
Experiments and Results	To eliminate randomness in statistical models (WTMF and LDA ), all the reported results are averaged over 10 runs.
Introduction	Latent variable models, such as Latent Semantic Analysis [LSA] (Landauer et al., 1998), Probabilistic Latent Semantic Analysis [PLSA] (Hofmann, 1999), Latent Dirichlet Allocation [ LDA ] (Blei et al., 2003) can solve the two issues naturally by modeling the semantics of words and sentences simultaneously in the low-dimensional latent space.
Limitations of Topic Models and LSA for Modeling Sentences	Therefore, PLSA finds a topic distribution for each concept definition that maximizes the log likelihood of the corpus X ( LDA has a similar form):

LDA is mentioned in 26 sentences in this paper.

Topics mentioned in this paper:

7. Finding Bursty Topics from Microblogs

Diao, Qiming and Jiang, Jing and Zhu, Feida and Lim, Ee-Peng

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Our experiments on a large Twitter dataset show that there are more meaningful and unique bursty topics in the top-ranked results returned by our model than an LDA baseline and two degenerate variations of our model.
Experiments	)f tweets (or words in the case of the LDA model) 1ssigned to the topics and take the top-30 bursty top-cs from each model.
Experiments	In the case of the LDA mod-:1, only 23 bursty topics were detected.
Introduction	To discover topics, we can certainly apply standard topic models such as LDA (Blei et al., 2003), but with standard LDA temporal information is lost during topic discovery.
Introduction	We find that compared with bursty topics discovered by standard LDA and by two degenerate variations of our model, bursty topics discovered by our model are more accurate and less redundant within the top-ranked results.
Method	In standard LDA , a document contains a mixture of topics, represented by a topic distribution, and each word has a hidden topic label.
Method	We also consider a standard LDA model in our experiments, where each word is associated with a hidden topic.
Method	Just like standard LDA , our topic model itself finds a set of topics represented by gbc but does not directly generate bursty topics.

LDA is mentioned in 13 sentences in this paper.

Topics mentioned in this paper:

8. Learning Word Vectors for Sentiment Analysis

Maas, Andrew L. and Daly, Raymond E. and Pham, Peter T. and Huang, Dan and Ng, Andrew Y. and Potts, Christopher

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Latent Dirichlet Allocation ( LDA ; Blei et al., 2003) We use the method described in section 2 for inducing word representations from the topic matrix.
Experiments	To train the 50-topic LDA model we use code released by Blei et a1.
Experiments	We use the same 5,000 term vocabulary for LDA as is used for training word vector models.
Our Model	This component does not require labeled data, and shares its foundation with probabilistic topic models such as LDA .
Our Model	Equation 1 resembles the probabilistic model of LDA (Blei et al., 2003), which models documents as mixtures of latent topics.
Our Model	Because of the log-linear formulation of the conditional distribution, 6 is a vector in R5 and not restricted to the unit simplex as it is in LDA .
Related work	Latent Dirichlet Allocation ( LDA ; (Blei et al., 2003)) is a probabilistic document model that assumes each document is a mixture of latent topics.
Related work	However, because the emphasis in LDA is on modeling topics, not word meanings, there is no guarantee that the row (word) vectors are sensible as points in a k-dimensional space.
Related work	Indeed, we show in section 4 that using LDA in this way does not deliver robust word vectors.

LDA is mentioned in 14 sentences in this paper.

Topics mentioned in this paper:

9. Improved Bayesian Logistic Supervised Topic Models with Data Augmentation

Zhu, Jun and Zheng, Xun and Zhang, Bo

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Gibbs Sampling Algorithm	LDA (Griffiths and Steyvers, 2004).
A Gibbs Sampling Algorithm	where Cfpn indicates that term n is excluded from the corresponding document or topic; 7 = Nid; and Afin = fl 2k, 77/9/0527, is the discriminant function value without word n. We can see that the first term is from the LDA model for observed word counts and the second term is from
Experiments	We compare the generalized logistic supervised LDA using Gibbs sampling (denoted by gSLDA) with various competitors, including the standard sLDA using variational mean-field methods (denoted by vSLDA) (Wang et al., 2009), the MedLDA model using variational mean-field methods (denoted by vMedLDA) (Zhu et al., 2012), and the MedLDA model using collapsed Gibbs sampling algorithms (denoted by gMedLDA) (Jiang et al., 2012).
Introduction	As widely adopted in supervised latent Dirichlet allocation (sLDA) models (Blei and McAuliffe, 2010; Wang et al., 2009), one way to improve the predictive power of LDA is to define a likelihood model for the widely available document-level response variables, in addition to the likelihood model for document words.
Introduction	Though powerful, one issue that could limit the use of existing logistic supervised LDA models is that they treat the document-level response variable as one additional word via a normalized likelihood model.
Introduction	For Bayesian LDA models, we can also explore the conjugacy of the Dirichlet-Multinomial prior-likelihood pairs to collapse out the Dirichlet variables (i.e., topics and mixing proportions) to do collapsed Gibbs sampling, which can have better mixing rates (Griffiths and Steyvers, 2004).
Logistic Supervised Topic Models	A logistic supervised topic model consists of two parts — an LDA model (Blei et al., 2003) for describing the words W = {wd}dD=1, where Wd 2 {wdnfigl denote the words within document d, and a logistic classifier for considering the supervising signal y = {yd}dD=1.
Logistic Supervised Topic Models	LDA: LDA is a hierarchical Bayesian model that posits each document as an admixture of K topics, where each topic (I);g is a multinomial distribution over a V-word vocabulary.
Logistic Supervised Topic Models	For fully-Bayesian LDA , the topics are random samples from a Dirichlet prior, (I);c N Dir(,6).

LDA is mentioned in 14 sentences in this paper.

Topics mentioned in this paper:

10. Interactive Topic Modeling

Hu, Yuening and Boyd-Graber, Jordan and Satinoff, Brianna

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In this work, we develop a framework for allowing users to iteratively refine the topics discovered by models such as latent Dirichlet allocation ( LDA ) by adding constraints that enforce that sets of words must appear together in the same topic.
Constraints Shape Topics	As discussed above, LDA views topics as distributions over words, and each document expresses an admixture of these topics.
Constraints Shape Topics	For “vanilla” LDA (no constraints), these are symmetric Dirichlet distributions.
Constraints Shape Topics	Because LDA assumes a document’s tokens are interchangeable, it treats the document as a bag-of—words, ignoring potential relations between words.
Introduction	Probabilistic topic models, as exemplified by probabilistic latent semantic indexing (Hofmann, 1999) and latent Dirichlet allocation ( LDA ) (Blei et al., 2003) are unsupervised statistical techniques to discover the thematic topics that permeate a large corpus of text documents.
Putting Knowledge in Topic Models	At a high level, topic models such as LDA take as input a number of topics K and a corpus.
Putting Knowledge in Topic Models	In LDA both of these outputs are multinomial distributions; typically they are presented to users in summary form by listing the elements with highest probability.

LDA is mentioned in 21 sentences in this paper.

Topics mentioned in this paper:

11. Automatically Extracting Polarity-Bearing Topics for Cross-Domain Sentiment Classification

He, Yulan and Lin, Chenghua and Alani, Harith

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	was extended from the latent Dirichlet allocation ( LDA ) model (?)
Joint Sentiment-Topic (J ST) Model	It is worth pointing out that the J ST model with single topic becomes the standard LDA model with only three sentiment topics.
Joint Sentiment-Topic (J ST) Model	that the J ST model with word polarity priors incorporated performs significantly better than the LDA model without incorporating such prior information.
Joint Sentiment-Topic (J ST) Model	For comparison purpose, we also run the LDA model and augmented the BOW features with the

LDA is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

12. Aspect Extraction with Automated Prior Knowledge Learning

Chen, Zhiyuan and Mukherjee, Arjun and Liu, Bing

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

AKL: Using the Learned Knowledge	To compute this distribution, instead of considering how well 21- matches with w,- only (as in LDA ), we also consider two other factors:
Experiments	This section evaluates and compares the proposed AKL model with three baseline models LDA , MC-LDA, and GK—LDA.
Introduction	Traditional topic models such as LDA (Blei et al., 2003) and pLSA (Hofmann, 1999) are unsupervised methods for extracting latent topics in text documents.
Introduction	We thus propose to first use LDA to learn topics/aspects from each individual domain and then discover the shared aspects (or topics) and aspect terms among a subset of domains.
Introduction	We propose a method to solve this problem, which also results in a new topic model, called AKL (Automated Knowledge LDA ), whose inference can exploit the automatically learned prior knowledge and handle the issues of incorrect knowledge to produce superior aspects.
Learning Quality Knowledge	This section details Step 1 in the overall algorithm, which has three sub-steps: running LDA (or AKL) on each domain corpus, clustering the resulting topics, and mining frequent patterns from the topics in each cluster.
Learning Quality Knowledge	Since running LDA is simple, we will not discuss it further.
Learning Quality Knowledge	After running LDA (or AKL) on each domain corpus, a set of topics is obtained.
Overall Algorithm	Lines 3 and 5 run LDA on each review domain corpus Di 6 D L to generate a set of aspects/topics A, (lines 2, 4, and 6-9 will be discussed below).
Overall Algorithm	Scalability: the proposed algorithm is naturally scalable as both LDA and AKL run on each domain independently.

LDA is mentioned in 26 sentences in this paper.

Topics mentioned in this paper:

13. Sprinkling Topics for Weakly Supervised Text Classification

Hingmire, Swapnil and Chakraborti, Sutanu

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In this paper, we propose a weakly supervised algorithm in which supervision comes in the form of labeling of Latent Dirichlet Allocation ( LDA ) topics.
Background 3.1 LDA	LDA is an unsupervised probabilistic generative model for collections of discrete data such as text documents.
Background 3.1 LDA	The generative process of LDA can be described as follows:
Background 3.1 LDA	The key problem in LDA is posterior inference.
Introduction	In this paper, we propose a text classification algorithm based on Latent Dirichlet Allocation ( LDA ) (Blei et al., 2003) which does not need labeled documents.
Introduction	LDA is an unsupervised probabilistic topic model and it is widely used to discover latent semantic structure of a document collection by modeling words in the documents.
Introduction	(Blei et al., 2003) used LDA topics as features in text classification, but they use labeled documents while learning a classifier.
Related Work	As LDA topics are semantically more meaningful than individual words and can be acquired easily, our approach overcomes limitations of the semi-supervised methods discussed above.

LDA is mentioned in 17 sentences in this paper.

Topics mentioned in this paper:

14. Latent Variable Models of Selectional Preference

Ó Séaghdha, Diarmuid

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Related work	These include the Latent Dirichlet Allocation ( LDA ) model of Blei et al.
Related work	(2007) integrate a model of random walks on the WordNet graph into an LDA topic model to build an unsupervised word sense disambiguation system.
Related work	and Lapata (2009) adapt the basic LDA model for application to unsupervised word sense induction; in this context, the topics learned by the model are assumed to correspond to distinct senses of a particular lemma.
Three selectional preference models	As noted above, LDA was originally introduced to model sets of documents in terms of topics, or clusters of terms, that they share in varying proportions.
Three selectional preference models	The high-level “generative story” for the LDA selectional preference model is as follows:
Three selectional preference models	(2009) for LDA .

LDA is mentioned in 29 sentences in this paper.

Topics mentioned in this paper:

15. Syntactic and Semantic Factors in Processing Difficulty: An Integrated Measure

Mitchell, Jeff and Lapata, Mirella and Demberg, Vera and Keller, Frank

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Integrating Semantic Constraint into Surprisal	The factor A(wn, h) is essentially based on a comparison between the vector representing the current word wn and the vector representing the prior history h. Varying the method for constructing word vectors (e. g., using LDA or a simpler semantic space model) and for combining them into a representation of the prior context h (e.g., using additive or multiplicative functions) produces distinct models of semantic composition.
Method	We also trained the LDA model on BLLIP, using the Gibb’s sampling procedure discussed in Griffiths et al.
Models of Processing Difficulty	LDA is a probabilistic topic model offering an alternative to spatial semantic representations.
Models of Processing Difficulty	Whereas in LSA words are represented as points in a multidimensional space, LDA represents words using topics.
Results	SSS Additive — .03820* Multiplicative — .00895* LDA Additive — .025 00
Results	Table 2: Coefficients of LME models including simple semantic space (SSS) or Latent Dirichlet Allocation ( LDA ) as factors; ***p < .001
Results	Besides, replicating Pynte et al.’s (2008) finding, we were also interested in assessing whether the underlying semantic representation (simple semantic space or LDA ) and composition function (additive versus multiplicative) modulate reading times differentially.

LDA is mentioned in 17 sentences in this paper.

Topics mentioned in this paper:

16. Polylingual Tree-Based Topic Models for Translation Domain Adaptation

Hu, Yuening and Zhai, Ke and Eidelman, Vladimir and Boyd-Graber, Jordan

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discussion	The first example shows both LDA and ptLDA improve the baseline.
Experiments	Topic Models Configuration We compare our polylingual tree-based topic model (ptLDA) against tree-based topic models (tLDA), polylingual topic models (pLDA) and vanilla topic models ( LDA ).3 We also examine different inference algorithms—Gibbs sampling (gibbs), variational inference (variational) and hybrid approach (variational-hybrid)—on the effects of SMT performance.
Experiments	We refer to the SMT model without domain adaptation as baseline.5 LDA marginally improves machine translation (less than half a BLEU point).
Experiments	3For Gibbs sampling, we use implementations available in Hu and Boyd-Graber (2012) for tLDA; and Mallet (McCallum, 2002) for LDA and pLDA.
Inference	p(zdn : k7 ydn : Slfizdna fiydn7 057/6) _ Nk\|d+0z 0C ll — wdn] Zk,(Nk, ld+a ) OH Ni—>j\|k+fli—>j
Introduction	Probabilistic topic models (Blei and Lafferty, 2009), exemplified by latent Dirichlet allocation (Blei et al., 2003, LDA ), are one of the most popular statistical frameworks for navigating large unannotated document collections.
Polylingual Tree-based Topic Models	Generative Process As in LDA , each word token is associated with a topic.
Polylingual Tree-based Topic Models	With these correlated in topics in hand, the generation of documents are very similar to LDA .
Topic Models for Machine Translation	While vanilla topic models ( LDA ) can only be applied to monolingual data, there are a number of topic models for parallel corpora: Zhao and Xing (2006) assume aligned word pairs share same topics; Mimno et al.

LDA is mentioned in 18 sentences in this paper.

Topics mentioned in this paper:

17. PCFGs, Topic Models, Adaptor Grammars and Learning Topical Collocations and the Structure of Proper Names

Johnson, Mark

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Latent Dirichlet Allocation ( LDA ) models are used as “topic models” to produce a low-dimensional representation of documents, while Probabilistic Context-Free Grammars (PCFGs) define distributions over trees.
Abstract	The paper begins by showing that LDA topic models can be viewed as a special kind of PCFG, so Bayesian inference for PCFGs can be used to infer Topic Models as well.
Abstract	Exploiting the close relationship between LDA and PCFGs just described, we propose two novel probabilistic models that combine insights from LDA and AG models.
Introduction	Specifically, we show that an LDA model can be expressed as a certain kind of PCFG,
Introduction	so Bayesian inference for PCFGs can be used to learn LDA topic models as well.
Introduction	The importance of this observation is primarily theoretical, as current Bayesian inference algorithms for PCFGs are less efficient than those for LDA inference.

LDA is mentioned in 51 sentences in this paper.

Topics mentioned in this paper:

18. Learning Polylingual Topic Models from Code-Switched Social Media Documents

Peng, Nanyun and Wang, Yiming and Dredze, Mark

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We present Code-Switched LDA (csLDA), which infers language specific topic distributions based on code-switched documents to facilitate multilingual corpus analysis.
Abstract	We experiment on two code-switching corpora (English-Spanish Twitter data and English-Chinese Weibo data) and show that csLDA improves perpleXity over LDA , and learns semantically coherent aligned topics as judged by human annotators.
Code-Switching	We call the resulting model Code-Switched LDA (csLDA).
Code-Switching	3.1 Inference Inference for csLDA follows directly from LDA .
Code-Switching	Instead, we constructed a baseline from LDA run on the entire dataset (no

LDA is mentioned in 18 sentences in this paper.

Topics mentioned in this paper:

19. Employing Topic Models for Pattern-based Semantic Class Discovery

Zhang, Huibin and Zhu, Mingjie and Shi, Shuming and Wen, Ji-Rong

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	LDA: Our approach with LDA as the topic model.
Experiments	The implementation of LDA is based on Blei’s code of variational EM for LDAS .
Experiments	Table 4 shows the average query processing time and results quality of the LDA approach, by varying frequency threshold h. Similar results are observed for the pLSI approach.
Our Approach	And at the same time, one document could be related to multiple topics in some topic models (e.g., pLSI and LDA ).
Our Approach	Here we use LDA as an example to
Our Approach	According to the assumption of LDA and our concept mapping in Table 3, a RASC (“document”) is viewed as a mixture of hidden semantic classes (“topics”).
Topic Models	1 Z LDA (Blei et al., 2003): In LDA , the topic mixture is drawn from a conjugate Dirichlet prior that remains the same for all documents (Figure l).
Topic Models	Figure l. Graphical model representation of LDA , from Blei et a1.

LDA is mentioned in 14 sentences in this paper.

Topics mentioned in this paper:

20. Latent Variable Models of Concept-Attribute Attachment

Reisinger, Joseph and Pasca, Marius

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Hierarchical Topic Models 3.1 Latent Dirichlet Allocation	The underlying mechanism for our annotation procedure is LDA (Blei et al., 2003b), a fully Bayesian extension of probabilistic Latent Semantic Analysis (Hofmann, 1999).
Hierarchical Topic Models 3.1 Latent Dirichlet Allocation	Given D labeled attribute sets wd, d E D, LDA infers an unstructured set of T latent annotated concepts over which attribute sets decompose as mixtures.2 The latent annotated concepts represent semantically coherent groups of attributes expressed in the data, as shown in Example 1.
Hierarchical Topic Models 3.1 Latent Dirichlet Allocation	The generative model for LDA is given by
Introduction	In this paper, we show that both of these goals can be realized jointly using a probabilistic topic model, namely hierarchical Latent Dirichlet Allocation ( LDA ) (Blei et al., 2003b).
Introduction	There are three main advantages to using a topic model as the annotation procedure: (1) Unlike hierarchical clustering (Duda et al., 2000), the attribute distribution at a concept node is not composed of the distributions of its children; attributes found specific to the concept Painter would not need to appear in the distribution of attributes for Person, making the internal distributions at each concept more meaningful as attributes specific to that concept; (2) Since LDA is fully Bayesian, its model semantics allow additional prior information to be included, unlike standard models such as Latent Semantic Analysis (Hofmann, 1999), improving annotation precision; (3) Attributes with multiple related meanings (i.e., polysemous attributes) are modeled implicitly: if an attribute (e.g., “style”) occurs in two separate input classes (e.g., poets and car models), then that attribute might attach at two different concepts in the ontology, which is better than attaching it at their most specific common ancestor (Whole) if that ancestor is too general to be useful.
Introduction	ate three variants: (1) a fixed structure approach where each flat class is attached to WN using a simple string-matching heuristic, and concept nodes are annotated using LDA, (2) an extension of LDA allowing for sense selection in addition to annotation, and (3) an approach employing a nonparametric prior over tree structures capable of inferring arbitrary ontologies.
Ontology Annotation	LDA Fixed Structure LDA nCRP
Ontology Annotation	Figure 2: Graphical models for the LDA variants; shaded nodes indicate observed quantities.
Ontology Annotation	We propose a set of Bayesian generative models based on LDA that take as input labeled attribute sets generated using an extraction procedure such as the above and organize the attributes in WN according to their level of generality.

LDA is mentioned in 31 sentences in this paper.

Topics mentioned in this paper:

21. Linguistic Structured Sparsity in Text Categorization

Yogatama, Dani and Smith, Noah A.

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Learning	For the LDA regularizer, L = R X K. For the Brown cluster regularizer, L = V — 1.
Structured Regularizers for Text	4.3 LDA Regularizer
Structured Regularizers for Text	We do this by inferring topics in the training corpus by estimating the latent Dirichlet allocation ( LDA ) model (Blei et al., 2003)).
Structured Regularizers for Text	Note that LDA is an unsupervised method, so we can infer topical structures from any collection of documents that are considered related to the target corpus (e. g., training documents, text from the web, etc.).

LDA is mentioned in 24 sentences in this paper.

Topics mentioned in this paper:

22. Perplexity on Reduced Corpora

Kobayashi, Hayato

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	1(f) plots the perpleXity of LDA models with 20 topics learned from Reuters, ZOnews, Enwiki, Zipfl, and Ziprix versus the size of reduced vocabulary on a log-log graph.
Experiments	Table 2: Computational time and memory size for LDA learning on the original corpus, (1/ 10)-reduced corpus, and (1/20)-reduced corpus of Reuters.
Experiments	Finally, let us examine the computational costs for LDA learning.
Perplexity on Reduced Corpora	In this section, we consider the perpleXity of the widely used topic model, Latent Dirichlet Allocation ( LDA ) (Blei et al., 2003), by using the notation given in (Griffiths and Steyvers, 2004).
Perplexity on Reduced Corpora	LDA is a probabilistic language model that generates a corpus as a mixture of hidden topics, and it allows us to infer two parameters: the document-topic distribution 6 that represents the mixture rate of topics in each document, and the topic-word distribution gb that represents the occurrence rate of words in each topic.
Perplexity on Reduced Corpora	The assumption placed on 9b may not be reasonable in the case of d, because we can easily think of a document with only one topic, and we usually use a small number T of topics for LDA , e.g., T = 20.

LDA is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

unigram (13)
topic models (12)
LDA (10)

23. Building Comparable Corpora Based on Bilingual LDA Model

Zhu, Zede and Li, Miao and Chen, Lei and Yang, Zhenxin

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Bilingual LDA Model	2.1 Standard LDA
Bilingual LDA Model	LDA model (Blei et al., 2003) represents the latent topic of the document distribution by Dirichlet distribution with a K-dimensional implicit random variable, which is transformed into a complete generative model when ,8 is exerted to Dirichlet distribution (Griffiths et al., 2004) (Shown in Fig.
Bilingual LDA Model	Figure 1: Standard LDA model
Building comparable corpora	Based on the bilingual LDA model, building comparable corpora includes several steps to
Introduction	Based on Bilingual LDA Model
Introduction	The paper concretely includes: 1) Introduce the Bilingual LDA (Latent Dirichlet Allocation) model which builds comparable corpora and improves the efficiency of matching similar documents; 2) Design a novel method of TFIDF (Topic Frequency-Inverse Document Frequency) to enhance the distinguishing ability of topics from different documents; 3) Propose a tailored

LDA is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

24. Semi-Supervised Semantic Tagging of Conversational Understanding using Markov Topic Regression

Celikyilmaz, Asli and Hakkani-Tur, Dilek and Tur, Gokhan and Sarikaya, Ruhi

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Since MTR provides a mixture of properties adapted from earlier models, we present performance benchmarks on tag clustering using: (i) LDA; (ii) Hidden Markov Topic Model HMTM (Gruber et al., 2005); and, (iii) w-LDA (Petterson et al., 2010) that uses word features as priors in LDA .
Experiments	0.6 T I ags I _ LDA w—LDA HMTM MTR
Experiments	models: LDA , HMTM, w—LDA.
Markov Topic Regression - MTR	LDA assumes that the latent topics of documents are sampled independently from one of K topics.
Related Work and Motivation	Standard topic models, such as Latent Dirichlet Allocation ( LDA ) (Blei et al., 2003), use a bag-of-words approach, which disregards word order and clusters words together that appear in a similar global context.
Related Work and Motivation	In LDA , common words tend to dominate all topics causing related words to end up in different topics.
Related Work and Motivation	In (Petterson et al., 2010), the vector-based features of words are used as prior information in LDA so that the words that are synonyms end up in same topic.

LDA is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

25. A Topic Model for Building Fine-grained Domain-specific Emotion Lexicon

Yang, Min and Zhu, Dingju and Chow, Kam-Pui

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In this paper, we propose a novel Emotion-aware LDA (EaLDA) model to build a domain-specific lexicon for predefined emotions that include anger, disgust, fear, joy, sadness, surprise.
Algorithm	In this section, we rigorously define the emotion-aware LDA model and its learning algorithm.
Algorithm	Like the standard LDA model, EaLDA is a generative model.
Algorithm	The generative process of word distributions for non-emotion topics follows the standard LDA definition with a scalar hyperparameter 607’).
Conclusions and Future Work	In this paper, we have presented a novel emotion-aware LDA model that is able to quickly build a fine-grained domain-specific emotion lexicon for languages without many manually constructed resources.
Conclusions and Future Work	The proposed EaLDA model extends the standard LDA model by accepting a set of domain-independent emotion words as prior knowledge, and guiding to group semantically related words into the same emotion category.
Introduction	The proposed EaLDA model extends the standard Latent Dirichlet Allocation ( LDA ) (Blei et al., 2003) model by employing a small set of seeds to guide the model generating topics.
Related Work	Our approach relates most closely to the method proposed by Xie and Li (2012) for the construction of lexicon annotated for polarity based on LDA model.

LDA is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

26. Are Semantically Coherent Topic Models Useful for Ad Hoc Information Retrieval?

Deveaud, Romain and SanJuan, Eric and Bellot, Patrice

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Latent Semantic Indexing (Deerwester et al., 1990) (LSI), probabilistic Latent Semantic Analysis (Hofmann, 2001) (pLSA) and Latent Dirichlet Allocation (Blei et al., 2003) ( LDA ) are the most famous approaches that tried to tackle this problem throughout the years.
Introduction	This is one of the reasons of the intensive use of topic models (and especially LDA ) in current research in Natural Language Processing (NLP) related areas.
Introduction	The approach by Wei and Croft (2006) was the first to leverage LDA topics to improve the estimate of document language models and achieved good empirical results.
Topic-Driven Relevance Models	We specifically focus on Latent Dirichlet Allocation ( LDA ), since it is currently one of the most representative.
Topic-Driven Relevance Models	In LDA , each topic multinomial distribution gbk is generated by a conjugate Dirichlet prior with parameter [3, while each document multinomial distribution 6d is generated by a conjugate Dirichlet prior with parameter 04.
Topic-Driven Relevance Models	TDRM relies on two important parameters: the number of topics K that we want to learn, and the number of feedback documents N from which LDA learns the topics.

LDA is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

27. Historical Analysis of Legal Opinions with a Sparse Mixed-Effects Latent Variable Model

Wang, William Yang and Mayfield, Elijah and Naidu, Suresh and Dittmar, Jeremiah

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Latent variable models, such as latent Dirichlet allocation ( LDA ) (Blei et al., 2003) and probabilistic latent semantic analysis (PLSA) (Hofmann, 1999), have been used in the past to facilitate social science research.
Introduction	SAGE (Eisenstein et al., 2011a), a recently proposed sparse additive generative model of language, addresses many of the drawbacks of LDA .
Introduction	Another advantage, from a social science perspective, is that SAGE can be derived from a standard logit random-utility model of judicial opinion writing, in contrast to LDA .
Related Work	Related research efforts include using the LDA model for topic modeling in historical newspapers (Yang et al., 2011), a rule-based approach to extract verbs in historical Swedish texts (Pettersson and Nivre, 2011), a system for semantic tagging of historical Dutch archives (Cybulska and Vossen, 2011).
Related Work	(2010) study the effect of the context of interaction in blogs using a standard LDA model.
The Sparse Mixed-Effects Model	To address the over-parameterization, lack of expressiveness and robustness issues in LDA , the SAGE (Eisenstein et al., 2011a) framework draws a
The Sparse Mixed-Effects Model	In this SME model, we still have the same Dirichlet a, the latent topic proportion 6, and the latent topic variable 2 as the original LDA model.
The Sparse Mixed-Effects Model	In contrast to traditional multinomial distribution of words in LDA models, we approximate the conditional word distribution in the document d as the

LDA is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

28. A Joint Model of Text and Aspect Ratings for Sentiment Summarization

Titov, Ivan and McDonald, Ryan

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	It is difficult to compare our model to other unsupervised systems such as MG—LDA or LDA .
Related Work	Recently, Blei and McAuliffe (2008) proposed an approach for joint sentiment and topic modeling that can be viewed as a supervised LDA (sLDA) model that tries to infer topics appropriate for use in a given classification or regression problem.
The Model	2.1 Multi-Grain LDA
The Model	The Multi-Grain Latent Dirichlet Allocation model (MG-LDA) is an extension of Latent Dirichlet Allocation ( LDA ) (Blei et al., 2003).
The Model	strated in Titov and McDonald (2008), the topics produced by LDA do not correspond to ratable aspects of entities.

LDA is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

29. A Hybrid Hierarchical Model for Multi-Document Summarization

Celikyilmaz, Asli and Hakkani-Tur, Dilek

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background and Motivation	A hierarchical model is particularly appealing to summarization than a ”flat” model, e. g. LDA (Blei et al., 2003b), in that one can discover ”abstract” and ”specific” topics.
Experiments and Discussions	* HbeSum (Hybrid Flat Summarizer): To investigate the performance of hierarchical topic model, we build another hybrid model using flat LDA (Blei et al., 2003b).
Experiments and Discussions	In LDA each sentence is a superposition of all K topics with sentence specific weights, there is no hierarchical relation between topics.
Experiments and Discussions	Instead of the new tree-based sentence scoring (§ 4), we present a similar method using topics from LDA on sentence level.
Introduction	We present a probabilistic topic model on sentence level building on hierarchical Latent Dirichlet Allocation (hLDA) (Blei et al., 2003a), which is a generalization of LDA (Blei et al., 2003b).

LDA is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

30. A Latent Dirichlet Allocation Method for Selectional Preferences

Ritter, Alan and Mausam and Etzioni, Oren

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Unsupervised topic models, such as latent Dirichlet allocation ( LDA ) (Blei et al., 2003) and its variants are characterized by a set of hidden topics, which represent the underlying semantic structure of a document collection.
Introduction	In particular, our system, called LDA-SP, uses LinkLDA (Erosheva et al., 2004), an extension of LDA that simultaneously models two sets of distributions for each topic.
Previous Work	Topic models such as LDA (Blei et al., 2003) and its variants have recently begun to see use in many NLP applications such as summarization (Daume III and Marcu, 2006), document alignment and segmentation (Chen et al., 2009), and inferring class-attribute hierarchies (Reisinger and Pasca, 2009).
Previous Work	Van Durme and Gildea (2009) proposed applying LDA to general knowledge templates extracted using the KNEXT system (Schubert and Tong, 2003).
Topic Models for Selectional Prefs.	We first describe the straightforward application of LDA to modeling our corpus of extracted relations.
Topic Models for Selectional Prefs.	In this case two separate LDA models are used to model a1 and a2 independently.
Topic Models for Selectional Prefs.	Formally, LDA generates each argument in the corpus of relations as follows:

LDA is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

topic models (12)
WordNet (10)
LDA (8)

31. Linking Tweets to News: A Framework to Enrich Short Text Data in Social Media

Guo, Weiwei and Li, Hao and Ji, Heng and Diab, Mona

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	For LDA-6 and LDA-wvec, we run Gibbs Sampling based LDA for 2000 iterations and average the model over the last 10 iterations.
Experiments	For LDA we tune the hyperparameter 04 (Dirichlet prior for topic distribution of a document) and [3 (Dirichlet prior for word distribution given a topic).
Introduction	WTMF is a state-of-the-art unsupervised model that was tested on two short text similarity datasets: (Li et al., 2006) and (Agirre et al., 2012), which outperforms Latent Semantic Analysis [LSA] (Landauer et al., 1998) and Latent Dirichelet Allocation [ LDA ] (Blei et al., 2003) by a large margin.
Introduction	We employ it as a strong baseline in this task as it exploits and effectively models the missing words in a tweet, in practice adding thousands of more features for the tweet, by contrast LDA , for example, only leverages observed words (14 features) to infer the latent vector for a tweet.
Related Work	(2010) also use hashtags to improve the latent representation of tweets in a LDA framework, Labeled-LDA (Ramage et al., 2009), treating each hashtag as a label.
Related Work	Similar to the experiments presented in this paper, the result of using Labeled-LDA alone is worse than the IR model, due to the sparseness in the induced LDA latent vector.
Related Work	(2011) apply an LDA based model on clustering by incorporating url referred documents.

LDA is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

32. Generating Synthetic Comparable Questions for News Articles

Rokhlenko, Oleg and Szpektor, Idan

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Comparable Question Mining	Input: A news article Output: A sorted list of comparable questions 1: Identify all target named entities (NEs) in the article 2: Infer the distribution of LDA topics for the article 3: For each comparable relation R in the database, compute its relevance score to be the similarity between the topic distributions of R and the article 4: Rank all the relations according to their relevance score and pick the top M as relevant 5: for each relevant relation R in the order of relevance ranking do 6: Filter out all the target NEs that do not pass the single entity classifier for R 7: Generate all possible NE pairs from the those that passed the single classifier 8: Filter out all the generated NE pairs that do not pass the entity pair classifier for R 9: Pick up the top N pairs with positive classification score to be qualified for generation
Evaluation	The reason for this mistake is that many named entities appear as frequent terms in LDA topics, and thus mentioning many names that belong to a single topic drives LDA to assign this topic a high probability.
Online Question Generation	Specifically, we utilize Latent Dirichlet Allocation ( LDA ) (Blei et al., 2003) to infer latent topics in texts.
Online Question Generation	To train an LDA model, we constructed for each comparable relation a pseudo-document consisting of all questions that contain this relation in our corpus (the supporting questions).
Online Question Generation	An additional product of the LDA training process is a topic distribution for each relation’s pseudo-document, which we consider as the relation’s context profile.
Related Work	Instead, we are interested in a higher level topical similarity to the input article, for which LDA topics were shown to help (Celikyilmaz et al., 2010).

LDA is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

33. Unsupervised Relation Discovery with Sense Disambiguation

Yao, Limin and Riedel, Sebastian and McCallum, Andrew

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	To this end we interpret the de-scriptors as words in documents, and train a standard LDA model based on these documents.
Experiments	We also train a standard LDA model to obtain the theme of a sentence.
Experiments	The LDA model assigns each word to a topic.
Our Approach	In our experiments, we use the meta-descriptors of a document as side information and train a standard LDA model to find the theme of a document.
Our Approach	This model is a minor variation on standard LDA and the difference is that instead of drawing an observation from a hidden topic variable, we draw multiple observations from a hidden topic variable.

LDA is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

34. Discovering Latent Structure in Task-Oriented Dialogues

Zhai, Ke and Williams, Jason D

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	In this paper, we retain the underlying HMM, but assume words are emitted using topic models (TM), exemplified by latent Dirichlet allocation (Blei et al., 2003, LDA ).
Introduction	LDA assumes each word in an utterance is drawn from one of a set of latent topics, where each topic is a multinomial distribution over the vocabulary.
Introduction	This paper is organized as follows: Section 2 introduces two task-oriented domains and corpora; Section 3 details three new unsupervised generative models which combine HMMs and LDA and efficient inference schemes; Section 4 evaluates our models qualitatively and quantitatively, and finally conclude in Section 5.
Latent Structure in Dialogues	We assume 6’s and qb’s are drawn from corresponding Dirichlet priors, as in LDA .
Latent Structure in Dialogues	All probabilities can be computed using collapsed Gibbs sampler for LDA (Griffiths
Latent Structure in Dialogues	Again, we impose Dirichlet priors on distributions over topics 6’s and distributions over words qb’s as in LDA .

LDA is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

35. Topic Modeling Based Classification of Clinical Reports

Sarioglu, Efsun and Yadav, Kabir and Choi, Hyeong-Ah

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background	Several techniques can be used for this purpose such as Latent Semantic Analysis (LSA) (Deerwester et al., 1990), Probabilistic Latent Semantic Analysis (PLSA) (Hofmann, 1999), and Latent Dirichlet Allocation ( LDA ) (Blei et al., 2003).
Background	LDA , first defined by (Blei et al., 2003), defines topic as a distribution over a fixed vocabulary, where each document can exhibit them with different proportions.
Background	For each document, LDA generates the words in a two-step process:
Experiments	LDA was chosen to generate the topic models of clinical reports due to its being a generative probabilistic system for documents and its robustness to overfitting.

LDA is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

topic model (23)
SVM (14)
LDA (6)

36. A Sense-Based Translation Model for Statistical Machine Translation

Xiong, Deyi and Zhang, Min

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	Comparing with macro topics of documents inferred by LDA with bag of words from the whole documents, word senses inferred by the HDP-based WSI can be considered as micro topics.
Related Work	They adapt LDA to word sense induction by building one topic model per word type.
WSI-Based Broad-Coverage Sense Tagger	We first describe WSI, especially WSI based on the Hierarchical Dirichlet Process (HDP) (Teh et al., 2004), a nonparametric version of Latent Dirichlet Allocation ( LDA ) (Blei et al., 2003).
WSI-Based Broad-Coverage Sense Tagger	The conventional topic distribution 6j for the j-th pseudo document is taken as the the distribution over senses for the given word type W. The LDA generative process for sense induction is as follows: 1) for each pseudo document Dj, draw a per-document sense distribution 63- from a Dirichlet distribution Dir(a); 2) for each item ww- in the pseudo document Dj, 2.1) draw a sense cluster sm- N Multinomial(6j); and 2.2) draw a word tum- ~ @st where cps“ is the distribution of sense 3M over words drawn from a Dirichlet distribution Dir(6
WSI-Based Broad-Coverage Sense Tagger	As LDA needs to manually specify the number of senses (topics), a better idea is to let the training data automatically determine the number of senses for each word type.

LDA is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

37. Weak semantic context helps phonetic learning in a model of infant language acquisition

Frank, Stella and Feldman, Naomi H. and Goldwater, Sharon

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background and overview of models	training an LDA topic model (Blei et al., 2003) on a superset of the child-directed transcript data we use for lexical-phonetic learning, dividing the transcripts into small sections (the ‘documents’ in LDA ) that serve as our distinct situations h. As noted above, the learned document-topic distributions 6 are treated as observed variables in the TLD model to represent the situational context.
Background and overview of models	The topic-word distributions learned by LDA are discarded, since these are based on the (correct and unambiguous) words in the transcript, whereas the TLD model is presented with phonetically ambiguous versions of these word tokens and must learn to disambiguate them and associate them with topics.
Conclusion	Regardless of the specific way in which infants encode semantic information, our method of adding this information by using LDA topics from transcript data was shown to be effective.
Experiments	The input to the TLD model includes a distribution over topics for each situation, which we infer in advance from the full Brent corpus (not only the C1 subset) using LDA .
Inference: Gibbs Sampling	The first factor, the prior probability of topic k in document h, is given by 6M, obtained from the LDA .
Topic-Lexical-Distributional Model	There are a fixed number of lower level topic-lexicons; these are matched to the number of topics in the LDA model used to infer the topic distributions (see Section 6.4).

LDA is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

38. Negation Focus Identification with Contextual Discourse Information

Zou, Bowei and Zhou, Guodong and Zhu, Qiaoming

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Baselines	Here, the topics are extracted from all the documents in the *SEM 2012 shared task using the LDA Gibbs Sampling algorithm (Griffiths, 2002).
Baselines	In the topic-driven word-based graph model, the first layer denotes the relatedness among content words as captured in the above word-based graph model, and the second layer denotes the topic distribution, with the dashed lines between these two layers indicating the word-topic model return by LDA .
Baselines	where Rel(w,, rm) is the weight of word w in topic rm calculated by the LDA Gibbs Sampling algorithm.

LDA is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

39. Learning Topic Representation for SMT with Neural Networks

Cui, Lei and Zhang, Dongdong and Liu, Shujie and Chen, Qiming and Li, Mu and Zhou, Ming and Yang, Muyun

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	In this way, the topic of a sentence can be inferred with document-level information using off-the-shelf topic modeling toolkits such as Latent Dirichlet Allocation ( LDA ) (Blei et al., 2003) or Hidden Topic Markov Model (HTMM) (Gruber et al., 2007).
Introduction	Although we can easily apply LDA at the
Introduction	Additionally, our model can be discriminatively trained with a large number of training instances, without expensive sampling methods such as in LDA or HTMM, thus it is more practicable and scalable.
Related Work	Experiments show that their approach not only achieved better translation performance but also provided a faster decoding speed compared with previous lexicon-based LDA methods.
Related Work	Generally, most previous research has leveraged conventional topic modeling techniques such as LDA or HTMM.

LDA is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

40. Template-Based Information Extraction without the Templates

Chambers, Nathanael and Jurafsky, Dan

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Learning Templates from Raw Text	We consider two unsupervised algorithms: Latent Dirichlet Allocation ( LDA ) (Blei et al., 2003), and agglomerative clustering based on word distance.
Learning Templates from Raw Text	4.1.1 LDA for Unknown Data
Learning Templates from Raw Text	LDA is a probabilistic model that treats documents as mixtures of topics.

LDA is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

41. Lexical Inference over Multi-Word Predicates: A Distributional Approach

Abend, Omri and Cohen, Shay B. and Steedman, Mark

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	To compute the LDA features, we use the online variational Bayes algorithm of (Hoffman et al., 2010) as implemented in the Gensim software package (Rehurek and Sojka, 2010).
Experimental Setup	More inclusive is the feature set NO-LDA, which includes all features except the LDA features.
Experimental Setup	Experiments with this set were performed in order to isolate the effect of the LDA features.
Our Proposal: A Latent LC Approach	We further incorporate features based on a Latent Dirichlet Allocation ( LDA ) topic model (Blei et al., 2003).
Our Proposal: A Latent LC Approach	We populate the pseudo-documents of an LC with its arguments according to R. We then train an LDA model with 25 topics over these documents.

LDA is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

42. Text Classification based on the Latent Topics of Important Sentences extracted by the PageRank Algorithm

Ogura, Yukari and Kobayashi, Ichiro

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiment	Furthermore, the hyper-parameters for topic probability distribution and word probability distribution in LDA are a=0.5 and [3:05, respectively.
Experiment	Here, in the case of clustering the documents based on the topic probabilistic distribution by LDA , the topic distribution over documents 6 is changed in every estimation.
Introduction	3) Information used for classification — we use latent information estimated by latent Dirichlet allocation ( LDA ) (Blei et al., 2003) to classify documents, and compare the results of the cases using both surface and latent information.
Techniques for text classification	After obtaining a collection of refined documents for classification, we adopt LDA to estimate the latent topic probabilistic distributions over the target documents and use them for clustering.
Techniques for text classification	As for the refined document obtained in step 2, the latent topics are estimated by means of LDA .

LDA is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

43. Semi-Supervised SimHash for Efficient Document Similarity Search

Jiang, Qixia and Sun, Maosong

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Semi-Supervised SimHash	Clearly, Equation (12) is analogous to Linear Discriminant Analysis ( LDA ) (Duda et al., 2000) except for the difference: 1) measurement.
Semi-Supervised SimHash	83H uses similarity while LDA uses distance.
Semi-Supervised SimHash	As a result, the objective function of 83H is just the reciprocal of LDA’s .

LDA is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

44. Dirt Cheap Web-Scale Parallel Text from the Common Crawl

Smith, Jason R. and Saint-Amand, Herve and Plamada, Magdalena and Koehn, Philipp and Callison-Burch, Chris and Lopez, Adam

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We also applied Latent Dirichlet Allocation ( LDA ; Blei et al., 2003) to learn a distribution over latent topics in the extracted data, as this is a popular exploratory data analysis method.
Abstract	In LDA a topic is a unigram distribution over words, and each document is modeled as a distribution over topics.
Abstract	Some of the topics that LDA finds correspond closely with specific domains, such as topics 1 (blingee .

LDA is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

45. Models of Semantic Representation with Visual Attributes

Silberer, Carina and Ferrari, Vittorio and Lapata, Mirella

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Attribute-based Semantic Models	(2009) present an extension of LDA (Blei et al., 2003) where words in documents and their associated attributes are treated as observed variables that are explained by a generative process.
Attribute-based Semantic Models	Inducing these attribute-topic components from Q) with the extended LDA model gives two sets of parameters: word probabilities given components PW (wi\|X = xc) for w, i = l, ...,n, and attribute probabilities given components PA(ak\|X = xc) for ak, k = 1,...,F. For example, most of the probability mass of a component x would be reserved for the words shirt, coat, dress and the attributes has_1_piece, has_seams, made_of_materia\| and so on.
Related Work	Their model is essentially Latent Dirichlet Allocation ( LDA , Blei et al., 2003) trained on a corpus of multimodal documents (i.e., BBC news articles and their associated images).
Results	Unseen are concepts covered by LDA but unknown to the attribute classifiers (N = 388).

LDA is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

46. Learning Document-Level Semantic Properties from Free-Text Annotations

Branavan, S.R.K. and Chen, Harr and Eisenstein, Jacob and Barzilay, Regina

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Model Description	Our analysis of the document text is based on probabilistic topic models such as LDA (Blei et al., 2003).
Model Description	In the LDA framework, each word is generated from a language model that is indexed by the word’s topic assignment.
Model Description	Thus, rather than identifying a single topic for a document, LDA identifies a distribution over topics.
Related Work	This approach is inspired by methods in the topic modeling literature, such as Latent Dirichlet Allocation ( LDA ) (Blei et al., 2003), where topics are treated as hidden variables that govern the distribution of words in a text.

LDA is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

47. Topic Models for Dynamic Translation Model Adaptation

Eidelman, Vladimir and Boyd-Graber, Jordan and Resnik, Philip

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Topic modeling was performed with Mallet (Mccallum, 2002), a standard implementation of LDA , using a Chinese sto-plist and setting the per-document Dirichlet parameter a = 0.01.
Model Description	, K} over each document, using Latent Dirichlet Allocation ( LDA ) (Blei et al., 2003).
Model Description	For this case, we also propose a local LDA model (LTM), which treats each sentence as a separate document.

LDA is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

48. A Random Walk Approach to Selectional Preferences Based on Preference Ranking and Propagation

Tian, Zhenhua and Xiang, Hengheng and Liu, Ziqi and Zheng, Qinghua

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

RSP: A Random Walk Model for SP	LDA-SP: Another kind of sophisticated unsupervised approaches for SP are latent variable models based on Latent Dirichlet Allocation ( LDA ).
RSP: A Random Walk Model for SP	C) Seaghdha (2010) applies topic models for the SP induction with three variations: LDA , Rooth-LDA, and Dual-LDA; Ritter et al.
RSP: A Random Walk Model for SP	In this work, we compare with C) Seaghdha’s original LDA approach to SP.

LDA is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

49. Topic Models for Word Sense Disambiguation and Token-Based Idiom Detection

Li, Linlin and Roth, Benjamin and Sporleder, Caroline

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Related Work	(2007), for example, use LDA to capture global context.
Related Work	(2007) enhance the basic LDA algorithm by incorporating WordNet senses as an additional latent variable.
The Sense Disambiguation Model	LDA is a Bayesian version of this framework with Dirichlet hyper-parameters (Blei et al., 2003).

LDA is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

50. How Many Words Is a Picture Worth? Automatic Caption Generation for News Images

Feng, Yansong and Lapata, Mirella

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Image Annotation	Latent Dirichlet Allocation ( LDA , Blei et al.
Image Annotation	The basic idea underlying LDA , and topic models in general, is that each document is composed of a probability distribution over topics, where each topic represents a probability distribution over words.
Image Annotation	Examples include PLSA-based approaches to image annotation (e.g., Monay and Gatica-Perez 2007) and correspondence LDA (Blei and Jordan, 2003).

LDA is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

51. Exploiting Topic based Twitter Sentiment for Stock Prediction

Si, Jianfeng and Mukherjee, Arjun and Liu, Bing and Li, Qing and Li, Huayi and Deng, Xiaotie

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Related Work 2.1 Market Prediction and Social Media	One of the basic and most widely used models is Latent Dirichlet Allocation ( LDA ) (Blei et al., 2003).
Related Work 2.1 Market Prediction and Social Media	LDA can learn a predefined number of topics and has been widely applied in its extended forms in sentiment analysis and many other tasks (Mei et al., 2007; Branavan et al., 2008; Lin and He, 2009; Zhao et al., 2010; Wang et al., 2010; Brody and Elhadad, 2010; Jo and Oh, 2011; Moghaddam and Ester, 2011; Sauper et al., 2011; Mukherjee and Liu, 2012; He et al., 2012).
Related Work 2.1 Market Prediction and Social Media	The Dirichlet Processes Mixture (DPM) model is a nonparametric extension of LDA (Teh et al., 2006), which can estimate the number of topics inherent in the data itself.

LDA is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

52. Grounded Language Modeling for Automatic Speech Recognition of Sports Video

Fleischman, Michael and Roy, Deb

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Linguistic Mapping	In this work we follow closely the Author-Topic (AT) model (Steyvers et al., 2004) which is a generalization of Latent Dirichlet Allocation ( LDA ) (Blei et al., 2005).3
Linguistic Mapping	LDA is a technique that was developed to model the distribution of topics discussed in a large corpus of documents.
Linguistic Mapping	The AT model generalizes LDA , saying that the mixture of topics is not dependent on the document itself, but rather on the authors who wrote it.

LDA is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

53. Aspect Extraction through Semi-Supervised Modeling

Mukherjee, Arjun and Liu, Bing

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	DF-LDA adds constraints to LDA .
Proposed Seeded Models	The standard LDA and existing aspect and sentiment models (ASMs) are mostly governed by the phenomenon called “higher-order co-occurrence” (Heinrich, 2009), i.e., based on how often terms co-occur in different contexts].
Related Work	Existing works are based on two basic models, pLSA (Hofmann, 1999) and LDA (Blei et al., 2003).

LDA is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: