Index of papers in Proc. ACL 2013 that mention

topic distribution

Seen in text as:

topic distribution (38)
topic distributions (25)

Seen in 63 sentences in 7 papers.

1. Incremental Topic-Based Translation Model Adaptation for Conversational Spoken Language Translation

Hewavitharana, Sanjika and Mehay, Dennis and Ananthakrishnan, Sankaranarayanan and Natarajan, Prem

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	A significant novelty of our adaptation technique is its incremental nature; we continuously update the topic distribution on the evolving test conversation as new utterances become available.
Corpus Data and Baseline SMT	The training, development and test sets were partitioned at the conversation level, so that we could model a topic distribution for entire conversations, both during training and during tuning and testing.
Incremental Topic-Based Adaptation	The topic distribution is incrementally updated as the conversation history grows, and we recompute the topic similarity between the current conversation and the training conversations for each new source utterance.
Incremental Topic-Based Adaptation	We use latent Dirichlet allocation, or LDA, (Blei et al., 2003) to obtain a topic distribution over conversations.
Incremental Topic-Based Adaptation	For each conversation di in the training collection (1,600 conversations), LDA infers a topic distribution Qdi = p(zk\|di) for all latent topics 2],, = {1, ...,K}, where K is the number of topics.
Introduction	At runtime, this model is used to infer a topic distribution over the evolving test conversation up to and including the current utterance.
Introduction	Translation phrase pairs that originate in training conversations whose topic distribution is similar to that of the current conversation are given preference through a single similarity feature, which augments the standard phrase-based SMT log-linear model.
Introduction	The topic distribution for the test conversation is updated incrementally for each new utterance as the available history grows.
Relation to Prior Work	(2012), who both describe adaptation techniques where monolingual LDA topic models are used to obtain a topic distribution over the training data, followed by dynamic adaptation of the phrase table based on the inferred topic of the test document.
Relation to Prior Work	Our proposed approach infers topic distributions incrementally as the conversation progresses.
Relation to Prior Work	Second, we do not directly augment the translation table with the inferred topic distribution .

topic distribution is mentioned in 19 sentences in this paper.

Topics mentioned in this paper:

2. Broadcast News Story Segmentation Using Manifold Learning on Latent Topic Distributions

Lu, Xiaoming and Xie, Lei and Leung, Cheung-Chi and Ma, Bin and Li, Haizhou

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We present an efficient approach for broadcast news story segmentation using a manifold learning algorithm on latent topic distributions .
Abstract	The latent topic distribution estimated by Latent Dirichlet Allocation (LDA) is used to represent each text block.
Abstract	We employ Laplacian Eigenmaps (LE) to project the latent topic distributions into low-dimensional semantic representations while preserving the intrinsic local geometric structure.
Experimental setup	0 PLSA-DP: PLSA topic distributions were used to compute sentence cohesive strength.
Introduction	To further improve the segmentation performance, using latent topic distributions and LE instead of term frequencies to represent text blocks is studied in this paper.
Our Proposed Approach	In this paper, we propose to apply LE on the LDA topic distributions , each of which is estimated from a text block.
Our Proposed Approach	[3 is a K x V matrix, which defines the latent topic distributions over terms.
Our Proposed Approach	Given the ASR transcripts of N text blocks, we apply LDA algorithm to compute the corresponding latent topic distributions X = [x1, x2, .

topic distribution is mentioned in 16 sentences in this paper.

Topics mentioned in this paper:

3. A Two Level Model for Context Sensitive Inference Rules

Melamud, Oren and Berant, Jonathan and Dagan, Ido and Goldberger, Jacob and Szpektor, Idan

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background and Model Setting	Next, an LDA model is learned from the set of all pseudo-documents, extracted for all predicates.2 The learning process results in the construction of K latent topics, where each topic 75 specifies a distribution over all words, denoted by p(w\|t), and a topic distribution for each pseudo-document d, denoted by p(t\|d).
Background and Model Setting	Within the LDA model we can derive the a-posteriori topic distribution conditioned on a particular word within a document, denoted by p(t\|d,w) oc p(w\|t) -p(t\|d).
Background and Model Setting	Dinu and Lapata (2010b) presented a slightly different similarity measure for topic distributions that performed better in their setting as well as in a related later paper on context-sensitive scoring of lexical similarity (Dinu and Lapata, 2010a).
Discussion and Future Work	Then, given a specific candidate rule application, the LDA model is used to infer the topic distribution relevant to the context specified by the given arguments.
Discussion and Future Work	Finally, the context-sensitive rule application score is computed as a weighted average of the per-topic word-level similarity scores, which are weighed according to the inferred topic distribution .
Discussion and Future Work	Finally, they train a classifier to translate a given target word based on these tables and the inferred topic distribution of the given document in which the target word appears.
Introduction	Then, similarity is measured between the two topic distribution vectors corresponding to the two sides of the rule in the given context, yielding a context-sensitive score for each particular rule application.
Introduction	Then, when applying a rule in a given context, these different scores are weighed together based on the specific topic distribution under the given context.

topic distribution is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

4. Linking Tweets to News: A Framework to Enrich Short Text Data in Social Media

Guo, Weiwei and Li, Hao and Ji, Heng and Diab, Mona

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We use the inferred topic distribution 6 as a latent vector to represent the tweet/news.
Experiments	The problem with LDA-6 is the inferred topic distribution latent vector is very sparse with only a few nonzero values, resulting in many tweet/news pairs receiving a high similarity value as long as they are in the same topic domain.
Experiments	Hence following (Guo and Diab, 2012b), we first compute the latent vector of a word by ( topic distribution per word), then average the word latent vectors weighted by TF-IDF values to represent the short text, which yields much better results.
Related Work	The semantics of long documents are transferred to the topic distribution of tweets.

topic distribution is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

5. Topic Modeling Based Classification of Clinical Reports

Sarioglu, Efsun and Yadav, Kabir and Choi, Hyeong-Ah

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Representing reports according to their topic distributions is more compact than bag-of-words representation and can be processed faster than raw text in subsequent automated processes.
Experiments	Topic modeling of reports produces a topic distribution for each report which can be used to represent them as topic vectors.
Experiments	With this approach, a representative topic vector for each class was composed by averaging their corresponding topic distributions in the training dataset.
Introduction	Topic modeling is an unsupervised technique that can automatically identify themes from a given set of documents and find topic distributions of each document.
Introduction	Representing reports according to their topic distributions is more compact and can be processed faster than raw text in subsequent automated processing.

topic distribution is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

topic model (23)
SVM (14)
LDA (6)

6. The Impact of Topic Bias on Quality Flaw Prediction in Wikipedia

Ferschke, Oliver and Gurevych, Iryna and Rittberger, Marc

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We factor out the topic bias by extracting reliable training instances from the revision history which have a topic distribution similar to the labeled articles.
Introduction	It is rather necessary to determine reliable negative instances with a similar topic distribution as the set of positive instances in order to factor out the sampling bias.
Selection of Reliable Training Instances	In order to factor out the article topics as a major characteristic for distinguishing flawed articles from the set of outliers, reliable negative instances Are; have to be sampled from the restricted topic set Ampic that contains articles with a topic distribution similar to the flawed articles in A f (see Fig.

topic distribution is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

7. Building Comparable Corpora Based on Bilingual LDA Model

Zhu, Zede and Li, Miao and Chen, Lei and Yang, Zhenxin

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Building comparable corpora	given bilingual corpora, predict the topic distribution (9m, kof the new documents, calculate the
Building comparable corpora	P(Z) as prior topic distribution is assumed a
Introduction	(2009) adapted monolingual topic model to bilingual topic model in which the documents of a concept unit in different languages were assumed to share identical topic distribution .

topic distribution is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: