Abstract | A significant novelty of our adaptation technique is its incremental nature; we continuously update the topic distribution on the evolving test conversation as new utterances become available. |
Corpus Data and Baseline SMT | The training, development and test sets were partitioned at the conversation level, so that we could model a topic distribution for entire conversations, both during training and during tuning and testing. |
Incremental Topic-Based Adaptation | The topic distribution is incrementally updated as the conversation history grows, and we recompute the topic similarity between the current conversation and the training conversations for each new source utterance. |
Incremental Topic-Based Adaptation | We use latent Dirichlet allocation, or LDA, (Blei et al., 2003) to obtain a topic distribution over conversations. |
Incremental Topic-Based Adaptation | For each conversation di in the training collection (1,600 conversations), LDA infers a topic distribution Qdi = p(zk|di) for all latent topics 2],, = {1, ...,K}, where K is the number of topics. |
Introduction | At runtime, this model is used to infer a topic distribution over the evolving test conversation up to and including the current utterance. |
Introduction | Translation phrase pairs that originate in training conversations whose topic distribution is similar to that of the current conversation are given preference through a single similarity feature, which augments the standard phrase-based SMT log-linear model. |
Introduction | The topic distribution for the test conversation is updated incrementally for each new utterance as the available history grows. |
Relation to Prior Work | (2012), who both describe adaptation techniques where monolingual LDA topic models are used to obtain a topic distribution over the training data, followed by dynamic adaptation of the phrase table based on the inferred topic of the test document. |
Relation to Prior Work | Our proposed approach infers topic distributions incrementally as the conversation progresses. |
Relation to Prior Work | Second, we do not directly augment the translation table with the inferred topic distribution . |
Abstract | We present an efficient approach for broadcast news story segmentation using a manifold learning algorithm on latent topic distributions . |
Abstract | The latent topic distribution estimated by Latent Dirichlet Allocation (LDA) is used to represent each text block. |
Abstract | We employ Laplacian Eigenmaps (LE) to project the latent topic distributions into low-dimensional semantic representations while preserving the intrinsic local geometric structure. |
Experimental setup | 0 PLSA-DP: PLSA topic distributions were used to compute sentence cohesive strength. |
Introduction | To further improve the segmentation performance, using latent topic distributions and LE instead of term frequencies to represent text blocks is studied in this paper. |
Our Proposed Approach | In this paper, we propose to apply LE on the LDA topic distributions , each of which is estimated from a text block. |
Our Proposed Approach | [3 is a K x V matrix, which defines the latent topic distributions over terms. |
Our Proposed Approach | Given the ASR transcripts of N text blocks, we apply LDA algorithm to compute the corresponding latent topic distributions X = [x1, x2, . |
Background and Model Setting | Next, an LDA model is learned from the set of all pseudo-documents, extracted for all predicates.2 The learning process results in the construction of K latent topics, where each topic 75 specifies a distribution over all words, denoted by p(w|t), and a topic distribution for each pseudo-document d, denoted by p(t|d). |
Background and Model Setting | Within the LDA model we can derive the a-posteriori topic distribution conditioned on a particular word within a document, denoted by p(t|d,w) oc p(w|t) -p(t|d). |
Background and Model Setting | Dinu and Lapata (2010b) presented a slightly different similarity measure for topic distributions that performed better in their setting as well as in a related later paper on context-sensitive scoring of lexical similarity (Dinu and Lapata, 2010a). |
Discussion and Future Work | Then, given a specific candidate rule application, the LDA model is used to infer the topic distribution relevant to the context specified by the given arguments. |
Discussion and Future Work | Finally, the context-sensitive rule application score is computed as a weighted average of the per-topic word-level similarity scores, which are weighed according to the inferred topic distribution . |
Discussion and Future Work | Finally, they train a classifier to translate a given target word based on these tables and the inferred topic distribution of the given document in which the target word appears. |
Introduction | Then, similarity is measured between the two topic distribution vectors corresponding to the two sides of the rule in the given context, yielding a context-sensitive score for each particular rule application. |
Introduction | Then, when applying a rule in a given context, these different scores are weighed together based on the specific topic distribution under the given context. |
Experiments | We use the inferred topic distribution 6 as a latent vector to represent the tweet/news. |
Experiments | The problem with LDA-6 is the inferred topic distribution latent vector is very sparse with only a few nonzero values, resulting in many tweet/news pairs receiving a high similarity value as long as they are in the same topic domain. |
Experiments | Hence following (Guo and Diab, 2012b), we first compute the latent vector of a word by ( topic distribution per word), then average the word latent vectors weighted by TF-IDF values to represent the short text, which yields much better results. |
Related Work | The semantics of long documents are transferred to the topic distribution of tweets. |
Abstract | Representing reports according to their topic distributions is more compact than bag-of-words representation and can be processed faster than raw text in subsequent automated processes. |
Experiments | Topic modeling of reports produces a topic distribution for each report which can be used to represent them as topic vectors. |
Experiments | With this approach, a representative topic vector for each class was composed by averaging their corresponding topic distributions in the training dataset. |
Introduction | Topic modeling is an unsupervised technique that can automatically identify themes from a given set of documents and find topic distributions of each document. |
Introduction | Representing reports according to their topic distributions is more compact and can be processed faster than raw text in subsequent automated processing. |
Abstract | We factor out the topic bias by extracting reliable training instances from the revision history which have a topic distribution similar to the labeled articles. |
Introduction | It is rather necessary to determine reliable negative instances with a similar topic distribution as the set of positive instances in order to factor out the sampling bias. |
Selection of Reliable Training Instances | In order to factor out the article topics as a major characteristic for distinguishing flawed articles from the set of outliers, reliable negative instances Are; have to be sampled from the restricted topic set Ampic that contains articles with a topic distribution similar to the flawed articles in A f (see Fig. |
Building comparable corpora | given bilingual corpora, predict the topic distribution (9m, kof the new documents, calculate the |
Building comparable corpora | P(Z) as prior topic distribution is assumed a |
Introduction | (2009) adapted monolingual topic model to bilingual topic model in which the documents of a concept unit in different languages were assumed to share identical topic distribution . |