Index of papers in Proc. ACL that mention
  • latent semantic
Guo, Weiwei and Diab, Mona
Abstract
Previous sentence similarity work finds that latent semantics approaches to the problem do not perform well due to insufficient information in single sentences.
Experiments and Results
This is because LDA only uses 10 observed words to infer a 100 dimension vector for a sentence, while WTMF takes advantage of much more missing words to learn more robust latent semantics vectors.
Introduction
Latent variable models, such as Latent Semantic Analysis [LSA] (Landauer et al., 1998), Probabilistic Latent Semantic Analysis [PLSA] (Hofmann, 1999), Latent Dirichlet Allocation [LDA] (Blei et al., 2003) can solve the two issues naturally by modeling the semantics of words and sentences simultaneously in the low-dimensional latent space.
Introduction
We believe that the latent semantics approaches applied to date to the SS problem have not yielded positive results due to the deficient modeling of the meymdwwmmMCwmeSSqwmanaww limited contextual setting where the sentences are typically very short to derive robust latent semantics .
Introduction
Apart from the SS setting, robust modeling of the latent semantics of short sentences/texts is becoming a pressing need due to the pervasive presence of more bursty data sets such as Twitter feeds and SMS where short contexts are an inherent characteristic of the data.
Limitations of Topic Models and LSA for Modeling Sentences
Usually latent variable models aim to find a latent semantic profile for a sentence that is most relevant to the observed words.
Limitations of Topic Models and LSA for Modeling Sentences
By explicitly modeling missing words, we set another criterion to the latent semantics profile: it should not be related to the missing words in the sentence.
Limitations of Topic Models and LSA for Modeling Sentences
It would be desirable if topic models can exploit missing words (a lot more data than observed words) to render more nuanced latent semantics , so that pairs of sentences in the same domain can be differentiable.
The Proposed Approach
Accordingly, Pfl; is a K -dimension latent semantics vector profile for word 10,; similarly, Q.,j is the K—dimension vector profile that represents the sentence 83-.
The Proposed Approach
This solution is quite elegant: 1. it explicitly tells the model that in general all missing words should not be related to the sentence; 2. meanwhile latent semantics are mainly generalized based on observed words, and the model is not penalized too much (wm is very small) when it is very confident that the sentence is highly related to a small subset of missing words based on their latent semantics profiles (bank:#n#1 definition sentence is related to its missing words check loan).
latent semantic is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Yang, Qiang and Chen, Yuqiang and Xue, Gui-Rong and Dai, Wenyuan and Yu, Yong
Image Clustering with Annotated Auxiliary Data
In this section, we present our annotation-based probabilistic latent semantic analysis algorithm (aPLSA), which extends the traditional PLSA model by incorporating annotated auxiliary image data.
Image Clustering with Annotated Auxiliary Data
3.1 Probabilistic Latent Semantic Analysis
Image Clustering with Annotated Auxiliary Data
To formally introduce the aPLSA model, we start from the probabilistic latent semantic analysis (PLSA) (Hofmann, 1999) model.
Related Works
What we need to do is to uncover this latent semantic information by finding out what is common among them.
Related Works
Probabilistic latent semantic analysis (PLSA) is a widely used probabilistic model (Hofmann, 1999), and could be considered as a probabilistic implementation of latent semantic analysis (L SA) (Deerwester et al., 1990).
latent semantic is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Zhang, Duo and Mei, Qiaozhu and Zhai, ChengXiang
Abstract
Specifically, we propose a new topic model called Probabilistic Cross-Lingual Latent Semantic Analysis (PCLSA) which extends the Probabilistic Latent Semantic Analysis (PLSA) model by regularizing its likelihood function with soft constraints defined based on a bilingual dictionary.
Conclusion
the Probabilistic Cross-Lingual Latent Semantic Analysis (PCLSA) model) that can incorporate translation knowledge in bilingual dictionaries as a regularizer to constrain the parameter estimation so that the learned topic models would be synchronized in multiple languages.
Introduction
As a robust unsupervised way to perform shallow latent semantic analysis of topics in text, probabilistic topic models (Hofmann, 1999a; Blei et al., 2003b) have recently attracted much attention.
Introduction
In this paper, we propose a novel topic model, called Probabilistic Cross-Lingual Latent Semantic Analysis (PCLSA) model, which can be used to mine shared latent topics from unaligned text data in different languages.
Introduction
PCLSA extends the Probabilistic Latent Semantic Analysis (PLSA) model by regularizing its likelihood function with soft constraints defined based on a bilingual dictionary.
Probabilistic Cross-Lingual Latent Semantic Analysis
In this section, we present our probabilistic cross-lingual latent semantic analysis (PCLSA) model and discuss how it can be used to extract cross-lingual topics from multilingual text data.
Related Work
Many topic models have been proposed, and the two basic models are the Probabilistic Latent Semantic Analysis (PLSA) model (Hofmann, 1999a) and the Latent Dirichlet Allocation (LDA) model (Blei et al., 2003b).
latent semantic is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Nastase, Vivi and Strapparava, Carlo
Cross Language Text Categorization
(1997) find semantic correspondences in parallel (different language) corpora through latent semantic analysis (LSA).
Cross Language Text Categorization
We then use LSA — previously shown by (Dumais et al., 1997) and (Gliozzo and Strapparava, 2005) to be useful for this task —to induce the latent semantic dimensions of documents and words respectively, hypothesizing that word etymological ancestors will lead to semantic dimensions that transcend language boundaries.
Cross Language Text Categorization
3.4 Cross-lingual text categorization in a latent semantic space adding etymology
Discussion
The clue to why the increase when using LSA is lower than for English trainingfltalian testing is in the way LSA operates — it relies heavily on word co-occurrences in finding the latent semantic dimensions of documents and words.
latent semantic is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Hingmire, Swapnil and Chakraborti, Sutanu
Background 3.1 LDA
(Chakraborti et al., 2007) empirically show that sprinkled words boost higher order word associations and projects documents with same class labels close to each other in latent semantic space.
Conclusions and Future Work
We have used the idea of sprinkling originally proposed in the context of supervised Latent Semantic Analysis, but the setting here is quite different.
Introduction
LDA is an unsupervised probabilistic topic model and it is widely used to discover latent semantic structure of a document collection by modeling words in the documents.
Introduction
Sprinkling (Chakraborti et al., 2007) integrates class labels of documents into Latent Semantic Indexing (LSI)(Deerwester et al., 1990).
Introduction
As LSI uses higher order word associations (Kontostathis and Pottenger, 2006), sprinkling of artificial words gives better and class-enriched latent semantic structure.
latent semantic is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Titov, Ivan and Kozhevnikov, Mikhail
A Model of Semantics
Figure 3: The semantics-text correspondence model with K documents sharing the same latent semantic state.
Abstract
A simple and efficient inference method recursively induces joint semantic representations for each group and discovers correspondence between lexical entries and latent semantic concepts.
Introduction
We assume that each text in a group is independently generated from a full latent semantic state corresponding to the group.
Introduction
Unsupervised learning with shared latent semantic representations presents its own challenges, as exact inference requires marginalization over possible assignments of the latent semantic state, consequently, introducing nonlocal statistical dependencies between the decisions about the semantic structure of each text.
Summary and Future Work
However, exact inference for groups of documents with overlapping semantic representation is generally prohibitively expensive, as the shared latent semantics introduces nonlocal dependences between semantic representations of individual documents.
latent semantic is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Kireyev, Kirill and Landauer, Thomas K
Abstract
We present a computational algorithm for estimating word maturity, based on modeling language acquisition with Latent Semantic Analysis.
Rethinking Word Difficulty
3 Modeling Word Meaning Acquisition with Latent Semantic Analysis
Rethinking Word Difficulty
3.1 Latent Semantic Analysis (LSA)
Rethinking Word Difficulty
An appealing choice for quantitatively modeling word meanings and their growth over time is Latent Semantic Analysis (LSA), an unsupervised method for representing word and document meaning in a multidimensional vector space.
latent semantic is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Zweig, Geoffrey and Platt, John C. and Meek, Christopher and Burges, Christopher J.C. and Yessenalina, Ainur and Liu, Qiang
Abstract
We tackle the problem with two approaches: methods that use local lexical information, such as the n-grams of a classical language model; and methods that evaluate global coherence, such as latent semantic analysis.
Introduction
As a first step, we have approached the problem from two points—of—view: first by exploiting local sentence structure, and secondly by measuring a novel form of global sentence coherence based on latent semantic analysis.
Introduction
a novel method based on latent semantic analysis (LSA).
Related Work
That paper also explores the use of Latent Semantic Analysis to measure the degree of similarity between a potential replacement and its context, but the results are poorer than others.
Sentence Completion via Latent Semantic Analysis
Latent Semantic Analysis (LSA) (Deerwester et al., 1990) is a widely used method for representing words and documents in a low dimensional vector space.
latent semantic is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Peng, Xingyuan and Ke, Dengfeng and Xu, Bo
Abstract
Compared with the Latent Semantic Analysis with Support Vector Regression (LSA-SVR) method (stands for the conventional measures), our FST method shows better performance especially towards the ASR transcription.
Related Work
In the LSA-SVR method, each essay transcription is represented by a latent semantic space vector, which is regarded as the features in the SVR model.
Related Work
The LSA (Deerwester et al., 1990) considers the relations between the dimensions in conventional vector space model (VSM) (Salton et al., 1975), and it can order the importance of each dimension in the Latent Semantic Space (LS S).
latent semantic is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Baroni, Marco and Dinu, Georgiana and Kruszewski, Germán
Conclusion
For example, the developers of Latent Semantic Analysis (Landauer and Dumais, 1997), Topic Models (Griffiths et al., 2007) and related DSMs have shown that the dimensions of these models can be interpreted as general “latent” semantic domains, which gives the corresponding models some a priori cognitive plausibility while paving the way for interesting applications.
Conclusion
Do the dimensions of predict models also encode latent semantic domains?
Introduction
(2013d) compare their predict models to “Latent Semantic Analysis” (LSA) count vectors on syntactic and semantic analogy tasks, finding that the predict models are highly superior.
latent semantic is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Poon, Hoifung
Background
Top: the dependency tree of the sentence is annotated with latent semantic states by GUSP.
Grounded Unsupervised Semantic Parsing
GUSP produces a semantic parse of the question by annotating its dependency tree with latent semantic states.
Introduction
GUSP starts with the dependency tree of a sentence and produces a semantic parse by annotating the nodes and edges with latent semantic states derived from the database.
latent semantic is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Plank, Barbara and Moschitti, Alessandro
Abstract
In this paper, we propose to combine (i) term generalization approaches such as word clustering and latent semantic analysis (LSA) and (ii) structured kernels to improve the adaptability of relation extractors to new text genres/domains.
Computational Structures for RE
We study two ways for term generalization in tree kernels: Brown words clusters and Latent Semantic Analysis (LSA), both briefly described next.
Introduction
The latter is derived in two ways with: (a) Brown word clustering (Brown et al., 1992); and (b) Latent Semantic Analysis (LSA).
latent semantic is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Lu, Xiaoming and Xie, Lei and Leung, Cheung-Chi and Ma, Bin and Li, Haizhou
Abstract
We evaluate two approaches employing LDA and probabilistic latent semantic analysis (PLSA) distributions respectively.
Introduction
Probabilistic latent semantic analysis (PLSA) (Hofman-n, 1999) is a typical instance and used widely.
Introduction
PLSA is the probabilistic variant of latent semantic analysis (LSA) (Choi et al., 2001), and offers a more solid statistical foundation.
latent semantic is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Celikyilmaz, Asli and Hakkani-Tur, Dilek and Tur, Gokhan and Sarikaya, Ruhi
Introduction
Thus, each latent semantic class corresponds to one of the semantic tags found in labeled data.
Markov Topic Regression - MTR
(I) Semantic Tags (Si): Each word 21),- of a given utterance with Nj words, uj={wi}§:j1€U, j=1,..|U |, from a set of utterances U, is associated with a latent semantic tag (state) variable 3168, where 8 is the set of semantic tags.
Markov Topic Regression - MTR
latent semantic tag
latent semantic is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Chang, Kai-min K. and Cherkassky, Vladimir L. and Mitchell, Tom M. and Just, Marcel Adam
Brain Imaging Experiments on Adj ec-tive-Noun Comprehension
We are currently exploring the infinite latent semantic feature model (ILFM; Griffiths & Ghahramani, 2005), which assumes a nonparametric Indian Buffet prior to the binary feature vector and models neural activation with a linear Gaussian model.
Brain Imaging Experiments on Adj ec-tive-Noun Comprehension
We are investigating if the compositional models also operate in the learned latent semantic space.
Introduction
There are also efforts to recover the latent semantic structure from text corpora using techniques such as LSA (Landauer & Dumais, 1997) and topic models (Blei et al., 2003).
latent semantic is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Tan, Ming and Zhou, Wenli and Zheng, Lei and Wang, Shaojun
Composite language model
Since only one pair of (d, w) is being observed, as a result, the joint probability model is a mixture of log-linear model with the expression p(d, w) = p(d) Zg p(wlg)p(9|d)- Typically, the number of documents and vocabulary size are much larger than the size of latent semantic class variables.
Composite language model
Thus, latent semantic class variables function as bottleneck variables to constrain word occurrences in
Introduction
(2006) integrated n-gram, structured language model (SLM) (Chelba and Jelinek, 2000) and probabilistic latent semantic analysis (PLSA) (Hofmann, 2001) under the directed MRF framework (Wang et al., 2005) and studied the stochastic properties for the composite language model.
latent semantic is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Croce, Danilo and Giannone, Cristina and Annesi, Paolo and Basili, Roberto
A Distributional Model for Argument Classification
Latent Semantic Analysis (LSA) (Landauer and Dumais, 1997), is then applied to M to acquire meaningful representations LSA exploits the linear transformation called Singular Value Decomposition (SVD) and produces an approximation of the original matrix M, capturing (semantic) dependencies between context vectors.
Empirical Analysis
Clustering, as discussed in Section 3.1, allows to generalize lexical information: similar heads within the latent semantic space are built from the annotated examples and they allow to predict the behavior of new unseen words as found in the test sentences.
Introduction
Moreover, it generalizes lexical information about the annotated examples by applying a geometrical model, in a Latent Semantic Analysis style, inspired by a distributional paradigm (Pado
latent semantic is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Kim, Jungi and Li, Jin-Ji and Lee, Jong-Hyeok
Term Weighting and Sentiment Analysis
Statistical measures of associations between terms include estimations by the co-occurrence in the whole collection, such as Point-wise Mutual Information (PMI) and Latent Semantic Analysis (LSA).
Term Weighting and Sentiment Analysis
Latent Semantic Analysis (LSA) (Landauer and Dumais, 1997) creates a semantic space from a collection of documents to measure the semantic relatedness of words.
Term Weighting and Sentiment Analysis
For LSA, we used the online demonstration mode from the Latent Semantic Analysis page from the University of Colorado at Boulder.3 For PMI, we used the online API provided by the CogWorks Lab at the Rensselaer Polytechnic Institute.4
latent semantic is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: