Background 3.1 LDA | (Chakraborti et al., 2007) empirically show that sprinkled words boost higher order word associations and projects documents with same class labels close to each other in latent semantic space. |
Conclusions and Future Work | We have used the idea of sprinkling originally proposed in the context of supervised Latent Semantic Analysis, but the setting here is quite different. |
Introduction | LDA is an unsupervised probabilistic topic model and it is widely used to discover latent semantic structure of a document collection by modeling words in the documents. |
Introduction | Sprinkling (Chakraborti et al., 2007) integrates class labels of documents into Latent Semantic Indexing (LSI)(Deerwester et al., 1990). |
Introduction | As LSI uses higher order word associations (Kontostathis and Pottenger, 2006), sprinkling of artificial words gives better and class-enriched latent semantic structure. |
Conclusion | For example, the developers of Latent Semantic Analysis (Landauer and Dumais, 1997), Topic Models (Griffiths et al., 2007) and related DSMs have shown that the dimensions of these models can be interpreted as general “latent” semantic domains, which gives the corresponding models some a priori cognitive plausibility while paving the way for interesting applications. |
Conclusion | Do the dimensions of predict models also encode latent semantic domains? |
Introduction | (2013d) compare their predict models to “Latent Semantic Analysis” (LSA) count vectors on syntactic and semantic analogy tasks, finding that the predict models are highly superior. |