Final Experiments | The following models are used as benchmark: (i) PYTHY (Toutanova et al., 2007): Utilizes human generated summaries to train a sentence ranking system using a classifier model; (ii) HIERSUM (Haghighi and Vanderwende, 2009): Based on hierarchical topic models . |
Introduction | In particular (Haghighi and Vanderwende, 2009; Celikyilmaz and Hakkani-Tur, 2010) build hierarchical topic models to identify salient sentences that contain abstract concepts rather than specific concepts. |
Multi-Document Summarization Models | Some of these work (Haghighi and Vanderwende, 2009; Celikyilmaz and Hakkani-Tur, 2010) focus on the discovery of hierarchical concepts from documents (from abstract to specific) using extensions of hierarchal topic models (Blei et al., 2004) and reflect this hierarchy on the sentences. |
Multi-Document Summarization Models | we utilize the advantages of previous topic models and build an unsupervised generative model that can associate each word in each document with three random variables: a sentence S, a higher-level topic H, and a lower-level topic T, in an analogical way to PAM models (Li and McCallum, 2006), i.e., a directed acyclic graph (DAG) representing mixtures of hierarchical structure, where super-topics are multi-nomials over subtopics at lower levels in the DAG. |
Topic Coherence for Summarization | In this section we discuss the main contribution, our two hierarchical mixture models, which improve summary generation performance through the use of tiered topic models . |
Two-Tiered Topic Model - TTM | Our base model, the two-tiered topic model (TTM), is inspired by the hierarchical topic model , PAM, proposed by Li and McCallum (2006). |
Two-Tiered Topic Model - TTM | Figure l: Graphical model depiction of two-tiered topic model (TTM) described in section ยง4. |
Two-Tiered Topic Model - TTM | Our two-tiered topic model for salient sentence discovery can be generated for each word in the document (Algorithm 1) as follows: For a word wid in document d, a random variable acid is drawn, which determines if wid is query related, i.e., wid either exists in the query or is related to the queryz. |
Abstract | Topic models have been used extensively as a tool for corpus exploration, and a cottage industry has developed to tweak topic models to better encode human intuitions or to better model data. |
Abstract | However, creating such extensions requires expertise in machine learning unavailable to potential end-users of topic modeling software. |
Introduction | Probabilistic topic models , as exemplified by probabilistic latent semantic indexing (Hofmann, 1999) and latent Dirichlet allocation (LDA) (Blei et al., 2003) are unsupervised statistical techniques to discover the thematic topics that permeate a large corpus of text documents. |
Introduction | Topic models have had considerable application beyond natural language processing in computer vision (Rob et al., 2005), biology (Shringarpure and Xing, 2008), and psychology (Landauer et al., 2006) in addition to their canonical application to text. |
Introduction | For text, one of the few real-world applications of topic models is corpus exploration. |
Abstract | We present a probabilistic topic model for jointly identifying properties and attributes of social media review snippets. |
Conclusion | We have presented a probabilistic topic model for identifying properties and attitudes of product review snippets. |
Introduction | We capture this idea using a Bayesian topic model where a set of properties and corresponding attribute tendencies are represented as hidden variables. |
Related Work | Finally, a number of approaches analyze review documents using probabilistic topic models (Lu and Zhai, 2008; Titov and McDonald, 2008; Mei et al., 2007). |
Experiments | We also tried the standard LDA model and the author-topic model on our data set and found that our proposed topic model was better or at least comparable in terms of finding meaningful topics. |
Method | To extract keyphrases, we first identify topics from the Twitter collection using topic models (Section 3.2). |
Method | Author-topic models have been shown to be effective for topic modeling of microblogs (Weng et al., 2010; Hong and Davison, 2010). |
Method | Given the topic model gbt previously learned for topic 75, we can set P(w|t,R = 1) to 2,, i.e. |
Introduction | Hao et al (2010) use a location-based topic model to summarize travelogues, enrich them with automatically chosen images, and provide travel recommendations. |
Introduction | (2010) evaluate their geographic topic model by geolocating USA-based Twitter users based on their tweet content. |
Introduction | Their geographic topic model receives supervision from many documents/users and predicts locations for unseen documents/users. |