Index of papers in Proc. ACL that mention
  • latent variables
Feng, Yansong and Lapata, Mirella
BBC News Database
Unlike other unsupervised approaches vhere a set of latent variables is introduced, each 1efining a joint distribution on the space of key-vords and image features, the relevance model cap-,ures the joint probability of images and annotated vords directly, without requiring an intermediate :lustering stage.
BBC News Database
achieve competitive performance with latent variable models.
BBC News Database
Each annotated image in the training set is treated as a latent variable .
Related Work
Another way of capturing co-occurrence information is to introduce latent variables linking image features with words.
latent variables is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Sun, Xu and Okazaki, Naoaki and Tsujii, Jun'ichi
Abbreviator with Nonlocal Information
2.1 A Latent Variable Abbreviator
Abbreviator with Nonlocal Information
To implicitly incorporate nonlocal information, we propose discriminative probabilistic latent variable models (DPLVMs) (Morency et al., 2007; Petrov and Klein, 2008) for abbreviating terms.
Abbreviator with Nonlocal Information
The DPLVM is a natural extension of the CRF model (see Figure 2), which is a special case of the DPLVM, with only one latent variable assigned for each label.
Abstract
First, in order to incorporate nonlocal information into abbreviation generation tasks, we present both implicit and explicit solutions: the latent variable model, or alternatively, the label encoding approach with global information.
Introduction
Variables at, y, and h represent observation, label, and latent variables , respectively.
Introduction
discriminative probabilistic latent variable model (DPLVM) in which nonlocal information is modeled by latent variables .
latent variables is mentioned in 24 sentences in this paper.
Topics mentioned in this paper:
Yang, Qiang and Chen, Yuqiang and Xue, Gui-Rong and Dai, Wenyuan and Yu, Yong
Experiments
The entropy of g on a single latent variable 2 is defined to be H (g, z) é — 2666 P(c|z) log2 P(c|z), where C is the class
Image Clustering with Annotated Auxiliary Data
In order to unify those two separate PLSA models, these two steps are done simultaneously with common latent variables used as a bridge linking them.
Image Clustering with Annotated Auxiliary Data
Through these common latent variables , which are now constrained by both target image data and auxiliary annotation data, a better clustering result is expected for the target data.
Image Clustering with Annotated Auxiliary Data
Let Z = be the latent variable set in our aPLSA model.
latent variables is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Ó Séaghdha, Diarmuid
Conclusions and future work
The models presented here derive their predictions by modelling predicate-argument plausibility through the intermediary of latent variables .
Conclusions and future work
We also anticipate that latent variable models will prove effective for learning selectional preferences of semantic predicates (e. g., FrameNet roles) where direct estimation from a large corpus is not a viable option.
Related work
In Rooth et al.’s model each observed predicate-argument pair is probabilistically generated from a latent variable , which is itself generated from an underlying distribution on variables.
Related work
The use of latent variables , which correspond to coherent clusters of predicate-argument interactions, allow probabilities to be assigned to predicate-argument pairs which have not previously been observed by the model.
Related work
The work presented in this paper is inspired by Rooth et al.’s latent variable approach, most directly in the model described in Section 3.3.
Results
Latent variable models that use EM for inference can be very sensitive to the number of latent variables chosen.
Three selectional preference models
Each model has at least one vocabulary of Z arbitrarily labelled latent variables .
Three selectional preference models
fzn is the number of observations where the latent variable 2 has been associated with the argument type n, fzv is the number of observations where 2 has been associated with the predicate type 2) and fzr is the number of observations where 2 has been associated with the relation 7“.
Three selectional preference models
In Rooth et al.’s (1999) selectional preference model, a latent variable is responsible for generating both the predicate and argument types of an observation.
latent variables is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Wang, William Yang and Mayfield, Elijah and Naidu, Suresh and Dittmar, Jeremiah
Abstract
We propose a latent variable model to enhance historical analysis of large corpora.
Introduction
Latent variable models, such as latent Dirichlet allocation (LDA) (Blei et al., 2003) and probabilistic latent semantic analysis (PLSA) (Hofmann, 1999), have been used in the past to facilitate social science research.
Introduction
To do this we augment SAGE with two sparse latent variables that model the region and time of a document, as well as a third sparse latent
Introduction
variable that captures the interactions among the region, time and topic latent variables .
Related Work
For example, SVM does not have latent variables to model the subtle differences and interactions of features from different domains (e.g.
Related Work
(2010) use a latent variable model to predict geolocation information of Twitter users, and investigate geographic variations of language use.
The Sparse Mixed-Effects Model
It also incorporates latent variables 7' to model the variance for each sparse deviation 77.
The Sparse Mixed-Effects Model
The three major sparse deviation latent variables are (T) (R) (Q)
The Sparse Mixed-Effects Model
All of the three latent variables are condi-
latent variables is mentioned in 17 sentences in this paper.
Topics mentioned in this paper:
Titov, Ivan
Abstract
One way to tackle this problem is to train a generative model with latent variables on the mixture of data from the source and target domains.
Abstract
Such a model would cluster features in both domains and ensure that at least some of the latent variables are predictive of the label on the source domain.
Abstract
We introduce a constraint enforcing that marginal distributions of each cluster (i.e., each latent variable ) do not vary significantly across domains.
Introduction
We use generative latent variable models (LVMs) learned on all the available data: unlabeled data for both domains and on the labeled data for the source domain.
Introduction
The latent variables encode regularities observed on unlabeled data from both domains, and they are learned to be predictive of the labels on the source domain.
Introduction
The danger of this semi-supervised approach in the domain-adaptation setting is that some of the latent variables will correspond to clusters of features specific only to the source domain, and consequently, the classifier relying on this latent variable will be badly affected when tested on the target domain.
The Latent Variable Model
vectors of latent variables , to abstract away from handcrafted features.
The Latent Variable Model
The model assumes that the features and the latent variable vector are generated jointly from a globally-normalized model and then the label 3/ is generated from a conditional distribution dependent on z.
latent variables is mentioned in 23 sentences in this paper.
Topics mentioned in this paper:
Blunsom, Phil and Cohn, Trevor and Osborne, Miles
Abstract
We present a translation model which models derivations as a latent variable , in both training and decoding, and is fully discriminative and globally optimised.
Challenges for Discriminative SMT
Instead we model the translation distribution with a latent variable for the derivation, which we marginalise out in training and decoding.
Discriminative Synchronous Transduction
As the training data only provides source and target sentences, the derivations are modelled as a latent variable .
Discriminative Synchronous Transduction
Our findings echo those observed for latent variable log-linear models successfully used in monolingual parsing (Clark and Curran, 2007; Petrov et al., 2007).
Discriminative Synchronous Transduction
This method has been demonstrated to be effective for (non-convex) log-linear models with latent variables (Clark and Curran, 2004; Petrov et al., 2007).
Evaluation
Derivational ambiguity Table 1 shows the impact of accounting for derivational ambiguity in training and decoding.5 There are two options for training, we could use our latent variable model and optimise the probability of all derivations of the reference translation, or choose a single derivation that yields the reference and optimise its probability alone.
Evaluation
Max-translation decoding for the model trained on single derivations has only a small positive effect, while for the latent variable model the impact is much larger.6
Introduction
Second, within this framework, we model the derivation, d, as a latent variable , p(e, d|f), which is marginalised out in training and decoding.
latent variables is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Huang, Fei and Yates, Alexander
Related Work
Sparsity for low-order contexts has recently spurred interest in using latent variables to represent distributions over contexts in language models.
Related Work
Several authors investigate neural network models that learn not just one latent state, but rather a vector of latent variables , to represent each word in a language model (Bengio et al., 2003; Emami et al., 2003; Morin and Bengio, 2005).
Smoothing Natural Language Sequences
2.3 Latent Variable Language Model Representation
Smoothing Natural Language Sequences
Latent variable language models (LVLMs) can be used to produce just such a distributional representation.
Smoothing Natural Language Sequences
We use Hidden Markov Models (HMMs) as the main example in the discussion and as the LVLMs in our experiments, but the smoothing technique can be generalized to other forms of LVLMs, such as factorial HMMs and latent variable maximum entropy models (Ghahramani and Jordan, 1997; Smith and Eisner, 2005).
latent variables is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Parikh, Ankur P. and Cohen, Shay B. and Xing, Eric P.
Abstract
We associate each sentence with an undirected latent tree graphical model, which is a tree consisting of both observed variables (corresponding to the words in the sentence) and an additional set of latent variables that are unobserved in the data.
Abstract
However, due to the presence of latent variables , structure learning of latent trees is substantially more complicated than in observed models.
Abstract
The latent variables can incorporate various linguistic properties, such as head information, valence of dependency being generated, and so on.
latent variables is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Guo, Weiwei and Li, Hao and Ji, Heng and Diab, Mona
Abstract
The contribution of the paper is twofold: 1. we introduce the Linking-Tweets-to-News task as well as a dataset of linked tweetnews pairs, which can benefit many NLP applications; 2. in contrast to previous research which focuses on lexical features within the short texts (text-to-word information), we propose a graph based latent variable model that models the inter short text correlations (text-to-text information).
Conclusion
We formalize the linking task as a short text modeling problem, and extract Twitter/news specific features to extract text-to-text relations, which are incorporated into a latent variable model.
Experiments
As a latent variable model, it is able to capture global topics (+1.89% ATOP over LDA-wvec); moreover, by explicitly modeling missing words, the existence of a word is also encoded in the latent vector (+2.31% TOPIO and —0.011% RR over IR model).
Experiments
The only evidence the latent variable models rely on is lexical items (WTMF-G extract additional text-to-text correlation by word matching).
Introduction
Latent variable models are powerful by going beyond the surface word level and mapping short texts into a low dimensional dense vector (Socher et al., 2011; Guo and Diab, 2012b).
Introduction
Accordingly, we apply a latent variable model, namely, the Weighted Textual Matrix Factorization [WTMF] (Guo and Diab, 2012b; Guo and Diab, 2012c) to both the tweets and the news articles.
Introduction
Our proposed latent variable model not only models text-to-word information, but also is aware of the text-to-text information (illustrated in Figure 1): two linked texts should have similar latent vectors, accordingly the semantic picture of a tweet is completed by receiving semantics from its related tweets.
latent variables is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Shindo, Hiroyuki and Miyao, Yusuke and Fujino, Akinori and Nagata, Masaaki
Experiment
From this viewpoint, TSG utilizes surrounding symbols (NNP of NPNNP in the above example) as latent variables with which to capture context information.
Experiment
as latent variables and the search space is larger than that of a TSG when the symbol refinement model allows for more than two subcategories for each symbol.
Experiment
Our experimental results comfirm that jointly modeling both latent variables using our SR-TSG assists accurate parsing.
Inference
The inference of the SR-TSG derivations corresponds to inferring two kinds of latent variables : latent symbol subcategories and latent substitution
Inference
This stepwise learning is simple and efficient in practice, but we believe that the joint learning of both latent variables is possible, and we will deal with this in future work.
Inference
This sampler simultaneously updates blocks of latent variables associated with a sentence, thus it can find MAP solutions efficiently.
latent variables is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Guo, Weiwei and Diab, Mona
Abstract
In this paper, we show that by carefully handling words that are not in the sentences (missing words), we can train a reliable latent variable model on sentences.
Abstract
Experiments on the new task and previous data sets show significant improvement of our model over baselines and other traditional latent variable models.
Experiments and Results
All the latent variable models (LSA, LDA, WTMF) are built on the same set of corpus: WN+Wik+Brown (393, 666 sentences and 4, 262, 026 words).
Experiments and Results
In these latent variable models, there are several essential parameters: weight of missing words wm, and dimension K. Figure 2 and 3 analyze the impact of these parameters on ATOPteSt.
Introduction
Latent variable models, such as Latent Semantic Analysis [LSA] (Landauer et al., 1998), Probabilistic Latent Semantic Analysis [PLSA] (Hofmann, 1999), Latent Dirichlet Allocation [LDA] (Blei et al., 2003) can solve the two issues naturally by modeling the semantics of words and sentences simultaneously in the low-dimensional latent space.
Introduction
After analyzing the way traditional latent variable models (LSA, PLSNLDA) handle missing words, we decide to model sentences using a weighted matrix factorization approach (Srebro and J aakkola, 2003), which allows us to treat observed words and missing words differently.
Limitations of Topic Models and LSA for Modeling Sentences
Usually latent variable models aim to find a latent semantic profile for a sentence that is most relevant to the observed words.
latent variables is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Sun, Xu and Wang, Houfeng and Li, Wenjie
Introduction
While most of the state-of-the-art CWS systems used semi-Markov conditional random fields or latent variable conditional random fields, we simply use a single first-order conditional random fields (CRFs) for the joint modeling.
Introduction
The semi-Markov CRFs and latent variable CRFs relax the Markov assumption of CRFs to express more complicated dependencies, and therefore to achieve higher disambiguation power.
Related Work
To achieve high accuracy, most of the state-of-the-art systems are heavy probabilistic systems using semi-Markov assumptions or latent variables (Andrew, 2006; Sun et al., 2009b).
Related Work
For example, one of the state-of-the-art CWS system is the latent variable conditional random field (Sun et al., 2008; Sun and Tsujii, 2009) system presented in Sun et al.
Related Work
Those semi-Markov perceptron systems are moderately faster than the heavy probabilistic systems using semi-Markov conditional random fields or latent variable conditional random fields.
latent variables is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Nguyen, Thang and Hu, Yuening and Boyd-Graber, Jordan
Conclusion
These regularizations could improve spectral algorithms for latent variables models, improving the performance for other NLP tasks such as latent variable PCFGs (Cohen et al., 2013) and HMMs (Anandkumar et al., 2012), combining the flexibility and robustness offered by priors with the speed and accuracy of new, scalable algorithms.
Introduction
Theoretically, their latent variable formulation has served as a foundation for more robust models of other linguistic phenomena (Brody and Lapata, 2009).
Introduction
Modern topic models are formulated as a latent variable model.
Introduction
Typical solutions use MCMC (Griffiths and Steyvers, 2004) or variational EM (Blei et al., 2003), which can be viewed as local optimization: searching for the latent variables that maximize the data likelihood.
latent variables is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Abend, Omri and Cohen, Shay B. and Steedman, Mark
Background and Related Work
Our work proposes a uniform treatment to MWPs of varying degrees of compositionality, and avoids defining MWPs explicitly by modelling their LCs as latent variables .
Introduction
We present a novel approach to the task that models the selection and relative weighting of the predicate’s LCs using latent variables .
Our Proposal: A Latent LC Approach
We address the task with a latent variable log-linear model, representing the LCs of the predicates.
Our Proposal: A Latent LC Approach
We choose this model for its generality, conceptual simplicity, and because it allows to easily incorporate various feature sets and sets of latent variables .
Our Proposal: A Latent LC Approach
The introduction of latent variables into the log-linear model leads to a non-convex objective function.
latent variables is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Nguyen, Viet-An and Boyd-Graber, Jordan and Resnik, Philip
Inference
To find the latent variables that best explain observed data, we use Gibbs sampling, a widely used Markov chain Monte Carlo inference technique (Neal, 2000; Resnik and Hardisty, 2010).
Inference
The state space is latent variables for topic indices assigned to all tokens z = {ZQM} and topic shifts assigned to turns 1 2 {lat}.
Inference
We marginalize over all other latent variables .
Modeling Multiparty Discussions
Instead, we endow each turn with a binary latent variable lat, called the topic shift.
Modeling Multiparty Discussions
This latent variable signifies whether the speaker changed the topic of the conversation.
Related and Future Work
as a distinct latent variable (Wang and McCallum, 2006; Eisenstein et a1., 2010).
latent variables is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Pershina, Maria and Min, Bonan and Xu, Wei and Grishman, Ralph
Guided DS
We introduce a set of latent variables hi which model human ground truth for each mention in the ith bag and take precedence over the current model assignment zi.
Guided DS
o ZijER U NR: a latent variable that denotes the relation of the jth mention in the ith bag
Guided DS
0 hij E R U NR: a latent variable that denotes the refined relation of the mention xij
Introduction
(2012), we generalize the labeled data through feature selection and model this additional information directly in the latent variable approaches.
The Challenge
Instead we propose to perform feature selection to generalize human labeled data into training guidelines, and integrate them into latent variable model.
latent variables is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Eidelman, Vladimir and Marton, Yuval and Resnik, Philip
Conclusions and Future Work
The closed-form online update for our relative margin solution accounts for surrogate references and latent variables .
Introduction
Unfortunately, not all advances in machine learning are easy to apply to structured prediction problems such as SMT; the latter often involve latent variables and surrogate references, resulting in loss functions that have not been well explored in machine learning (Mcallester and Keshet, 2011; Gimpel and Smith, 2012).
Introduction
The contributions of this paper include (1) introduction of a loss function for structured RMM in the SMT setting, with surrogate reference translations and latent variables ; (2) an online gradient-based solver, RM, with a closed-form parameter update to optimize the relative margin loss; and (3) an efficient implementation that integrates well with the open source cdec SMT system (Dyer et al., 2010).1 In addition, (4) as our solution is not dependent on any specific QP solver, it can be easily incorporated into practically any gradient-based learning algorithm.
Introduction
First, we introduce RMM (§3.1) and propose a latent structured relative margin objective which incorporates cost-augmented hypothesis selection and latent variables .
Learning in SMT
While many derivations d E D(:c) can produce a given translation, we are only able to observe 3/; thus we model d as a latent variable .
latent variables is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Gormley, Matthew R. and Eisner, Jason
Abstract
Many models in NLP involve latent variables , such as unknown parses, tags, or alignments.
Projections
Given a relaxed joint solution to the parameters and the latent variables, one must be able to project it to a nearby feasible one, by projecting either the fractional parameters or the fractional latent variables into the feasible space and then solving exactly for the other.
Related Work
The goal of this work was to better understand and address the non-convexity of maximum-likelihood training with latent variables , especially parses.
Related Work
For supervised parsing, spectral leam-ing has been used to learn latent variable PCFGs (Cohen et al., 2012) and hidden-state dependency grammars (Luque et al., 2012).
The Constrained Optimization Task
The feature counts are constrained to be derived from the latent variables (e.g., parses), which are unknown discrete structures that must be encoded with integer variables.
latent variables is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Hu, Yuening and Zhai, Ke and Eidelman, Vladimir and Boyd-Graber, Jordan
Inference
Inference of probabilistic models discovers the posterior distribution over latent variables .
Inference
For a collection of D documents, each of which contains Nd number of words, the latent variables of ptLDA are: transition distributions 71',“- for every topic k and internal node i in the prior tree structure; multinomial distributions over topics 6d for every document d; topic assignments zdn and path ydn for the nth word wdn in document d. The joint distribution of polylingual tree-based topic models is
Inference
proximate posterior inference to discover the latent variables that best explain our data.
latent variables is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Zhu, Jun and Zheng, Xun and Zhang, Bo
A Gibbs Sampling Algorithm
Our algorithm represents a first attempt to extend Polson’s approach (Polson et al., 2012) to deal with highly nontrivial Bayesian latent variable models.
Experiments
trivial to develop a Gibbs sampling algorithm using the similar data augmentation idea, due to the presence of latent variables and the nonlinearity of the soft-max function.
Introduction
ing due to the presence of nontrivial latent variables .
Logistic Supervised Topic Models
But the presence of latent variables poses additional challenges in carrying out a formal theoretical analysis of these surrogate losses (Lin, 2001) in the topic model setting.
Logistic Supervised Topic Models
Moreover, the latent variables Z make the inference problem harder than that of Bayesian logistic regression models (Chen et al., 1999; Meyer and Laud, 2002; Polson et al., 2012).
latent variables is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Lee, Chia-ying and Glass, James
Introduction
In contrast to the previous methods, we approach the problem by modeling the three sub-problems as well as the unknown set of sub-word units as latent variables in one nonparametric Bayesian model.
Model
In the next section, we show how to infer the value of each of the latent variables in Fig.
Problem Formulation
We model the three subtasks as latent variables in our approach.
Problem Formulation
In this section, we describe the observed data, latent variables , and auxiliary variables
Related Work
For the domain our problem is applied to, our model has to include more latent variables and is more complex.
latent variables is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Liu, Yang and Mi, Haitao and Feng, Yang and Liu, Qun
Background
(2008) present a latent variable model that describes the relationship between translation and derivation clearly.
Background
Although originally proposed for supporting large sets of nonindependent and overlapping features, the latent variable model is actually a more general form of conventional linear model (Och and Ney, 2002).
Background
Accordingly, decoding for the latent variable model can be formalized as
Related Work
They show that max-translation decoding outperforms max-derivation decoding for the latent variable model.
latent variables is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Cheung, Jackie Chi Kit and Penn, Gerald
Distributional Semantic Hidden Markov Models
This model can be thought of as an HMM with two layers of latent variables , representing events and slots in the domain.
Distributional Semantic Hidden Markov Models
Event Variables At the top-level, a categorical latent variable E; with N E possible states represents the event that is described by clause 75.
Distributional Semantic Hidden Markov Models
Slot Variables Categorical latent variables with N 3 possible states represent the slot that an argument fills, and are conditioned on the event variable in the clause, E7; (i.e., PS(Sta|Et), for the ath slot variable).
Related Work
Distributions that generate the latent variables and hyperparameters are omitted for clarity.
latent variables is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Bodenstab, Nathan and Dunlop, Aaron and Hall, Keith and Roark, Brian
Background
Petrov and Klein (2007a) derive coarse grammars in a more statistically principled way, although the technique is closely tied to their latent variable grammar representation.
Experimental Setup
Alternative decoding methods, such as marginalizing over the latent variables in the grammar or MaxRule decoding (Petrov and Klein, 2007a) are certainly possible in our framework, but it is unknown how effective these methods will be given the heavily pruned na-
Introduction
Grammar transformation techniques such as linguistically inspired nonterminal annotations (Johnson, 1998; Klein and Manning, 2003b) and latent variable grammars (Matsuzaki et al., 2005; Petrov et al., 2006) have increased the grammar size |G| from a few thousand rules to several million in an explicitly enumerable grammar, or even more in an implicit grammar.
Introduction
Rather, the beam-width prediction model is trained to learn the rank of constituents in the maximum likelihood trees.1 We will illustrate this by presenting results using a latent-variable grammar, for which there is no “true” reference latent variable parse.
latent variables is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Snyder, Benjamin and Barzilay, Regina and Knight, Kevin
Inference
In order to do so, we need to integrate out all the other latent variables in our model.
Inference
To do so tractably, we use Gibbs sampling to draw each latent variable conditioned on our current sample of the others.
Inference
Even with a large number of sampling rounds, it is difficult to fully explore the latent variable space for complex unsupervised models.
latent variables is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Huang, Fei and Yates, Alexander
Introduction
An HMM is a generative probabilistic model that generates each word 5137; in the corpus conditioned on a latent variable Y}.
Introduction
Each Y; in the model takes on integral values from 1 to K, and each one is generated by the latent variable for the preceding word, Y};_1.
Introduction
In response, we introduce latent variable models of word spans, or sequences of words.
latent variables is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Kushman, Nate and Artzi, Yoav and Zettlemoyer, Luke and Barzilay, Regina
Introduction
In both cases, the available labeled equations (either the seed set, or the full set) are abstracted to provide the model’s equation templates, while the slot filling and alignment decisions are latent variables whose settings are estimated by directly optimizing the marginal data log-likelihood.
Mapping Word Problems to Equations
In this way, the distribution over derivations 3/ is modeled as a latent variable .
Related Work
In our approach, systems of equations are relatively easy to specify, providing a type of template structure, and the alignment of the slots in these templates to the text is modeled primarily with latent variables during learning.
latent variables is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Hovy, Dirk
Extending the Model
By adding additional transitions, we can constrain the latent variables further.
Introduction
(2011) proposed an approach that uses co-occurrence patterns to find entity type candidates, and then learns their applicability to relation arguments by using them as latent variables in a first-order HMM.
Model
Thus all common nouns are possible types, and can be used as latent variables in an HMM.
latent variables is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Hall, David and Berg-Kirkpatrick, Taylor and Klein, Dan
Minimum Bayes risk parsing
MBR parsing has proven especially useful in latent variable grammars.
Minimum Bayes risk parsing
Petrov and Klein (2007) showed that MBR trees substantially improved performance over Viterbi parses for latent variable grammars, earning up to 1.5Fl.
Sparsity and CPUs
For instance, in a latent variable parser, the coarse grammar would have symbols like NP, VP, etc., and the fine pass would have refined symbols N P0, N P1, VP4, and so on.
latent variables is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Cohen, Shay B. and Collins, Michael
Introduction
This matrix form has clear relevance to latent variable models.
Related Work
Recently a number of researchers have developed provably correct algorithms for parameter estimation in latent variable models such as hidden Markov models, topic models, directed graphical models with latent variables , and so on (Hsu et al., 2009; Bailly et al., 2010; Siddiqi et al., 2010; Parikh et al., 2011; Balle et al., 2011; Arora et al., 2013; Dhillon et al., 2012; Anandkumar et al., 2012; Arora et al., 2012; Arora et al., 2013).
The Learning Algorithm for L-PCFGS
The training set does not include values for the latent variables ; this is the main challenge in learning.
latent variables is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Chaturvedi, Snigdha and Goldwasser, Dan and Daumé III, Hal
Intervention Prediction Models
pi, 7“ and gb(t) are observed and hi are the latent variables .
Intervention Prediction Models
In the first step, it determines the latent variable assignments for positive examples.
Intervention Prediction Models
Once this process converges for negative examples, the algorithm reassigns values to the latent variables for positive examples, and proceeds to the second step.
latent variables is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Berant, Jonathan and Liang, Percy
Model overview
Many existing paraphrase models introduce latent variables to describe the derivation of c from :c, e.g., with transformations (Heilman and Smith, 2010; Stern and Dagan, 2011) or alignments (Haghighi et al., 2005; Das and Smith, 2009; Chang et al., 2010).
Model overview
However, we opt for a simpler paraphrase model without latent variables in the interest of efficiency.
Paraphrasing
The NLP paraphrase literature is vast and ranges from simple methods employing surface features (Wan et al., 2006), through vector space models (Socher et al., 2011), to latent variable models (Das and Smith, 2009; Wang and Manning, 2010; Stern and Dagan, 2011).
latent variables is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Bamman, David and Underwood, Ted and Smith, Noah A.
Experiments
As a Baseline, we also evaluate all hypotheses on a model with no latent variables whatsoever, which instead measures similarity as the average J S divergence between the empirical word distributions over each role type.
Experiments
Table 1 presents the results of this comparison; for all models with latent variables , we report the average of 5 sampling runs with different random initializations.
Model
Observed variables are shaded, latent variables are clear, and collapsed variables are dotted.
latent variables is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Tian, Zhenhua and Xiang, Hengheng and Liu, Ziqi and Zheng, Qinghua
RSP: A Random Walk Model for SP
LDA-SP: Another kind of sophisticated unsupervised approaches for SP are latent variable models based on Latent Dirichlet Allocation (LDA).
Related Work 2.1 WordNet-based Approach
Recently, more sophisticated methods are innovated for SP based on topic models, where the latent variables (topics) take the place of semantic classes and distributional clusterings (Seaghdha, 2010; Ritter et al., 2010).
Related Work 2.1 WordNet-based Approach
Without introducing semantic classes and latent variables , Keller and Lapata (2003) use the web to obtain frequencies for unseen bigrams smooth.
latent variables is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Branavan, S.R.K and Silver, David and Barzilay, Regina
Adding Linguistic Knowledge to the Monte-Carlo Framework
Game only 17.3 5.3 i: 2.7 Sentence relevance 46.7 2.8 i: 3.5 Full model 53.7 5.9 i: 3.5 Random text 40.3 4.3 i: 3.4 Latent variable 26.1 3.7 i: 3.1
Adding Linguistic Knowledge to the Monte-Carlo Framework
Method % Wins Standard Error Game only 45.7 i: 7.0 Latent variable 62.2 i: 6.9 Full model 78.8 i: 5.8
Adding Linguistic Knowledge to the Monte-Carlo Framework
The second baseline, latent variable, extends the linear action-value function Q(s, a) of the game only baseline with a set of latent variables — i.e., it is a four layer neural network, where the second layer’s units are activated only based on game information.
latent variables is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Hu, Yuening and Boyd-Graber, Jordan and Satinoff, Brianna
Constraints Shape Topics
In topic modeling, collapsed Gibbs sampling (Griffiths and Steyvers, 2004) is a standard procedure for obtaining a Markov chain over the latent variables in the model.
Constraints Shape Topics
Typically, these only change based on assignments of latent variables in the sampler; in Section 4 we describe how changes in the model’s structure (in addition to the latent state) can be reflected in these count statistics.
Interactively adding constraints
In the more general case, when words lack a unique path in the constraint tree, an additional latent variable specifies which possible paths in the constraint tree produced the word; this would have to be sampled.
latent variables is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Mylonakis, Markos and Sima'an, Khalil
Joint Translation Model
structural part and their associated probabilities define a model 19(0) over the latent variable 0 determining the recursive, reordering and phrase-pair segmenting structure of translation, as in Figure 4.
Learning Translation Structure
It works it-eratively on a partition of the training data, climbing the likelihood of the training data while cross-validating the latent variable values, considering for every training data point only those which can be produced by models built from the rest of the data excluding the current part.
Related Work
The rich linguistically motivated latent variable learnt by our method delivers translation performance that compares favourably to a state-of-the-art system.
latent variables is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Druck, Gregory and Pang, Bo
Models
To identify refinements without labeled data, we propose a generative model of reviews (or more generally documents) with latent variables .
Models
Finally, although we motivated including the review-level latent variable y as a way to improve segment-level prediction of 2, note that predictions of y are useful in and of themselves.
Models
over latent variables using the sum-product algorithm (Koller and Friedman, 2009).
latent variables is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: