Index of papers in Proc. ACL that mention

latent variables

Seen in text as:

latent variables (138)
latent variable (116)
Latent variable (7)

Seen in 252 sentences in 39 papers.

1. Automatic Image Annotation Using Auxiliary Text Information

Feng, Yansong and Lapata, Mirella

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

BBC News Database	Unlike other unsupervised approaches vhere a set of latent variables is introduced, each 1efining a joint distribution on the space of key-vords and image features, the relevance model cap-,ures the joint probability of images and annotated vords directly, without requiring an intermediate :lustering stage.
BBC News Database	achieve competitive performance with latent variable models.
BBC News Database	Each annotated image in the training set is treated as a latent variable .
Related Work	Another way of capturing co-occurrence information is to introduce latent variables linking image features with words.

latent variables is mentioned in 13 sentences in this paper.

Topics mentioned in this paper:

2. Robust Approach to Abbreviating Terms: A Discriminative Latent Variable Model with Global Information

Sun, Xu and Okazaki, Naoaki and Tsujii, Jun'ichi

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abbreviator with Nonlocal Information	2.1 A Latent Variable Abbreviator
Abbreviator with Nonlocal Information	To implicitly incorporate nonlocal information, we propose discriminative probabilistic latent variable models (DPLVMs) (Morency et al., 2007; Petrov and Klein, 2008) for abbreviating terms.
Abbreviator with Nonlocal Information	The DPLVM is a natural extension of the CRF model (see Figure 2), which is a special case of the DPLVM, with only one latent variable assigned for each label.
Abstract	First, in order to incorporate nonlocal information into abbreviation generation tasks, we present both implicit and explicit solutions: the latent variable model, or alternatively, the label encoding approach with global information.
Introduction	Variables at, y, and h represent observation, label, and latent variables , respectively.
Introduction	discriminative probabilistic latent variable model (DPLVM) in which nonlocal information is modeled by latent variables .

latent variables is mentioned in 24 sentences in this paper.

Topics mentioned in this paper:

3. Heterogeneous Transfer Learning for Image Clustering via the SocialWeb

Yang, Qiang and Chen, Yuqiang and Xue, Gui-Rong and Dai, Wenyuan and Yu, Yong

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	The entropy of g on a single latent variable 2 is defined to be H (g, z) é — 2666 P(c\|z) log2 P(c\|z), where C is the class
Image Clustering with Annotated Auxiliary Data	In order to unify those two separate PLSA models, these two steps are done simultaneously with common latent variables used as a bridge linking them.
Image Clustering with Annotated Auxiliary Data	Through these common latent variables , which are now constrained by both target image data and auxiliary annotation data, a better clustering result is expected for the target data.
Image Clustering with Annotated Auxiliary Data	Let Z = be the latent variable set in our aPLSA model.

latent variables is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

4. Latent Variable Models of Selectional Preference

Ó Séaghdha, Diarmuid

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusions and future work	The models presented here derive their predictions by modelling predicate-argument plausibility through the intermediary of latent variables .
Conclusions and future work	We also anticipate that latent variable models will prove effective for learning selectional preferences of semantic predicates (e. g., FrameNet roles) where direct estimation from a large corpus is not a viable option.
Related work	In Rooth et al.’s model each observed predicate-argument pair is probabilistically generated from a latent variable , which is itself generated from an underlying distribution on variables.
Related work	The use of latent variables , which correspond to coherent clusters of predicate-argument interactions, allow probabilities to be assigned to predicate-argument pairs which have not previously been observed by the model.
Related work	The work presented in this paper is inspired by Rooth et al.’s latent variable approach, most directly in the model described in Section 3.3.
Results	Latent variable models that use EM for inference can be very sensitive to the number of latent variables chosen.
Three selectional preference models	Each model has at least one vocabulary of Z arbitrarily labelled latent variables .
Three selectional preference models	fzn is the number of observations where the latent variable 2 has been associated with the argument type n, fzv is the number of observations where 2 has been associated with the predicate type 2) and fzr is the number of observations where 2 has been associated with the relation 7“.
Three selectional preference models	In Rooth et al.’s (1999) selectional preference model, a latent variable is responsible for generating both the predicate and argument types of an observation.

latent variables is mentioned in 11 sentences in this paper.

Topics mentioned in this paper:

5. Historical Analysis of Legal Opinions with a Sparse Mixed-Effects Latent Variable Model

Wang, William Yang and Mayfield, Elijah and Naidu, Suresh and Dittmar, Jeremiah

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We propose a latent variable model to enhance historical analysis of large corpora.
Introduction	Latent variable models, such as latent Dirichlet allocation (LDA) (Blei et al., 2003) and probabilistic latent semantic analysis (PLSA) (Hofmann, 1999), have been used in the past to facilitate social science research.
Introduction	To do this we augment SAGE with two sparse latent variables that model the region and time of a document, as well as a third sparse latent
Introduction	variable that captures the interactions among the region, time and topic latent variables .
Related Work	For example, SVM does not have latent variables to model the subtle differences and interactions of features from different domains (e.g.
Related Work	(2010) use a latent variable model to predict geolocation information of Twitter users, and investigate geographic variations of language use.
The Sparse Mixed-Effects Model	It also incorporates latent variables 7' to model the variance for each sparse deviation 77.
The Sparse Mixed-Effects Model	The three major sparse deviation latent variables are (T) (R) (Q)
The Sparse Mixed-Effects Model	All of the three latent variables are condi-

latent variables is mentioned in 17 sentences in this paper.

Topics mentioned in this paper:

6. Domain Adaptation by Constraining Inter-Domain Variability of Latent Feature Representation

Titov, Ivan

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	One way to tackle this problem is to train a generative model with latent variables on the mixture of data from the source and target domains.
Abstract	Such a model would cluster features in both domains and ensure that at least some of the latent variables are predictive of the label on the source domain.
Abstract	We introduce a constraint enforcing that marginal distributions of each cluster (i.e., each latent variable ) do not vary significantly across domains.
Introduction	We use generative latent variable models (LVMs) learned on all the available data: unlabeled data for both domains and on the labeled data for the source domain.
Introduction	The latent variables encode regularities observed on unlabeled data from both domains, and they are learned to be predictive of the labels on the source domain.
Introduction	The danger of this semi-supervised approach in the domain-adaptation setting is that some of the latent variables will correspond to clusters of features specific only to the source domain, and consequently, the classifier relying on this latent variable will be badly affected when tested on the target domain.
The Latent Variable Model	vectors of latent variables , to abstract away from handcrafted features.
The Latent Variable Model	The model assumes that the features and the latent variable vector are generated jointly from a globally-normalized model and then the label 3/ is generated from a conditional distribution dependent on z.

latent variables is mentioned in 23 sentences in this paper.

Topics mentioned in this paper:

7. A Discriminative Latent Variable Model for Statistical Machine Translation

Blunsom, Phil and Cohn, Trevor and Osborne, Miles

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We present a translation model which models derivations as a latent variable , in both training and decoding, and is fully discriminative and globally optimised.
Challenges for Discriminative SMT	Instead we model the translation distribution with a latent variable for the derivation, which we marginalise out in training and decoding.
Discriminative Synchronous Transduction	As the training data only provides source and target sentences, the derivations are modelled as a latent variable .
Discriminative Synchronous Transduction	Our findings echo those observed for latent variable log-linear models successfully used in monolingual parsing (Clark and Curran, 2007; Petrov et al., 2007).
Discriminative Synchronous Transduction	This method has been demonstrated to be effective for (non-convex) log-linear models with latent variables (Clark and Curran, 2004; Petrov et al., 2007).
Evaluation	Derivational ambiguity Table 1 shows the impact of accounting for derivational ambiguity in training and decoding.5 There are two options for training, we could use our latent variable model and optimise the probability of all derivations of the reference translation, or choose a single derivation that yields the reference and optimise its probability alone.
Evaluation	Max-translation decoding for the model trained on single derivations has only a small positive effect, while for the latent variable model the impact is much larger.6
Introduction	Second, within this framework, we model the derivation, d, as a latent variable , p(e, d\|f), which is marginalised out in training and decoding.

latent variables is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

8. Distributional Representations for Handling Sparsity in Supervised Sequence-Labeling

Huang, Fei and Yates, Alexander

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Related Work	Sparsity for low-order contexts has recently spurred interest in using latent variables to represent distributions over contexts in language models.
Related Work	Several authors investigate neural network models that learn not just one latent state, but rather a vector of latent variables , to represent each word in a language model (Bengio et al., 2003; Emami et al., 2003; Morin and Bengio, 2005).
Smoothing Natural Language Sequences	2.3 Latent Variable Language Model Representation
Smoothing Natural Language Sequences	Latent variable language models (LVLMs) can be used to produce just such a distributional representation.
Smoothing Natural Language Sequences	We use Hidden Markov Models (HMMs) as the main example in the discussion and as the LVLMs in our experiments, but the smoothing technique can be generalized to other forms of LVLMs, such as factorial HMMs and latent variable maximum entropy models (Ghahramani and Jordan, 1997; Smith and Eisner, 2005).

latent variables is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

9. Spectral Unsupervised Parsing with Additive Tree Metrics

Parikh, Ankur P. and Cohen, Shay B. and Xing, Eric P.

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We associate each sentence with an undirected latent tree graphical model, which is a tree consisting of both observed variables (corresponding to the words in the sentence) and an additional set of latent variables that are unobserved in the data.
Abstract	However, due to the presence of latent variables , structure learning of latent trees is substantially more complicated than in observed models.
Abstract	The latent variables can incorporate various linguistic properties, such as head information, valence of dependency being generated, and so on.

latent variables is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

10. Linking Tweets to News: A Framework to Enrich Short Text Data in Social Media

Guo, Weiwei and Li, Hao and Ji, Heng and Diab, Mona

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	The contribution of the paper is twofold: 1. we introduce the Linking-Tweets-to-News task as well as a dataset of linked tweetnews pairs, which can benefit many NLP applications; 2. in contrast to previous research which focuses on lexical features within the short texts (text-to-word information), we propose a graph based latent variable model that models the inter short text correlations (text-to-text information).
Conclusion	We formalize the linking task as a short text modeling problem, and extract Twitter/news specific features to extract text-to-text relations, which are incorporated into a latent variable model.
Experiments	As a latent variable model, it is able to capture global topics (+1.89% ATOP over LDA-wvec); moreover, by explicitly modeling missing words, the existence of a word is also encoded in the latent vector (+2.31% TOPIO and —0.011% RR over IR model).
Experiments	The only evidence the latent variable models rely on is lexical items (WTMF-G extract additional text-to-text correlation by word matching).
Introduction	Latent variable models are powerful by going beyond the surface word level and mapping short texts into a low dimensional dense vector (Socher et al., 2011; Guo and Diab, 2012b).
Introduction	Accordingly, we apply a latent variable model, namely, the Weighted Textual Matrix Factorization [WTMF] (Guo and Diab, 2012b; Guo and Diab, 2012c) to both the tweets and the news articles.
Introduction	Our proposed latent variable model not only models text-to-word information, but also is aware of the text-to-text information (illustrated in Figure 1): two linked texts should have similar latent vectors, accordingly the semantic picture of a tweet is completed by receiving semantics from its related tweets.

latent variables is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

11. Bayesian Symbol-Refined Tree Substitution Grammars for Syntactic Parsing

Shindo, Hiroyuki and Miyao, Yusuke and Fujino, Akinori and Nagata, Masaaki

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiment	From this viewpoint, TSG utilizes surrounding symbols (NNP of NPNNP in the above example) as latent variables with which to capture context information.
Experiment	as latent variables and the search space is larger than that of a TSG when the symbol refinement model allows for more than two subcategories for each symbol.
Experiment	Our experimental results comfirm that jointly modeling both latent variables using our SR-TSG assists accurate parsing.
Inference	The inference of the SR-TSG derivations corresponds to inferring two kinds of latent variables : latent symbol subcategories and latent substitution
Inference	This stepwise learning is simple and efficient in practice, but we believe that the joint learning of both latent variables is possible, and we will deal with this in future work.
Inference	This sampler simultaneously updates blocks of latent variables associated with a sentence, thus it can find MAP solutions efficiently.

latent variables is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

12. Modeling Sentences in the Latent Space

Guo, Weiwei and Diab, Mona

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In this paper, we show that by carefully handling words that are not in the sentences (missing words), we can train a reliable latent variable model on sentences.
Abstract	Experiments on the new task and previous data sets show significant improvement of our model over baselines and other traditional latent variable models.
Experiments and Results	All the latent variable models (LSA, LDA, WTMF) are built on the same set of corpus: WN+Wik+Brown (393, 666 sentences and 4, 262, 026 words).
Experiments and Results	In these latent variable models, there are several essential parameters: weight of missing words wm, and dimension K. Figure 2 and 3 analyze the impact of these parameters on ATOPteSt.
Introduction	Latent variable models, such as Latent Semantic Analysis [LSA] (Landauer et al., 1998), Probabilistic Latent Semantic Analysis [PLSA] (Hofmann, 1999), Latent Dirichlet Allocation [LDA] (Blei et al., 2003) can solve the two issues naturally by modeling the semantics of words and sentences simultaneously in the low-dimensional latent space.
Introduction	After analyzing the way traditional latent variable models (LSA, PLSNLDA) handle missing words, we decide to model sentences using a weighted matrix factorization approach (Srebro and J aakkola, 2003), which allows us to treat observed words and missing words differently.
Limitations of Topic Models and LSA for Modeling Sentences	Usually latent variable models aim to find a latent semantic profile for a sentence that is most relevant to the observed words.

latent variables is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

13. Fast Online Training with Frequency-Adaptive Learning Rates for Chinese Word Segmentation and New Word Detection

Sun, Xu and Wang, Houfeng and Li, Wenjie

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	While most of the state-of-the-art CWS systems used semi-Markov conditional random fields or latent variable conditional random fields, we simply use a single first-order conditional random fields (CRFs) for the joint modeling.
Introduction	The semi-Markov CRFs and latent variable CRFs relax the Markov assumption of CRFs to express more complicated dependencies, and therefore to achieve higher disambiguation power.
Related Work	To achieve high accuracy, most of the state-of-the-art systems are heavy probabilistic systems using semi-Markov assumptions or latent variables (Andrew, 2006; Sun et al., 2009b).
Related Work	For example, one of the state-of-the-art CWS system is the latent variable conditional random field (Sun et al., 2008; Sun and Tsujii, 2009) system presented in Sun et al.
Related Work	Those semi-Markov perceptron systems are moderately faster than the heavy probabilistic systems using semi-Markov conditional random fields or latent variable conditional random fields.

latent variables is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

14. Anchors Regularized: Adding Robustness and Extensibility to Scalable Topic-Modeling Algorithms

Nguyen, Thang and Hu, Yuening and Boyd-Graber, Jordan

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	These regularizations could improve spectral algorithms for latent variables models, improving the performance for other NLP tasks such as latent variable PCFGs (Cohen et al., 2013) and HMMs (Anandkumar et al., 2012), combining the flexibility and robustness offered by priors with the speed and accuracy of new, scalable algorithms.
Introduction	Theoretically, their latent variable formulation has served as a foundation for more robust models of other linguistic phenomena (Brody and Lapata, 2009).
Introduction	Modern topic models are formulated as a latent variable model.
Introduction	Typical solutions use MCMC (Griffiths and Steyvers, 2004) or variational EM (Blei et al., 2003), which can be viewed as local optimization: searching for the latent variables that maximize the data likelihood.

latent variables is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

15. Lexical Inference over Multi-Word Predicates: A Distributional Approach

Abend, Omri and Cohen, Shay B. and Steedman, Mark

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background and Related Work	Our work proposes a uniform treatment to MWPs of varying degrees of compositionality, and avoids defining MWPs explicitly by modelling their LCs as latent variables .
Introduction	We present a novel approach to the task that models the selection and relative weighting of the predicate’s LCs using latent variables .
Our Proposal: A Latent LC Approach	We address the task with a latent variable log-linear model, representing the LCs of the predicates.
Our Proposal: A Latent LC Approach	We choose this model for its generality, conceptual simplicity, and because it allows to easily incorporate various feature sets and sets of latent variables .
Our Proposal: A Latent LC Approach	The introduction of latent variables into the log-linear model leads to a non-convex objective function.

latent variables is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

16. SITS: A Hierarchical Nonparametric Model using Speaker Identity for Topic Segmentation in Multiparty Conversations

Nguyen, Viet-An and Boyd-Graber, Jordan and Resnik, Philip

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Inference	To find the latent variables that best explain observed data, we use Gibbs sampling, a widely used Markov chain Monte Carlo inference technique (Neal, 2000; Resnik and Hardisty, 2010).
Inference	The state space is latent variables for topic indices assigned to all tokens z = {ZQM} and topic shifts assigned to turns 1 2 {lat}.
Inference	We marginalize over all other latent variables .
Modeling Multiparty Discussions	Instead, we endow each turn with a binary latent variable lat, called the topic shift.
Modeling Multiparty Discussions	This latent variable signifies whether the speaker changed the topic of the conversation.
Related and Future Work	as a distinct latent variable (Wang and McCallum, 2006; Eisenstein et a1., 2010).

latent variables is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

17. Infusion of Labeled Data into Distant Supervision for Relation Extraction

Pershina, Maria and Min, Bonan and Xu, Wei and Grishman, Ralph

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Guided DS	We introduce a set of latent variables hi which model human ground truth for each mention in the ith bag and take precedence over the current model assignment zi.
Guided DS	o ZijER U NR: a latent variable that denotes the relation of the jth mention in the ith bag
Guided DS	0 hij E R U NR: a latent variable that denotes the refined relation of the mention xij
Introduction	(2012), we generalize the labeled data through feature selection and model this additional information directly in the latent variable approaches.
The Challenge	Instead we propose to perform feature selection to generalize human labeled data into training guidelines, and integrate them into latent variable model.

latent variables is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

18. Online Relative Margin Maximization for Statistical Machine Translation

Eidelman, Vladimir and Marton, Yuval and Resnik, Philip

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusions and Future Work	The closed-form online update for our relative margin solution accounts for surrogate references and latent variables .
Introduction	Unfortunately, not all advances in machine learning are easy to apply to structured prediction problems such as SMT; the latter often involve latent variables and surrogate references, resulting in loss functions that have not been well explored in machine learning (Mcallester and Keshet, 2011; Gimpel and Smith, 2012).
Introduction	The contributions of this paper include (1) introduction of a loss function for structured RMM in the SMT setting, with surrogate reference translations and latent variables ; (2) an online gradient-based solver, RM, with a closed-form parameter update to optimize the relative margin loss; and (3) an efficient implementation that integrates well with the open source cdec SMT system (Dyer et al., 2010).1 In addition, (4) as our solution is not dependent on any specific QP solver, it can be easily incorporated into practically any gradient-based learning algorithm.
Introduction	First, we introduce RMM (§3.1) and propose a latent structured relative margin objective which incorporates cost-augmented hypothesis selection and latent variables .
Learning in SMT	While many derivations d E D(:c) can produce a given translation, we are only able to observe 3/; thus we model d as a latent variable .

latent variables is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

feature set (25)
BLEU (18)
TER (11)

19. Nonconvex Global Optimization for Latent-Variable Models

Gormley, Matthew R. and Eisner, Jason

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Many models in NLP involve latent variables , such as unknown parses, tags, or alignments.
Projections	Given a relaxed joint solution to the parameters and the latent variables, one must be able to project it to a nearby feasible one, by projecting either the fractional parameters or the fractional latent variables into the feasible space and then solving exactly for the other.
Related Work	The goal of this work was to better understand and address the non-convexity of maximum-likelihood training with latent variables , especially parses.
Related Work	For supervised parsing, spectral leam-ing has been used to learn latent variable PCFGs (Cohen et al., 2012) and hidden-state dependency grammars (Luque et al., 2012).
The Constrained Optimization Task	The feature counts are constrained to be derived from the latent variables (e.g., parses), which are unknown discrete structures that must be encoded with integer variables.

latent variables is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

20. Polylingual Tree-Based Topic Models for Translation Domain Adaptation

Hu, Yuening and Zhai, Ke and Eidelman, Vladimir and Boyd-Graber, Jordan

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Inference	Inference of probabilistic models discovers the posterior distribution over latent variables .
Inference	For a collection of D documents, each of which contains Nd number of words, the latent variables of ptLDA are: transition distributions 71',“- for every topic k and internal node i in the prior tree structure; multinomial distributions over topics 6d for every document d; topic assignments zdn and path ydn for the nth word wdn in document d. The joint distribution of polylingual tree-based topic models is
Inference	proximate posterior inference to discover the latent variables that best explain our data.

latent variables is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

21. Improved Bayesian Logistic Supervised Topic Models with Data Augmentation

Zhu, Jun and Zheng, Xun and Zhang, Bo

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Gibbs Sampling Algorithm	Our algorithm represents a first attempt to extend Polson’s approach (Polson et al., 2012) to deal with highly nontrivial Bayesian latent variable models.
Experiments	trivial to develop a Gibbs sampling algorithm using the similar data augmentation idea, due to the presence of latent variables and the nonlinearity of the soft-max function.
Introduction	ing due to the presence of nontrivial latent variables .
Logistic Supervised Topic Models	But the presence of latent variables poses additional challenges in carrying out a formal theoretical analysis of these surrogate losses (Lin, 2001) in the topic model setting.
Logistic Supervised Topic Models	Moreover, the latent variables Z make the inference problem harder than that of Bayesian logistic regression models (Chen et al., 1999; Meyer and Laud, 2002; Polson et al., 2012).

latent variables is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

22. A Nonparametric Bayesian Approach to Acoustic Model Discovery

Lee, Chia-ying and Glass, James

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	In contrast to the previous methods, we approach the problem by modeling the three sub-problems as well as the unknown set of sub-word units as latent variables in one nonparametric Bayesian model.
Model	In the next section, we show how to infer the value of each of the latent variables in Fig.
Problem Formulation	We model the three subtasks as latent variables in our approach.
Problem Formulation	In this section, we describe the observed data, latent variables , and auxiliary variables
Related Work	For the domain our problem is applied to, our model has to include more latent variables and is more complex.

latent variables is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

23. Joint Decoding with Multiple Translation Models

Liu, Yang and Mi, Haitao and Feng, Yang and Liu, Qun

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background	(2008) present a latent variable model that describes the relationship between translation and derivation clearly.
Background	Although originally proposed for supporting large sets of nonindependent and overlapping features, the latent variable model is actually a more general form of conventional linear model (Och and Ney, 2002).
Background	Accordingly, decoding for the latent variable model can be formalized as
Related Work	They show that max-translation decoding outperforms max-derivation decoding for the latent variable model.

latent variables is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

24. Probabilistic Domain Modelling With Contextualized Distributional Semantic Vectors

Cheung, Jackie Chi Kit and Penn, Gerald

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Distributional Semantic Hidden Markov Models	This model can be thought of as an HMM with two layers of latent variables , representing events and slots in the domain.
Distributional Semantic Hidden Markov Models	Event Variables At the top-level, a categorical latent variable E; with N E possible states represents the event that is described by clause 75.
Distributional Semantic Hidden Markov Models	Slot Variables Categorical latent variables with N 3 possible states represent the slot that an argument fills, and are conditioned on the event variable in the clause, E7; (i.e., PS(Sta\|Et), for the ath slot variable).
Related Work	Distributions that generate the latent variables and hyperparameters are omitted for clarity.

latent variables is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

25. Beam-Width Prediction for Efficient Context-Free Parsing

Bodenstab, Nathan and Dunlop, Aaron and Hall, Keith and Roark, Brian

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background	Petrov and Klein (2007a) derive coarse grammars in a more statistically principled way, although the technique is closely tied to their latent variable grammar representation.
Experimental Setup	Alternative decoding methods, such as marginalizing over the latent variables in the grammar or MaxRule decoding (Petrov and Klein, 2007a) are certainly possible in our framework, but it is unknown how effective these methods will be given the heavily pruned na-
Introduction	Grammar transformation techniques such as linguistically inspired nonterminal annotations (Johnson, 1998; Klein and Manning, 2003b) and latent variable grammars (Matsuzaki et al., 2005; Petrov et al., 2006) have increased the grammar size \|G\| from a few thousand rules to several million in an explicitly enumerable grammar, or even more in an implicit grammar.
Introduction	Rather, the beam-width prediction model is trained to learn the rank of constituents in the maximum likelihood trees.1 We will illustrate this by presenting results using a latent-variable grammar, for which there is no “true” reference latent variable parse.

latent variables is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

26. A Statistical Model for Lost Language Decipherment

Snyder, Benjamin and Barzilay, Regina and Knight, Kevin

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Inference	In order to do so, we need to integrate out all the other latent variables in our model.
Inference	To do so tractably, we use Gibbs sampling to draw each latent variable conditioned on our current sample of the others.
Inference	Even with a large number of sampling rounds, it is difficult to fully explore the latent variable space for complex unsupervised models.

latent variables is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

27. Open-Domain Semantic Role Labeling by Modeling Word Spans

Huang, Fei and Yates, Alexander

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	An HMM is a generative probabilistic model that generates each word 5137; in the corpus conditioned on a latent variable Y}.
Introduction	Each Y; in the model takes on integral values from 1 to K, and each one is generated by the latent variable for the preceding word, Y};_1.
Introduction	In response, we introduce latent variable models of word spans, or sequences of words.

latent variables is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

28. Learning to Automatically Solve Algebra Word Problems

Kushman, Nate and Artzi, Yoav and Zettlemoyer, Luke and Barzilay, Regina

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	In both cases, the available labeled equations (either the seed set, or the full set) are abstracted to provide the model’s equation templates, while the slot filling and alignment decisions are latent variables whose settings are estimated by directly optimizing the marginal data log-likelihood.
Mapping Word Problems to Equations	In this way, the distribution over derivations 3/ is modeled as a latent variable .
Related Work	In our approach, systems of equations are relatively easy to specify, providing a type of template structure, and the alignment of the slots in these templates to the text is modeled primarily with latent variables during learning.

latent variables is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

29. How Well can We Learn Interpretable Entity Types from Text?

Hovy, Dirk

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Extending the Model	By adding additional transitions, we can constrain the latent variables further.
Introduction	(2011) proposed an approach that uses co-occurrence patterns to find entity type candidates, and then learns their applicability to relation arguments by using them as latent variables in a first-order HMM.
Model	Thus all common nouns are possible types, and can be used as latent variables in an HMM.

latent variables is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

30. Sparser, Better, Faster GPU Parsing

Hall, David and Berg-Kirkpatrick, Taylor and Klein, Dan

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Minimum Bayes risk parsing	MBR parsing has proven especially useful in latent variable grammars.
Minimum Bayes risk parsing	Petrov and Klein (2007) showed that MBR trees substantially improved performance over Viterbi parses for latent variable grammars, earning up to 1.5Fl.
Sparsity and CPUs	For instance, in a latent variable parser, the coarse grammar would have symbols like NP, VP, etc., and the fine pass would have refined symbols N P0, N P1, VP4, and so on.

latent variables is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

31. A Provably Correct Learning Algorithm for Latent-Variable PCFGs

Cohen, Shay B. and Collins, Michael

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	This matrix form has clear relevance to latent variable models.
Related Work	Recently a number of researchers have developed provably correct algorithms for parameter estimation in latent variable models such as hidden Markov models, topic models, directed graphical models with latent variables , and so on (Hsu et al., 2009; Bailly et al., 2010; Siddiqi et al., 2010; Parikh et al., 2011; Balle et al., 2011; Arora et al., 2013; Dhillon et al., 2012; Anandkumar et al., 2012; Arora et al., 2012; Arora et al., 2013).
The Learning Algorithm for L-PCFGS	The training set does not include values for the latent variables ; this is the main challenge in learning.

latent variables is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

32. Predicting Instructor's Intervention in MOOC forums

Chaturvedi, Snigdha and Goldwasser, Dan and Daumé III, Hal

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Intervention Prediction Models	pi, 7“ and gb(t) are observed and hi are the latent variables .
Intervention Prediction Models	In the first step, it determines the latent variable assignments for positive examples.
Intervention Prediction Models	Once this process converges for negative examples, the algorithm reassigns values to the latent variables for positive examples, and proceeds to the second step.

latent variables is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

33. Semantic Parsing via Paraphrasing

Berant, Jonathan and Liang, Percy

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Model overview	Many existing paraphrase models introduce latent variables to describe the derivation of c from :c, e.g., with transformations (Heilman and Smith, 2010; Stern and Dagan, 2011) or alignments (Haghighi et al., 2005; Das and Smith, 2009; Chang et al., 2010).
Model overview	However, we opt for a simpler paraphrase model without latent variables in the interest of efficiency.
Paraphrasing	The NLP paraphrase literature is vast and ranges from simple methods employing surface features (Wan et al., 2006), through vector space models (Socher et al., 2011), to latent variable models (Das and Smith, 2009; Wang and Manning, 2010; Stern and Dagan, 2011).

latent variables is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

34. A Bayesian Mixed Effects Model of Literary Character

Bamman, David and Underwood, Ted and Smith, Noah A.

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	As a Baseline, we also evaluate all hypotheses on a model with no latent variables whatsoever, which instead measures similarity as the average J S divergence between the empirical word distributions over each role type.
Experiments	Table 1 presents the results of this comparison; for all models with latent variables , we report the average of 5 sampling runs with different random initializations.
Model	Observed variables are shaded, latent variables are clear, and collapsed variables are dotted.

latent variables is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

35. A Random Walk Approach to Selectional Preferences Based on Preference Ranking and Propagation

Tian, Zhenhua and Xiang, Hengheng and Liu, Ziqi and Zheng, Qinghua

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

RSP: A Random Walk Model for SP	LDA-SP: Another kind of sophisticated unsupervised approaches for SP are latent variable models based on Latent Dirichlet Allocation (LDA).
Related Work 2.1 WordNet-based Approach	Recently, more sophisticated methods are innovated for SP based on topic models, where the latent variables (topics) take the place of semantic classes and distributional clusterings (Seaghdha, 2010; Ritter et al., 2010).
Related Work 2.1 WordNet-based Approach	Without introducing semantic classes and latent variables , Keller and Lapata (2003) use the web to obtain frequencies for unseen bigrams smooth.

latent variables is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

36. Learning to Win by Reading Manuals in a Monte-Carlo Framework

Branavan, S.R.K and Silver, David and Barzilay, Regina

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Adding Linguistic Knowledge to the Monte-Carlo Framework	Game only 17.3 5.3 i: 2.7 Sentence relevance 46.7 2.8 i: 3.5 Full model 53.7 5.9 i: 3.5 Random text 40.3 4.3 i: 3.4 Latent variable 26.1 3.7 i: 3.1
Adding Linguistic Knowledge to the Monte-Carlo Framework	Method % Wins Standard Error Game only 45.7 i: 7.0 Latent variable 62.2 i: 6.9 Full model 78.8 i: 5.8
Adding Linguistic Knowledge to the Monte-Carlo Framework	The second baseline, latent variable, extends the linear action-value function Q(s, a) of the game only baseline with a set of latent variables — i.e., it is a four layer neural network, where the second layer’s units are activated only based on game information.

latent variables is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

37. Interactive Topic Modeling

Hu, Yuening and Boyd-Graber, Jordan and Satinoff, Brianna

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Constraints Shape Topics	In topic modeling, collapsed Gibbs sampling (Griffiths and Steyvers, 2004) is a standard procedure for obtaining a Markov chain over the latent variables in the model.
Constraints Shape Topics	Typically, these only change based on assignments of latent variables in the sampler; in Section 4 we describe how changes in the model’s structure (in addition to the latent state) can be reflected in these count statistics.
Interactively adding constraints	In the more general case, when words lack a unique path in the constraint tree, an additional latent variable specifies which possible paths in the constraint tree produced the word; this would have to be sampled.

latent variables is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

38. Learning Hierarchical Translation Structure with Linguistic Annotations

Mylonakis, Markos and Sima'an, Khalil

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Joint Translation Model	structural part and their associated probabilities define a model 19(0) over the latent variable 0 determining the recursive, reordering and phrase-pair segmenting structure of translation, as in Figure 4.
Learning Translation Structure	It works it-eratively on a partition of the training data, climbing the likelihood of the training data while cross-validating the latent variable values, considering for every training data point only those which can be produced by models built from the rest of the data excluding the current part.
Related Work	The rich linguistically motivated latent variable learnt by our method delivers translation performance that compares favourably to a state-of-the-art system.

latent variables is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

39. Spice it up? Mining Refinements to Online Instructions from User Generated Content

Druck, Gregory and Pang, Bo

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Models	To identify refinements without labeled data, we propose a generative model of reviews (or more generally documents) with latent variables .
Models	Finally, although we motivated including the review-level latent variable y as a way to improve segment-level prediction of 2, note that predictions of y are useful in and of themselves.
Models	over latent variables using the sum-product algorithm (Koller and Friedman, 2009).

latent variables is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: