Index of papers in Proc. ACL that mention

Gibbs sampling

Seen in text as:

Gibbs sampling (141)
Gibbs sampler (63)
Gibbs Sampling (10)
Gibbs samplers (6)
Gibbs samples (4)
Gibbs Sampler (4)

Seen in 223 sentences in 42 papers.

1. Bayesian Unsupervised Word Segmentation with Nested Pitman-Yor Language Modeling

Mochihashi, Daichi and Yamada, Takeshi and Ueda, Naonori

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In this paper, we propose a new Bayesian model for fully unsupervised word segmentation and an efficient blocked Gibbs sampler combined with dynamic programming for inference.
Experiments	Since our algorithm converges rather fast, we ran the Gibbs sampler of trigram NPYLM for 200 iterations to obtain the results in Table 1.
Experiments	In all cases we removed all whitespaces to yield raw character strings for inference, and set L = 4 for Chinese and L = 8 for Japanese to run the Gibbs sampler for 400 iterations.
Experiments	9Notice that analyzing a test data is not easy for character-wise Gibbs sampler of previous work.
Inference	To find the hidden word segmentation w of a string 3 = 01 - - - c N, which is equivalent to the vector of binary hidden variables 2 = 21 - - - ZN, the simplest approach is to build a Gibbs sampler that randomly selects a character c,- and draw a binary decision 2,- as to whether there is a word boundary, and then update the language model according to the new segmentation (Goldwater et al., 2006; Xu et al., 2008).
Inference	4.1 Blocked Gibbs sampler
Inference	Instead, we propose a sentence-wise Gibbs sampler of word segmentation using efficient dynamic programming, as shown in Figure 3.
Introduction	However, they are still na‘1've with respect to word spellings, and the inference is very slow owing to inefficient Gibbs sampling .
Introduction	Section 4 describes an efficient blocked Gibbs sampler that leverages dynamic programming for inference.

Gibbs sampling is mentioned in 11 sentences in this paper.

Topics mentioned in this paper:

2. Improved Bayesian Logistic Supervised Topic Models with Data Augmentation

Zhu, Jun and Zheng, Xun and Zhang, Bo

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Gibbs Sampling Algorithm	Now, we present a simple and efficient Gibbs sampling algorithm for the generalized Bayesian logistic supervised topic models.
A Gibbs Sampling Algorithm	3.2 Inference with Collapsed Gibbs Sampling
A Gibbs Sampling Algorithm	Although we can do Gibbs sampling to infer the complete posterior distribution q(n,A, 8, Z, Q) and thus q(n, ('9, Z, Q) by ignoring A, the mixing rate would be slow due to the large sample space.
Abstract	We address these issues by: l) introducing a regularization constant to better balance the two parts based on an optimization formulation of Bayesian inference; and 2) developing a simple Gibbs sampling algorithm by introducing auxiliary Polya-Gamma variables and collapsing out Dirichlet variables.
Introduction	Second, to solve the intractable posterior inference problem of the generalized Bayesian logistic supervised topic models, we present a simple Gibbs sampling algorithm by exploring the ideas of data augmentation (Tanner and Wong, 1987; van Dyk and Meng, 2001; Holmes and Held, 2006).
Introduction	Then, we develop a simple and efficient Gibbs sampling algorithms with analytic conditional distributions without Metropolis-Hastings accept/reject steps.
Introduction	For Bayesian LDA models, we can also explore the conjugacy of the Dirichlet-Multinomial prior-likelihood pairs to collapse out the Dirichlet variables (i.e., topics and mixing proportions) to do collapsed Gibbs sampling , which can have better mixing rates (Griffiths and Steyvers, 2004).

Gibbs sampling is mentioned in 21 sentences in this paper.

Topics mentioned in this paper:

3. Sprinkling Topics for Weakly Supervised Text Classification

Hingmire, Swapnil and Chakraborti, Sutanu

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background 3.1 LDA	In this paper we estimate approximate posterior inference using collapsed Gibbs sampling (Griffiths and Steyvers, 2004).
Background 3.1 LDA	The Gibbs sampling equation used to update the assignment of a topic I to the word 21) E W at the position n in document d, conditioned on at, flu, is:
Background 3.1 LDA	We use a subscript d, fin to denote the current token, zdm is ignored in the Gibbs sampling update.
Experimental Evaluation	l. Infer T number of topics on D for LDA using collapsed Gibbs sampling .
Experimental Evaluation	Update M D using collapsed Gibbs sampling update in Equation 1.
Experimental Evaluation	Infer ‘0‘ number of topics on the sprinkled document corpus D using collapsed Gibbs sampling update.
Topic Sprinkling in LDA	We then update the new LDA model using collapsed Gibbs sampling .
Topic Sprinkling in LDA	We then infer a set of \|C \| number of topics on the sprinkled dataset using collapsed Gibbs sampling , where C is the set of class labels of the training documents.
Topic Sprinkling in LDA	We modify collapsed Gibbs sampling update in Equation 1 to carry class label information while inferring topics.

Gibbs sampling is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

4. A Markov Model of Machine Translation using Non-parametric Bayesian Inference

Feng, Yang and Cohn, Trevor

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	For each data set, Gibbs sampling was performed on the training set in each direction (source-to-target and target-to-source), initialized using GIZA++.4 We used the grow heuristic to combine the GIZA++ alignments in both directions (Koehn et al., 2003), which we then intersect with the predictions of GIZA++ in the relevant translation direction.
Experiments	The two Gibbs samplers were “burned in” for the first 1000 iterations, after which we ran a further 500 iterations selecting every 50th sample.
Experiments	Because the data set is small, we performed Gibbs sampling on a single processor.
Gibbs Sampling	To train the model, we use Gibbs sampling , a Markov Chain Monte Carlo (MCMC) technique for posterior inference.
Gibbs Sampling	Our Gibbs sampler operates by sampling an update to the alignment of each target word in the corpus.
Gibbs Sampling	(2009a) of using multiple processors to perform approximate Gibbs sampling which they showed achieved equivalent performance to the exact Gibbs sampler .
Model	This makes the approach more suitable for learning alignments, e. g., to account for word fertilities (see §3.3), while also permitting inference using Gibbs sampling (§4).

Gibbs sampling is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

5. Interactive Topic Modeling

Hu, Yuening and Boyd-Graber, Jordan and Satinoff, Brianna

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Constraints Shape Topics	3.1 Gibbs Sampling for Topic Models
Constraints Shape Topics	In topic modeling, collapsed Gibbs sampling (Griffiths and Steyvers, 2004) is a standard procedure for obtaining a Markov chain over the latent variables in the model.
Constraints Shape Topics	Given M documents the state of a Gibbs sampler for LDA consists of topic assignments for each token in the corpus and is represented as Z : {21,1...21,N1,22,1,...2M,NM}.
Discussion	As presented here, the technique for incorporating constraints is closely tied to inference with Gibbs sampling .
Interactively adding constraints	In the implementation of a Gibbs sampler , unassignment is done by setting a token’s topic assignment to an invalid topic (e. g. -l, as we use here) and decrementing any counts associated with that word.
Simulation Experiment	Next, we perform one of the strategies for state ablation, add additional iterations of Gibbs sampling , use the newly obtained topic distribution of each document as the feature vector, and perform classification on the test / train split.
Simulation Experiment	Each is averaged over five different chains using 10 additional iterations of Gibbs sampling per round (other numbers of iterations are discussed in Section 6.4).
Simulation Experiment	Figure 4 shows the effect of using different numbers of Gibbs sampling iterations after changing a constraint.

Gibbs sampling is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

6. Aspect Extraction with Automated Prior Knowledge Learning

Chen, Zhiyuan and Mukherjee, Arjun and Liu, Bing

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

AKL: Using the Learned Knowledge	Most importantly, due to the use of the new form of knowledge, AKL’s inference mechanism ( Gibbs sampler ) is entirely different from that of MC-LDA (Section 5.2), which results in superior performances (Section 6).
AKL: Using the Learned Knowledge	In short, our modeling contributions are (1) the capability of handling more expressive knowledge in the form of clusters, (2) a novel Gibbs sampler to deal with inappropriate knowledge.
AKL: Using the Learned Knowledge	5.2 The Gibbs Sampler

Gibbs sampling is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

7. Scalable Decipherment for Machine Translation via Hash Sampling

Ravi, Sujith

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Bayesian MT Decipherment via Hash Sampling	Doing standard collapsed Gibbs sampling in this scenario would be very slow and intractable.
Bayesian MT Decipherment via Hash Sampling	To do collapsed Gibbs sampling under this model, we would perform the following steps during sampling:
Bayesian MT Decipherment via Hash Sampling	So, during decipherment training a standard collapsed Gibbs sampler will waste most of its time on expensive computations that will be discarded in the end anyways.
Conclusion	To summarize, our method is significantly faster than previous methods based on EM or Bayesian with standard Gibbs sampling and obtains better results than any previously published methods for the same task.
Decipherment Model for Machine Translation	In spite of using Bayesian inference which is typically slow in practice (with standard Gibbs sampling ), we show later that our method is scalable and permits decipherment training using more complex translation models (with several additional parameters).
Decipherment Model for Machine Translation	with this problem by using a fast, efficient sampler based on hashing that allows us to speed up the Bayesian inference significantly whereas standard Gibbs sampling would be extremely slow.
Experiments and Results	The table also demonstrates the siginificant speedup achieved by the hash sampler over a standard Gibbs sampler for the same model (~85 times faster when using a 2-gram LM).
Feature-based representation for Source and Target	Additionally, performing Bayesian inference with such a complex model using standard Gibbs sampling can be very slow in practice.

Gibbs sampling is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

8. Steps to Excellence: Simple Inference with Refined Scoring of Dependency Trees

Zhang, Yuan and Lei, Tao and Barzilay, Regina and Jaakkola, Tommi and Globerson, Amir

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	Therefore, the first-order distribution is not well-defined and we only employ Gibbs sampling for simplicity.
Introduction	Our first strategy is akin to Gibbs sampling and samples a new head for each word in the sentence, modifying one arc at a time.
Results	iteration of this sampler makes multiple changes to the tree, in contrast to a single-edge change of Gibbs sampler .
Sampling-Based Dependency Parsing with Global Features	3.2.1 Gibbs Sampling
Sampling-Based Dependency Parsing with Global Features	One shortcoming of the Gibbs sampler is that it only changes one variable (arc) at a time.
Sampling-Based Dependency Parsing with Global Features	Note that blocked Gibbs sampling would be exponential in K, and is thus very slow already at K = 4.

Gibbs sampling is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

9. A Hierarchical Pitman-Yor Process HMM for Unsupervised Part of Speech Induction

Blunsom, Phil and Cohn, Trevor

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background	However this work approximated the derivation of the Gibbs sampler (omitting the interdependence between events when sampling from a collapsed model), resulting in a model which underperformed Brown et al.
Experiments	We have omitted the results for the HMM-LM as experimentation showed that the local Gibbs sampler became hopelessly stuck, failing to
The PYP-HMM	In order to induce a tagging under this model we use Gibbs sampling , a Markov chain Monte Carlo (MCMC) technique for drawing samples from the posterior distribution over the tag sequences given observed word sequences.
The PYP-HMM	We present two different sampling strategies: First, a simple Gibbs sampler which randomly samples an update to a single tag given all other tags; and second, a type-level sampler which updates all tags for a given word under a
The PYP-HMM	Gibbs samplers Both our Gibbs samplers perform the same calculation of conditional tag distributions, and involve first decrementing all trigrams and emissions affected by a sampling action, and then reintroducing the trigrams one at a time, conditioning their probabilities on the updated counts and table configurations as we progress.

Gibbs sampling is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

10. Semi-Supervised Semantic Tagging of Conversational Understanding using Markov Topic Regression

Celikyilmaz, Asli and Hakkani-Tur, Dilek and Tur, Gokhan and Sarikaya, Ruhi

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Each topic model uses Gibbs sampling for inference and parameter learning.
Experiments	For testing we iterated the Gibbs sampler using the trained model for 10 iterations on the testing data.
Experiments	For fair comparison, each benchmark topic model is provided with prior information on word-semantic tag distributions based on the labeled training data, hence, each K latent topic is assigned to one of K semantic tags at the beginning of Gibbs sampling .
Markov Topic Regression - MTR	We use blocked Gibbs sampling, in which the topic assignments 3k, and hyper-parameters {6,3521 are alternately sampled at each Gibbs sampling lag period 9 given all other variables.
Markov Topic Regression - MTR	At each 9 lag period of the Gibbs sampling , K
Markov Topic Regression - MTR	At the start of the Gibbs sampling , we designate the

Gibbs sampling is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

11. Unsupervised Multilingual Learning for Morphological Segmentation

Snyder, Benjamin and Barzilay, Regina

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Model	In practice, we never deal with such distributions directly, but rather integrate over them during Gibbs sampling .
Model	We achieve these aims by performing Gibbs sampling .
Model	Sampling We follow (Neal, 1998) in the derivation of our blocked and collapsed Gibbs sampler .

Gibbs sampling is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

12. Robust Entity Clustering via Phylogenetic Inference

Andrews, Nicholas and Eisner, Jason and Dredze, Mark

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We present a block Gibbs sampler for posterior inference and an empirical evaluation on several datasets.
Inference by Block Gibbs Sampling	We use a block Gibbs sampler , which from an initial state (190,21), zo) repeats these steps: 1.
Inference by Block Gibbs Sampling	The topics of context words are assumed exchangeable, and so we re-sample them using Gibbs sampling (Griffiths and Steyvers, 2004).
Inference by Block Gibbs Sampling	Unfortunately, this is prohibitively expensive for the (nonexchangeable) topics of the named mentions c. A Gibbs sampler would have to choose a new value for cc.z with probability proportional to the resulting joint probability of the full sample.

Gibbs sampling is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

13. A Probabilistic Model for Canonicalizing Named Entity Mentions

Yogatama, Dani and Sim, Yanchuan and Smith, Noah A.

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Learning and Inference	In the E-step, we perform collapsed Gibbs sampling to obtain distributions over row and column indices for every mention, given the current value of the hyperparamaters.
Learning and Inference	Also, our model has interdependencies among column indices of a mention.2 Standard Gibbs sampling procedure breaks down these dependencies.
Learning and Inference	This kind of blocked Gibbs sampling was proposed by Jensen et al.

Gibbs sampling is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

14. A Joint Model of Text and Aspect Ratings for Sentiment Summarization

Titov, Ivan and McDonald, Ryan

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

The Model	Following Titov and McDonald (2008) we use a collapsed Gibbs sampling algorithm that was derived for the MG-LDA model based on the Gibbs sampling method proposed for LDA in (Griffiths and Steyvers, 2004).
The Model	Gibbs sampling is an example of a Markov Chain Monte Carlo algorithm (Geman and Geman, 1984).
The Model	In Gibbs sampling , variables are sequentially sampled from their distributions conditioned on all other variables in the model.

Gibbs sampling is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

15. A Latent Dirichlet Allocation Method for Selectional Preferences

Ritter, Alan and Mausam and Etzioni, Oren

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Next we used collapsed Gibbs sampling to infer a distribution over topics, 6?, for each of the relations in the primary corpus (based solely on tuples in the training set) using the topics from the generalization corpus.
Experiments	To evaluate how well our topic-class associations carry over to unseen relations we used the same random sample of 100 relations from the pseudo-disambiguation experiment.8 For each argument of each relation we picked the top two topics according to frequency in the 5 Gibbs samples .
Previous Work	Additionally we perform full Bayesian inference using collapsed Gibbs sampling , in which parameters are integrated out (Griffiths and Steyvers, 2004).
Topic Models for Selectional Prefs.	For all the models we use collapsed Gibbs sampling for inference in which each of the hidden variables (e. g., 27.,“ and 273,32 in LinkLDA) are sampled sequentially conditioned on a full-assignment to all others, integrating out the parameters (Griffiths and Steyvers, 2004).
Topic Models for Selectional Prefs.	In addition, there are several scalability enhancements such as SparseLDA (Yao et al., 2009), and an approximation of the Gibbs Sampling procedure can be efficiently parallelized (Newman et al., 2009).

Gibbs sampling is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

topic models (12)
WordNet (10)
LDA (8)

16. A Nonparametric Bayesian Approach to Acoustic Model Discovery

Lee, Chia-ying and Glass, James

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Inference	We employ Gibbs sampling (Gelman et al., 2004) to approximate the posterior distribution of the hidden variables in our model.
Inference	To apply Gibbs sampling to our problem, we need to derive the conditional posterior distributions of each hidden variable of the model.
Inference	2, the Gibbs sampler can draw a new value for CM by sampling from the normalized distribution.
Introduction	We implement the inference process using Gibbs sampling .

Gibbs sampling is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

17. Latent Variable Models of Concept-Attribute Attachment

Reisinger, Joseph and Pasca, Marius

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup 4.1 Data Analysis	Per-Node Distribution: In stDA and ssLDA, attribute rankings can be constructed directly for each WN concept 0, by computing the likelihood of attribute 212 attaching to c, £(c\|w) = p(w\|c) averaged over all Gibbs samples (discarding a fixed number of samples for burn-in).
Hierarchical Topic Models 3.1 Latent Dirichlet Allocation	This distribution can be approximated efficiently using Gibbs sampling .
Hierarchical Topic Models 3.1 Latent Dirichlet Allocation	An efficient Gibbs sampling procedure is given in (Blei et al., 2003a).
Results	Precision was manually evaluated relative to 23 concepts chosen for broad coverage.7 Table 1 shows precision at n and the Mean Average Precision (MAP); In all LDA-based models, the Bayes average posterior is taken over all Gibbs samples
Results	Inset plots show log-likelihood of each Gibbs sample , indicating convergence except in the case of nCRP.

Gibbs sampling is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

18. Bayesian Symbol-Refined Tree Substitution Grammars for Syntactic Parsing

Shindo, Hiroyuki and Miyao, Yusuke and Fujino, Akinori and Nagata, Masaaki

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiment	After that, to infer the substitution sites, we initialized the model with the final sample from a run on the small training set, and used the Gibbs sampler for 2000 iterations.
Inference	In each splitting step, we use two types of blocked MCMC algorithm: the sentence-level blocked Metroporil-Hastings (MH) sampler and the tree-level blocked Gibbs sampler , while (Petrov et al., 2006) use a different MLE-based model and the EM algorithm.
Inference	The tree-level blocked Gibbs sampler focuses on the type of SR-TSG rules and simultaneously up-
Inference	After the inference of symbol subcategories, we use Gibbs sampling to infer the substitution sites of parse trees as described in (Cohn and Lapata, 2009; Post and Gildea, 2009).

Gibbs sampling is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

19. Learning to Extract International Relations from Political Context

O'Connor, Brendan and Stewart, Brandon M. and Smith, Noah A.

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Posteriors are saved and averaged from 11 Gibbs samples (every 100 iterations from 9,000 to 10,000) for analysis.
Experiments	where n refers to the averaged Gibbs samples’ counts of event tuples having frame k and a particular verb path,8 and N is the number of token comparisons (i.e.
Inference	After randomly initializing all 77k,8,7.,t, inference is performed by a blocked Gibbs sampler , alternating resamplings for three major groups of variables: the language model (z,gb), context model (07,7, [3, p), and the 77, 6 variables, which bottleneck between the submodels.
Inference	find that experimenting with different models is easier in the Gibbs sampling framework.
Inference	While Gibbs sampling for logistic normal priors is possible using auxiliary variable methods (Mimno et al., 2008; Holmes and Held, 2006; Polson et al., 2012), it can be slow to converge.

Gibbs sampling is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

20. The effect of non-tightness on Bayesian estimation of PCFGs

Cohen, Shay B. and Johnson, Mark

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Bayesian inference for PCFGs	The algorithms we give here are based on their Gibbs sampler , which in each iteration first samples parse trees
Bayesian inference for PCFGs	1—3) for P(@ \| 13,04) into the generic Gibbs sampler framework of Johnson et al.
Bayesian inference for PCFGs	Figure 1 plots the density of F1 scores (compared to the gold standard) resulting from the Gibbs sampler , using all three approaches.
Introduction	We show how to modify the Gibbs sampler described by Johnson et al.
Introduction	Perhaps surprisingly, we show that Gibbs sampler as defined by Johnson et al.

Gibbs sampling is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

21. Unsupervised Multilingual Grammar Induction

Snyder, Benjamin and Naseem, Tahira and Barzilay, Regina

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental setup	This model uses the same inference procedure as our bilingual model ( Gibbs sampling ).
Experimental setup	We also reimplemented the original EM version of CCM and found virtually no difference in performance when using EM or Gibbs sampling .
Model	We use Gibbs sampling (Hastings, 1970) to draw trees for each sentence conditioned on those drawn for
Model	This use of a tractable proposal distribution and acceptance ratio is known as the Metropolis-Hastings algorithm and it preserves the convergence guarantee of the Gibbs sampler (Hastings, 1970).

Gibbs sampling is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

22. Part-of-Speech Induction in Dependency Trees for Statistical Machine Translation

Tamura, Akihiro and Watanabe, Taro and Sumita, Eiichiro and Takamura, Hiroya and Okumura, Manabu

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Bilingual Infinite Tree Model	(2007) presented a sampling algorithm for the infinite tree model, which is based on the Gibbs sampling in the direct assignment representation for iHMM (Teh et al., 2006).
Bilingual Infinite Tree Model	Gibbs sampling , individual hidden state variables are resampled conditioned on all other variables.
Bilingual Infinite Tree Model	Beam sampling does not suffer from slow convergence as in Gibbs sampling by sampling the whole state variables at once.

Gibbs sampling is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

23. Unsupervised Consonant-Vowel Prediction over Hundreds of Languages

Kim, Young-Bum and Snyder, Benjamin

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	our Gibbs sampling inference method for the type-based HMM, even in the absence of multilingual priors.
Inference	4.2 Gibbs Sampling
Inference	To sample values (tg,z, a, [3) from their posterior (the integrand of Equation 1), we use Gibbs sampling , a Monte Carlo technique that constructs a Markov chain over a high-dimensional sample space by iteratively sampling each variable conditioned on the currently drawn sample values for the others, starting from a random initialization.
Results	Simply using our Gibbs sampler with symmetric priors boosts the performance up to 96%.

Gibbs sampling is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

24. Bayesian Synchronous Tree-Substitution Grammar Induction and Its Application to Sentence Compression

Yamangil, Elif and Shieber, Stuart M.

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We formalize nonparametric Bayesian STSG with epsilon alignment in full generality, and provide a Gibbs sampling algorithm for posterior inference tailored to the task of extractive sentence compression.
Evaluation	We compared the Gibbs sampling compressor (GS) against a version of maximum a posteriori EM (with Dirichlet parameter greater than 1) and a discriminative STSG based on SVM training (Cohn and Lapata, 2008) (SVM).
The STSG Model	3.2 Posterior inference via Gibbs sampling
The STSG Model	We use Gibbs sampling (Geman and Geman, 1984), a Markov chain Monte Carlo (MCMC) method, to sample from the posterior (3).

Gibbs sampling is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

25. Polylingual Tree-Based Topic Models for Translation Domain Adaptation

Hu, Yuening and Zhai, Ke and Eidelman, Vladimir and Boyd-Graber, Jordan

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	3For Gibbs sampling , we use implementations available in Hu and Boyd-Graber (2012) for tLDA; and Mallet (McCallum, 2002) for LDA and pLDA.
Inference	We use a collapsed Gibbs sampler for tree-based topic models to sample the path ydn and topic assignment zdn for word wdn,
Inference	For topic .2 and path y, instead of variational updates, we use a Gibbs sampler within a document.
Inference	This equation embodies how this is a hybrid algorithm: the first term resembles the Gibbs sampling term encoding how much a document prefers a topic, while the second term encodes the expectation under the variational distribution of how much a path is preferred by this topic,

Gibbs sampling is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

26. Negation Focus Identification with Contextual Discourse Information

Zou, Bowei and Zhou, Guodong and Zhu, Qiaoming

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Baselines	Here, the topics are extracted from all the documents in the *SEM 2012 shared task using the LDA Gibbs Sampling algorithm (Griffiths, 2002).
Baselines	where Rel(w,, rm) is the weight of word w in topic rm calculated by the LDA Gibbs Sampling algorithm.
Baselines	> Topic Modeler: For estimating transition probability Pt(i,m), we employ GibbsLDA++6, an LDA model using Gibbs Sampling technique for parameter estimation and inference.

Gibbs sampling is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

27. Empirical Study of Unsupervised Chinese Word Segmentation Methods for SMT on Large-scale Corpora

Wang, Xiaolin and Utiyama, Masao and Finch, Andrew and Sumita, Eiichiro

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	(2010) used the local best alignment to increase the speed of the Gibbs sampling in training but the impact on accuracy was not explored.
Introduction	To this end, we model bilingual UWS under a similar framework with monolingual UWS in order to improve efficiency, and replace Gibbs sampling with expectation maximization (EM) in training.
Methods	EF/{;}(P(.7-"k/\|.7-")) = P(J:k'\|f, M) in a similar manner to the marginalization in the Gibbs sampling process which we are replacing;

Gibbs sampling is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

segmenters (11)
BLEU (9)
bigram (7)

28. Context-aware Learning for Sentence-level Sentiment Analysis with Posterior Regularization

Yang, Bishan and Cardie, Claire

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Approach	For constraints with higher-order structures, we use Gibbs Sampling (Geman and Geman, 1984) to approximate the expectations.
Approach	For documents where the higher-order constraints apply, we use the same Gibbs sampler as described above to infer the most likely label assignment, otherwise, we use the Viterbi algorithm.
Experiments	For approximation inference with higher-order constraints, we perform 2000 Gibbs sampling iterations where the first 1000 iterations are bum-in iterations.

Gibbs sampling is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

29. Discovering Latent Structure in Task-Oriented Dialogues

Zhai, Ke and Williams, Jason D

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We run the Gibbs samplers for 1000 iterations and update all hyper-parameters using slice sampling (Neal, 2003; Wallach, 2008) every 10 iterations.
Latent Structure in Dialogues	We also assume symmetric Dirichlet priors on all multinomial distributions and apply collapsed Gibbs sampling .
Latent Structure in Dialogues	All probabilities can be computed using collapsed Gibbs sampler for LDA (Griffiths

Gibbs sampling is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

30. Learning Document-Level Semantic Properties from Free-Text Annotations

Branavan, S.R.K. and Chen, Harr and Eisenstein, Jacob and Barzilay, Regina

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	To improve the model’s convergence rate, we perform two initialization steps for the Gibbs sampler .
Experimental Setup	Inference The final point estimate used for testing is an average (for continuous variables) or a mode (for discrete variables) over the last 1,000 Gibbs sampling iterations.
Posterior Sampling	We employ Gibbs sampling , previously used in NLP by Finkel et al.

Gibbs sampling is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

31. A Bayesian Mixed Effects Model of Literary Character

Bamman, David and Underwood, Ted and Smith, Noah A.

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	All experiments are run with 50 iterations of Gibbs sampling to collect samples for the personas p, alternating with maximization steps for 77.
Model	Rather than adopting a fully Bayesian approach (e.g., sampling all variables), we infer these values using stochastic EM, alternating between collapsed Gibbs sampling for each p and maximizing with respect to 77.
Model	8We assume the reader is familiar with collapsed Gibbs sampling as used in latent-variable NLP models.

Gibbs sampling is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

32. Exact Maximum Inference for the Fertility Hidden Markov Model

Quirk, Chris

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Recent approaches instead use more principled approximate inference techniques such as Gibbs sampling for parameter estimation.
Evaluation	The next line shows the fertility HMM with approximate posterior computation from Gibbs sampling but with final alignment selected by the Viterbi algorithm.
HMM alignment	estimate the posterior distribution using Markov chain Monte Carlo methods such as Gibbs sampling (Zhao and Gildea, 2010).

Gibbs sampling is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

33. Latent Variable Models of Selectional Preference

Ó Séaghdha, Diarmuid

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Related work	The combination of a well-defined probabilistic model and Gibbs sampling procedure for estimation guarantee (eventual) convergence and the avoidance of degenerate solutions.
Three selectional preference models	Following Griffiths and Steyvers (2004), we estimate the model by Gibbs sampling .
Three selectional preference models	As suggested by the similarity between (4) and (2), the ROOTH-LDA model can be estimated by an LDA-like Gibbs sampling procedure.

Gibbs sampling is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

34. Exploiting Topic based Twitter Sentiment for Stock Prediction

Si, Jianfeng and Mukherjee, Arjun and Liu, Bing and Li, Qing and Li, Huayi and Deng, Xiaotie

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Related Work 2.1 Market Prediction and Social Media	We use collapsed Gibbs sampling (Bishop, 2006) for model inference.
Related Work 2.1 Market Prediction and Social Media	Only non-opinion words in tweets are used for Gibbs sampling .
Related Work 2.1 Market Prediction and Social Media	1 The actual topic priors for topic links are governed by the four cases of the Gibbs Sampler .

Gibbs sampling is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

35. Deciphering Foreign Language

Ravi, Sujith and Knight, Kevin

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Machine Translation as a Decipherment Task	Sampling IBM Model 3: We use point-wise Gibbs sampling to estimate the IBM Model 3 parameters.
Word Substitution Decipherment	channel.1 We perform inference using point-wise Gibbs sampling (Geman and Geman, 1984).
Word Substitution Decipherment	Parallelized Gibbs sampling : Secondly, we parallelize our sampling step using a Map-Reduce framework.

Gibbs sampling is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

36. Stop-probability estimates computed on a large corpus improve Unsupervised Dependency Parsing

Mareċek, David and Straka, Milan

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We have also found that the Gibbs sampler does not always converge to a similar grammar.
Inference	We employ the Gibbs sampling algorithm (Gilks et al., 1996).
Introduction	Section 5 describes the inference algorithm based on Gibbs sampling .

Gibbs sampling is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

37. Discovery of Topically Coherent Sentences for Extractive Summarization

Celikyilmaz, Asli and Hakkani-Tur, Dilek

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Final Experiments	For our models, we ran Gibbs samplers for 2000 iterations for each configuration throwing out first 500 samples as burn-in.
Two-Tiered Topic Model - TTM	We use Gibbs sampling which allows a combination of estimates from several local maxima of the posterior distribution.
Two-Tiered Topic Model - TTM	We obtain DS during Gibbs sampling (in §4.l), which indicates a saliency score of each sentence sj E S,j = LSD:

Gibbs sampling is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

38. A joint model of word segmentation and phonological variation for English word-final /t/-deletion

Börschinger, Benjamin and Johnson, Mark and Demuth, Katherine

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments 4.1 The data	To test our Gibbs sampling inference procedure, we ran it on artificial data generated according to the model itself.
The computational model	,8“, for unknown n. A major insight in Goldwater’s work is that rather than sampling over the latent variables in the model directly (the number of which we don’t even know), we can instead perform Gibbs sampling over a set of boundary variables (91, .
The computational model	Figure 5: The relation between the observed sequence of segments (bottom), the boundary variables b1,...,bIW\|_1 the Gibbs sampler operates over (in squares), the latent sequence of surface forms and the latent sequence of underlying forms.

Gibbs sampling is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

39. Automatically Extracting Polarity-Bearing Topics for Cross-Domain Sentiment Classification

He, Yulan and Lin, Chenghua and Alani, Harith

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	The previously proposed J ST model uses the sentiment prior information in the Gibbs sampling inference step that a sentiment label will only be sampled if the current word token has no prior sentiment as defined in a sentiment lexicon.
Joint Sentiment-Topic (J ST) Model	Gibbs sampling was used to estimate the posterior distribution by sequentially sampling each variable of interest, 2,; and It here, from the distribution over
Joint Sentiment-Topic (J ST) Model	In our experiment, a was updated every 25 iterations during the Gibbs sampling procedure.

Gibbs sampling is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

40. Aspect Extraction through Semi-Supervised Modeling

Mukherjee, Arjun and Liu, Bing

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	The variations in the results are due to the random initialization of the Gibbs sampler .
Proposed Seeded Models	We employ collapsed Gibbs sampling (Griffiths and Steyvers, 2004) for posterior inference.
Related Work	(2011) relied on user feedback during Gibbs sampling iterations.

Gibbs sampling is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

41. Finding Bursty Topics from Microblogs

Diao, Qiming and Jiang, Jing and Zhu, Feida and Lim, Ee-Peng

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Each model was run for 500 iterations of Gibbs sampling .
Method	We use collapsed Gibbs sampling to obtain samples of the hidden variable assignment and to estimate the model parameters from these samples.
Method	Due to space limit, we only show the derived Gibbs sampling formulas as follows.

Gibbs sampling is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

42. A Joint Model for Discovery of Aspects in Utterances

Celikyilmaz, Asli and Hakkani-Tur, Dilek

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	For Base—MGM and WebPrior—MCM, we run Gibbs sampler for 2000 iterations with the first 500 samples as bum-in.
MultiLayer Context Model - MCM	Thus, we use Markov Chain Monte Carlo (MCMC) method,specifically Gibbs sampling , to model the posterior distribution Pym/(Du, Aud, Sujdla‘fi‘, a‘f, cuff, fl) by obtaining samples (Du, Aud, Sujd) drawn from this distribution.
MultiLayer Context Model - MCM	During Gibbs sampling , we keep track of the frequency of draws of domain, dialog act and slot indicating n-grams wj, in M D, M A and MS matrices, respectively.

Gibbs sampling is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: