Index of papers in Proc. ACL that mention

model parameters

Seen in text as:

model parameters (82)
model parameter (12)
model’s parameters (4)
Model Parameters (3)

Seen in 101 sentences in 22 papers.

1. Nonconvex Global Optimization for Latent-Variable Models

Gormley, Matthew R. and Eisner, Jason

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Finding the optimal model parameters is then usually a difficult nonconvex optimization problem.
Abstract	We search for the maximum-likelihood model parameters and corpus parse, subject to posterior constraints.
Introduction	The node branches on a single model parameter 6m to partition its subspace.
Introduction	A variety of ways to find better local optima have been explored, including heuristic initialization of the model parameters (Spitkovsky et al., 2010a), random restarts (Smith, 2006), and annealing (Smith and Eisner, 2006; Smith, 2006).
Introduction	search with certificates of e-optimality for both the corpus parse and the model parameters .
The Constrained Optimization Task	The nonlinear constraints ensure that the model parameters are true log-probabilities.
The Constrained Optimization Task	Feature / model parameter index Sentence index
The Constrained Optimization Task	Conditional distribution index Number of model parameters

model parameters is mentioned in 22 sentences in this paper.

Topics mentioned in this paper:

2. A Large Scale Distributed Syntactic, Semantic and Lexical Language Model for Machine Translation

Tan, Ming and Zhou, Wenli and Zheng, Lei and Wang, Shaojun

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental results	For the 44 and 230 million tokens corpora, all sentences are automatically parsed and used to initialize model parameters , while for 1.3 billion tokens corpus, we parse the sentences from a portion of the corpus that
Experimental results	contain 230 million tokens, then use them to initialize model parameters .
Experimental results	Nevertheless, experimental results show that this approach is effective to provide initial values of model parameters .
Training algorithm	The objective of maximum likelihood estimation is to maximize the likelihood £(D, p) respect to model parameters .
Training algorithm	and denote ’2' N as the collection of N -best list parse trees for sentences over entire corpus D under model parameter p.
Training algorithm	mate model parameters .

model parameters is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

language model (36)
BLEU (15)
n-gram (14)

3. Deciphering Foreign Language

Ravi, Sujith and Knight, Kevin

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	From these corpora, we estimate translation model parameters : word-to-word translation tables, fertilities, distortion parameters, phrase tables, syntactic transformations, etc.
Introduction	A language model P (e) is typically used in SMT decoding (Koehn, 2009), but here P (6) actually plays a central role in training translation model parameters .
Machine Translation as a Decipherment Task	During decipherment training, our objective is to estimate the model parameters 0 in order to maximize the probability of the foreign corpus f. From Equation 4 we have:
Machine Translation as a Decipherment Task	5 For Bayesian MT decipherment, we set a high prior value on the language model (104) and use sparse priors for the IBM 3 model parameters t, n, d,p (0.01, 0.01, 0.01, 0.01).
Word Substitution Decipherment	During decipherment, our goal is to estimate the channel model parameters 6.
Word Substitution Decipherment	These methods are attractive for their ability to manage uncertainty about model parameters and allow one to incorporate prior knowledge during inference.

model parameters is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

4. Hierarchical Joint Learning: Improving Joint Parsing and Named Entity Recognition with Non-Jointly Labeled Data

Finkel, Jenny Rose and Manning, Christopher D.

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Base Models	be the value of feature 2' for subtree 7“ over sentence s, and let E9 [fi\|s] be the expected value of feature 2' in sentence 3, based on the current model parameters 6.
Hierarchical Joint Learning	After training has been completed, we retain only the joint model’s parameters .
Hierarchical Joint Learning	The first summation in this equation computes the log-likelihood of each model, using the data and parameters which correspond to that model, and the prior likelihood of that model’s parameters , based on a Gaussian prior centered around the top-level, non-model-specific parameters 6*, and with model-specific variance am.
Hierarchical Joint Learning	We need to compute partial derivatives in order to optimize the model parameters .

model parameters is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

5. Learning High-Level Planning from Text

Branavan, S.R.K. and Kushman, Nate and Lei, Tao and Barzilay, Regina

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

In our dataset only 11% of Candidate Relations are valid.	Initialization: Model parameters 0,, = 0 and 00 = 0.
In our dataset only 11% of Candidate Relations are valid.	As before, 633 is the vector of model parameters , and gbx is the feature function.
In our dataset only 11% of Candidate Relations are valid.	Therefore, during learning, we need to find the model parameters that maximize expected future reward (Sutton and Barto, 1998).
Model	Update the model parameters , using the low-level planner’s success or failure as the source of supervision.
Model	where 6C is the vector of model parameters .

model parameters is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

6. Exploiting Feature Hierarchy for Transfer Learning in Named Entity Recognition

Arnold, Andrew and Nallapati, Ramesh and Cohen, William W.

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Models considered 2.1 Basic Conditional Random Fields	The model parameters Nd), then, form the parameters of the leaves of this hierarchy.
Models considered 2.1 Basic Conditional Random Fields	(3) represent the likelihood of data in each domain given their corresponding model parameters, the second line represents the likelihood of each model parameter in each domain given the hyper-parameter of its parent in the tree hierarchy of features and the last term goes over the entire tree ’2' except the leaf nodes.
Models considered 2.1 Basic Conditional Random Fields	We perform a MAP estimation for each model parameter as well as the hyper-parameters.

model parameters is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

7. Robust Entity Clustering via Phylogenetic Inference

Andrews, Nicholas and Eisner, Jason and Dredze, Mark

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Detailed generative story	This is a conditional log-linear model parameterized by qb, where gbk, ~ N(0, 0,3).
Overview and Related Work	For learning, we iteratively adjust our model’s parameters to better explain our samples.
Overview and Related Work	(2012) we use topics as the contexts, but learn mention topics jointly with other model parameters .
Parameter Estimation	E-step: Collect samples by MCMC simulation as in §5, given current model parameters 6 and qb.

model parameters is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

8. From Natural Language Specifications to Program Input Parsers

Lei, Tao and Long, Fan and Barzilay, Regina and Rinard, Martin

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Model	We assume the generative model operates by first generating the model parameters from a set of Dirichlet distributions.
Model	0 Generating Model Parameters : For every pair of feature type f and phrase tag 2, draw a multinomial distribution parameter 63 from a Dirichlet prior P(6§;).
Model	Learning the Model During inference, we want to estimate the hidden specification trees 1: given the observed natural language specifications w, after integrating the model parameters out, i.e.

model parameters is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

9. Part-of-speech tagging with antagonistic adversaries

Sogaard, Anders

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Robust perceptron learning	Antagonistic adversaries choose transformations informed by the current model parameters w, but random adversaries randomly select transformations from a predefined set of possible transformations, e.g.
Robust perceptron learning	In an online setting feature bagging can be modelled as a game between a learner and an adversary, in which (a) the adversary can only choose between deleting transformations, (b) the adversary cannot see model parameters when choosing a transformation, and in which (c) the adversary only moves in between passes over the data.1
Robust perceptron learning	LRA is an adversarial game in which the two players are unaware of the other player’s current move, and in particular, where the adversary does not see model parameters and only randomly corrupts the data points.

model parameters is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

10. Kneser-Ney Smoothing on Expected Counts

Zhang, Hui and Chiang, David

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	For example, in Expectation-Maximization (Dempster et al., 1977), the Expectation (E) step computes the posterior distribution over possible completions of the data, and the Maximization (M) step reestimates the model parameters as
Word Alignment	Different models parameterize this probability distribution in different ways.
Word Alignment	It also contains most of the model’s parameters and is where overfitting occurs most.

model parameters is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

11. Response-based Learning for Grounded Machine Translation

Riezler, Stefan and Simianer, Patrick and Haas, Carolin

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	This can explain the success of response-based learning: Lexical and structural variants of reference translations can be used to boost model parameters towards translations with positive feedback, while the same translations might be considered as negative examples in standard structured learning.
Introduction	Here, leam-ing proceeds by “trying out” translation hypotheses, receiving a response from interacting in the task, and converting this response into a supervision signal for updating model parameters .
Response-based Online Learning	(2010) or Goldwasser and Roth (2013) describe a response-driven learning framework for the area of semantic parsing: Here a meaning representation is “tried out” by itera-tively generating system outputs, receiving feedback from world interaction, and updating the model parameters .

model parameters is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

12. A Recursive Recurrent Neural Network for Statistical Machine Translation

Liu, Shujie and Yang, Nan and Li, Mu and Zhou, Ming

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Model Training	gym] is the plausible score for the best translation candidate given the model parameters W and V .
Phrase Pair Embedding	Table 1: The relationship between the size of training data and the number of model parameters .
Phrase Pair Embedding	Table 1 shows the relationship between the size of training data and the number of model parameters .

model parameters is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

13. Low-Rank Tensors for Scoring Dependency Structures

Lei, Tao and Xin, Yu and Zhang, Yuan and Barzilay, Regina and Jaakkola, Tommi

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	We should note that since our model parameter A is represented and learned in the low-rank form, we only have to store and maintain the low-rank projections U gbh, ngm and nghm rather than explicitly calculate the feature tensor gbh®gbm®gbh,m.
Problem Formulation	We will directly learn a low-rank tensor A (because r is small) in this form as one of our model parameters .
Problem Formulation	where 6 6 RL, U 6 RM”, V 6 RM”, and W E 1Rer are the model parameters to be learned.

model parameters is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

14. Low-Resource Semantic Role Labeling

Gormley, Matthew R. and Mitchell, Margaret and Van Durme, Benjamin and Dredze, Mark

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Approaches	Unsupervised Grammar Induction Our first method for grammar induction is fully unsupervised Viterbi EM training of the Dependency Model with Valence (DMV) (Klein and Manning, 2004), with uniform initialization of the model parameters .
Related Work	(2010a) show that Viterbi (hard) EM training of the DMV with simple uniform initialization of the model parameters yields higher accuracy models than standard soft-EM
Related Work	In Viterbi EM, the E—step finds the maximum likelihood corpus parse given the current model parameters .

model parameters is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

15. Scalable Decipherment for Machine Translation via Hash Sampling

Ravi, Sujith

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Decipherment Model for Machine Translation	During decipherment training, our objective is to estimate the model parameters in order to maximize the probability of the source text f as suggested by Ravi and Knight (2011b).
Decipherment Model for Machine Translation	Instead, we propose a new Bayesian inference framework to estimate the translation model parameters .
Introduction	The parallel corpora are used to estimate translation model parameters involving word-to-word translation tables, fertilities, distortion, phrase translations, syntactic transformations, etc.

model parameters is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

16. Discriminative state tracking for spoken dialog systems

Metallinou, Angeliki and Bohus, Dan and Williams, Jason

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Generative state tracking	where ¢,(x, y) are feature functions jointly defined on features and labels, and A, are the model parameters .
Generative state tracking	This formulation also decouples the number of models parameters (i.e.
Generative state tracking	Second, model parameters in DISCIND are trained independently of competing hypotheses.

model parameters is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

17. Unsupervised Transcription of Historical Documents

Berg-Kirkpatrick, Taylor and Durrett, Greg and Klein, Dan

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Model	The Bernoulli parameter of a pixel inside a glyph bounding box depends on the pixel’s location inside the box (as well as on di and 21,-, but for simplicity of exposition, we temporarily suppress this dependence) and on the model parameters governing glyph shape (for each character type c, the parameter matrix gbc specifies the shape of the character’s glyph.)
Results and Analysis	(2010), we use a regularization term in the optimization of the log-linear model parameters (15¢ during the M-step.
Results and Analysis	Figure 8: The central glyph is a representation of the initial model parameters for the glyph shape for g, and surrounding this are the learned parameters for documents from various years.

model parameters is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

18. Bayesian Inference for Zodiac and Other Homophonic Ciphers

Ravi, Sujith and Knight, Kevin

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Decipherment	These methods are attractive for their ability to manage uncertainty about model parameters and allow one to incorporate prior knowledge during inference.
Decipherment	Our goal is to estimate the channel model parameters 6 in order to maximize the probability of the observed ciphertext c:
Decipherment	The base distribution P0 represents prior knowledge about the model parameter distributions.

model parameters is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

19. Jigs and Lures: Associating Web Queries with Structured Entities

Pantel, Patrick and Fuxman, Ariel

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Association Model	Basic Interpolation: This smoothing model, Pinw(e\|q), linearly combines our foreground and background models using a model parameter 04:
Association Model	Section 5.2 outlines our procedure for leam-ing the model parameters for both 15mm(e\|q) and
Experimental Results	5.2.1 Model Parameters

model parameters is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

20. Learning to Win by Reading Manuals in a Monte-Carlo Framework

Branavan, S.R.K and Silver, David and Barzilay, Regina

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Adding Linguistic Knowledge to the Monte-Carlo Framework	Since our model is a nonlinear approximation of the underlying action-value function of the game, we learn model parameters by applying nonlinear regression to the observed final utilities from the simulated roll-outs.
Adding Linguistic Knowledge to the Monte-Carlo Framework	The resulting update to model parameters 6 is of the form:
Adding Linguistic Knowledge to the Monte-Carlo Framework	We use the same experimental settings across all methods, and all model parameters are initialized to zero.

model parameters is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

21. Bootstrapping Semantic Analyzers from Non-Contradictory Texts

Titov, Ivan and Kozhevnikov, Mikhail

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Model of Semantics	We select the model parameters 6 by maximizing the marginal likelihood of the data, where the data D is given in the form of groups w =
Empirical Evaluation	When estimating the model parameters , we followed the training regime prescribed in (Liang et al., 2009).
Inference with NonContradictory Documents	In the supervised case, where a and m are observable, estimation of the generative model parameters is generally straightforward.

model parameters is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

22. A Discriminative Latent Variable Model for Statistical Machine Translation

Blunsom, Phil and Cohn, Trevor and Osborne, Miles

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Challenges for Discriminative SMT	This itself provides robustness to noisy data, in addition to the explicit regularisation from a prior over the model parameters .
Discriminative Synchronous Transduction	Here k ranges over the model’s features, and A = {M} are the model parameters (weights for their corresponding features).
Discriminative Synchronous Transduction	Each L-BFGS iteration requires the objective value and its gradient with respect to the model parameters .

model parameters is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: