Index of papers in Proc. ACL 2012 that mention
  • hyperparameters
Vaswani, Ashish and Huang, Liang and Chiang, David
Conclusion
Even though we have used a small set of gold-standard alignments to tune our hyperparameters, we found that performance was fairly robust to variation in the hyperparameters , and translation performance was good even when gold-standard alignments were unavailable.
Experiments
We have implemented our algorithm as an open-source extension to GIZA++.1 Usage of the extension is identical to standard GIZA++, except that the user can switch the (0 prior on or off, and adjust the hyperparameters a and ,6.
Experiments
We set the hyperparameters a and ,6 by tuning on gold-standard word alignments (to maximize F1) when possible.
Experiments
The fact that we had to use hand-aligned data to tune the hyperparameters a and ,6 means that our method is no longer completely unsupervised.
Method
The hyperparameter ,6 controls the tightness of the approximation, as illustrated in Figure 1.
hyperparameters is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Lee, Chia-ying and Glass, James
Experimental Setup
Table 1: The values of the hyperparameters of our model, where [id and Ad are the dth entry of the mean and the diagonal of the inverse covariance matrix of training data.
Experimental Setup
Hyperparameters and Training Iterations The values of the hyperparameters of our model are shown in Table l, where Md and Ad are the dth entry of the mean and the diagonal of the inverse covariance matrix computed from training data.
Inference
We use P - - - ) to denote a conditional posterior probability given observed data, all the other variables, and hyperparameters for the model.
Inference
The conjugate prior we use for the two variables is a normal-Gamma distribution with hyperparameters ,uo, 14:0, a0 and fig (Murphy, 2007).
Inference
Assume we use a symmetric Dirichlet distribution with a positive hyperparameter
Model
2, where the shaded circle denotes the observed feature vectors, and the squares denote the hyperparameters of the priors used in our model.
Results
6In the future, we plan to extend the model and infer the values of these hyperparameters from data directly.
hyperparameters is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Takamatsu, Shingo and Sato, Issei and Nakagawa, Hiroshi
Experiments
The averages of hyperparameters of PROP were 0.84 d: 0.05 for A and 0.85 d: 0.10 for the threshold.
Experiments
Proposed Model (PROP): Using the training data, we determined the two hyperparameters , A and the threshold to round gbrs to 1 or 0, so that they maximized the F value.
Experiments
hand, our model learns parameters such as or for each relation and thus the hyperparameter of our model does not directly affect its performance.
Generative Model
In this section, we consider relation 7“ since parameters are conditionally independent if relation 7“ and the hyperparameter are given.
Generative Model
A is the hyperparameter and mst is constant.
Generative Model
where 0 S A S 1 is the hyperparameter that controls how strongly brs is affected by the main labeling process explained in the previous subsection.
hyperparameters is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Bansal, Mohit and Klein, Dan
Experiments
We develop our features and tune their hyperparameter values on the ACE04 development set and then use these on the ACE04 test set.12 On the ACE05 and ACE05-ALL datasets, we directly transfer our Web features and their hyperparameter values from the ACE04 dev-set, without any retuning.
Semantics via Web Features
To capture this effect, we create a feature that indicates whether there is a match in the top 1:: seeds of the two headwords (where k: is a hyperparameter to tune).
Semantics via Web Features
We first collect the POS tags (using length 2 character prefixes to indicate coarse parts of speech) of the seeds matched in the top h’ seed lists of the two headwords, where h’ is another hyperparameter to tune.
Semantics via Web Features
We tune a separate bin-size hyperparameter for each of these three features.
hyperparameters is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Nguyen, Viet-An and Boyd-Graber, Jordan and Resnik, Philip
Evaluating Topic Shift Tendency
2008 Elections To obtain a posterior estimate of 7r (Figure 3) we create 10 chains with hyperparameters sampled from the uniform distribution U (0, l) and averaged 7r over 10 chains (as described in Section 5).
Inference
Marginal counts are represented with - and >x< represents all hyperparameters .
Topic Segmentation Experiments
Initial hyperparameter values are sampled from U (0, 1) to favor sparsity; statistics are collected after 500 burn-in iterations with a lag of 25 iterations over a total of 5000 iterations; and slice sampling (Neal, 2003) optimizes hyperparameters .
hyperparameters is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Shindo, Hiroyuki and Miyao, Yusuke and Fujino, Akinori and Nagata, Masaaki
Inference
4.3 Hyperparameter Estimation
Inference
We treat hyperparameters {d, 0} as random variables and update their values for every MCMC iteration.
Inference
We place a prior on the hyperparameters as follows: d N Beta(1,1), 6 N Gamma(1,1).
hyperparameters is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: