Index of papers in Proc. ACL 2010 that mention
  • hyperparameters
Kazama, Jun'ichi and De Saeger, Stijn and Kuroda, Kow and Murata, Masaki and Torisawa, Kentaro
Background
The Dirichlet distribution is parametrized by hyperparameters ak(> 0).
Background
where C(k) is the frequency of choice k in data D. For example, C(k) = C(wi, fk) in the estimation of p( This is very simple: we just need to add the observed counts to the hyperparameters .
Experiments
We randomly chose 200 sets each for sets “A” and “B.” Set “A” is a development set to tune the value of the hyperparameters and
Experiments
As for BCb, we assumed that all of the hyperparameters had the same value, i.e., 04k; = 04.
Experiments
Because tuning hyperparameters involves the possibility of overfitting, its robustness should be assessed.
Method
Note that with the Dirichlet prior, 04;; 2 04k, + C(wl, fk) and 6,; = 6;, + C(wg, fk), where 04],; and 6;, are the hyperparameters of the priors of ml and 2122, respectively.
Method
To put it all together, we can obtain a new Bayesian similarity measure on words, which can be calculated only from the hyperparameters for the Dirichlet prior, 04 and 6, and the observed counts C(wi, fk).
hyperparameters is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Prettenhofer, Peter and Stein, Benno
Experiments
Special emphasis is put on corpus construction, determination of upper bounds and baselines, and a sensitivity analysis of important hyperparameters .
Experiments
SGD receives two hyperparameters as input: the number of iterations T, and the regularization parameter A.
Experiments
Recall that CL-SCL receives three hyperparameters as input: the number of pivots m, the dimensionality of the cross-lingual representation k,
Introduction
Third, an in-depth analysis with respect to important hyperparameters such as the ratio of labeled and unlabeled documents, the number of pivots, and the optimum dimensionality of the cross-lingual representation.
hyperparameters is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Turian, Joseph and Ratinov, Lev-Arie and Bengio, Yoshua
Supervised evaluation tasks
After choosing hyperparameters to maximize the dev Fl, we would retrain the model using these hyperparameters on the full 8936 sentence training set, and evaluate on test.
Supervised evaluation tasks
One hyperparameter was l2-regularization sigma, which for most models was optimal at 2 or 3.2.
Supervised evaluation tasks
The word embeddings also required a scaling hyperparameter , as described in Section 7.2.
Unlabled Data
We can scale the embeddings by a hyperparameter , to control their standard deviation.
hyperparameters is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Ó Séaghdha, Diarmuid
Experimental setup
Unless stated otherwise, all results are based on runs of 1,000 iterations with 100 classes, with a 200-iteration bumin period after which hyperparameters were reesti-mated every 50 iterations.3 The probabilities estimated by the models (P(n|v, 7“) for LDA and P(n,v|7“) for ROOTH- and DUAL-LDA) were sampled every 50 iterations post-burnin and averaged over three runs to smooth out variance.
Results
(2009) demonstrate that LDA is relatively insensitive to the choice of topic vocabulary size Z when the 04 and 6 hyperparameters are optimised appropriately during estimation.
Results
In fact, we do not find that performance becomes significantly less robust when hyperparameter reestimation is deactiviated; correlation scores simply drop by a small amount (1—2 points), irrespective of the Z chosen.
Three selectional preference models
Given a dataset of predicate-argument combinations and values for the hyperparameters 04 and 6, the probability model is determined by the class assignment counts fzn and fzv.
hyperparameters is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Yamangil, Elif and Shieber, Stuart M.
Evaluation
All hyperparameters ac, BC were held constant at 04, 6 for simplicity and were fit using grid-search over 04 E [10—6,106],6 E [10—3,0.5].
Evaluation
Hyperparameters were handled the same way as for GS.
The STSG Model
The hyperparameters do can be incorporated into the generative model as random variables; however, we opt to fix these at various constants to investigate different levels of sparsity.
The STSG Model
Assuming fixed hyperparameters a 2 {ac} and ,6 2 {BC}, our inference problem is to find the posterior distribution of the derivation sequences
hyperparameters is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Snyder, Benjamin and Barzilay, Regina and Knight, Kevin
Inference
Recall that ve is a hyperparameter for the Dirichlet prior on G0 and depends on the value of the corresponding indicator variable A6.
Inference
Recall that each sparsity indicator A6 determines the value of the corresponding hyperparameter 216 of the Dirichlet prior for the character-edit base distribution Go.
Model
The prior on the base distribution G0 is a Dirichlet distribution with hyperparameters 27, i.e., g; N Dirichlet(27).
hyperparameters is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: