Conclusion | Because we are interested in applying our techniques to languages for which no labeled resources are available, we paid particular attention to minimize the number of free parameters and used the same hyperparameters for all language pairs. |
Experiments and Results | We paid particular attention to minimize the number of free parameters, and used the same hyperparameters for all language pairs, rather than attempting language-specific tuning. |
Experiments and Results | While we tried to minimize the number of free parameters in our model, there are a few hyperparameters that need to be set. |
Experiments and Results | Fortunately, performance was stable across various values, and we were able to use the same hyperparameters for all languages. |
PCS Projection | , |Vf|) are the label distributions over the foreign language vertices and ,u and V are hyperparameters that we discuss in §6.4. |
Constraints Shape Topics | In this model, a, 6, and 77 are Dirichlet hyperparameters set by the user; their role is explained below. |
Constraints Shape Topics | where TM is the number of times topic k is used in document d, Phwd’n is the number of times the type wdm, is assigned to topic k, and 04, 6 are the hyperparameters of the two Dirichlet distributions, and B is the number of top-level branches (this is the vocabulary size for vanilla LDA). |
Constraints Shape Topics | In order to make the constraints effective, we set the constraint word-distribution hyperparameter 77 to be much larger than the hyperparameter for the distribution over constraints and vocabulary 6. |
Simulation Experiment | The hyperparameters for all experiments are 04 = 0.1, 6 = 0.01, and 77 = 100. |
The PYP-HMM | The arrangement of customers at tables defines a clustering which exhibits a power-law behavior controlled by the hyperparameters a and b. |
The PYP-HMM | Sampling hyperparameters We treat the hyper-parameters {(cfl, If”) ,x E (U, B,T, E, 0)} as random variables in our model and infer their values. |
The PYP-HMM | The result of this hyperparameter inference is that there are no user tunable parameters in the model, an important feature that we believe helps explain its consistently high performance across test settings. |
Experimental Setup | Training Regimes and Hyperparameters For each run of our model we perform three random restarts to convergence and select the posterior with lowest final free energy. |
Experimental Setup | Dirichlet hyperparameters are set to 0.1. |
Model | Fixed hyperparameters are subscripted with zero. |