Conclusion and future work | In this paper all of the hyperparameters 04A were tied and varied simultaneously, but it is desirable to learn these from data as well. |
Conclusion and future work | Just before the camera-ready version of this paper was due we developed a method for estimating the hyperparameters by putting a vague Gamma hyper-prior on each 04A and sampled using Metropolis-Hastings with a sequence of increasingly narrow Gamma proposal distributions, producing results for each model that are as good or better than the best ones reported in Table l. |
Word segmentation with adaptor grammars | We tied the Dirichlet Process concentration parameters a, and performed runs with 04 = 1, 10, 100 and 1000; apart from this, no attempt was made to optimize the hyperparameters . |
Word segmentation with adaptor grammars | It may be possible to correct this by “tuning” the grammar’s hyperparameters , but we did not attempt this here. |