Boosting-style algorithm | This can be used for example in the case where the experts are derived from probabilistic models . |
Introduction | Furthermore, these methods typically assume the use of probabilistic models , which is not a requirement in our learning scenario. |
Introduction | Other ensembles of probabilistic models have also been considered in text and speech processing by forming a product of probabilistic models via the intersection of lattices (Mohri et al., 2008), or a straightforward combination of the posteriors from probabilistic grammars trained using EM with different starting points (Petrov, 2010), or some other rather intricate techniques in speech recognition (Fiscus, 1997). |
Introduction | In this paper we use a basic neural network architecture and a lexicalized probability model to create a powerful MT decoding feature. |
Model Variations | Formally, the probability model is: |
Neural Network Joint Model (NNJ M) | far too sparse for standard probability models such as Kneser-Ney back-off (Kneser and Ney, 1995) or Maximum Entropy (Rosenfeld, 1996). |
Abstract | However, these new methods lack the rich priors associated with probabilistic models . |
Adding Regularization | For example, if we are seeking the MLE of a probabilistic model parameterized by 6, p(:c|6), adding a regularization term 7(6) = 2le 6? |
Regularization Improves Topic Models | This is the typical evaluation for probabilistic models . |