BBC News Database | Unlike other unsupervised approaches vhere a set of latent variables is introduced, each 1efining a joint distribution on the space of key-vords and image features, the relevance model cap-,ures the joint probability of images and annotated vords directly, without requiring an intermediate :lustering stage. |
BBC News Database | achieve competitive performance with latent variable models. |
BBC News Database | Each annotated image in the training set is treated as a latent variable . |
Related Work | Another way of capturing co-occurrence information is to introduce latent variables linking image features with words. |
Abstract | We present a translation model which models derivations as a latent variable , in both training and decoding, and is fully discriminative and globally optimised. |
Challenges for Discriminative SMT | Instead we model the translation distribution with a latent variable for the derivation, which we marginalise out in training and decoding. |
Discriminative Synchronous Transduction | As the training data only provides source and target sentences, the derivations are modelled as a latent variable . |
Discriminative Synchronous Transduction | Our findings echo those observed for latent variable log-linear models successfully used in monolingual parsing (Clark and Curran, 2007; Petrov et al., 2007). |
Discriminative Synchronous Transduction | This method has been demonstrated to be effective for (non-convex) log-linear models with latent variables (Clark and Curran, 2004; Petrov et al., 2007). |
Evaluation | Derivational ambiguity Table 1 shows the impact of accounting for derivational ambiguity in training and decoding.5 There are two options for training, we could use our latent variable model and optimise the probability of all derivations of the reference translation, or choose a single derivation that yields the reference and optimise its probability alone. |
Evaluation | Max-translation decoding for the model trained on single derivations has only a small positive effect, while for the latent variable model the impact is much larger.6 |
Introduction | Second, within this framework, we model the derivation, d, as a latent variable , p(e, d|f), which is marginalised out in training and decoding. |