SciSurf: Index of "overfitting" in Proc. ACL 2011

Index of papers in Proc. ACL 2011 that mention

overfitting

Seen in text as:

overfitting (7)

Seen in 8 sentences in 1 papers.

1. Learning Hierarchical Translation Structure with Linguistic Annotations

Mylonakis, Markos and Sima'an, Khalil

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusions	We address overfitting issues by cross-validating climbing the likelihood of the training data and propose solutions to increase the efficiency and accuracy of decoding.
Introduction	Estimating such grammars under a Maximum Likelihood criterion is known to be plagued by strong overfitting leading to degenerate estimates (DeNero et al., 2006).
Introduction	In contrast, our learning objective not only avoids overfitting the training data but, most importantly, learns joint stochastic synchronous grammars which directly aim at generalisation towards yet unseen instances.
Learning Translation Structure	On the other hand, estimating the parameters under Maximum-Likelihood Estimation (MLE) for the latent translation structure model 19(0) is bound to overfit towards memorising whole sentence-pairs as discussed in (Mylonakis and Sima’an, 2010), with the resulting grammar estimate not being able to
Learning Translation Structure	However, apart from overfitting towards long phrase-pairs, a grammar with millions of structural rules is also liable to overfit towards degenerate latent structures which, while fitting the training data well, have limited applicability to unseen sentences.
Learning Translation Structure	The CV—criterion, apart from avoiding overfitting , results in discarding the structural rules which are only found in a single part of the training corpus, leading to a more compact grammar while still retaining millions of structural rules that are more hopeful to generalise.
Related Work	We show that a translation system based on such a joint model can perform competitively in comparison with conditional probability models, when it is augmented with a rich latent hierarchical structure trained adequately to avoid overfitting .
Related Work	Cohn and Blunsom (2009) sample rules of the form proposed in (Galley et al., 2004) from a Bayesian model, employing Dirichlet Process priors favouring smaller rules to avoid overfitting .

overfitting is mentioned in 8 sentences in this paper.

Topics mentioned in this paper: