Index of papers in Proc. ACL 2008 that mention
  • overfitting
Zhang, Hao and Quirk, Chris and Moore, Robert C. and Gildea, Daniel
Conclusion
On top of these hard constraints, the sparse prior of VB helps make the model less prone to overfitting to infrequent phrase pairs, and thus improves the quality of the phrase pairs the model learns.
Experiments
Using EM, because of overfitting , AER drops first and increases again as the number of iterations varies from 1 to 10.
Experiments
The gain is especially large on the test data set, indicating VB is less prone to overfitting .
Introduction
In this direction, Expectation Maximization at the phrase level was proposed by Marcu and Wong (2002), who, however, experienced two major difficulties: computational complexity and controlling overfitting .
Introduction
Computational complexity arises from the exponentially large number of decompositions of a sentence pair into phrase pairs; overfitting is a problem because as EM attempts to maximize the likelihood of its training data, it prefers to directly explain a sentence pair with a single phrase pair.
Introduction
We address the tendency of EM to overfit by using Bayesian methods, where sparse priors assign greater mass to parameter vectors with fewer nonzero values therefore favoring shorter, more frequent phrases.
Variational Bayes for ITG
If we do not put any constraint on the distribution of phrases, EM overfits the data by memorizing every sentence pair.
overfitting is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Saha, Sujan Kumar and Mitra, Pabitra and Sarkar, Sudeshna
Abstract
These methods tend to overfit when the available training corpus is limited especially if the number of features is large or the number of values for a feature is large.
Conclusion
This is probably due to reduction of overfitting .
Introduction
In an effort to reduce overfitting , they use a combination of a Gaussian prior and early-stopping.
Introduction
This is due to overfitting which is a serious problem in most of the NLP tasks in resource poor languages where annotated data is scarce.
Maximum Entropy Based Model for Hindi NER
From the above discussion it is clear that the system suffers from overfitting if a large number of features are used to train the system.
overfitting is mentioned in 5 sentences in this paper.
Topics mentioned in this paper: