Index of papers in Proc. ACL 2009 that mention

Seen in text as:

Seen in 15 sentences in 3 papers.

Huang, Fei and Yates, Alexander

Abstract	Supervised sequence-labeling systems in natural language processing often suffer from data sparsity because they use word types as features in their prediction tasks.
Introduction	Data sparsity and high dimensionality are the twin curses of statistical natural language processing (NLP).
Introduction	The negative effects of data sparsity have been well-documented in the NLP literature.
Introduction	Our technique is particularly well-suited to handling data sparsity because it is possible to improve performance on rare words by supplementing the training data with additional unannotated text containing more examples of the rare words.
Related Work	Sophisticated smoothing techniques like modified Kneser-Ney and Katz smoothing (Chen and Goodman, 1996) smooth together the predictions of unigram, bi-gram, trigram, and potentially higher n-gram sequences to obtain accurate probability estimates in the face of data sparsity .
Smoothing Natural Language Sequences	For supervised sequence-labeling problems in NLP, the most important “complicating factor” that we seek to avoid through smoothing is the data sparsity associated with word-based representations.
Smoothing Natural Language Sequences	Importantly, we seek distributional representations that will provide features that are common in both training and test data, to avoid data sparsity .

data sparsity is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

Sun, Xu and Okazaki, Naoaki and Tsujii, Jun'ichi

Results and Discussion	The curves suggest that the data sparseness problem could be the reason for the differences in performance.
Results and Discussion	For the latent variable approach, its curve demonstrates that it did not cause a severe data sparseness problem.
Results and Discussion	5 In addition, the training data of the English task is much smaller than for the Chinese task, which could make the models more sensitive to data sparseness .

data sparsity is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

Boxwell, Stephen and Mehay, Dennis and Brew, Chris

Abstract	Most previously developed systems are CFG-based and make extensive use of a treepath feature, which suffers from data sparsity due to its use of explicit tree configurations.
Abstract	CCG affords ways to augment treepath-based features to overcome these data sparsity issues.
Potential Advantages to using CCG	Because there are a number of different treepaths that correspond to a single relation (figure 2), this approach can suffer from data sparsity .

data sparsity is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: