Abstract | Supervised sequence-labeling systems in natural language processing often suffer from data sparsity because they use word types as features in their prediction tasks. |
Introduction | Data sparsity and high dimensionality are the twin curses of statistical natural language processing (NLP). |
Introduction | The negative effects of data sparsity have been well-documented in the NLP literature. |
Introduction | Our technique is particularly well-suited to handling data sparsity because it is possible to improve performance on rare words by supplementing the training data with additional unannotated text containing more examples of the rare words. |
Related Work | Sophisticated smoothing techniques like modified Kneser-Ney and Katz smoothing (Chen and Goodman, 1996) smooth together the predictions of unigram, bi-gram, trigram, and potentially higher n-gram sequences to obtain accurate probability estimates in the face of data sparsity . |
Smoothing Natural Language Sequences | For supervised sequence-labeling problems in NLP, the most important “complicating factor” that we seek to avoid through smoothing is the data sparsity associated with word-based representations. |
Smoothing Natural Language Sequences | Importantly, we seek distributional representations that will provide features that are common in both training and test data, to avoid data sparsity . |
Results and Discussion | The curves suggest that the data sparseness problem could be the reason for the differences in performance. |
Results and Discussion | For the latent variable approach, its curve demonstrates that it did not cause a severe data sparseness problem. |
Results and Discussion | 5 In addition, the training data of the English task is much smaller than for the Chinese task, which could make the models more sensitive to data sparseness . |
Abstract | Most previously developed systems are CFG-based and make extensive use of a treepath feature, which suffers from data sparsity due to its use of explicit tree configurations. |
Abstract | CCG affords ways to augment treepath-based features to overcome these data sparsity issues. |
Potential Advantages to using CCG | Because there are a number of different treepaths that correspond to a single relation (figure 2), this approach can suffer from data sparsity . |