Abstract | We demonstrate that distributional representations of word types, trained on unannotated text, can be used to improve performance on rare words. |
Introduction | We investigate the use of distributional representations , which model the probability distribution of a word’s context, as techniques for finding smoothed representations of word sequences. |
Introduction | That is, we use the distributional representations to share information across unannotated examples of the same word type. |
Introduction | We then compute features of the distributional representations , and provide them as input to our supervised sequence labelers. |
Smoothing Natural Language Sequences | Importantly, we seek distributional representations that will provide features that are common in both training and test data, to avoid data sparsity. |
Smoothing Natural Language Sequences | In the next three sections, we develop three techniques for smoothing text using distributional representations . |
Smoothing Natural Language Sequences | This gives greater weight to words with more idiosyncratic distributions and may improve the informativeness of a distributional representation . |