Introduction | We show that simultaneously learning syllable structure and collocations improves word segmentation accuracy compared to models that learn these independently. |
Introduction | This paper applies adaptor grammars to word segmentation and morphological acquisition. |
Word segmentation with adaptor grammars | We now turn to linguistic applications of adaptor grammars, specifically, to models of unsupervised word segmentation . |
Word segmentation with adaptor grammars | Table 1: Word segmentation f-score results for all models, as a function of DP concentration parameter oz. |
Word segmentation with adaptor grammars | Table 1 summarizes the word segmentation f-scores for all models described in this paper. |
A single character is used if no suffix occurs 10 times. | In a full understanding system, output of the word segmenter would be passed to morphological and local syntactic processing. |
A single character is used if no suffix occurs 10 times. | Because standard models of morphological learning don’t address the interaction with word segmentation , WordEnds does a simple version of this repair process using a placeholder algorithm called Mini-morph. |
Previous work | Word segmentation experiments by Christiansen and Allen (1997) and Harrington et al. |
The task in more detail | The datasets are informal conversations in which debatable word segmentations are rare. |
The task in more detail | A theory of word segmentation must explain how affixes differ from freestanding function words. |
Previous Work | Nakagawa (2004) combine word-level and character-level information for Chinese and Japanese word segmentation . |
Previous Work | (of all words in a given sentence) and the POS tagging (of the known words) is based on a Viterbi search over a lattice composed of all possible word segmentations and the possible classifications of all observed characters. |
Previous Work | Their experimental results show that the method achieves high accuracy over state-of-the-art methods for Chinese and Japanese word segmentation . |