Abstract | In this paper, we propose a novel neural network model for Chinese word segmentation called Max-Margin Tensor Neural Network (MMTNN). |
Abstract | Despite Chinese word segmentation being a specific case, MMTNN can be easily generalized and applied to other sequence labeling tasks. |
Conventional Neural Network | Formally, in the Chinese word segmentation task, we have a character dictionary D of size Unless otherwise specified, the character dictionary is extracted from the training set and unknown characters are mapped to a special symbol that is not used elsewhere. |
Conventional Neural Network | In Chinese word segmentation , the most prevalent tag set T is BMES tag set, which uses 4 tags to carry word boundary information. |
Conventional Neural Network | (2013) modeled Chinese word segmentation as a series of |
Introduction | (2011) to Chinese word segmentation and POS tagging and proposed a perceptron-style algorithm to speed up the training process with negligible loss in performance. |
Introduction | We evaluate the performance of Chinese word segmentation on the PKU and MSRA benchmark datasets in the second International Chinese Word Segmentation Bakeoff (Emerson, 2005) which are commonly used for evaluation of Chinese word segmentation . |
Introduction | 0 We propose a Max-Margin Tensor Neural Network for Chinese word segmentation without feature engineering. |
Max-Margin Tensor Neural Network | In Chinese word segmentation , a proper modeling of the tag-tag interaction, tag-character interaction and character-character interaction is very important. |
Abstract | The focus of recent studies on Chinese word segmentation , part-of-speech (POS) tagging and parsing has been shifting from words to characters. |
Conclusion | A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-speech Tagging. |
Conclusion | Word Lattice Reranking for Chinese Word Segmentation |
Conclusion | An Error—Driven Word—Character Hybird Model for Joint Chinese Word Segmentation and POS Tagging. |
Introduction | In recent years, the focus of research on Chinese word segmentation , part-of-speech (POS) tagging and parsing has been shifting from words toward characters. |
Abstract | Automatic extraction of new words is an indispensable precursor to many NLP tasks such as Chinese word segmentation , named entity extraction, and sentiment analysis. |
Experiment | The posts were then part-of-speech tagged using a Chinese word segmentation tool named ICTCLAS (Zhang et al., 2003). |
Introduction | Automatic extraction of new words is indispensable to many tasks such as Chinese word segmentation , machine translation, named entity extraction, question answering, and sentiment analysis. |
Introduction | New word detection is one of the most critical issues in Chinese word segmentation . |
Methodology | Obviously, in order to obtain the value of 3(wi), some particular Chinese word segmentation tool is required. |
Character-Level Dependency Tree | Zhao (2009) was the first to study character-level dependencies; they argue that since no consistent word boundaries exist over Chinese word segmentation, dependency-based representations of word structures serve as a good alternative for Chinese word segmentation . |
Character-Level Dependency Tree | (2012) proposed a joint model for Chinese word segmentation , POS-tagging and dependency parsing, studying the influence of joint model and character features for parsing, Their model is extended from the arc-standard transition-based model, and can be regarded as an alternative to the arc-standard model of our work when pseudo intra-word dependencies are used. |
Introduction | First, character-level trees circumvent the issue that no universal standard exists for Chinese word segmentation . |
Introduction | In the well-known Chinese word segmentation bakeoff tasks, for example, different segmentation standards have been used by different data sets (Emerson, 2005). |
Abstract | Both Omni-word feature and soft constraint make a better use of sentence information and minimize the influences caused by Chinese word segmentation and parsing. |
Introduction | Lacking of orthographic word makes Chinese word segmentation difficult. |
Related Work | (2008; 2010) also pointed out that, due to the inaccuracy of Chinese word segmentation and parsing, the tree kernel based approach is inappropriate for Chinese relation extraction. |