SciSurf: Index of "Chinese word" in Proc. ACL 2014

Index of papers in Proc. ACL 2014 that mention

Chinese word

Seen in text as:

Chinese word (42)
Chinese Word (10)
Chinese words (9)

Seen in 56 sentences in 7 papers.

1. Max-Margin Tensor Neural Network for Chinese Word Segmentation

Pei, Wenzhe and Ge, Tao and Chang, Baobao

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In this paper, we propose a novel neural network model for Chinese word segmentation called Max-Margin Tensor Neural Network (MMTNN).
Abstract	Despite Chinese word segmentation being a specific case, MMTNN can be easily generalized and applied to other sequence labeling tasks.
Conventional Neural Network	Formally, in the Chinese word segmentation task, we have a character dictionary D of size Unless otherwise specified, the character dictionary is extracted from the training set and unknown characters are mapped to a special symbol that is not used elsewhere.
Conventional Neural Network	In Chinese word segmentation, the most prevalent tag set T is BMES tag set, which uses 4 tags to carry word boundary information.
Conventional Neural Network	(2013) modeled Chinese word segmentation as a series of
Introduction	(2011) to Chinese word segmentation and POS tagging and proposed a perceptron-style algorithm to speed up the training process with negligible loss in performance.
Introduction	We evaluate the performance of Chinese word segmentation on the PKU and MSRA benchmark datasets in the second International Chinese Word Segmentation Bakeoff (Emerson, 2005) which are commonly used for evaluation of Chinese word segmentation.
Introduction	0 We propose a Max-Margin Tensor Neural Network for Chinese word segmentation without feature engineering.
Max-Margin Tensor Neural Network	In Chinese word segmentation, a proper modeling of the tag-tag interaction, tag-character interaction and character-character interaction is very important.

Chinese word is mentioned in 16 sentences in this paper.

Topics mentioned in this paper:

2. A Joint Graph Model for Pinyin-to-Chinese Conversion with Typo Correction

Jia, Zhongye and Zhao, Hai

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We will also report the conversion error rate (ConvER) proposed by (Zheng et al., 2011a), which is the ratio of the number of mistyped pinyin word that is not converted to the right Chinese word over the total number of mistyped pinyin words3.
Experiments	According to our empirical observation, emission probabilities are mostly 1 since most Chinese words have unique pronunciation.
Experiments	, 201 1a) performed an experiment that 2,000 sentences of 11,968 Chinese words were entered by 5 native speakers.
Introduction	However, every Chinese word inputted into computer or cellphone cannot be typed through one-to-one mapping of key-to-letter inputting directly, but has to go through an IME as there are thousands of Chinese characters for inputting while only 26 letter keys are available in the keyboard.
Pinyin Input Method Model	Without word delimiters, linguists have argued on what a Chinese word really is for a long time and that is why there is always a primary word segmentation treatment in most Chinese language processing tasks (Zhao et al., 2006; Huang and Zhao, 2007; Zhao and Kit, 2008; Zhao et al., 2010; Zhao and Kit, 2011; Zhao et al., 2013).
Pinyin Input Method Model	A Chinese word may contain from 1 to over 10 characters due to different word segmentation conventions.
Pinyin Input Method Model	Nevertheless, pinyin syllable segmentation is a much easier problem compared to Chinese word segmentation.

Chinese word is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

3. Chinese Morphological Analysis with Character-level POS Tagging

Shen, Mo and Liu, Hongxiao and Kawahara, Daisuke and Kurohashi, Sadao

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	The focus of recent studies on Chinese word segmentation, part-of-speech (POS) tagging and parsing has been shifting from words to characters.
Conclusion	A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-speech Tagging.
Conclusion	Word Lattice Reranking for Chinese Word Segmentation
Conclusion	An Error—Driven Word—Character Hybird Model for Joint Chinese Word Segmentation and POS Tagging.
Introduction	In recent years, the focus of research on Chinese word segmentation, part-of-speech (POS) tagging and parsing has been shifting from words toward characters.

Chinese word is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

4. Omni-word Feature and Soft Constraint for Chinese Relation Extraction

Chen, Yanping and Zheng, Qinghua and Zhang, Wei

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Both Omni-word feature and soft constraint make a better use of sentence information and minimize the influences caused by Chinese word segmentation and parsing.
Feature Construction	Furthermore, for a single Chinese word , occurrences of 4 characters are frequent.
Feature Construction	First, the specificity of Chinese word-formation indicates that the subphrases of Chinese word (or phrase) are also informative.
Introduction	The difficulty of Chinese IE is that Chinese words are written next to each other without delimiter in between.
Introduction	Lacking of orthographic word makes Chinese word segmentation difficult.
Related Work	(2008; 2010) also pointed out that, due to the inaccuracy of Chinese word segmentation and parsing, the tree kernel based approach is inappropriate for Chinese relation extraction.

Chinese word is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

5. New Word Detection for Sentiment Analysis

Huang, Minlie and Ye, Borui and Wang, Yichen and Chen, Haiqiang and Cheng, Junjun and Zhu, Xiaoyan

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Automatic extraction of new words is an indispensable precursor to many NLP tasks such as Chinese word segmentation, named entity extraction, and sentiment analysis.
Experiment	The posts were then part-of-speech tagged using a Chinese word segmentation tool named ICTCLAS (Zhang et al., 2003).
Introduction	Automatic extraction of new words is indispensable to many tasks such as Chinese word segmentation, machine translation, named entity extraction, question answering, and sentiment analysis.
Introduction	New word detection is one of the most critical issues in Chinese word segmentation.
Introduction	Statistics show that more than 1000 new Chinese words appear every
Methodology	Obviously, in order to obtain the value of 3(wi), some particular Chinese word segmentation tool is required.

Chinese word is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

6. Character-Level Chinese Dependency Parsing

Zhang, Meishan and Zhang, Yue and Che, Wanxiang and Liu, Ting

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Character-Level Dependency Tree	The results demonstrate that the structures of Chinese words are not difficult to predict, and confirm the fact that Chinese word structures have some common syntactic patterns.
Character-Level Dependency Tree	Zhao (2009) was the first to study character-level dependencies; they argue that since no consistent word boundaries exist over Chinese word segmentation, dependency-based representations of word structures serve as a good alternative for Chinese word segmentation.
Character-Level Dependency Tree	(2012) proposed a joint model for Chinese word segmentation, POS-tagging and dependency parsing, studying the influence of joint model and character features for parsing, Their model is extended from the arc-standard transition-based model, and can be regarded as an alternative to the arc-standard model of our work when pseudo intra-word dependencies are used.
Introduction	First, character-level trees circumvent the issue that no universal standard exists for Chinese word segmentation.
Introduction	In the well-known Chinese word segmentation bakeoff tasks, for example, different segmentation standards have been used by different data sets (Emerson, 2005).

Chinese word is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

7. Toward Better Chinese Word Segmentation for SMT via Bilingual Constraints

Zeng, Xiaodong and Chao, Lidia S. and Wong, Derek F. and Trancoso, Isabel and Tian, Liang

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	This study investigates on building a better Chinese word segmentation model for statistical machine translation.
Experiments	All other nine CWS models outperforms the CS baseline which does not try to identify Chinese words at all.
Introduction	They leverage such mappings to either constitute a Chinese word dictionary for maximum-matching segmentation (Xu et al., 2004), or form labeled data for training a sequence labeling model (Paul et al., 2011).
Introduction	This paper proposes an alternative Chinese Word Segmentation (CWS) model adapted to the SMT task, which seeks not only to maintain the advantages of a monolingual supervised model, having hand-annotated linguistic knowledge, but also to assimilate the relevant bilingual segmenta-

Chinese word is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

CRFs (30)
segmentations (18)
treebank (16)