Index of papers in Proc. ACL 2012 that mention
  • Chinese word
Sun, Xu and Wang, Houfeng and Li, Wenjie
Abstract
We present a joint model for Chinese word segmentation and new word detection.
Introduction
The major problem of Chinese word segmentation is the ambiguity.
Introduction
In this paper, we present high dimensional new features, including word-based features and enriched edge (label-transition) features, for the joint modeling of Chinese word segmentation (CWS) and new word detection (NWD).
Introduction
0 We propose a joint model for Chinese word segmentation and new word detection.
Related Work
Conventional approaches to Chinese word segmentation treat the problem as a sequential labeling task (Xue, 2003; Peng et al., 2004; Tseng et al., 2005; Asahara et al., 2005; Zhao et al., 2010).
System Architecture
This phenomenon will also undermine the performance of Chinese word segmentation.
System Architecture
The B, I, E labels have been widely used in previous work of Chinese word segmentation (Sun et al., 2009b).
System Architecture
_ We used benchmark datasets provided by the second International Chinese Word Segmentation Bakeoff to test our proposals.
Chinese word is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Zhao, Qiuye and Marcus, Mitch
Abstract
We show for both English POS tagging and Chinese word segmentation that with proper representation, large number of deterministic constraints can be learned from training examples, and these are useful in constraining probabilistic inference.
Abstract
In this work, we explore deterministic constraints for two fundamental NLP problems, English POS tagging and Chinese word segmentation.
Abstract
For Chinese word segmentation (CWS), which can be formulated as character tagging, analogous constraints can be learned with the same templates as English POS tagging.
Chinese word is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Sun, Weiwei and Wan, Xiaojun
About Heterogeneous Annotations
For Chinese word segmentation and POS tagging, supervised learning has become a dominant paradigm.
About Heterogeneous Annotations
Take Chinese word segmentation for example.
Abstract
We address the issue of consuming heterogeneous annotation data for Chinese word segmentation and part-of-speech tagging.
Conclusion
Our theoretical and empirical analysis of two representative popular corpora highlights two essential characteristics of heterogeneous annotations which are eXplored to reduce approximation and estimation errors for Chinese word segmentation and POS tagging.
Experiments
Previous studies on joint Chinese word segmentation and POS tagging have used the CTB in experiments.
Introduction
This paper explores heterogeneous annotations to reduce both approximation and estimation errors for Chinese word segmentation and part-of-speech (POS) tagging, which are fundamental steps for more advanced Chinese language processing tasks.
Chinese word is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Liu, Chang and Ng, Hwee Tou
Discussion and Future Work
Chinese word segmentation.
Experiments
We use the Stanford Chinese word segmenter (Tseng et al., 2005) and POS tagger (Toutanova et al., 2003) for preprocessing and Cilin for synonym
Experiments
In all our experiments here we use TESLA-CELAB with n- grams for 77. up to four, since the vast majority of Chinese words , and therefore synonyms, are at most four characters long.
Chinese word is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: