Index of papers in Proc. ACL 2013 that mention

Chinese word

Seen in text as:

Chinese word (43)
Chinese words (13)
Chinese Word (3)

Seen in 57 sentences in 8 papers.

1. Chinese Parsing Exploiting Characters

Zhang, Meishan and Zhang, Yue and Che, Wanxiang and Liu, Ting

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Character-based Chinese Parsing	Trained using annotated word structures, our parser also analyzes the internal structures of Chinese words .
Introduction	Frequently-occurring character sequences that express certain meanings can be treated as words, while most Chinese words have syntactic structures.
Related Work	Zhao (2009) studied character-level dependencies for Chinese word segmentation by formalizing segmentsion task in a dependency parsing framework.
Related Work	They use it as a joint framework to perform Chinese word segmentation, POS tagging and syntax parsing.
Related Work	They exploit a generative maximum entropy model for character-based constituent parsing, and find that POS information is very useful for Chinese word segmentation, but high-level syntactic information seems to have little effect on segmentation.
Word Structures and Syntax Trees	Unlike alphabetical languages, Chinese characters convey meanings, and the meaning of most Chinese words takes roots in their character.
Word Structures and Syntax Trees	Chinese words have internal structures (Xue, 2001; Ma et al., 2012).
Word Structures and Syntax Trees	Zhang and Clark (2010) found that the first character in a Chinese word is a useful indicator of the word’s POS.

Chinese word is mentioned in 13 sentences in this paper.

Topics mentioned in this paper:

2. Improving Chinese Word Segmentation on Micro-blog Using Rich Punctuations

Zhang, Longkai and Li, Li and He, Zhengyan and Wang, Houfeng and Sun, Ni

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	While no segmented corpus of micro-blogs is available to train Chinese word segmentation model, existing Chinese word segmentation tools cannot perform equally well as in ordinary news texts.
Abstract	In this paper we present an effective yet simple approach to Chinese word segmentation of micro-blog.
Experiment	We use the benchmark datasets provided by the second International Chinese Word Segmentation Bakeoff2 as the labeled data.
Experiment	The first two are both famous Chinese word segmentation tools: ICTCLAS3 and Stanford Chinese word segmenter4, which are widely used in NLP related to word segmentation.
Experiment	Stanford Chinese word segmenter is a CRF-based segmentation tool and its segmentation standard is chosen as the PKU standard, which is the same to ours.
INTRODUCTION	These new features of micro-blogs make the Chinese Word Segmentation (CWS) models trained on the source domain, such as news corpus, fail to perform equally well when transferred to texts from micro-blogs.
Our method	Chinese word segmentation problem might be treated as a character labeling problem which gives each character a label indicating its position in one word.
Related Work	Recent studies show that character sequence labeling is an effective formulation of Chinese word segmentation (Low et al., 2005; Zhao et al., 2006a,b; Chen et al., 2006; Xue, 2003).
Related Work	(1998) takes advantage of the huge amount of raw text to solve Chinese word segmentation problems.
Related Work	Besides, Sun and Xu (2011) uses a sequence labeling framework, while unsupervised statistics are used as discrete features in their model, which prove to be effective in Chinese word segmentation.

Chinese word is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

3. Discriminative Learning with Natural Annotations: Word Segmentation as a Case Study

Jiang, Wenbin and Sun, Meng and Lü, Yajuan and Yang, Yating and Liu, Qun

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	With Chinese word segmentation as a case study, experiments show that the segmenter enhanced with the Chinese wikipedia achieves significant improvement on a series of testing sets from different domains, even with a single classifier and local features.
Conclusion and Future Work	Experiments on Chinese word segmentation show that, the enhanced word segmenter achieves significant improvement on testing sets of different domains, although using a single classifier with only local features.
Experiments	We use the Penn Chinese Treebank 5.0 (CTB) (Xue et al., 2005) as the existing annotated corpus for Chinese word segmentation.
Experiments	Table 4: Comparison with state-of-the-art work in Chinese word segmentation.
Experiments	Table 4 shows the comparison with other work in Chinese word segmentation.
Introduction	Taking Chinese word segmentation for example, the state-of-the-art models (Xue and Shen, 2003; Ng and Low, 2004; Gao et al., 2005; Nakagawa and Uchimoto, 2007; Zhao and Kit, 2008; J iang et al., 2009; Zhang and Clark, 2010; Sun, 2011b; Li, 2011) are usually trained on human-annotated corpora such as the Penn Chinese Treebank (CTB) (Xue et al., 2005), and perform quite well on corresponding test sets.
Introduction	In the rest of the paper, we first briefly introduce the problems of Chinese word segmentation and the character classification model in section
Related Work	Li and Sun (2009) extracted character classification instances from raw text for Chinese word segmentation, resorting to the indication of punctuation marks between characters.
Related Work	Sun and Xu (Sun and Xu, 2011) utilized the features derived from large-scaled unlabeled text to improve Chinese word segmentation.

Chinese word is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

4. Argument Inference from Relevant Event Mentions in Chinese Argument Extraction

Li, Peifeng and Zhu, Qiaoming and Zhou, Guodong

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimentation	Finally, all the sentences in the corpus are divided into words using a Chinese word segmentation tool (ICTCLAS)1 with all entities annotated in the corpus kept.
Inferring Inter-Sentence Arguments on Relevant Event Mentions	The second issue is that the Chinese word order in a sentence is rather agile for the open
Inferring Inter-Sentence Arguments on Relevant Event Mentions	(2012a) find out that sometimes two trigger mentions are within a Chinese word whose morphological structure is Coordination.
Inferring Inter-Sentence Arguments on Relevant Event Mentions	The relation between those event mentions whose triggers merge a Chinese word or share the subject and the object are Parallel.

Chinese word is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

5. Mining Informal Language from Chinese Microtext: Joint Word Recognition and Segmentation

Wang, Aobo and Kan, Min-Yen

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We exploit this reliance as an opportunity: recognizing the relation between informal word recognition and Chinese word segmentation, we propose to model the two tasks jointly.
Conclusion	There is a close dependency between Chinese word segmentation (CWS) and informal word recognition (IWR).
Introduction	This example illustrates the mutual dependency between Chinese word segmentation (henceforth, CWS) and informal word recognition (IWR) that should be solved jointly.
Methodology	Given an input Chinese microblog post, our method simultaneously segments the sentences into words (the Chinese Word Segmentation, CWS, task), and marks the component words as informal or formal ones (the Informal Word Re-congition, IWR, task).
Methodology	0 (*)IkaCk+1(i—4 < k < i+4)isnota Chinese word recorded in dictionaries: CPMI—N@k+i; CPMI—M@k+i; CDifl@k+i; PYPMI—N@k+i; PYPMI—M@k+i; PYD-ifi”@k+i

Chinese word is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

SVM (13)
baseline systems (11)
CRF (11)

6. Word Alignment Modeling with Context Dependent Deep Neural Network

Yang, Nan and Liu, Shujie and Li, Mu and Zhou, Ming and Yu, Nenghai

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments and Results	By analyzing the results, we found out that for both baseline and our model, a large part of missing alignment links involves stop words like English words “the”, “a”, “it” and Chinese words “de”.
Experiments and Results	As Chinese language lacks morphology, the single form and plural form of a noun in English often correspond to the same Chinese word , thus it is desirable that the two English words should have similar word embeddings.
Introduction	As shown in example (a) of Figure 1, in word pair {“juda” =>“mammot ”}, the Chinese word “juda” is a common word, but
Introduction	For example (b) in Figure l, for the word pair {“yibula” => “Yibula”}, both the Chinese word “yibula” and English word “Yibula” are rare name entities, but the words around them are very common, which are {“nongmin”, “shuo”} for Chinese side and {“farmer”, “said”} for the English side.
Training	For example, many Chinese words can act as a verb, noun and adjective without any change, while their English counter parts are distinct words with quite different word embeddings due to their different syntactic roles.

Chinese word is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

7. Enlisting the Ghost: Modeling Empty Categories for Machine Translation

Xiang, Bing and Luo, Xiaoqiang and Zhou, Bowen

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Results	We first predict pro and PRO with our annotation model for all Chinese sentences in the parallel training data, with pro and PRO inserted between the original Chinese words .
Integrating Empty Categories in Machine Translation	One of the other frequent ECs, OP , appears in the Chinese relative clauses, which usually have a Chinese word “De” aligned to the target side “that” or “which”.
Introduction	Consequently, “that” is incorrectly aligned to the second to the last Chinese word “De”, due to their high co-occurrence frequency in the training data.

Chinese word is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

8. Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging

Zeng, Xiaodong and Wong, Derek F. and Chao, Lidia S. and Trancoso, Isabel

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	This paper introduces a graph-based semi-supervised joint model of Chinese word segmentation and part-of-speech tagging.
Introduction	As far as we know, however, these methods have not yet been applied to resolve the problem of joint Chinese word segmentation (CWS) and POS tagging.
Method	This study introduces a novel semi-supervised approach for joint Chinese word segmentation and POS tagging.

Chinese word is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: