Index of papers in Proc. ACL 2012 that mention
  • POS tagging
Gardent, Claire and Narayan, Shashi
Experiment and Results
One feature of our approach is that it permits mining the data for tree patterns of arbitrary size using different types of labelling information ( POS tags , dependencies, word forms and any combination thereof).
Experiment and Results
4.3.1 Mining on single labels (word form, POS tag or dependency)
Experiment and Results
Mining on a single label permits (i) assessing the relative impact of each category in a given label category and (ii) identifying different sources of errors depending on the type of label considered ( POS tag , dependency or word form).
POS tagging is mentioned in 18 sentences in this paper.
Topics mentioned in this paper:
Hatori, Jun and Matsuzaki, Takuya and Miyao, Yusuke and Tsujii, Jun'ichi
Abstract
We propose the first joint model for word segmentation, POS tagging , and dependency parsing for Chinese.
Abstract
Based on an extension of the incremental joint model for POS tagging and dependency parsing (Hatori et al., 2011), we propose an efficient character-based decoding method that can combine features from state-of-the-art segmentation, POS tagging , and dependency parsing models.
Abstract
In experiments using the Chinese Treebank (CTB), we show that the accuracies of the three tasks can be improved significantly over the baseline models, particularly by 0.6% for POS tagging and 2.4% for dependency parsing.
Introduction
Furthermore, the word-level information is often augmented with the POS tags , which, along with segmentation, form the basic foundation of statistical NLP.
Introduction
Because the tasks of word segmentation and POS tagging have strong interactions, many studies have been devoted to the task of joint word segmentation and POS tagging for languages such as Chinese (e.g.
Introduction
This is because some of the segmentation ambiguities cannot be resolved without considering the surrounding grammatical constructions encoded in a sequence of POS tags .
Related Works
In Chinese, Luo (2003) proposed a joint constituency parser that performs segmentation, POS tagging , and parsing within a single character-based framework.
POS tagging is mentioned in 46 sentences in this paper.
Topics mentioned in this paper:
Li, Zhenghua and Liu, Ting and Che, Wanxiang
Dependency Parsing
Given an input sentence x = wowl...wn and its POS tag sequence 1; = totl...tn, the goal of dependency parsing is to build a dependency tree as depicted in Figure l, denoted by d = {(h, m, l) : 0 g h 3 72,0 < m g n,l E L}, where (h,m, l) indicates an directed arc from the head word (also called father) w, to the modifier (also called child or dependent) wm with a dependency label l, and L is the label set.
Dependency Parsing with QG Features
The type of the TP is conjoined with the related words and POS tags , such that the QG—enhanced parsing models can make more elaborate decisions based on the context.
Experiments and Analysis
CDT and CTB5/6 adopt different POS tag sets, and converting from one tag set to another is difficult (Niu et al., 2009).5 To overcome this problem, we use the People’s Daily corpus (PD),6 a large—scale corpus annotated with word segmentation and POS tags, to train a statistical POS tagger .
Experiments and Analysis
The tagger produces a universal layer of POS tags for both the source and target treebanks.
Experiments and Analysis
For all models used in current work ( POS tagging and parsing), we adopt averaged perceptron to train the feature weights (Collins, 2002).
POS tagging is mentioned in 24 sentences in this paper.
Topics mentioned in this paper:
Lippincott, Thomas and Korhonen, Anna and Ó Séaghdha, Diarmuid
Conclusions and future work
Second, simply treating POS tags within a small window of the verb as pseudo-GRs produces state-of-the-art results without the need for a parsing model.
Conclusions and future work
In fact, by integrating results from unsupervised POS tagging (Teichert and Daume III, 2009) we could render this approach fully domain- and language-independent.
Introduction
Second, by replacing the syntactic features with an approximation based on POS tags , we achieve state-of-the-art performance without relying on error-prone unlexicalized or domain-specific lexicalized parsers.
Methodology
The CONLL format is a common language for comparing output from dependency parsers: each lexical item has an index, lemma, POS tag , tGR in which it is the dependent, and index to the corresponding head.
Methodology
Table 2 shows the three variations we tested: the simple tGR type, with parameterization for the POS tags of head and dependent, and with closed-class POS tags (determiners, pronouns and prepositions) lexicalized.
Methodology
An unlexicalized parser cannot distinguish these based just on POS tags , while a lexicalized parser requires a large treebank.
Previous work
Graphical models have been increasingly popular for a variety of tasks such as distributional semantics (Blei et al., 2003) and unsupervised POS tagging (Finkel et al., 2007), and sampling methods allow efficient estimation of full joint distributions (Neal, 1993).
Previous work
Their study employed unsupervised POS tagging and parsing, and measures of selectional preference and argument structure as complementary features for the classifier.
Results
Since POS tagging is more reliable and robust across domains than parsing, retraining on new domains will not suffer the effects of a mismatched parsing model (Lippincott et al., 2010).
Results
Third, lexicalizing the closed-class POS tags introduces semantic information outside the scope of the alternation-based definition of subcategorization.
POS tagging is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Rastrow, Ariya and Dredze, Mark and Khudanpur, Sanjeev
Experiments
The dependency parser and POS tagger are trained on supervised data and up-trained on data labeled by the CKY—style bottom-up constituent parser of Huang et al.
Experiments
We use the POS tagger to generate tags for dependency training to match the test setting.
Incorporating Syntactic Structures
Long-span models — generative or discriminative, N -best or hill climbing — rely on auxiliary tools, such as a POS tagger or a parser, for extracting features for each hypothesis during rescoring, and during training for discriminative models.
Incorporating Syntactic Structures
A major complexity factor is due to processing 100s or 1000s of hypotheses for each speech utterance, even during hill climbing, each of which must be POS tagged and parsed.
Incorporating Syntactic Structures
For integer typed features the mapping is trivial, for string typed features (e. g. a POS tag identity) we use a mapping of the corresponding vocabulary to integers.
Syntactic Language Models
where h.w and h.t denote the word identity and the POS tag of the corresponding exposed head word.
Up-Training
We apply up-training to improve the accuracy of both our fast POS tagger and dependency parser.
POS tagging is mentioned in 16 sentences in this paper.
Topics mentioned in this paper:
Sun, Weiwei and Uszkoreit, Hans
Abstract
From the perspective of structural linguistics, we explore paradigmatic and syntagmatic lexical relations for Chinese POS tagging , an important and challenging task for Chinese language processing.
Introduction
Automatically assigning POS tags to words plays an important role in parsing, word sense disambiguation, as well as many other NLP applications.
Introduction
While state-of-the-art tagging systems have achieved accuracies above 97% on English, Chinese POS tagging has proven to be more challenging and obtained accuracies about 93-94% (Tseng et al., 2005b; Huang et al., 2007, 2009; Li et al., 2011).
Introduction
It is generally accepted that Chinese POS tagging often requires more sophisticated language processing techniques that are capable of drawing inferences from more subtle linguistic knowledge.
State-of-the-Art
In some cases, the methods work well without large modifications, such as German POS tagging .
POS tagging is mentioned in 35 sentences in this paper.
Topics mentioned in this paper:
Sun, Weiwei and Wan, Xiaojun
About Heterogeneous Annotations
For Chinese word segmentation and POS tagging , supervised learning has become a dominant paradigm.
About Heterogeneous Annotations
Although several institutions to date have released their segmented and POS tagged data, acquiring sufficient quantities of high quality training examples is still a major bottleneck.
About Heterogeneous Annotations
The statistics after colons are how many times this POS tag pair appears among the 3561 words that are consistently segmented.
Introduction
In particular, joint word segmentation and POS tagging is addressed as a two step process.
Joint Chinese Word Segmentation and POS Tagging
words, word segmentation and POS tagging are important initial steps for Chinese language processing.
Joint Chinese Word Segmentation and POS Tagging
Two kinds of approaches are popular for joint word segmentation and POS tagging .
Joint Chinese Word Segmentation and POS Tagging
In this kind of approach, the task is formulated as the classification of characters into POS tags with boundary information.
Structure-based Stacking
Table 1: Mapping between CTB and PPD POS Tags .
POS tagging is mentioned in 18 sentences in this paper.
Topics mentioned in this paper:
Zhao, Qiuye and Marcus, Mitch
Abstract
We show for both English POS tagging and Chinese word segmentation that with proper representation, large number of deterministic constraints can be learned from training examples, and these are useful in constraining probabilistic inference.
Abstract
”assign label 75 to word w” for POS tagging .
Abstract
In this work, we explore deterministic constraints for two fundamental NLP problems, English POS tagging and Chinese word segmentation.
POS tagging is mentioned in 31 sentences in this paper.
Topics mentioned in this paper:
Chambers, Nathanael
Learning Time Constraints
n—gram POS The 4—gram and 3-gram of POS tags that end with the year
Previous Work
Kanhabua and Norvag (2008; 2009) extended this approach with the same model, but expanded its unigrams with POS tags , collocations, and tf-idf scores.
Timestamp Classifiers
Word Classes: include only nouns, verbs, and adjectives as labeled by a POS tagger
Timestamp Classifiers
on POS tags and tf-idf scores.
Timestamp Classifiers
Typed Dependency POS: Similar to Typed Dependency, this feature uses POS tags of the dependency relation’s governor.
POS tagging is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Li, Junhui and Tu, Zhaopeng and Zhou, Guodong and van Genabith, Josef
Experiments
Examining translation rules extracted from the training data shows that there are 72,366 types of non-terminals with respect to 33 types of POS tags .
Head-Driven HPB Translation Model
Instead of collapsing all non-terminals in the source language into a single symbol X as in Chiang (2007), given a word sequence f2- from position i to position 3', we first find heads and then concatenate the POS tags of these heads as fé’s nonterminal symbol.
Head-Driven HPB Translation Model
We look for initial phrase pairs that contain other phrases and then replace sub-phrases with POS tags corresponding to their heads.
Introduction
Here, each Chinese word is attached with its POS tag and Pinyin.
POS tagging is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Liu, Chang and Ng, Hwee Tou
Discussion and Future Work
We can then award partial scores for related words, such as those identified as such by WordNet or those with the same POS tags .
Experiments
However, its use of POS tags and synonym dictionaries prevents its use at the character-level.
Experiments
We use the Stanford Chinese word segmenter (Tseng et al., 2005) and POS tagger (Toutanova et al., 2003) for preprocessing and Cilin for synonym
Introduction
However, many different segmentation standards eXist for different purposes, such as Microsoft Research Asia (MSRA) for Named Entity Recognition (NER), Chinese Treebank (CTB) for parsing and part-of-speech (POS) tagging, and City University of Hong Kong (CITYU) and Academia Sinica (AS) for general word segmentation and POS tagging .
POS tagging is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Green, Spence and DeNero, John
A Class-based Model of Agreement
The coarse categories are the universal POS tag set described by Petrov et al.
A Class-based Model of Agreement
For Arabic, we used the coarse POS tags plus definiteness and the so-called phi features (gender, number, and person).4 For example, SJWl ‘the car’ would be tagged “Noun+Def+Sg+Fem”.
Discussion of Translation Results
For comparison, +POS indicates our class-based model trained on the 11 coarse POS tags only (e.g., “Noun”).
POS tagging is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Huang, Zhiheng and Chang, Yi and Long, Bo and Crespo, Jean-Francois and Dong, Anlei and Keerthi, Sathiya and Wu, Su-Lin
Experiments
We apply 1-best and k-best sequential decoding algorithms to five NLP tagging tasks: Penn TreeBank (PTB) POS tagging, CoNLLZOOO joint POS tagging and chunking, CoNLL 2003 joint POS tagging , chunking and named entity tagging, HPSG supertag-ging (Matsuzaki et al., 2007) and a search query named entity recognition (NER) dataset.
Experiments
As in (Kaji et al., 2010), we combine the POS tags and chunk tags to form joint tags for CoNLL 2000 dataset, e.g., NN|B-NP.
Experiments
Similarly we combine the POS tags , chunk tags, and named entity tags to form joint tags for CoNLL 2003 dataset, e.g., PRP$|I-NP|O.
POS tagging is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: