Index of papers in Proc. ACL that mention

POS tags

Seen in text as:

POS tags (486)
POS tagging (362)
POS tag (229)
POS tagger (87)
POS taggers (24)
PoS tags (15)
POS tagged (15)
POS Tagging (11)
POS Tagger (5)
POS Tag (4)
PoS tag (3)
POS Tags (3)

Seen in 1136 sentences in 108 papers.

1. Incremental Joint Approach to Word Segmentation, POS Tagging, and Dependency Parsing in Chinese

Hatori, Jun and Matsuzaki, Takuya and Miyao, Yusuke and Tsujii, Jun'ichi

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We propose the first joint model for word segmentation, POS tagging , and dependency parsing for Chinese.
Abstract	Based on an extension of the incremental joint model for POS tagging and dependency parsing (Hatori et al., 2011), we propose an efficient character-based decoding method that can combine features from state-of-the-art segmentation, POS tagging , and dependency parsing models.
Abstract	In experiments using the Chinese Treebank (CTB), we show that the accuracies of the three tasks can be improved significantly over the baseline models, particularly by 0.6% for POS tagging and 2.4% for dependency parsing.
Introduction	Furthermore, the word-level information is often augmented with the POS tags , which, along with segmentation, form the basic foundation of statistical NLP.
Introduction	Because the tasks of word segmentation and POS tagging have strong interactions, many studies have been devoted to the task of joint word segmentation and POS tagging for languages such as Chinese (e.g.
Introduction	This is because some of the segmentation ambiguities cannot be resolved without considering the surrounding grammatical constructions encoded in a sequence of POS tags .
Related Works	In Chinese, Luo (2003) proposed a joint constituency parser that performs segmentation, POS tagging , and parsing within a single character-based framework.

POS tags is mentioned in 46 sentences in this paper.

Topics mentioned in this paper:

2. Capturing Paradigmatic and Syntagmatic Lexical Relations: Towards Accurate Chinese Part-of-Speech Tagging

Sun, Weiwei and Uszkoreit, Hans

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	From the perspective of structural linguistics, we explore paradigmatic and syntagmatic lexical relations for Chinese POS tagging , an important and challenging task for Chinese language processing.
Introduction	Automatically assigning POS tags to words plays an important role in parsing, word sense disambiguation, as well as many other NLP applications.
Introduction	While state-of-the-art tagging systems have achieved accuracies above 97% on English, Chinese POS tagging has proven to be more challenging and obtained accuracies about 93-94% (Tseng et al., 2005b; Huang et al., 2007, 2009; Li et al., 2011).
Introduction	It is generally accepted that Chinese POS tagging often requires more sophisticated language processing techniques that are capable of drawing inferences from more subtle linguistic knowledge.
State-of-the-Art	In some cases, the methods work well without large modifications, such as German POS tagging .

POS tags is mentioned in 35 sentences in this paper.

Topics mentioned in this paper:

3. Fast Syntactic Analysis for Statistical Language Modeling via Substructure Sharing and Uptraining

Rastrow, Ariya and Dredze, Mark and Khudanpur, Sanjeev

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	The dependency parser and POS tagger are trained on supervised data and up-trained on data labeled by the CKY—style bottom-up constituent parser of Huang et al.
Experiments	We use the POS tagger to generate tags for dependency training to match the test setting.
Incorporating Syntactic Structures	Long-span models — generative or discriminative, N -best or hill climbing — rely on auxiliary tools, such as a POS tagger or a parser, for extracting features for each hypothesis during rescoring, and during training for discriminative models.
Incorporating Syntactic Structures	A major complexity factor is due to processing 100s or 1000s of hypotheses for each speech utterance, even during hill climbing, each of which must be POS tagged and parsed.
Incorporating Syntactic Structures	For integer typed features the mapping is trivial, for string typed features (e. g. a POS tag identity) we use a mapping of the corresponding vocabulary to integers.
Syntactic Language Models	where h.w and h.t denote the word identity and the POS tag of the corresponding exposed head word.
Up-Training	We apply up-training to improve the accuracy of both our fast POS tagger and dependency parser.

POS tags is mentioned in 16 sentences in this paper.

Topics mentioned in this paper:

4. Fully Unsupervised Core-Adjunct Argument Classification

Abend, Omri and Rappoport, Ari

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Algorithm	To estimate this joint distribution, PSH samples are extracted from the training corpus using unsupervised POS taggers (Clark, 2003; Abend et al., 2010) and an unsupervised parser (Seginer, 2007).
Algorithm	This parser is unique in its ability to induce a bracketing (unlabeled parsing) from raw text (without even using POS tags ) with strong results.
Algorithm	We continue by tagging the corpus using Clark’s unsupervised POS tagger (Clark, 2003) and the unsupervised Prototype Tagger (Abend et al., 2010)2.
Conclusion	The algorithm applies state-of-the-art unsupervised parser and POS tagger to collect statistics from a large raw text corpus.
Core-Adjunct in Previous Work	In addition, supervised models utilize supervised parsers and POS taggers, while the current state-of-the-art in unsupervised parsing and POS tagging is considerably worse than their supervised counterparts.
Core-Adjunct in Previous Work	First, all works use manual or supervised syntactic annotations, usually including a POS tagger .
Experimental Setup	This scenario decouples the accuracy of the algorithm from the quality of the unsupervised POS tagging .
Experimental Setup	Finally, we experiment on a scenario where even argument identification on the test set is not provided, but performed by the algorithm of (Abend et al., 2009), which uses neither syntactic nor SRL annotation but does utilize a supervised POS tagger .
Introduction	However, no work has tack-lwnyamammemmfimdmmmm.Umw pervised models reduce reliance on the costly and error prone manual multilayer annotation ( POS tagging , parsing, core-adjunct tagging) commonly used for this task.

POS tags is mentioned in 11 sentences in this paper.

Topics mentioned in this paper:

5. On WordNet Semantic Classes and Dependency Parsing

Bengoetxea, Kepa and Agirre, Eneko and Nivre, Joakim and Zhang, Yue and Gojenola, Koldo

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Overall, we can say that the improvements are small and not significant using automatic POS tags, contrary to previously published results using gold POS tags (Agirre et al., 2011).
Experimental Framework	We modified the system in order to add semantic features, combining them with wordforms and POS tags , on the parent and child nodes of each arc.
Introduction	using MaltParser on gold POS tags .
Introduction	In this work, we will investigate the effect of semantic information using predicted POS tags .
Related work	(201 1) successfully introduced WordNet classes in a dependency parser, obtaining improvements on the full PTB using gold POS tags , trying different combinations of semantic classes.
Results	For all the tests, we used a perceptron POS-tagger (Collins, 2002), trained on WSJ sections 2—21, to assign POS tags automatically to both the training (using 10—way jackknifing) and test data, obtaining a POS tagging accuracy of 97.32% on the test data.
Results	Overall, we see that the small improvements do not confirm the previous results on Penn2Malt, MaltParser and gold POS tags .
Results	One of the obstacles of automatic parsers is the presence of incorrect POS tags due to auto-

POS tags is mentioned in 16 sentences in this paper.

Topics mentioned in this paper:

6. Spectral Unsupervised Parsing with Additive Tree Metrics

Parikh, Ankur P. and Cohen, Shay B. and Xing, Eric P.

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	:0, is the POS tag of 21),).
Abstract	The word embeddings are used during the leam-ing process, but the final decoder that the learning algorithm outputs maps a POS tag sequence a: to a parse tree.
Abstract	While ideally we would want to use the word information in decoding as well, much of the syntax of a sentence is determined by the POS tags, and relatively high level of accuracy can be achieved by learning, for example, a supervised parser from POS tag sequences.

POS tags is mentioned in 15 sentences in this paper.

Topics mentioned in this paper:

7. Learning Syntactic Verb Frames using Graphical Models

Lippincott, Thomas and Korhonen, Anna and Ó Séaghdha, Diarmuid

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusions and future work	Second, simply treating POS tags within a small window of the verb as pseudo-GRs produces state-of-the-art results without the need for a parsing model.
Conclusions and future work	In fact, by integrating results from unsupervised POS tagging (Teichert and Daume III, 2009) we could render this approach fully domain- and language-independent.
Introduction	Second, by replacing the syntactic features with an approximation based on POS tags , we achieve state-of-the-art performance without relying on error-prone unlexicalized or domain-specific lexicalized parsers.
Methodology	The CONLL format is a common language for comparing output from dependency parsers: each lexical item has an index, lemma, POS tag , tGR in which it is the dependent, and index to the corresponding head.
Methodology	Table 2 shows the three variations we tested: the simple tGR type, with parameterization for the POS tags of head and dependent, and with closed-class POS tags (determiners, pronouns and prepositions) lexicalized.
Methodology	An unlexicalized parser cannot distinguish these based just on POS tags , while a lexicalized parser requires a large treebank.
Previous work	Graphical models have been increasingly popular for a variety of tasks such as distributional semantics (Blei et al., 2003) and unsupervised POS tagging (Finkel et al., 2007), and sampling methods allow efficient estimation of full joint distributions (Neal, 1993).
Previous work	Their study employed unsupervised POS tagging and parsing, and measures of selectional preference and argument structure as complementary features for the classifier.
Results	Since POS tagging is more reliable and robust across domains than parsing, retraining on new domains will not suffer the effects of a mismatched parsing model (Lippincott et al., 2010).
Results	Third, lexicalizing the closed-class POS tags introduces semantic information outside the scope of the alternation-based definition of subcategorization.

POS tags is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

8. Tagging The Web: Building A Robust Web Tagger with Neural Network

Ma, Ji and Zhang, Yue and Zhu, Jingbo

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In this paper, we address the problem of web-domain POS tagging using a two-phase approach.
Abstract	The representation is integrated as features into a neural network that serves as a scorer for an easy-first POS tagger .
Introduction	However, state-of-the-art POS taggers in the literature (Collins, 2002; Shen et al., 2007) are mainly optimized on the the Penn Treebank (PTB), and when shifted to web data, tagging accuracies drop significantly (Petrov and McDonald, 2012).
Introduction	We integrate the learned encoder with a set of well-established features for POS tagging (Ratnaparkhi, 1996; Collins, 2002) in a single neural network, which is applied as a scorer to an easy-first POS tagger .
Introduction	We choose the easy-first tagging approach since it has been demonstrated to give higher accuracies than the standard left-to-right POS tagger (Shen et al., 2007; Ma et al., 2013).
Learning from Web Text	This may partly be due to the fact that unlike computer vision tasks, the input structure of POS tagging or other sequential labelling tasks is relatively simple, and a single nonlinear layer is enough to model the interactions within the input (Wang and Manning, 2013).
Neural Network for POS Disambiguation	The main challenge to designing the neural network structure is: on the one hand, we hope that the model can take the advantage of information provided by the learned WRRBM, which reflects general properties of web texts, so that the model generalizes well in the web domain; on the other hand, we also hope to improve the model’s discriminative power by utilizing well-established POS tagging features, such as those of Ratnaparkhi (1996).
Neural Network for POS Disambiguation	Under the output layer, the network consists of two modules: the web-feature module, which incorporates knowledge from the pre-trained WRRBM, and the sparse-feature module, which makes use of other POS tagging features.
Neural Network for POS Disambiguation	For POS tagging , we found that a simple linear layer yields satisfactory accuracies.

POS tags is mentioned in 20 sentences in this paper.

Topics mentioned in this paper:

9. Exploiting Multiple Treebanks for Parsing with Quasi-synchronous Grammars

Li, Zhenghua and Liu, Ting and Che, Wanxiang

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Dependency Parsing	Given an input sentence x = wowl...wn and its POS tag sequence 1; = totl...tn, the goal of dependency parsing is to build a dependency tree as depicted in Figure l, denoted by d = {(h, m, l) : 0 g h 3 72,0 < m g n,l E L}, where (h,m, l) indicates an directed arc from the head word (also called father) w, to the modifier (also called child or dependent) wm with a dependency label l, and L is the label set.
Dependency Parsing with QG Features	The type of the TP is conjoined with the related words and POS tags , such that the QG—enhanced parsing models can make more elaborate decisions based on the context.
Experiments and Analysis	CDT and CTB5/6 adopt different POS tag sets, and converting from one tag set to another is difficult (Niu et al., 2009).5 To overcome this problem, we use the People’s Daily corpus (PD),6 a large—scale corpus annotated with word segmentation and POS tags, to train a statistical POS tagger .
Experiments and Analysis	The tagger produces a universal layer of POS tags for both the source and target treebanks.
Experiments and Analysis	For all models used in current work ( POS tagging and parsing), we adopt averaged perceptron to train the feature weights (Collins, 2002).

POS tags is mentioned in 24 sentences in this paper.

Topics mentioned in this paper:

10. Reducing Approximation and Estimation Errors for Chinese Lexical Processing with Heterogeneous Annotations

Sun, Weiwei and Wan, Xiaojun

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

About Heterogeneous Annotations	For Chinese word segmentation and POS tagging , supervised learning has become a dominant paradigm.
About Heterogeneous Annotations	Although several institutions to date have released their segmented and POS tagged data, acquiring sufficient quantities of high quality training examples is still a major bottleneck.
About Heterogeneous Annotations	The statistics after colons are how many times this POS tag pair appears among the 3561 words that are consistently segmented.
Introduction	In particular, joint word segmentation and POS tagging is addressed as a two step process.
Joint Chinese Word Segmentation and POS Tagging	words, word segmentation and POS tagging are important initial steps for Chinese language processing.
Joint Chinese Word Segmentation and POS Tagging	Two kinds of approaches are popular for joint word segmentation and POS tagging .
Joint Chinese Word Segmentation and POS Tagging	In this kind of approach, the task is formulated as the classification of characters into POS tags with boundary information.
Structure-based Stacking	Table 1: Mapping between CTB and PPD POS Tags .

POS tags is mentioned in 18 sentences in this paper.

Topics mentioned in this paper:

11. Learning to Predict Distributions of Words Across Domains

Bollegala, Danushka and Weir, David and Carroll, John

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Distribution Prediction	As we go on to show in Section 6, this enables us to use the same distribution prediction method for both POS tagging and sentiment classification.
Domain Adaptation	We consider two DA tasks: (a) cross-domain POS tagging (Section 4.1), and (b) cross-domain sentiment classification (Section 4.2).
Domain Adaptation	4.1 Cross-Domain POS Tagging
Domain Adaptation	manually POS tagged ) sentence, we select its neighbours 7N) in the source domain as additional features.
Introduction	0 Using the learnt distribution prediction model, we propose a method to learn a cross-domain POS tagger .
Related Work	words that appear in both the source and target domains) to adapt a POS tagger to a target domain.
Related Work	Choi and Palmer (2012) propose a cross-domain POS tagging method by training two separate models: a generalised model and a domain-specific model.
Related Work	Adding latent states to the smoothing model further improves the POS tagging accuracy (Huang and Yates, 2012).

POS tags is mentioned in 22 sentences in this paper.

Topics mentioned in this paper:

12. Error Mining on Dependency Trees

Gardent, Claire and Narayan, Shashi

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiment and Results	One feature of our approach is that it permits mining the data for tree patterns of arbitrary size using different types of labelling information ( POS tags , dependencies, word forms and any combination thereof).
Experiment and Results	4.3.1 Mining on single labels (word form, POS tag or dependency)
Experiment and Results	Mining on a single label permits (i) assessing the relative impact of each category in a given label category and (ii) identifying different sources of errors depending on the type of label considered ( POS tag , dependency or word form).

POS tags is mentioned in 18 sentences in this paper.

Topics mentioned in this paper:

13. Creating a manually error-tagged and shallow-parsed learner corpus

Nagata, Ryo and Whittaker, Edward and Sheinman, Vera

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Such a comparison brings up another crucial question: “Do existing POS taggers and chun-
Introduction	Nevertheless, a great number of researchers have used existing POS taggers and chunkers to analyze the writing of learners of English.
Introduction	For instance, error detection methods normally use a POS tagger and/or a chunker in the error detection process.
Method	Considering this, we determined a basic rule as follows: “Use the Penn Treebank tag set and preserve the original texts as much as possible.” To handle such errors, we made several modifications and added two new POS tags (CE and UK) and another two for chunking (XP and PH), which are described below.
Method	Note that each POS tag is hyphenated.
UK and XP stand for unknown and X phrase, respectively.	5.1 POS Tagging
UK and XP stand for unknown and X phrase, respectively.	HMM-based and CRF-based POS taggers were tested on the shallow-parsed corpus.
UK and XP stand for unknown and X phrase, respectively.	Both use the Penn Treebank POS tag set.

POS tags is mentioned in 18 sentences in this paper.

Topics mentioned in this paper:

14. Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections

Das, Dipanjan and Petrov, Slav

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Approach Overview	The focus of this work is on building POS taggers for foreign languages, assuming that we have an English POS tagger and some parallel text between the two languages.
Approach Overview	The POS distributions over the foreign trigram types are used as features to learn a better unsupervised POS tagger (§5).
Experiments and Results	9We extracted only the words and their POS tags from the treebanks.
Experiments and Results	(2011) provide a mapping A from the fine-grained language specific POS tags in the foreign treebank to the universal POS tags .
Graph Construction	Graph construction for structured prediction problems such as POS tagging is nontrivial: on the one hand, using individual words as the vertices throws away the context
Graph Construction	They considered a semi-supervised POS tagging scenario and showed that one can use a graph over trigram types, and edge weights based on distributional similarity, to improve a supervised conditional random field tagger.
Introduction	Unfortunately, the best completely unsupervised English POS tagger (that does not make use of a tagging dictionary) reaches only 76.1% accuracy (Christodoulopoulos et al., 2010), making its practical usability questionable at best.
Introduction	Our final average POS tagging accuracy of 83.4% compares very favorably to the average accuracy of Berg-Kirkpatrick et al.’s monolingual unsupervised state-of-the-art model (73.0%), and considerably bridges the gap to fully supervised POS tagging performance (96.6%).
PCS Induction	After running label propagation (LP), we compute tag probabilities for foreign word types cc by marginalizing the POS tag distributions of foreign trigrams ui = :c_ cc 55+ over the left and right con-
PCS Induction	This vector tag is constructed for every word in the foreign vocabulary and will be used to provide features for the unsupervised foreign language POS tagger .
PCS Induction	For English POS tagging , Berg-Kirkpatrick et al.

POS tags is mentioned in 18 sentences in this paper.

Topics mentioned in this paper:

15. Exploiting Syntactico-Semantic Structures for Relation Extraction

Chan, Yee Seng and Roth, Dan

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Mention Extraction System	These are a combination of 21);, itself, its POS tag , and its integer offset from the last word (lw) in the mention.
Mention Extraction System	These features are meant to capture the word and POS tag sequences in mentions.
Mention Extraction System	Contextual We extract the word C_1,_1 immediately before mi, the word C+1,+1 immediately after mi, and their associated POS tags P.
Relation Extraction System	POS features If there is a single word between the two mentions, we extract its POS tag .
Relation Extraction System	Given the hw of m, Pm- refers to the sequence of POS tags in the immediate context of hw (we exclude the POS tag of hw).
Relation Extraction System	The offsets i and j denote the position (relative to hw) of the first and last POS tag respectively.
Syntactico-Semantic Structures	0 If u* is not empty, we require that it satisfies any of the following POS tag sequences: JJ+ \/ JJ and JJ?
Syntactico-Semantic Structures	These are (optional) POS tag sequences that normally start a valid noun phrase.
Syntactico-Semantic Structures	0 We use two patterns to differentiate between premodifier relations and possessive relations, by checking for the existence of POS tags PRP$, WP$, POS, and the word “’s”.

POS tags is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

16. Omni-word Feature and Soft Constraint for Chinese Relation Extraction

Chen, Yanping and Zheng, Qinghua and Zhang, Wei

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discussion	Except in Row 8 and Row 11, when two head nouns of entity pair were combined as semantic pair and when POS tag were combined with the entity type, the performances are decreased.
Discussion	Comparing the reference set (5) with the reference set (3), the Head noan and adjacent entity POS tag get a better performance when used as singletons.
Discussion	In this paper, for a better demonstration of the constraint condition, we still use the Position Sensitive as the default setting to use the Head noan and the adjacent entity POS tag .
Feature Construction	All the employed features are simply classified into five categories: Entity Type and Subtype, Head Noun, Position Feature, POS Tag and Omni-word Feature.
Feature Construction	POS Tag: In our model, we use only the adjacent entity POS tags , which lie in two sides of the entity mention.
Feature Construction	These POS tags are labelled by the ICTCLAS packagez.

POS tags is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

17. Joint Annotation of Search Queries

Bendersky, Michael and Croft, W. Bruce and Smith, David A.

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	This sample is manually labeled with three annotations: capitalization, POS tags , and segmentation, according to the description of these annotations in Figure 1.
Experiments	Table 1: Summary of query annotation performance for capitalization (CAP), POS tagging (TAG) and segmentation.
Experiments	In case of POS tagging , the decisions are ternary, and hence we report the classification accuracy.
Independent Query Annotations	On the other hand, given sentence from a corpus that is relevant to the query lCh as “Hawaiian Falls is a family-friendly water-:irk”, the word “falls” is correctly identified by a andard POS tagger as a proper noun.
Independent Query Annotations	(2010), an estimate of p(Cz-\|7“) is a smoothed estimator that combines the information from the retrieved sentence 7“ with the information about unigrams (for capitalization and POS tagging ) and bigrams (for segmentation) from a large n-gram corpus (Brants and Franz, 2006).
Joint Query Annotation	Many query annotations that are useful for IR can be represented using this simple form, including capitalization, POS tagging , phrase chunking, named entity recognition, and stopword indicators, to name just a few.
Joint Query Annotation	For instance, imagine that we need to perform two annotations: capitalization and POS tagging .
Query Annotation Example	In this scheme, each query is marked-up using three annotations: capitalization, POS tags , and segmentation indicators.
Related Work	Most of the previous work on query annotation focuses on performing a particular annotation task (e.g., segmentation or POS tagging ) in isolation.

POS tags is mentioned in 16 sentences in this paper.

Topics mentioned in this paper:

18. Low-Resource Semantic Role Labeling

Gormley, Matthew R. and Mitchell, Margaret and Van Durme, Benjamin and Dredze, Mark

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Approaches	A typical pipeline consists of a POS tagger , dependency parser, and semantic role labeler.
Approaches	Brown Clusters We use fully unsupervised Brown clusters (Brown et al., 1992) in place of POS tags .
Approaches	We define the DMV such that it generates sequences of word classes: either POS tags or Brown clusters as in Spitkovsky et al.
Experiments	Our experiments are subtractive, beginning with all supervision available and then successively removing (a) dependency syntax, (b) morphological features, (c) POS tags , and (d) lemmas.
Experiments	The CoNLL-2009 Shared Task (Hajic et al., 2009) dataset contains POS tags , lemmas, morphological features, syntactic dependencies, predicate senses, and semantic roles annotations for 7 languages: Catalan, Chinese, Czech, English, German, Japanese,4 Spanish.
Experiments	We first compare our models trained as a pipeline, using all available supervision (syntax, morphology, POS tags , lemmas) from the CoNLL-2009 data.
Introduction	0 Use of Brown clusters in place of POS tags for low-resource SRL.
Related Work	(2012) limit their exploration to a small set of basic features, and included high-resource supervision in the form of lemmas, POS tags , and morphology available from the CoNLL 2009 data.
Related Work	Our experiments also consider ‘longer’ pipelines that include earlier stages: a morphological analyzer, POS tagger , lemmatizer.

POS tags is mentioned in 15 sentences in this paper.

Topics mentioned in this paper:

19. Steps to Excellence: Simple Inference with Refined Scoring of Dependency Trees

Zhang, Yuan and Lei, Tao and Barzilay, Regina and Jaakkola, Tommi and Globerson, Amir

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Features	0 Coordination In a coordinate structure, the two adj acent conjuncts usually agree with each other on POS tags and their span lengths.
Features	Therefore, we add different features to capture POS tag and span length consistency in a coordinate structure.
Features	0 Span Length This feature captures the distribution of the binned span length of each POS tag .
Introduction	When proposing a small move, i.e., sampling a head of the word, we can also jointly sample its POS tag from a set of alternatives provided by the tagger.
Sampling-Based Dependency Parsing with Global Features	For instance, we can sample the POS tag , the dependency relation or morphology information.
Sampling-Based Dependency Parsing with Global Features	POS correction scenario in which only the predicted POS tags are provided in the testing phase, while both gold and predicted tags are available for the training set.
Sampling-Based Dependency Parsing with Global Features	We extend our model such that it jointly learns how to predict a parse tree and also correct the predicted POS tags for a better parsing performance.

POS tags is mentioned in 19 sentences in this paper.

Topics mentioned in this paper:

20. Part-of-Speech Induction in Dependency Trees for Statistical Machine Translation

Tamura, Akihiro and Watanabe, Taro and Sumita, Eiichiro and Takamura, Hiroya and Okumura, Manabu

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In particular, we extend the monolingual infinite tree model (Finkel et al., 2007) to a bilingual scenario: each hidden state ( POS tag ) of a source-side dependency tree emits a source word together with its aligned target word, either jointly (joint model), or independently (independent model).
Abstract	Evaluations of J apanese-to-English translation on the NTCIR-9 data show that our induced Japanese POS tags for dependency trees improve the performance of a forest-to-string SMT system.
Introduction	However, dependency parsing, which is a popular choice for Japanese, can incorporate only shallow syntactic information, i.e., POS tags , compared with the richer syntactic phrasal categories in constituency parsing.
Introduction	Figure 1: Examples of Existing Japanese POS Tags and Dependency Structures
Introduction	If we could discriminate POS tags for two cases, we might improve the performance of a Japanese-to-English SMT system.

POS tags is mentioned in 51 sentences in this paper.

Topics mentioned in this paper:

21. Enhancing Performance of Lexicalised Grammars

Dridan, Rebecca and Kordoni, Valia and Nicholson, Jeremy

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In terms of robustness, we try using different types of external data to increase lexical coverage, and find that simple POS tags have the most effect, increasing coverage on unseen data by up to 45%.
Abstract	Even using vanilla POS tags we achieve some efficiency gains, but when using detailed lexical types as supertags we manage to halve parsing time with minimal loss of coverage or precision.
Background	Supertagging is the process of assigning probable ‘supertags’ to words before parsing to restrict parser ambiguity, where a supertag is a tag that includes more specific information than the typical POS tags .
Parser Restriction	In these experiments we look at two methods of restricting the parser, first by using POS tags and then using lexical types.
Parser Restriction	We use TreeTagger (Schmid, 1994) to produce POS tags and then open class words are restricted if the POS tagger assigned a tag with a probability over a certain threshold.
Parser Restriction	Table 1: Results obtained when restricting the parser lexicon according to the POS tag , where words are restricted according to a threshold of POS probabilities.

POS tags is mentioned in 48 sentences in this paper.

Topics mentioned in this paper:

22. Hypertagging: Supertagging for Surface Realization with CCG

Espinosa, Dominic and White, Michael and Mehay, Dennis

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background	The best performing model interpolates a word trigram model with a trigram model that chains a POS model with a supertag model, where the POS model conditions on the previous two POS tags, and the supertag model conditions on the previous two POS tags as well as the current one.
The Approach	Clark (2002) notes in his parsing experiments that the POS tags of the surrounding words are highly informative.
The Approach	As discussed below, a significant gain in hypertagging accuracy resulted from including features sensitive to the POS tags of a node’s parent, the node itself, and all of its arguments and modifiers.
The Approach	Predicting these tags requires the use of a separate POS tagger , which operates in a manner similar to the hypertagger itself, though exploiting a slightly different set of features (e. g., including features corresponding to the four-character prefixes and suffixes of rare logical predication names).

POS tags is mentioned in 13 sentences in this paper.

Topics mentioned in this paper:

logical form (13)
POS tags (13)
CCG (11)

23. A Single Generative Model for Joint Morphological Segmentation and Syntactic Parsing

Goldberg, Yoav and Tsarfaty, Reut

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Generative PCFG Model	The entries in such a lexicon may be thought of as meaningful surface segments paired up with their PoS tags I, = (3,, pi), but note that a surface segment 3 need not be a space-delimited token.
A Generative PCFG Model	(1996) who consider the kind of probabilities a generative parser should get from a PoS tagger , and concludes that these should be P(w\|t) “and nothing fancier”.3 In our setting, therefore, the Lattice is not used to induce a probability distribution on a linear context, but rather, it is used as a common-denominator of state-indexation of all segmentations possibilities of a surface form.
Model Preliminaries	A Hebrew surface token may have several readings, each of which corresponding to a sequence of segments and their corresponding PoS tags .
Model Preliminaries	We refer to different readings as different analyses whereby the segments are deterministic given the sequence of PoS tags .
Model Preliminaries	We refer to a segment and its assigned PoS tag as a lexeme, and so analyses are in fact sequences of lexemes.
Modern Hebrew Structure	Such discrepancies can be aligned via an intermediate level of PoS tags .
Modern Hebrew Structure	PoS tags impose a unique morphological segmentation on surface tokens and present a unique valid yield for syntactic trees.
Previous Work on Hebrew Processing	Tsarfaty (2006) used a morphological analyzer (Segal, 2000), a PoS tagger (Bar-Haim et al., 2005), and a general purpose parser (Schmid, 2000) in an integrated framework in which morphological and syntactic components interact to share information, leading to improved performance on the joint task.

POS tags is mentioned in 17 sentences in this paper.

Topics mentioned in this paper:

24. Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging

Zeng, Xiaodong and Wong, Derek F. and Chao, Lidia S. and Trancoso, Isabel

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background	To perform segmentation and tagging simultaneously in a uniform framework, according to Ng and Low (2004), the tag is composed of a word boundary part, and a POS part, e. g., “B _N N” refers to the first character in a word with POS tag “NN”.
Background	As for the POS tag , we shal-1 use the 33 tags in the Chinese tree bank.
Introduction	The traditional way of segmentation and tagging is performed in a pipeline approach, first segmenting a sentence into words, and then assigning each word a POS tag .
Introduction	The pipeline approach is very simple to implement, but frequently causes error propagation, given that wrong seg-mentations in the earlier stage harm the subsequent POS tagging (Ng and Low, 2004).
Introduction	The joint approaches of word segmentation and POS tagging (joint S&T) are proposed to resolve these two tasks simultaneously.
Method	In fact, the sparsity is also a common phenomenon among character-based CWS and POS tagging .
Method	The performance measurement indicators for word segmentation and POS tagging (joint S&T) are balance F-score, F = 2PIU(P+R), the harmonic mean of precision (P) and recall (R), and out-of-vocabulary recall (OOV—R).
Related Work	There are few explorations of semi-supervised approaches for CWS or POS tagging in previous works.

POS tags is mentioned in 20 sentences in this paper.

Topics mentioned in this paper:

25. Chinese Parsing Exploiting Characters

Zhang, Meishan and Zhang, Yue and Che, Wanxiang and Liu, Ting

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Character-based Chinese Parsing	To produce character-level trees for Chinese NLP tasks, we develop a character-based parsing model, which can jointly perform word segmentation, POS tagging and phrase-structure parsing.
Character-based Chinese Parsing	We make two extensions to their work to enable joint segmentation, POS tagging and phrase-structure parsing from the character level.
Character-based Chinese Parsing	First, we split the original SHIFT action into SHIFT—SEPARATE (t) and SHIFT—APPEND, which jointly perform the word segmentation and POS tagging tasks.
Introduction	Compared to a pipeline system, the advantages of a joint system include reduction of error propagation, and the integration of segmentation, POS tagging and syntax features.
Introduction	To analyze word structures in addition to phrase structures, our character-based parser naturally performs joint word segmentation, POS tagging and parsing jointly.
Introduction	We extend their shift-reduce framework, adding more transition actions for word segmentation and POS tagging , and defining novel features that capture character information.
Word Structures and Syntax Trees	They made use of this information to help joint word segmentation and POS tagging .
Word Structures and Syntax Trees	In particular, we mark the original nodes that represent POS tags in CTB-style trees with “-t”, and insert our word structures as unary subnodes of the “-t” nodes.

POS tags is mentioned in 35 sentences in this paper.

Topics mentioned in this paper:

26. Unsupervised Argument Identification for Semantic Role Labeling

Abend, Omri and Reichart, Roi and Rappoport, Ari

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In this paper we present an unsupervised algorithm for identifying verb arguments, where the only type of annotation required is POS tagging .
Algorithm	This parser is unique in that it is able to induce a bracketing (unlabeled parsing) from raw text (without even using POS tags ) achieving state-of-the-art results.
Algorithm	The only type of supervised annotation we use is POS tagging .
Algorithm	We use the taggers MX-POST (Ratnaparkhi, 1996) for English and Tree-Tagger (Schmid, 1994) for Spanish, to obtain POS tags for our model.
Introduction	A standard SRL algorithm requires thousands to dozens of thousands sentences annotated with POS tags , syntactic annotation and SRL annotation.

POS tags is mentioned in 14 sentences in this paper.

Topics mentioned in this paper:

POS tags (14)
F-score (12)
parse tree (11)

27. Stop-probability estimates computed on a large corpus improve Unsupervised Dependency Parsing

Mareċek, David and Straka, Milan

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Rasooli and Faili (2012) and Bisk and Hockenmaier (2012) made some efforts to boost the verbocentricity of the inferred structures; however, both of the approaches require manual identification of the POS tags marking the verbs, which renders them useless when unsupervised POS tags are employed.
Related Work	Our dependency model contained a submodel which directly prioritized subtrees that form reducible sequences of POS tags .
Related Work	Reducibility scores of given POS tag sequences were estimated using a large corpus of Wikipedia articles.
Related Work	The weakness of this approach was the fact that longer sequences of POS tags are very sparse and no reducibility scores could be estimated for them.
STOP-probability estimation	Hereinafter, Psfifzxch, dir) denotes the STOP-probability we want to estimate from a large corpus; ch is the head’s POS tag and dir is the direction in which the STOP probability is estimated.
STOP-probability estimation	For each POS tag 0;, in the given corpus, we first compute its left and right “raw” score Sst0p(ch, left) and Sst0p(ch, right) as the relative number of times a word with POS tag 0;, was in the first (or last) position in a reducible sequence found in the corpus.
STOP-probability estimation	Their main purpose is to sort the POS tags according to their “reducibility”.

POS tags is mentioned in 21 sentences in this paper.

Topics mentioned in this paper:

28. Easy-First POS Tagging and Dependency Parsing with Beam Search

Ma, Ji and Zhu, Jingbo and Xiao, Tong and Yang, Nan

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In this paper, we combine easy-first dependency parsing and POS tagging algorithms with beam search and structured perceptron.
Experiments	We use the standard split for dependency parsing and the split used by (Ratnaparkhi, 1996) for POS tagging .
Experiments	For dependency parsing, POS tags of the training set are generated using 10-fold jackknifing.
Experiments	For dependency parsing, we assume gold segmentation and POS tags for the input.
Introduction	The proposed solution is general and can also be applied to other algorithms that exhibit spurious ambiguity, such as easy-first POS tagging (Ma et al., 2012) and transition-based dependency parsing with dynamic oracle (Goldberg and Nivre, 2012).
Introduction	In this paper, we report experimental results on both easy-first dependency parsing and POS tagging (Ma et al., 2012).
Introduction	We show that both easy-first POS tagging and dependency parsing can be improved significantly from beam search and global learning.
Training	wp denotes the head word of p, tp denotes the POS tag of wp.

POS tags is mentioned in 13 sentences in this paper.

Topics mentioned in this paper:

29. Addressing Ambiguity in Unsupervised Part-of-Speech Induction with Substitute Vectors

Cirik, Volkan

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Algorithm	We induce number of POS tags of a word type at this step.
Algorithm	Furthermore, they will have the same POS tags .
Experiments	As a result, this method inaccurately induces POS tags for the occurrences of word types with high gold tag perplexity.
Experiments	In other words, we assume that the number of different POS tags of each word type is equal to 2.
Introduction	part-of-speech or POS tagging ) is an important preprocessing step for many natural language processing applications because grammatical rules are not functions of individual words, instead, they are functions of word categories.
Introduction	Unlike supervised POS tagging systems, POS induction systems make use of unsupervised methods.
Introduction	Type based methods suffer from POS ambiguity because one POS tag is assigned to each word type.

POS tags is mentioned in 15 sentences in this paper.

Topics mentioned in this paper:

30. Distributional Representations for Handling Sparsity in Supervised Sequence-Labeling

Huang, Fei and Yates, Alexander

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We investigate the use of smoothing in two test systems, conditional random field (CRF) models for POS tagging and chunking.
Experiments	Our baseline CRF system for POS tagging follows the model described by Lafferty et al.
Experiments	In addition to the transition, word-level, and orthographic features, we include features relating automatically-generated POS tags and the chunk labels.
Introduction	effects of our smoothing techniques on two sequence-labeling tasks, POS tagging and chunking, to answer the following: I.
Introduction	Our best smoothing technique improves a POS tagger by 11% on OOV words, and a chunker by an impressive 21% on OOV words.

POS tags is mentioned in 16 sentences in this paper.

Topics mentioned in this paper:

31. Joint POS Tagging and Transition-based Constituent Parsing in Chinese with Non-local Features

Wang, Zhiguo and Xue, Nianwen

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	First, to resolve the error propagation problem of the traditional pipeline approach, we incorporate POS tagging into the syntactic parsing process.
Introduction	First, POS tagging is typically performed separately as a preliminary step, and POS tagging errors will propagate to the parsing process.
Introduction	This problem is especially severe for languages where the POS tagging accuracy is relatively low, and this is the case for Chinese where there are fewer contextual clues that can be used to inform the tagging process and some of the tagging decisions are actually influenced by the syntactic structure of the sentence.
Introduction	First, we integrate POS tagging into the parsing process and jointly optimize these two processes simultaneously.
Joint POS Tagging and Parsing with Nonlocal Features	To address the drawbacks of the standard transition-based constituent parsing model (described in Section 1), we propose a model to jointly solve POS tagging and constituent parsing with nonlocal features.
Joint POS Tagging and Parsing with Nonlocal Features	3.1 Joint POS Tagging and Parsing
Joint POS Tagging and Parsing with Nonlocal Features	POS tagging is often taken as a preliminary step for transition-based constituent parsing, therefore the accuracy of POS tagging would greatly affect parsing performance.
Transition-based Constituent Parsing	Figure 1: Two constituent trees for an example sentence wowlwg with POS tags abc.
Transition-based Constituent Parsing	For example, in Figure l, for the input sentence wowlwg and its POS tags abc, our parser can construct two parse trees using action sequences given below these trees.

POS tags is mentioned in 37 sentences in this paper.

Topics mentioned in this paper:

32. Automatic Adaptation of Annotation Standards: Chinese Word Segmentation and POS Tagging -- A Case Study

Jiang, Wenbin and Huang, Liang and Liu, Qun

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We test the efficacy of this method in the context of Chinese word segmentation and part-of-speech tagging, where no segmentation and POS tagging standards are widely accepted due to the lack of morphology in Chinese.
Experiments	For example, currently, most Chinese constituency and dependency parsers are trained on some version of CTB, using its segmentation and POS tagging as the defacto standards.
Experiments	Therefore, we expect the knowledge adapted from PD will lead to more precise CTB-style segmenter and POS tagger , which would in turn reduce the error propagation to parsing (and translation).
Introduction	Figure l: Incompatible word segmentation and POS tagging standards between CTB (upper) and People’s Daily (below).
Introduction	Our experiments show that adaptation from PD to CTB results in a significant improvement in segmentation and POS tagging , with error reductions of 30.2% and 14%, respectively.
Segmentation and Tagging as Character Classification	While in Joint S&T, each word is further annotated with a POS tag:
Segmentation and Tagging as Character Classification	Where tk(l<: = 1..m) denotes the POS tag for the word Cek_1+1;ek.
Segmentation and Tagging as Character Classification	In Ng and Low (2004), Joint S&T can also be treated as a character classification problem, Where a boundary tag is combined with a POS tag in order to give the POS information of the word containing these characters.

POS tags is mentioned in 15 sentences in this paper.

Topics mentioned in this paper:

33. An Error-Driven Word-Character Hybrid Model for Joint Chinese Word Segmentation and POS Tagging

Kruengkrai, Canasai and Uchimoto, Kiyotaka and Kazama, Jun'ichi and Wang, Yiou and Torisawa, Kentaro and Isahara, Hitoshi

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In this paper, we present a discriminative word-character hybrid model for joint Chinese word segmentation and POS tagging .
Background	In joint word segmentation and the POS tagging process, the task is to predict a path
Background	p is its POS tag , and a “7%” symbol denotes the number of elements in each variable.
Background	words found in the system’s word dictionary, have regular POS tags .
Introduction	Word segmentation and POS tagging results are required as inputs to other NLP tasks, such as phrase chunking, dependency parsing, and machine translation.
Introduction	Word segmentation and POS tagging in a joint process have received much attention in recent research and have shown improvements over a pipelined fashion (Ng and Low, 2004; Nakagawa and Uchimoto, 2007; Zhang and Clark, 2008; Jiang et al., 2008a; Jiang et al., 2008b).
Introduction	In joint word segmentation and the POS tagging process, one serious problem is caused by unknown words, which are defined as words that are not found in a training corpus or in a sys-
Policies for correct path selection	We can directly estimate the statistics of known words from an annotated corpus where a sentence is already segmented into words and assigned POS tags .
Policies for correct path selection	3We consider a word and its POS tag a single entry.

POS tags is mentioned in 26 sentences in this paper.

Topics mentioned in this paper:

34. Exploring Deterministic Constraints: from a Constrained English POS Tagger to an Efficient ILP Solution to Chinese Word Segmentation

Zhao, Qiuye and Marcus, Mitch

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We show for both English POS tagging and Chinese word segmentation that with proper representation, large number of deterministic constraints can be learned from training examples, and these are useful in constraining probabilistic inference.
Abstract	”assign label 75 to word w” for POS tagging .
Abstract	In this work, we explore deterministic constraints for two fundamental NLP problems, English POS tagging and Chinese word segmentation.

POS tags is mentioned in 31 sentences in this paper.

Topics mentioned in this paper:

POS tagging (31)
ILP (20)
Viterbi (18)

35. Chinese Morphological Analysis with Character-level POS Tagging

Shen, Mo and Liu, Hongxiao and Kawahara, Daisuke and Kurohashi, Sadao

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We propose the first tagset designed for the task of character-level POS tagging .
Abstract	We propose a method that performs character-level POS tagging jointly with word segmentation and word-level POS tagging .
Character-level POS Tagset	We propose a tagset for the task of character-level POS tagging .
Chinese Morphological Analysis with Character-level POS	Previous studies have shown that jointly processing word segmentation and POS tagging is preferable to pipeline processing, which can propagate errors (Nakagawa and Uchimoto, 2007; Kruengkrai et a1., 2009).
Chinese Morphological Analysis with Character-level POS	Baseline features: For word-level nodes that represent known words, we use the symbols w, p and l to denote the word form, POS tag and length of the word, respectively.
Chinese Morphological Analysis with Character-level POS	Proposed features: For word-level nodes, the function CPpal-T (w) returns the pair of the char-acter-level POS tags of the first and last characters of w, and CPau(w) returns the sequence of character-level POS tags of w. If either the pair or the sequence of character-level P08 is ambiguous, which means there are multiple paths in the sub-lattice of the word-level node, then the values on the current best path (with local context) during the Viterbi search will be returned.
Evaluation	To evaluate our proposed method, we have conducted two sets of experiments on CTB5: word segmentation, and joint word segmentation and word-level POS tagging .
Evaluation	The results of the word segmentation experiment and the joint experiment of segmentation and POS tagging are shown in Table 5(a) and Table 5(b), respectively.
Introduction	ith Character-level POS Tagging
Introduction	We propose the first tagset designed for the task of character-level POS tagging , based on which we manually annotate the entire CTB5.
Introduction	We propose a method that performs character-level POS tagging jointly with word segmentation and word-level POS tagging .

POS tags is mentioned in 15 sentences in this paper.

Topics mentioned in this paper:

36. Automatic Coupling of Answer Extraction and Information Retrieval

Yao, Xuchen and Van Durme, Benjamin and Clark, Peter

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	But only sentence boundaries, POS tags and NER labels were kept as the annotation of the corpus.
Introduction	IR can easily make use of this knowledge: for a when question, IR retrieves sentences with tokens labeled as DATE by NER, or POS tagged as CD.
Introduction	Moreover, our approach extends easily beyond fixed answer types such as named entities: we are already using POS tags as a demonstration.
Method	We let the trained QA system guide the query formulation when performing coupled retrieval with Indri (Strohman et al., 2005), given a corpus already annotated with POS tags and NER labels.
Method	Since NER and POS tags are not lexicalized they accumulate many more counts (i.e.
Method	NER Types First We found NER labels better indicators of expected answer types than POS tags .

POS tags is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

37. Topological Field Parsing of German

Cheung, Jackie Chi Kit and Penn, Gerald

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Latent Variable Parser	The Berkeley parser has been applied to the TuBaD/Z corpus in the constituent parsing shared task of the ACL-2008 Workshop on Parsing German (Petrov and Klein, 2008), achieving an F1-measure of 85.10% and 83.18% with and without gold standard POS tags respectively2.
Experiments	As part of our experiment design, we investigated the effect of providing gold POS tags to the parser, and the effect of incorporating edge labels into the nonterminal labels for training and parsing.
Experiments	In all cases, gold annotations which include gold POS tags were used when training the parser.
Experiments	This table shows the results after five iterations of grammar modification, parameterized over whether we provide gold POS tags for parsing, and edge labels for training and parsing.
Introduction	the unlexicalized, latent variable-based Berkeley Innser(PeUInIetal,2006).VVfihoutanylanguage-or model-dependent adaptation, we achieve state-of-the-art results on the TuBa-D/Z corpus (Telljo-hann et al., 2004), with a Fl-measure of 95.15% using gold POS tags .
Introduction	It is found that the three techniques perform about equally well, with F1 of 94.1% using POS tags from the TnT tagger, and 98.4% with gold tags.

POS tags is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

38. Efficient Staggered Decoding for Sequence Labeling

Kaji, Nobuhiro and Fujiwara, Yasuhiro and Yoshinaga, Naoki and Kitsuregawa, Masaru

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Experiments on three tasks (POS tagging, joint POS tagging and chunking, and supertagging) show that the new algorithm is several orders of magnitude faster than the basic Viterbi and a state-of-the-art algorithm, CARPEDIEM (Esposito and Radicioni, 2009).
Introduction	Now they are indispensable in a wide range of NLP tasks including chunking, POS tagging , NER and so on (Sha and Pereira, 2003; Tsuruoka and Tsujii, 2005; Lin and Wu, 2009).
Introduction	For example, there are more than 40 and 2000 labels in POS tagging and supertagging, respectively (Brants, 2000; Matsuzaki et al., 2007).
Introduction	As we shall see later, we need over 300 labels to reduce joint POS tagging and chunking into the single sequence labeling problem.

POS tags is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

39. Low-Rank Tensors for Scoring Dependency Structures

Lei, Tao and Xin, Yu and Zhang, Yuan and Barzilay, Regina and Jaakkola, Tommi

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	These datasets include manually annotated dependency trees, POS tags and morphological information.
Experimental Setup	In contrast, assume we take the crossproduct of the auxiliary word vector values, POS tags and lexical items of a word and its context, and add the crossed values into a normal model (in gbhmm).
Introduction	This low dimensional syntactic abstraction can be thought of as a proxy to manually constructed POS tags .
Introduction	For instance, on the English dataset, the low-rank model trained without POS tags achieves 90.49% on first-order parsing, while the baseline gets 86.70% if trained under the same conditions, and 90.58% if trained with 12 core POS tags .
Problem Formulation	pos, form, lemma and morph stand for the fine POS tag , word form, word lemma and the morphology feature (provided in CoNLL format file) of the current word.
Problem Formulation	For example, pos-p means the POS tag to the left of the current word in the sentence.
Problem Formulation	Other possible features include, for example, the label of the arc h —> m, the POS tags between the head and the modifier, boolean flags which indicate the occurence of in-between punctutations or conjunctions, etc.
Results	The rationale is that given all other features, the model would induce representations that play a similar role to POS tags .
Results	Table 4: The first three columns show parsing results when models are trained without POS tags .
Results	the performance of a parser trained with 12 Core POS tags .

POS tags is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

40. Character-Level Chinese Dependency Parsing

Zhang, Meishan and Zhang, Yue and Che, Wanxiang and Liu, Ting

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Character-Level Dependency Tree	system, each word is initialized by the action SHW with a POS tag , before being incrementally modified by a sequence of intra-word actions, and finally being completed by the action PW.
Character-Level Dependency Tree	L and R denote the two elements over which the dependencies are built; the subscripts lcl and r01 denote the leftmost and rightmost children, respectively; the subscripts 102 and r02 denote the second leftmost and second rightmost children, respectively; w denotes the word; t denotes the POS tag ; 9 denotes the head character; ls_w and w denote the smallest left and right subwords respectively, as shown in Figure 2.
Character-Level Dependency Tree	Since the first element of the queue can be shifted onto the stack by either SH or AR, it is more difficult to assign a POS tag to each word by using a single action.

POS tags is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

41. Parsing Noun Phrase Structure with CCG

Vadas, David and Curran, James R.

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conversion Process	Since we are applying these to CCGbank NP structures rather than the Penn Treebank, the POS tag based heuristics are sufficient to determine heads accurately.
Conversion Process	Some POS tags require special behaviour.
Conversion Process	Accordingly, we do not alter tokens with POS tags of DT and PRP s. Instead, their sibling node is given the category N and their parent node is made the head.
Experiments	Table 3: Parsing results with gold-standard POS tags
Experiments	Table 4: Parsing results with automatic POS tags
Experiments	We have also experimented with using automatically assigned POS tags .
NER features	Many of these features generalise the head words and/or POS tags that are already part of the feature set.
NER features	There are already features in the model describing each combination of the children’s head words and POS tags , which we extend to include combinations with

POS tags is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

F-score (20)
NER (19)
CCG (18)

42. Shallow Analysis Based Assessment of Syntactic Complexity for Automated Speech Scoring

Bhat, Suma and Xue, Huichao and Yoon, Su-Youn

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	The first stage, ASR, yields an automatic transcription, which is followed by the POS tagging stage.
Experimental Setup	The steps for automatic assessment of overall proficiency follow an analogous process (either including the POS tagger or not), depending on the objective measure being evaluated.
Experimental Setup	5.3.2 POS tagger
Related Work	The idea of capturing differences in POS tag distributions for classification has been explored in several previous studies.
Related Work	In the area of text-genre classification, POS tag distributions have been found to capture genre differences in text (Feldman et al., 2009; Marin et al., 2009); in a language testing context, it has been used in grammatical error detection and essay scoring (Chodorow and Leacock, 2000; Tetreault and Chodorow, 2008).
Shallow-analysis approach to measuring syntactic complexity	Consider the two sentence fragments below taken from actual responses (the bigrams of interest and their associated POS tags are boldfaced).

POS tags is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

43. Unsupervised Dependency Parsing with Transferring Distribution via Parallel Guidance and Entropy Regularization

Ma, Xuezhe and Xia, Fei

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Data and Tools	The set of POS tags needs to be consistent across languages and treebanks.
Data and Tools	For this reason we use the universal POS tag set of Petrov et al.
Data and Tools	POS tags are not available for parallel data in the Europarl and Kaist corpus, so we need to pro-

POS tags is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

44. Efficient Third-Order Dependency Parsers

Koo, Terry and Collins, Michael

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Parsing experiments	For example, fdep contains lexicalized “in-between” features that depend on the head and modifier words as well as a word lying in between the two; in contrast, previous work has generally defined in-between features for POS tags only.
Parsing experiments	8A3 in previous work, English evaluation ignores any token whose gold-standard POS tag is one of { ‘ ‘ ' ' :
Parsing experiments	First, we define 4-gram features that characterize the four relevant indices using words and POS tags; examples include POS 4-grams and mixed 4-grams with one word and three POS tags .
Related work	These indices allow the use of arbitrary features predicated on the position of the grandparent (e. g., word identity, POS tag, contextual POS tags ) without affecting the asymptotic complexity of the parsing algorithm.

POS tags is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

45. A Word-Class Approach to Labeling PSCFG Rules for Machine Translation

Zollmann, Andreas and Vogel, Stephan

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Clustering phrase pairs directly using the K-means algorithm	Using a scheme based on source and target phrases with accounting for phrase size, with 36 word classes (the size of the Penn English POS tag set) for both languages, yields a grammar with (36 + 2 >\|< 362 )2 = 6.9m nonterminal labels.
Conclusion and discussion	Crucially, our methods only rely on “shallow” lexical tags, either generated by POS taggers or by automatic clustering of words into classes.
Conclusion and discussion	Using automatically obtained word clusters instead of POS tags yields essentially the same results, thus making our methods applicable to all languages pairs with parallel corpora, whether syntactic resources are available for them or not.
Conclusion and discussion	On the other extreme, the clustering based approach labels phrases based on the contained words alone.8 The POS grammar represents an intermediate point on this spectrum, since POS tags can change based on surrounding words in the sentence; and the position of the K-means model depends on the influence of the phrase contexts on the clustering process.
Experiments	The source and target language parses for the syntax-augmented grammar, as well as the POS tags for our POS-based grammars were generated by the Stanford parser (Klein and Manning, 2003).
Experiments	Our approach, using target POS tags (‘POS-tgt (no phr.
Experiments	, 36 (the number Penn treebank POS tags , used for the ‘POS’ models, is 36).6 For ‘Clust’, we see a comfortably wide plateau of nearly-identical scores from N = 7,. .
Hard rule labeling from word classes	We use the simple term ‘tag’ to stand for any kind of word-level analysis—a syntactic, statistical, or other means of grouping word types or tokens into classes, possibly based on their position and context in the sentence, POS tagging being the most obvious example.
Related work	(2007) improve the statistical phrase-based MT model by injecting supertags, lexical information such as the POS tag of the word and its subcategorization information, into the phrase table, resulting in generalized phrases with placeholders in them.

POS tags is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

46. Real-World Semi-Supervised Learning of POS-Taggers for Low-Resource Languages

Garrette, Dan and Mielens, Jason and Baldridge, Jason

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Approach	These targeted morphological features are effective during LP because words that share them are much more likely to actually share POS tags .
Approach	Since the LP graph contains a node for each corpus token, and each node is labeled with a distribution over POS tags , the graph provides a corpus of sentences labeled with noisy tag distributions along with an expanded tag dictionary.
Data	enized and labeled with POS tags by two linguistics graduate students, each of which was studying one of the languages.
Data	The KIN and MLG data have 12 and 23 distinct POS tags , respectively.
Data	The PTB uses 45 distinct POS tags .
Experiments3	Moreover, since large gains in accuracy can be achieved by spending a small amount of time just annotating word types with POS tags , we are led to conclude that time should be spent annotating types or tokens instead of developing an FST.
Introduction	Haghighi and Klein (2006) develop a model in which a POS-tagger is learned from a list of POS tags and just three “prototype” word types for each tag, but their approach requires a vector space to compute the distributional similarity between prototypes and other word types in the corpus.

POS tags is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

47. Stochastic Gradient Descent Training for L1-regularized Log-linear Models with Cumulative Penalty

Tsuruoka, Yoshimasa and Tsujii, Jun'ichi and Ananiadou, Sophia

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	We evaluate the effectiveness of our method by using linear-chain conditional random fields (CRFs) and three traditional NLP tasks, namely, text chunking (shallow parsing), named entity recognition, and POS tagging .
Log-Linear Models	The model is used for a variety of sequence labeling tasks such as POS tagging , chunking, and named entity recognition.
Log-Linear Models	We evaluate the effectiveness our training algorithm using linear-chain CRF models and three NLP tasks: text chunking, named entity recognition, and POS tagging .
Log-Linear Models	The features used in this experiment were unigrams and bigrams of neighboring words, and unigrams, bigrams and trigrams of neighboring POS tags .

POS tags is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

48. Lexical Inference over Multi-Word Predicates: A Distributional Approach

Abend, Omri and Cohen, Shay B. and Steedman, Mark

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	Relations were extracted using regular expressions over the output of a POS tagger and an NP chunker.
Experimental Setup	We use a Maximum Entropy POS Tagger , trained on the Penn Treebank, and the WordNet lemmatizer, both implemented within the NLTK package (Loper and Bird, 2002).
Experimental Setup	To obtain a coarse-grained set of POS tags , we collapse the tag set to 7 categories: nouns, verbs, adjectives, adverbs, prepositions, the word “to” and a category that includes all other words.
Our Proposal: A Latent LC Approach	1We use a POS tagger to identify content words.
Our Proposal: A Latent LC Approach	In addition, we use POS-based features that encode the most frequent POS tag for the word lemma and the second most frequent POS tag (according to R).
Our Proposal: A Latent LC Approach	Information about the second most frequent POS tag can be important in identifying light verb constructions, such as “take a swim” or “give a smile”, where the object is derived from a verb.

POS tags is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

49. New Word Detection for Sentiment Analysis

Huang, Minlie and Ye, Borui and Wang, Yichen and Chen, Haiqiang and Cheng, Junjun and Zhu, Xiaoyan

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	The method is almost free of linguistic resources (except POS tags ), and requires no elaborated linguistic rules.
Conclusion	almost knowledge-free (except POS tags ) framework.
Conclusion	The method is almost free of linguistic resources (except POS tags ), and does not rely on elaborated linguistic rules.
Introduction	This framework is fully unsupervised and purely data-driven, and requires very lightweight linguistic resources (i.e., only POS tags ).
Methodology	In order to obtain lexical patterns, we can define regular expressions with POS tags 2 and apply the regular expressions on POS tagged texts.
Methodology	2Such expressions are very simple and easy to write because we only need to consider POS tags of adverbial and auxiliary word.
Methodology	Our algorithm is in spirit to double propagation (Qiu et al., 2011), however, the differences are apparent in that: firstly, we use very lightweight linguistic information (except POS tags ); secondly, our major contributions are to propose statistical measures to address the following key issues: first, to measure the utility of lexical patterns; second, to measure the possibility of a candidate word being a new word.

POS tags is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

50. Cross Language Dependency Parsing using a Bilingual Lexicon

Zhao, Hai and Song, Yan and Kit, Chunyu and Zhou, Guodong

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Dependency Parsing: Baseline	pas POS tag of word
Dependency Parsing: Baseline	cpos] coarse POS: the first letter of POS tag of word
Dependency Parsing: Baseline	cposZ coarse POS: the first two POS tags of word
Exploiting the Translated Treebank	Chinese word should be strictly segmented according to the guideline before POS tags and dependency relations are annotated.
Exploiting the Translated Treebank	The difference is, rootscore counts for the given POS tag occurring as ROOT, and pairscore counts for two POS tag combination occurring for a dependent relationship.
Treebank Translation and Dependency Transformation	Bind POS tag and dependency relation of a word with itself; 2.
Treebank Translation and Dependency Transformation	After the target sentence is generated, the attached POS tags and dependency information of each English word will also be transferred to each corresponding Chinese word.

POS tags is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

51. Detecting Errors in Automatically-Parsed Dependency Relations

Dickinson, Markus

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Ad hoc rule detection	Units of comparison To determine similarity, one can compare dependency relations, POS tags , or both.
Ad hoc rule detection	Thus, we use the pairs of dependency relations and POS tags as the units of comparison.
Additional information	One method which does not have this problem of overflagging uses a “lexicon” of POS tag pairs, examining relations between POS, irrespective of position.
Evaluation	We use the gold standard POS tags for all experiments.
Evaluation	For example, the parsed rule TA —> IG:IG RO has a correct dependency relation (IG) between the POS tags IG and its head RO, yet is assigned a whole rule score of 2 and a bigram score of 20.
Evaluation	This is likely due to the fact that Alpino has the smallest label set of any of the corpora, with only 24 dependency labels and 12 POS tags (cf.

POS tags is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

52. Linguistically debatable or just plain wrong?

Plank, Barbara and Hovy, Dirk and Sogaard, Anders

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Annotator disagreements across domains and languages	In this study, we had between 2-10 individual annotators with degrees in linguistics annotate different kinds of English text with POS tags , e.g., newswire text (PTB WSJ Section 00), transcripts of spoken language (from a database containing transcripts of conversations, Talkbankl), as well as Twitter posts.
Annotator disagreements across domains and languages	We instructed annotators to use the 12 universal POS tags of Petrov et al.
Annotator disagreements across domains and languages	2Experiments with variation 71- grams on WSJ (Dickinson and Meurers, 2003) and the French data lead us to estimate that the fine-to-coarse mapping of POS tags disregards about 20% of observed tag-pair confusion types, most of which relate to fine-grained verb and noun distinctions, e. g. past participle versus past in “[..] criminal lawyers speculated/VBD vs. VBN that [..]”.
Related work	(2014) use small samples of doubly-annotated POS data to estimate annotator reliability and show how those metrics can be implemented in the loss function when inducing POS taggers to reflect confidence we can put in annotations.
Related work	They show that not biasing the theory towards a single annotator but using a cost-sensitive learning scheme makes POS taggers more robust and more applicable for downstream tasks.

POS tags is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

53. Open Information Extraction Using Wikipedia

Wu, Fei and Weld, Daniel S.

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	WOE can run in two modes: a CRF extractor (WOEPOS) trained with shallow features like POS tags ; a pattern classfier (WOEparse) learned from dependency path patterns.
Related Work	Shallow or Deep Parsing: Shallow features, like POS tags , enable fast extraction over large-scale corpora (Davidov et al., 2007; Banko et al., 2007).
Wikipedia-based Open IE	NLP Annotation: As we discuss fully in Section 4 (Experiments), we consider several variations of our system; one version, WOEparse, uses parser-based features, while another, WOEPOS , uses shallow features like POS tags , which may be more quickly computed.
Wikipedia-based Open IE	Depending on which version is being trained, the preprocessor uses OpenNLP to supply POS tags and NP—chunk annotations — or uses the Stanford Parser to create a dependency parse.
Wikipedia-based Open IE	We learn two kinds of extractors, one (WOEparse) using features from dependency-parse trees and the other (WOEPOS) limited to shallow features like POS tags .

POS tags is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

54. Part-of-speech tagging with antagonistic adversaries

Sogaard, Anders

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	Our approach was superior to previous approaches across 12 multilingual cross-domain POS tagging datasets, with an average error reduction of 4% over a structured perceptron baseline.
Experiments	POS tagging accuracy is known to be very sensitive to domain shifts.
Experiments	(2011) report a POS tagging accuracy on social media data of 84% using a tagger that ac-chieves an accuracy of about 97% on newspaper data.
Experiments	While POS taggers can often recover the part of speech of a previously unseen word from the context it occurs in, this is harder than for previously seen words.
Introduction	This paper considers the POS tagging problem, i.e.
Introduction	Several authors have noted how POS tagging performance is sensitive to cross-domain shifts (Blitzer et al., 2006; Daume III, 2007; Jiang and Zhai, 2007), and while most authors have assumed known target distributions and pool unlabeled target data in order to automatically correct cross-domain bias (Jiang and Zhai, 2007; Foster et al., 2010), methods such as feature bagging (Sutton et al., 2006), learning with random adversaries (Globerson and Roweis, 2006) and LOO-regularization (Dekel and Shamir, 2008) have been proposed to improve performance on unknown target distributions.
Introduction	Section 4 presents experiments on POS tagging and discusses how to evaluate cross-domain performance.

POS tags is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

55. Semantic Representation of Negation Using Focus Detection

Blanco, Eduardo and Moldovan, Dan

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Learning Algorithm	Features (1—5) are extracted for each role and capture their presence, first POS tag and word, length and position within the roles present for that instance.
Learning Algorithm	Al—postag is extracted for the following POS tags : DT, JJ, PRP, CD, RB, VB and WP; Al—keywo rd for the following words: any, anybody, anymore, anyone, anything, anytime, anywhere, certain, enough, fall, many, much, other, some, specifics, too and until.
Learning Algorithm	These lists of POS tags and keywords were extracted after manual examination of training examples and aim at signaling whether this role correspond to the focus.

POS tags is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

56. Unsupervised Lexicon-Based Resolution of Unknown Words for Full Morphological Analysis

Adler, Meni and Goldberg, Yoav and Gabay, David and Elhadad, Michael

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Evaluation	Table 5 shows the result of the disambiguation when we only take into account the POS tag of the unknown tokens.
Introduction	On the one hand, this tagset is much larger than the largest tagset used in English (from 17 tags in most unsupervised POS tagging experiments, to the 46 tags of the WSJ corpus and the about 150 tags of the LOB corpus).
Introduction	On average, each token in the 42M corpus is given 2.7 possible analyses by the analyzer (much higher than the average 1.41 POS tag ambiguity reported in English (Dermatas and Kokkinakis, 1995)).
Previous Work	At the word level, a segmented word is attached to a POS, where the character model is based on the observed characters and their classification: Begin of word, In the middle of a word, End of word, the character is a word itself S. They apply Baum-Welch training over a segmented corpus, where the segmentation of each word and its character classification is observed, and the POS tagging is ambiguous.
Previous Work	(of all words in a given sentence) and the POS tagging (of the known words) is based on a Viterbi search over a lattice composed of all possible word segmentations and the possible classifications of all observed characters.
Previous Work	They report a very slight improvement on Hebrew and Arabic supervised POS taggers .

POS tags is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

57. Learning Dependency-Based Compositional Semantics

Liang, Percy and Jordan, Michael and Klein, Dan

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	predicate cc in w (e.g., (Boston, Boston)), and (iii) predicates for each POS tag in {11, NN, NNS} (e.g., (JJ, size), (JJ, area), etc.
Experiments	We also define an augmented lexicon L+ which includes a prototype word cc for each predicate appearing in (iii) above (e.g., (large, size)), which cancels the predicates triggered by :c’s POS tag .
Experiments	SEMRESP requires a lexicon of 1.42 words per non-value predicate, Word-Net features, and syntactic parse trees; DCS requires only words for the domain-independent predicates (overall, around 0.5 words per non-value predicate), POS tags , and very simple indicator features.

POS tags is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

58. Exploiting Heterogeneous Treebanks for Parsing

Niu, Zheng-Yu and Wang, Haifeng and Wu, Hua

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments of Grammar Formalism Conversion	(2008) used POS tag information, dependency structures and dependency tags in test set for conversion.
Experiments of Grammar Formalism Conversion	Similarly, we used POS tag information in the test set to restrict search space of the parser for generation of better N-best parses.
Experiments of Parsing	CDT consists of 60k Chinese sentences, annotated with POS tag information and dependency structure information (including 28 P08 tags, and 24 dependency tags) (Liu et al., 2006).
Experiments of Parsing	We did not use POS tag information as inputs to the parser in our conversion method due to the difficulty of conversion from CDT POS tags to CTB POS tags .
Experiments of Parsing	We used the POS tagged People Daily corpus9 (Jan. l998~Jun.
Our Two-Step Solution	” (a preposition, with “BA” as its POS tag in CTB), and the head of IP-OBJ is 3% [El ” .

POS tags is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

59. Named Entity Recognition using Cross-lingual Resources: Arabic as an Example

Darwish, Kareem

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	- Part-of-speech (POS) tags and morphological features: POS tags indicate (or counter-indicate) the possible presence of a named entity at word level or at word sequence level.
Related Work	Benajiba and Rosso (2007) improved their system by incorporating POS tags to improve NE boundary detection.
Related Work	Benajiba and Rosso (2008) used CRF sequence labeling and incorporated many language specific features, namely POS tagging , base-phrase chunking, Arabic tokenization, and adjectives indicating nationality.
Related Work	Using POS tagging generally improved recall at the expense of precision, leading to overall improvements in F-measure.

POS tags is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

60. Open-Domain Semantic Role Labeling by Modeling Word Spans

Huang, Fei and Yates, Alexander

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	For the chunker and POS tagger , the drop-offs are less severe: 94.89 to 91.73, and 97.36 to 94.73.
Introduction	We use an open source CRF software package to implement our CRF models.1 We use words, POS tags , chunk labels, and the predicate label at the preceding and following nodes as features for our Baseline system.
Introduction	0 P05 before, after predicate: the POS tag of the tokens immediately preceding and following the predicate

POS tags is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

61. Reconstructing an Indo-European Family Tree from Non-native English Texts

Nagata, Ryo and Whittaker, Edward

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Performance of POS tagging is an important factor in our methods because they are based on wordfl’OS sequences.
Experiments	Existing POS taggers might not perform well on nonnative English texts because they are normally developed to analyze native English texts.
Methods	In this language model, content words in n-grams are replaced with their corresponding POS tags .
Methods	Finally, words are replaced with their corresponding POS tags; for the following words, word tokens are used as their corresponding POS tags : coordinating conjunctions, determiners, prepositions, modals, predeterminers, possessives, pronouns, question adverbs.
Methods	At this point, the special POS tags BOS and EOS are added at the beginning and end of each sentence, respectively.

POS tags is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

62. Quadratic-Time Dependency Parsing for Machine Translation

Galley, Michel and Manning, Christopher D.

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Dependency parsing experiments	The only features that are not cached are the ones that include contextual POS tags , since their miss rate is relatively high.
Dependency parsing for machine translation	o a predicted POS tag tj; o a dependency score sj.
Dependency parsing for machine translation	We write h-word, h-pos, m-word, m-pos to refer to head and modifier words and POS tags , and append a numerical value to shift the word offset either to the left or to the right (e.g., h-pos+1 is the POS to the right of the head word).
Dependency parsing for machine translation	It is quite similar to the McDonald (2005a) feature set, except that it does not include the set of all POS tags that appear between each candidate head-modifier pair (i , j).

POS tags is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

63. Cross-lingual Transfer of Semantic Role Labeling Models

Kozhevnikov, Mikhail and Titov, Ivan

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Evaluation	The part-of-speech tags in all datasets were replaced with the universal POS tags of Petrov et al.
Model Transfer	This may have a negative effect on the performance of a monolingual model, since most part-of-speech tagsets are more fine-grained than the universal POS tags considered here.
Model Transfer	Since the finer-grained POS tags often reflect more language-specific phenomena, however, they would only be useful for very closely related languages in the cross-lingual setting.
Model Transfer	If Synt is enabled too, it also uses the POS tags of the argument’s parent, children and siblings.
Related Work	Cross-lingual annotation projection (Yarowsky et al., 2001) approaches have been applied extensively to a variety of tasks, including POS tagging (Xi and Hwa, 2005; Das and Petrov, 2011), morphology segmentation (Snyder and Barzilay, 2008), verb classification (Merlo et al., 2002), mention detection (Zitouni and Florian, 2008), LFG parsing (Wroblewska and Frank, 2009), information extraction (Kim et al., 2010), SRL (Pado and Lapata, 2009; van der Plas et al., 2011; Annesi and Basili, 2010; Tonelli and Pi-anta, 2008), dependency parsing (Naseem et al., 2012; Ganchev et al., 2009; Smith and Eisner, 2009; Hwa et al., 2005) or temporal relation pre-

POS tags is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

64. Two-Neighbor Orientation Model with Cross-Boundary Global Contexts

Setiawan, Hendra and Zhou, Bowen and Xiang, Bing and Shen, Libin

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Training	2. surrounding: lslex (the previous word / 33:11), rslex (the next word/ fJJ-jill), lspos (lsleX’S POS tag), rspos (rsleX’S POS tag ), lsparent (lsleX’S parent), rsparent
Training	3. nonlocal: lanchorslex (thE: pl‘EDVlOLlS anchor’s word) , ranchorslex (the next an-ChOf’S word), lanchorspos (lanchorslex’s POS tag), ranchorspos (ranchorslex’s POS tag ).
Training	Of mosl_int_spos (mosl_int_sleX’S POS tag ), mosl_ext_spos (mosl_ext_spos’S PQS tag), mosr_int_slex (the actual word.

POS tags is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

65. Extracting Social Power Relationships from Natural Language

Bramsen, Philip and Escobar-Molano, Martha and Patel, Ami and Alonso, Rafael

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Many would be better modeled by POS tag unigrams (with no word information) or by longer n-grams consisting of either words, POS tags , or a combination of the two.
Abstract	Each n-gram is a sequence of words, POS tags or a combination of words and POS tags
Abstract	or a POS tag .

POS tags is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

66. Dependency Grammar Induction via Bitext Projection Constraints

Ganchev, Kuzman and Gillenwater, Jennifer and Taskar, Ben

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We used the Tokyo tagger (Tsuruoka and Tsujii, 2005) to POS tag the English tokens, and generated parses using the first-order model of McDonald et al.
Experiments	For Bulgarian we trained the Stanford POS tagger (Toutanova et al., 2003) on the Bul-
Experiments	The Spanish Europarl data was POS tagged with the FreeLing language analyzer (Atserias et al., 2006).

POS tags is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

67. Transition-based Dependency Parsing with Selectional Branching

Choi, Jinho D. and McCallum, Andrew

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Moreover, all POS tag features from English are duplicated with coarse-grained POS tags provided by CoNLL-X.
Experiments	Before parsing, POS tags were assigned to the training set by using 20-way jackknifing.
Experiments	For the automatic generation of POS tags , we used the domain-specific model of Choi and Palmer (2012a)’s tagger, which gave 97.5% accuracy on the English evaluation set (0.2% higher than Collins (2002)’s tagger).
Related work	Bohnet and Nivre (2012) introduced a transition-based system that jointly performed POS tagging and dependency parsing.

POS tags is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

68. Semi-Supervised Semantic Tagging of Conversational Understanding using Markov Topic Regression

Celikyilmaz, Asli and Hakkani-Tur, Dilek and Tur, Gokhan and Sarikaya, Ruhi

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Table 2: Domain Adaptation performance in F-measure on Semantic Tagging on Movie Target domain and POS tagging on QBanszuestionBank.
Related Work and Motivation	In (Subramanya et al., 2010) an efficient iterative SSL method is described for syntactic tagging, using graph-based learning to smooth POS tag posteriors.
Semi-Supervised Semantic Labeling	In (Subramanya et al., 2010), a new SSL method is described for adapting syntactic POS tagging of sentences in newswire articles along with search queries to a target domain of natural language (NL) questions.
Semi-Supervised Semantic Labeling	The unlabeled POS tag posteriors are then smoothed using a graph-based learning algorithm.
Semi-Supervised Semantic Labeling	Later, using Viterbi decoding, they select the l-best POS tag sequence, 33'?

POS tags is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

69. Constituency to Dependency Translation with Forests

Mi, Haitao and Liu, Qun

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Decoding	where the first two terms are translation and language model probabilities, 6(0) is the target string (English sentence) for derivation 0, the third and forth items are the dependency language model probabilities on the target side computed with words and POS tags separately, De (0) is the target dependency tree of 0, the fifth one is the parsing probability of the source side tree TC(0) 6 FC, the ill(0) is the penalty for the number of ill-formed dependency structures in 0, and the last two terms are derivation and translation length penalties, respectively.
Decoding	In order to alleviate the problem of data sparse, we also compute a dependency language model for POS tages over a dependency tree.
Decoding	the POS tag information on the target side for each constituency-to-dependency rule.
Experiments	We also store the POS tag information for each word in dependency trees, and compute two different dependency language models for words and POS tags in dependency tree separately.

POS tags is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

70. Simple Unsupervised Grammar Induction from Raw Text with Cascaded Finite State Models

Ponvert, Elias and Baldridge, Jason and Erk, Katrin

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

CD	CCM learns to predict a set of brackets over a string (in practice, a string of POS tags ) by jointly estimating constituent and distituent strings and contexts using an iterative EM-like procedure (though, as noted by Smith and Eisner (2004), CCM is deficient as a generative model).
Introduction	Recent work (Headden III et al., 2009; Cohen and Smith, 2009; Hanig, 2010; Spitkovsky et al., 2010) has largely built on the dependency model with valence of Klein and Manning (2004), and is characterized by its reliance on gold-standard part-of—speech (POS) annotations: the models are trained on and evaluated using sequences of POS tags rather than raw tokens.
Introduction	An exception which learns from raw text and makes no use of POS tags is the common cover links parser (CCL, Seginer 2007).
Tasks and Benchmark	portantly, until recently it was the only unsupervised raw text constituent parser to produce results competitive with systems which use gold POS tags (Klein and Manning, 2002; Klein and Manning, 2004; Bod, 2006) — and the recent improved raw-text parsing results of Reichart and Rappoport (2010) make direct use of CCL without modification.
Tasks and Benchmark	Finally, CCL outperforms most published POS-based models when those models are trained on unsupervised word classes rather than gold POS tags .

POS tags is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

71. Bitext Dependency Parsing with Bilingual Subtree Constraints

Chen, Wenliang and Kazama, Jun'ichi and Torisawa, Kentaro

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Bilingual subtree constraints	For the source part, we replace nouns and verbs using their POS tags (coarse grained tags).
Bilingual subtree constraints	For example, we have the subtree pair: “H %(society):2-ifl €%(fringe):0” and “fringes(W_2):0-of:1-society(W_1):2”, where “of” does not have a corresponding word, the POS tag of “fiéflsocietyY’ is N, and the POS tag of “53 é%(fringe)” is N. The source part of the rule becomes “N22-N20” and the target part becomes “W_2:0-of:1-W_1:2”.
Experiments	For Chinese unannotated data, we used the XIN_CMN portion of Chinese Gigaword Version 2.0 (LDC2009T14) (Huang, 2009), which has approximately 311 million words whose segmentation and POS tags are given.
Experiments	We used the MMA system (Kruengkrai et al., 2009) trained on the training data to perform word segmentation and POS tagging and used the Baseline Parser to parse all the sentences in the data.
Experiments	The POS tags were assigned by the MXPOST tagger trained on training data.

POS tags is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

72. Labeling Documents with Timestamps: Learning from their Time Expressions

Chambers, Nathanael

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Learning Time Constraints	n—gram POS The 4—gram and 3-gram of POS tags that end with the year
Previous Work	Kanhabua and Norvag (2008; 2009) extended this approach with the same model, but expanded its unigrams with POS tags , collocations, and tf-idf scores.
Timestamp Classifiers	Word Classes: include only nouns, verbs, and adjectives as labeled by a POS tagger
Timestamp Classifiers	on POS tags and tf-idf scores.
Timestamp Classifiers	Typed Dependency POS: Similar to Typed Dependency, this feature uses POS tags of the dependency relation’s governor.

POS tags is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

MaxEnt (15)
unigrams (15)
NER (7)

73. Negation Focus Identification with Contextual Discourse Information

Zou, Bowei and Zhou, Guodong and Zhu, Qiaoming

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Baselines	Following is a list of features adopted in the two baselines, for both BaselineC4'5 and BaselineSVM, > Basic features: first token and its part-of-speech (POS) tag of the focus candidate; the number of tokens in the focus candidate; relative position of the focus candidate among all the roles present in the sentence; negated verb and its POS tag of the negative expression;
Baselines	> Syntactic features: the sequence of words from the beginning of the governing VP to the negated verb; the sequence of POS tags from the beginning of the governing VP to the negated verb; whether the governing VP contains a CC; whether the governing VP contains a RB.
Baselines	> Semantic features: the syntactic label of semantic role A1; whether A1 contains POS tag DT, JJ, PRP, CD, RB, VB, and WP, as defined in Blanco and Moldovan (2011); whether A1 contains token any, anybody, anymore, anyone, anything, anytime, anywhere, certain, enough, full, many, much, other, some, specifics, too, and until, as defined in Blanco and Moldovan (2011); the syntactic label of the first semantic role in the sentence; the semantic label of the last semantic role in the sentence; the thematic role for AO/Al/AZ/A3/A4 of the negated predicate.

POS tags is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

74. Fast and Accurate Shift-Reduce Constituent Parsing

Zhu, Muhua and Zhang, Yue and Chen, Wenliang and Zhang, Min and Zhu, Jingbo

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	For both English and Chinese data, we used tenfold jackknifing (Collins, 2000) to automatically assign POS tags to the training data.
Experiments	For English POS tagging, we adopted SVMTool, 3 and for Chinese POS tagging
Experiments	we employed the Stanford POS tagger .
Semi-supervised Parsing with Large Data	Word clusters are regarded as lexical intermediaries for dependency parsing (Koo et al., 2008) and POS tagging (Sun and Uszkoreit, 2012).

POS tags is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

75. Semantic Parsing via Paraphrasing

Berant, Jonathan and Liang, Percy

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Paraphrasing	Deletions Deleted lemma and POS tag
Paraphrasing	£13,; j and ci/zj/ denote spans from a: and c. pos(:1:¢;j) and lemma(:1:i; j) denote the POS tag and lemma sequence of £13,; 3'.
Paraphrasing	For a pair (at, c), we also consider as candidate associations the set [3 (represented implicitly), which contains token pairs (510,, ci/) such that at, and oil share the same lemma, the same POS tag , or are linked through a derivation link on WordNet (Fellbaum, 1998).

POS tags is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

76. Strategies for Contiguous Multiword Expression Analysis and Dependency Parsing

Candito, Marie and Constant, Matthieu

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Use of external MWE resources	6We use the version available in the POS tagger MElt (Denis and Sagot, 2009).
Use of external MWE resources	The MWE analyzer is a CRF-based sequential labeler, which, given a tokenized text, jointly performs MWE segmentation and POS tagging (of simple tokens and of MWEs), both tasks mutually helping each other9.
Use of external MWE resources	The MWE analyzer integrates, among others, features computed from the external lexicons described in section 5.1, which greatly improve POS tagging (Denis and Sagot, 2009) and MWE segmentation (Constant and Tel-lier, 2012).

POS tags is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

77. Ambiguity-aware Ensemble Training for Semi-supervised Dependency Parsing

Li, Zhenghua and Zhang, Min and Chen, Wenliang

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments and Analysis	We build a CRF-based bigram part-of-speech (POS) tagger with the features described in (Li et al., 2012), and produce POS tags for all trairfldevelopment/test/unlabeled sets (10-way jackknifing for training sets).
Experiments and Analysis	(2012) and Bohnet and Nivre (2012) use joint models for POS tagging and dependency parsing, significantly outperforming their pipeline counterparts.
Experiments and Analysis	Our approach can be combined with their work to utilize unlabeled data to improve both POS tagging and parsing simultaneously.
Supervised Dependency Parsing	ti denotes the POS tag of 10,-. b is an index between h and m. dir(z', j) and dist(i, j) denote the direction and distance of the dependency (i, j).

POS tags is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

78. Semantic Frame Identification with Distributed Word Representations

Hermann, Karl Moritz and Das, Dipanjan and Weston, Jason and Ganchev, Kuzman

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Argument Identification	0 bag of words in a 0 bag of POS tags in a
Argument Identification	o the set of dependency labels of the predicate’s children 0 dependency path conjoined with the POS tag of a’s head
Experiments	Before parsing the data, it is tagged with a POS tagger trained with a conditional random field (Lafferty et al., 2001) with the following emission features: word, the word cluster, word suffixes of length l, 2 and 3, capitalization, whether it has a hyphen, digit and punctuation.
Frame Identification with Embeddings	Let the lexical unit (the lemma conjoined with a coarse POS tag ) for the marked predicate be 6.

POS tags is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

79. Head-Driven Hierarchical Phrase-based Translation

Li, Junhui and Tu, Zhaopeng and Zhou, Guodong and van Genabith, Josef

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Examining translation rules extracted from the training data shows that there are 72,366 types of non-terminals with respect to 33 types of POS tags .
Head-Driven HPB Translation Model	Instead of collapsing all non-terminals in the source language into a single symbol X as in Chiang (2007), given a word sequence f2- from position i to position 3', we first find heads and then concatenate the POS tags of these heads as fé’s nonterminal symbol.
Head-Driven HPB Translation Model	We look for initial phrase pairs that contain other phrases and then replace sub-phrases with POS tags corresponding to their heads.
Introduction	Here, each Chinese word is attached with its POS tag and Pinyin.

POS tags is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

80. Faster Parsing by Supertagger Adaptation

Kummerfeld, Jonathan K. and Roesner, Jessika and Dawborn, Tim and Haggerty, James and Curran, James R. and Clark, Stephen

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background	The C&C supertagger is similar to the Ratnaparkhi (1996) tagger, using features based on words and POS tags in a five-word window surrounding the target word, and defining a local probability distribution over supertags for each word in the sentence, given the previous two supertags.
Data	For supertagger evaluation, one thousand sentences were manually annotated with CCG lexical categories and POS tags .
Introduction	Since the CCG lexical category set used by the supertagger is much larger than the Penn Treebank POS tag set, the accuracy of supertagging is much lower than POS tagging ; hence the CCG supertagger assigns multiple supertags1 to a word, when the local context does not provide enough information to decide on the correct supertag.
Introduction	(2003) were unable to improve the accuracy of POS tagging using self-training.

POS tags is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

81. Dependency Parsing and Projection Based on Word-Pair Classification

Jiang, Wenbin and Liu, Qun

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	For English, we use the automatically-assigned POS tags produced by an implementation of the POS tagger of Collins (2002).
Experiments	While for Chinese, we just use the gold-standard POS tags following the tradition.
Experiments	Both English and Chinese sentences are tagged by the implementations of the POS tagger of Collins (2002), which trained on W8] and CTB 5.0 respectively.
Word-Pair Classification Model	1 Each feature is composed of some words and POS tags surrounded word 7' and/or word j, as well as an optional distance representations between this two words.

POS tags is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

82. Minimized Models and Grammar-Informed Initialization for Supertagging with Highly Ambiguous Lexicons

Ravi, Sujith and Baldridge, Jason and Knight, Kevin

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	The most ambiguous word has 7 different POS tags associated with it.
Minimized models for supertagging	We also wish to scale our methods to larger data settings than the 24k word tokens in the test data used in the POS tagging task.
Minimized models for supertagging	On the simpler task of unsupervised POS tagging with a dictionary, we compared our method versus directly solving I Pongmal and found that the minimization (in terms of grammar size) achieved by our method is close to the optimal solution for the original objective and yields the same tagging accuracy far more efficiently.
Minimized models for supertagging	Ravi and Knight (2009) exploited this to iteratively improve their POS tag model: since the first minimization procedure is seeded with a noisy grammar and tag dictionary, iterating the IP procedure with progressively better grammars further improves the model.

POS tags is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

83. Starting from Scratch in Semantic Role Labeling

Connor, Michael and Gertner, Yael and Fisher, Cynthia and Roth, Dan

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Testing SRL Performance	When trained on arguments identified via the unsupervised POS tagger , noun pattern features promoted agent interpretations of tran-
Unsupervised Parsing	To implement this division into function and content words3, we start with a list of function word POS tags4 and then find words that appear predominantly with these POS tags , using tagged WSJ data (Marcus et al., 1993).
Unsupervised Parsing	Smaller numbers are better, indicating less information lost in moving from the HMM states to the gold POS tags .
Unsupervised Parsing	We first evaluate these parsers (the first stage of our SRL system) on unsupervised POS tagging .

POS tags is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

84. Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification

Bollegala, Danushka and Weir, David and Carroll, John

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Motivating Example	POS tags Excellent/JJ and/CC broad/JJ
Sentiment Sensitive Thesaurus	We then apply a simple word filter based on POS tags to select content words (nouns, verbs, adjectives, and adverbs).
Sentiment Sensitive Thesaurus	In addition to word-level sentiment features, we replace words with their POS tags to create
Sentiment Sensitive Thesaurus	POS tags generalize the word-level sentiment features, thereby reducing feature sparseness.

POS tags is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

85. A Large Scale Distributed Syntactic, Semantic and Lexical Language Model for Machine Translation

Tan, Ming and Zhou, Wenli and Zheng, Lei and Wang, Shaojun

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Composite language model	The SLM is based on statistical parsing techniques that allow syntactic analysis of sentences; it assigns a probability p(VV, T) to every sentence W and every possible binary parse T. The terminals of T are the words of W with POS tags , and the nodes of T are annotated with phrase headwords and nonterminal labels.
Composite language model	A word-parse k-prefix has a set of exposed heads h_m, - - - , h_1, with each head being a pair (headword, nonterminal label), or in the case of a root-only tree (word, POS tag ).
Composite language model	An m—th order SLM (m-SLM) has three operators to generate a sentence: WORD-PREDICTOR predicts the next word wk+1 based on the m leftmost exposed headwords bin 2 h_m, - - - , h_1 in the word-parse k-prefix with probability p(wk+1\|h:,1n), and then passes control to the TAGGER; the TAGGER predicts the POS tag tk+1 to the next word wk+1 based on the next word wk+1 and the POS tags of the m leftmost exposed headwords hjn in the word-parse k-prefix with probability p(tk+1\|wk+1, h_m.tag, - - - ,h_1.tag); the CONSTRUCTOR builds the partial parse Tk, from Tk,_1, wk, and tk, in a series of moves ending with NULL, where a parse move a is made with probability p(a\|h:,1,,); a e A={(unary, NTlabel), (adjoin-left, NTlabel), (adjoin-right, NTlabel), null}.
Training algorithm	The TAGGER and CONSTRUCTOR are conditional probabilistic models of the type p(u\|zl, - - - ,2“) where u, 21, - - - ,zn belong to a mixed set of words, POS tags , NTtags, CONSTRUCTOR actions (u only), and 21, - - - ,2“, form a linear Markov chain.

POS tags is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

language model (36)
BLEU (15)
n-gram (14)

86. Minimized Models for Unsupervised Part-of-Speech Tagging

Ravi, Sujith and Knight, Kevin

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We describe a novel method for the task of unsupervised POS tagging with a dictionary, one that uses integer programming to explicitly search for the smallest model that explains the data, and then uses EM to set parameter values.
Introduction	The classic Expectation Maximization (EM) algorithm has been shown to perform poorly on POS tagging , when compared to other techniques, such as Bayesian methods.
Introduction	(2008) depart from the Bayesian framework and show how EM can be used to learn good POS taggers for Hebrew and English, when provided with good initial conditions.
What goes wrong with EM?	The overall POS tag distribution learnt by EM is relatively uniform, as noted by Johnson (2007), and it tends to assign equal number of tokens to each

POS tags is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

87. Character-Level Machine Translation Evaluation for Languages with Ambiguous Word Boundaries

Liu, Chang and Ng, Hwee Tou

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discussion and Future Work	We can then award partial scores for related words, such as those identified as such by WordNet or those with the same POS tags .
Experiments	However, its use of POS tags and synonym dictionaries prevents its use at the character-level.
Experiments	We use the Stanford Chinese word segmenter (Tseng et al., 2005) and POS tagger (Toutanova et al., 2003) for preprocessing and Cilin for synonym
Introduction	However, many different segmentation standards eXist for different purposes, such as Microsoft Research Asia (MSRA) for Named Entity Recognition (NER), Chinese Treebank (CTB) for parsing and part-of-speech (POS) tagging, and City University of Hong Kong (CITYU) and Academia Sinica (AS) for general word segmentation and POS tagging .

POS tags is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

BLEU (19)
word-level (15)
n-grams (12)

88. Semi-supervised Learning for Automatic Prosodic Event Detection Using Co-training Algorithm

Jeon, Je Hun and Liu, Yang

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Co-training strategy for prosodic event detection	As described in Section 4, we use two classifiers for the prosodic event detection task based on two different information sources: one is the acoustic evidence extracted from the speech signal of an utterance; the other is the lexical and syntactic evidence such as syllables, words, POS tags and phrasal boundary information.
Previous work	(2007) applied co-training method in POS tagging using agreement-based selection strategy.
Prosodic event detection method	0 Accent detection: syllable identity, lexical stress (exist or not), word boundary information (boundary or not), and POS tag .
Prosodic event detection method	0 IPB and Break index detection: POS tag , the ratio of syntactic phrases the word initiates, and the ratio of syntactic phrases the word terminates.

POS tags is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

89. Accurate Word Segmentation using Transliteration and Language Model Projection

Hagiwara, Masato and Sekine, Satoshi

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	4Since the dictionary is not explicitly annotated with PoS tags, we firstly took the intersection of the training corpus and the dictionary words, and assigned all the possible PoS tags to the words which appeared in the corpus.
Experiments	Proper noun performance for the Stanford segmenter is not shown since it does not assign PoS tags .
Word Segmentation Model	Here, 111,- and wi_1 denote the current and previous word in question, and ti and til are level-j PoS tags assigned to them.
Word Segmentation Model	1The Japanese dictionary and the corpus we used have 6 levels of PoS tag hierarchy, while the Chinese ones have only one level, which is why some of the PoS features are not included in Chinese.

POS tags is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

LM (19)
PoS tags (4)
bigram (3)

90. Parsing with Compositional Vector Grammars

Socher, Richard and Bauer, John and Manning, Christopher D. and Andrew Y., Ng

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	3.1 and the POS tags come from a PCFG.
Introduction	The standard RNN essentially ignores all POS tags and syntactic categories and each nonterminal node is associated with the same neural network (i.e., the weights across nodes are fully tied).
Introduction	While this results in a powerful composition function that essentially depends on the words being combined, the number of model parameters explodes and the composition functions do not capture the syntactic commonalities between similar POS tags or syntactic categories.

POS tags is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

91. Paraphrase Identification as Probabilistic Quasi-Synchronous Recognition

Das, Dipanjan and Smith, Noah A.

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Evaluation	model is approximate, because we used different preprocessing tools: MX-POST for POS tagging (Ratnaparkhi, 1996), MSTParser for parsing (McDonald et al., 2005), and Dan Bikel’s interface (http: //WWW .
QG for Paraphrase Modeling	For unobserved cases, the conditional probability is estimated by backing off to the parent POS tag and child direction.
QG for Paraphrase Modeling	We estimate the distributions over dependency labels, POS tags , and named entity classes using the transformed treebank (footnote 4).
QG for Paraphrase Modeling	(17) The parameters 9 to be learned include the class priors, the conditional distributions of the dependency labels given the various configurations, the POS tags given POS tags , the NE tags given NE

POS tags is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

92. Enriching Morphologically Poor Languages for Statistical Machine Translation

Avramidis, Eleftherios and Koehn, Philipp

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	In one of the first efforts to enrich the source in word-based SMT, Ueffing and Ney (2003) used part-of-speech (POS) tags, in order to deal with the verb conjugation of Spanish and Catalan; so, POS tags were used to identify the pronoun+verb sequence and splice these two words into one term.
Introduction	In their presentation of the factored SMT models, Koehn and Hoang (2007) describe experiments for translating from English to German, Spanish and Czech, using morphology tags added on the morphologically rich side, along with POS tags .
Methods for enriching input	The POS tag of this noun is then used to identify if it is plural or singular.
Methods for enriching input	The word “aspects” is found, which has a POS tag that shows it is a plural noun.

POS tags is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

93. Word Representations: A Simple and General Method for Semi-Supervised Learning

Turian, Joseph and Ratinov, Lev-Arie and Bengio, Yoshua

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Clustering-based word representations	Ushioda (1996) presents an extension to the Brown clustering algorithm, and learn hierarchical clusterings of words as well as phrases, which they apply to POS tagging .
Clustering-based word representations	Li and McCallum (2005) use an HMM-LDA model to improve POS tagging and Chinese Word Segmentation.
Clustering-based word representations	(2009) use an HMM to assign POS tags to words, which in turns improves the accuracy of the PCFG—based Hebrew parser.

POS tags is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

94. Semi-Supervised Active Learning for Sequence Labeling

Tomanek, Katrin and Hahn, Udo

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conditional Random Fields for Sequence Labeling	Many NLP tasks, such as POS tagging , chunking, or NER, are sequence labeling problems where a sequence of class labels 3] = (3/1,.
Conditional Random Fields for Sequence Labeling	Input units 553- are usually tokens, class labels yj can be POS tags or entity classes.
Introduction	When used for sequence labeling tasks such as POS tagging , chunking, or named entity recogni-

POS tags is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

95. Shift-Reduce CCG Parsing with a Dependency Model

Xu, Wenduan and Clark, Stephen and Zhang, Yue

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	For example, one template returns the top category on the stack plus its head word, together with the first word and its POS tag on the queue.
Experiments	Another template returns the second category on the stack, together with the POS tag of its head word.
Experiments	We use 10-fold cross validation for POS tagging and supertagging the training data, and automatically assigned POS tags for all experiments.

POS tags is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

96. Accurate Context-Free Parsing with Combinatory Categorial Grammar

Fowler, Timothy A. D. and Penn, Gerald

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Latent Variable CCG Parser	In the supertagging literature, POS tagging and supertagging are distinguished — POS tags are the traditional Penn treebank tags (e. g. NN, VBZ and DT) and supertags are CCG categories.
A Latent Variable CCG Parser	However, because the Petrov parser trained on CCGbank has no notion of Penn treebank POS tags , we can only evaluate the accuracy of the supertags.
A Latent Variable CCG Parser	Despite the lack of POS tags in the Petrov parser, we can see that it performs slightly better than the Clark and Curran parser.

POS tags is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

CCG (46)
Penn treebank (11)
treebank (11)

97. A Critical Reassessment of Evaluation Baselines for Speech Summarization

Penn, Gerald and Zhu, Xiaodan

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Setting of the experiment	A decision tree (C4.5, Release 8) is used to detect false starts, trained on the POS tags and trigger-word status of the first and last four words of sentences from a training set.
Setting of the experiment	For (both WH-and Yesfl\Io) question identification, another C4.5 classifier was trained on 2,000 manually annotated sentences using utterance length, POS bigram occurrences, and the POS tags and trigger-word status of the first and last five words of an utterance.
Setting of the experiment	Taking ASR transcripts as input, we use the Brill tagger (Brill, 1995) to assign POS tags to each word.

POS tags is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

98. Practical Very Large Scale CRFs

Lavergne, Thomas and Cappé, Olivier and Yvon, François

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conditional Random Fields	3-grm 10.74% 14.3M 14.59% 0.3M 5-grm 8.48% 132.5M 11.54% 2.5M POS tagging
Conditional Random Fields	For the POS tagging task, BCD appears to be unpractically slower to train than the others approaches (SGD takes about 40min to train, OWL-QN about 1 hour) due the simultaneous increase in the sequence length and in the number of observations.
Conditional Random Fields	Based on this observation, we have designed an incremental training strategy for the POS tagging task, where more specific features are progressively incorporated into the model if the corresponding less specific feature is active.

POS tags is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

99. Representation Learning for Text-level Discourse Parsing

Ji, Yangfeng and Eisenstein, Jacob

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	POS tag at beginning and end of the EDU
Implementation	The dependency structure and POS tags are obtained from MALT-Parser (Nivre et al., 2007).
Model	While such feature learning approaches have proven to increase robustness for parsing, POS tagging , and NER (Miller et al., 2004; Koo et al., 2008; Turian et al., 2010), they would seem to have an especially promising role for discourse, where training data is relatively sparse and ambiguity is considerable.

POS tags is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

100. Transfer Learning Based Cross-lingual Knowledge Extraction for Wikipedia

Wang, Zhigang and Li, Zhixing and Li, Juanzi and Tang, Jie and Z. Pan, Jeff

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Our Approach	As shown in Table 2, we classify the features used in WikiCiKE into three categories: format features, POS tag features and token features.
Our Approach	POS tag POS tag of current token features POS tags of previous 5 tokens
Our Approach	POS tags of

POS tags is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

101. A Comparison of Loopy Belief Propagation and Dual Decomposition for Integrated CCG Supertagging and Parsing

Auli, Michael and Lopez, Adam

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	On CCGbank we achieve a labelled dependency F—measure of 88.8% on gold POS tags , and 86.7% on automatic part-of-speeoch tags, the best reported results for this task.
Conclusion and Future Work	In future work we plan to integrate the POS tagger , which is crucial to parsing accuracy (Clark and Curran, 2004b).
Experiments	To the best of our knowledge, the results obtained with BP and DD are the best reported results on this task using gold POS tags .

POS tags is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

CCG (13)
F-score (6)
Model score (6)

102. A Class-Based Agreement Model for Generating Accurately Inflected Translations

Green, Spence and DeNero, John

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Class-based Model of Agreement	The coarse categories are the universal POS tag set described by Petrov et al.
A Class-based Model of Agreement	For Arabic, we used the coarse POS tags plus definiteness and the so-called phi features (gender, number, and person).4 For example, SJWl ‘the car’ would be tagged “Noun+Def+Sg+Fem”.
Discussion of Translation Results	For comparison, +POS indicates our class-based model trained on the 11 coarse POS tags only (e.g., “Noun”).

POS tags is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

phrase-based (10)
CRF (9)
LM (9)

103. Iterative Viterbi A* Algorithm for K-Best Sequential Decoding

Huang, Zhiheng and Chang, Yi and Long, Bo and Crespo, Jean-Francois and Dong, Anlei and Keerthi, Sathiya and Wu, Su-Lin

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We apply 1-best and k-best sequential decoding algorithms to five NLP tagging tasks: Penn TreeBank (PTB) POS tagging, CoNLLZOOO joint POS tagging and chunking, CoNLL 2003 joint POS tagging , chunking and named entity tagging, HPSG supertag-ging (Matsuzaki et al., 2007) and a search query named entity recognition (NER) dataset.
Experiments	As in (Kaji et al., 2010), we combine the POS tags and chunk tags to form joint tags for CoNLL 2000 dataset, e.g., NN\|B-NP.
Experiments	Similarly we combine the POS tags , chunk tags, and named entity tags to form joint tags for CoNLL 2003 dataset, e.g., PRP$\|I-NP\|O.

POS tags is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

Viterbi (72)
CoNLL (15)
beam search (9)

104. Large tagset labeling using Feed Forward Neural Networks. Case study on Romanian Language

Boros, Tiberiu and Ion, Radu and Tufis, Dan

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	When tagging with CTAGS, one can use any statistical POS tagging method such as HMMs, Maximum Entropy Classifiers, Bayesian Networks, CRFs, etc., followed by the CTAG to MSD recovery.
Abstract	Manual+automatic Tmmmg a POS tagger rules for MSD recovery I I I I Tagging i i Input data Labeling with CTAGS —> MSD Recovery Output data
Abstract	Also, our POS tagger detected cases where the annotation in the Gold Standard was erroneous.

POS tags is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

105. Modeling Thesis Clarity in Student Essays

Persing, Isaac and Ng, Vincent

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Error Classification	For this reason, we include POS tag 1, 2, 3, and 4-grams in the set of features we sort in the previous paragraph.
Error Classification	For each error 6,, we select POS tag n-grams from the top thousand features of the information gain sorted list to count toward the Ap+i and Api aggregation features.
Error Classification	This feature type may also help with Confusing Phrasing because the list of POS tag n-grams our annotator generated for its Ap+i contains useful features like DT NNS VBZ VBN (e.g., “these signals has been”), which captures noun-verb disagreement.

POS tags is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

F-score (27)
n-gram (14)
n-grams (11)

106. Cut the noise: Mutually reinforcing reordering and alignments for improved machine translation

Visweswariah, Karthik and Khapra, Mitesh M. and Ramanathan, Ananthakrishnan

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Generating reference reordering from parallel sentences	the Model 1 probabilities between pairs of words linked in the alignment a, features that inspect source and target POS tags and parses (if available) and features that inspect the alignments of adjacent words in the source and target sentence.
Generating reference reordering from parallel sentences	We conjoin the msd (minimum signed distance) with the POS tags to allow the model to capture the fact that the alignment error rate maybe higher for some POS tags than others (e.g., we have observed verbs have a higher error rate in Urdu-English alignments).
Reordering model	where 6 is a learned vector of weights and (I) is a vector of binary feature functions that inspect the words and POS tags of the source sentence at and around positions m and n. We use the features ((1)) described in Visweswariah et al.

POS tags is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

107. Joint Inference for Fine-grained Opinion Extraction

Yang, Bishan and Cardie, Claire

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We trained CRFs for opinion entity identification using the following features: indicators for words, POS tags , and lexicon features (the subjectivity strength of the word in the Subjectivity Lexicon).
Model	Words and POS tags: the words contained in the candidate and their POS tags .
Model	For features, we use words, POS tags , phrase types, lexicon and semantic frames (see Section 3.2.1 for details) to capture the properties of the opinion expression, and also features that capture the context of the opinion expression:

POS tags is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

relation extraction (20)
CRF (14)
ILP (14)

108. Enlisting the Ghost: Modeling Empty Categories for Machine Translation

Xiang, Bing and Luo, Xiaoqiang and Zhou, Bowen

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Chinese Empty Category Prediction	leftmost child label or POS tag rightmost child label or POS tag label or POS tag of the head child the number of child nodes
Chinese Empty Category Prediction	left-sibling label or POS tag
Chinese Empty Category Prediction	0 right-sibling label or POS tag

POS tags is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: