Index of papers in Proc. ACL 2011 that mention
  • POS tags
Bendersky, Michael and Croft, W. Bruce and Smith, David A.
Experiments
This sample is manually labeled with three annotations: capitalization, POS tags , and segmentation, according to the description of these annotations in Figure 1.
Experiments
Table 1: Summary of query annotation performance for capitalization (CAP), POS tagging (TAG) and segmentation.
Experiments
In case of POS tagging , the decisions are ternary, and hence we report the classification accuracy.
Independent Query Annotations
On the other hand, given sentence from a corpus that is relevant to the query lCh as “Hawaiian Falls is a family-friendly water-:irk”, the word “falls” is correctly identified by a andard POS tagger as a proper noun.
Independent Query Annotations
(2010), an estimate of p(Cz-|7“) is a smoothed estimator that combines the information from the retrieved sentence 7“ with the information about unigrams (for capitalization and POS tagging ) and bigrams (for segmentation) from a large n-gram corpus (Brants and Franz, 2006).
Joint Query Annotation
Many query annotations that are useful for IR can be represented using this simple form, including capitalization, POS tagging , phrase chunking, named entity recognition, and stopword indicators, to name just a few.
Joint Query Annotation
For instance, imagine that we need to perform two annotations: capitalization and POS tagging .
Query Annotation Example
In this scheme, each query is marked-up using three annotations: capitalization, POS tags , and segmentation indicators.
Related Work
Most of the previous work on query annotation focuses on performing a particular annotation task (e.g., segmentation or POS tagging ) in isolation.
POS tags is mentioned in 16 sentences in this paper.
Topics mentioned in this paper:
Chan, Yee Seng and Roth, Dan
Mention Extraction System
These are a combination of 21);, itself, its POS tag , and its integer offset from the last word (lw) in the mention.
Mention Extraction System
These features are meant to capture the word and POS tag sequences in mentions.
Mention Extraction System
Contextual We extract the word C_1,_1 immediately before mi, the word C+1,+1 immediately after mi, and their associated POS tags P.
Relation Extraction System
POS features If there is a single word between the two mentions, we extract its POS tag .
Relation Extraction System
Given the hw of m, Pm- refers to the sequence of POS tags in the immediate context of hw (we exclude the POS tag of hw).
Relation Extraction System
The offsets i and j denote the position (relative to hw) of the first and last POS tag respectively.
Syntactico-Semantic Structures
0 If u* is not empty, we require that it satisfies any of the following POS tag sequences: JJ+ \/ JJ and JJ?
Syntactico-Semantic Structures
These are (optional) POS tag sequences that normally start a valid noun phrase.
Syntactico-Semantic Structures
0 We use two patterns to differentiate between premodifier relations and possessive relations, by checking for the existence of POS tags PRP$, WP$, POS, and the word “’s”.
POS tags is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Das, Dipanjan and Petrov, Slav
Approach Overview
The focus of this work is on building POS taggers for foreign languages, assuming that we have an English POS tagger and some parallel text between the two languages.
Approach Overview
The POS distributions over the foreign trigram types are used as features to learn a better unsupervised POS tagger (§5).
Experiments and Results
9We extracted only the words and their POS tags from the treebanks.
Experiments and Results
(2011) provide a mapping A from the fine-grained language specific POS tags in the foreign treebank to the universal POS tags .
Graph Construction
Graph construction for structured prediction problems such as POS tagging is nontrivial: on the one hand, using individual words as the vertices throws away the context
Graph Construction
They considered a semi-supervised POS tagging scenario and showed that one can use a graph over trigram types, and edge weights based on distributional similarity, to improve a supervised conditional random field tagger.
Introduction
Unfortunately, the best completely unsupervised English POS tagger (that does not make use of a tagging dictionary) reaches only 76.1% accuracy (Christodoulopoulos et al., 2010), making its practical usability questionable at best.
Introduction
Our final average POS tagging accuracy of 83.4% compares very favorably to the average accuracy of Berg-Kirkpatrick et al.’s monolingual unsupervised state-of-the-art model (73.0%), and considerably bridges the gap to fully supervised POS tagging performance (96.6%).
PCS Induction
After running label propagation (LP), we compute tag probabilities for foreign word types cc by marginalizing the POS tag distributions of foreign trigrams ui = :c_ cc 55+ over the left and right con-
PCS Induction
This vector tag is constructed for every word in the foreign vocabulary and will be used to provide features for the unsupervised foreign language POS tagger .
PCS Induction
For English POS tagging , Berg-Kirkpatrick et al.
POS tags is mentioned in 18 sentences in this paper.
Topics mentioned in this paper:
Nagata, Ryo and Whittaker, Edward and Sheinman, Vera
Introduction
Such a comparison brings up another crucial question: “Do existing POS taggers and chun-
Introduction
Nevertheless, a great number of researchers have used existing POS taggers and chunkers to analyze the writing of learners of English.
Introduction
For instance, error detection methods normally use a POS tagger and/or a chunker in the error detection process.
Method
Considering this, we determined a basic rule as follows: “Use the Penn Treebank tag set and preserve the original texts as much as possible.” To handle such errors, we made several modifications and added two new POS tags (CE and UK) and another two for chunking (XP and PH), which are described below.
Method
Note that each POS tag is hyphenated.
UK and XP stand for unknown and X phrase, respectively.
5.1 POS Tagging
UK and XP stand for unknown and X phrase, respectively.
HMM-based and CRF-based POS taggers were tested on the shallow-parsed corpus.
UK and XP stand for unknown and X phrase, respectively.
Both use the Penn Treebank POS tag set.
POS tags is mentioned in 18 sentences in this paper.
Topics mentioned in this paper:
Zollmann, Andreas and Vogel, Stephan
Clustering phrase pairs directly using the K-means algorithm
Using a scheme based on source and target phrases with accounting for phrase size, with 36 word classes (the size of the Penn English POS tag set) for both languages, yields a grammar with (36 + 2 >|< 362 )2 = 6.9m nonterminal labels.
Conclusion and discussion
Crucially, our methods only rely on “shallow” lexical tags, either generated by POS taggers or by automatic clustering of words into classes.
Conclusion and discussion
Using automatically obtained word clusters instead of POS tags yields essentially the same results, thus making our methods applicable to all languages pairs with parallel corpora, whether syntactic resources are available for them or not.
Conclusion and discussion
On the other extreme, the clustering based approach labels phrases based on the contained words alone.8 The POS grammar represents an intermediate point on this spectrum, since POS tags can change based on surrounding words in the sentence; and the position of the K-means model depends on the influence of the phrase contexts on the clustering process.
Experiments
The source and target language parses for the syntax-augmented grammar, as well as the POS tags for our POS-based grammars were generated by the Stanford parser (Klein and Manning, 2003).
Experiments
Our approach, using target POS tags (‘POS-tgt (no phr.
Experiments
, 36 (the number Penn treebank POS tags , used for the ‘POS’ models, is 36).6 For ‘Clust’, we see a comfortably wide plateau of nearly-identical scores from N = 7,. .
Hard rule labeling from word classes
We use the simple term ‘tag’ to stand for any kind of word-level analysis—a syntactic, statistical, or other means of grouping word types or tokens into classes, possibly based on their position and context in the sentence, POS tagging being the most obvious example.
Related work
(2007) improve the statistical phrase-based MT model by injecting supertags, lexical information such as the POS tag of the word and its subcategorization information, into the phrase table, resulting in generalized phrases with placeholders in them.
POS tags is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Blanco, Eduardo and Moldovan, Dan
Learning Algorithm
Features (1—5) are extracted for each role and capture their presence, first POS tag and word, length and position within the roles present for that instance.
Learning Algorithm
Al—postag is extracted for the following POS tags : DT, JJ, PRP, CD, RB, VB and WP; Al—keywo rd for the following words: any, anybody, anymore, anyone, anything, anytime, anywhere, certain, enough, fall, many, much, other, some, specifics, too and until.
Learning Algorithm
These lists of POS tags and keywords were extracted after manual examination of training examples and aim at signaling whether this role correspond to the focus.
POS tags is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Liang, Percy and Jordan, Michael and Klein, Dan
Experiments
predicate cc in w (e.g., (Boston, Boston)), and (iii) predicates for each POS tag in {11, NN, NNS} (e.g., (JJ, size), (JJ, area), etc.
Experiments
We also define an augmented lexicon L+ which includes a prototype word cc for each predicate appearing in (iii) above (e.g., (large, size)), which cancels the predicates triggered by :c’s POS tag .
Experiments
SEMRESP requires a lexicon of 1.42 words per non-value predicate, Word-Net features, and syntactic parse trees; DCS requires only words for the domain-independent predicates (overall, around 0.5 words per non-value predicate), POS tags , and very simple indicator features.
POS tags is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Bramsen, Philip and Escobar-Molano, Martha and Patel, Ami and Alonso, Rafael
Abstract
Many would be better modeled by POS tag unigrams (with no word information) or by longer n-grams consisting of either words, POS tags , or a combination of the two.
Abstract
Each n-gram is a sequence of words, POS tags or a combination of words and POS tags
Abstract
or a POS tag .
POS tags is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Ponvert, Elias and Baldridge, Jason and Erk, Katrin
CD
CCM learns to predict a set of brackets over a string (in practice, a string of POS tags ) by jointly estimating constituent and distituent strings and contexts using an iterative EM-like procedure (though, as noted by Smith and Eisner (2004), CCM is deficient as a generative model).
Introduction
Recent work (Headden III et al., 2009; Cohen and Smith, 2009; Hanig, 2010; Spitkovsky et al., 2010) has largely built on the dependency model with valence of Klein and Manning (2004), and is characterized by its reliance on gold-standard part-of—speech (POS) annotations: the models are trained on and evaluated using sequences of POS tags rather than raw tokens.
Introduction
An exception which learns from raw text and makes no use of POS tags is the common cover links parser (CCL, Seginer 2007).
Tasks and Benchmark
portantly, until recently it was the only unsupervised raw text constituent parser to produce results competitive with systems which use gold POS tags (Klein and Manning, 2002; Klein and Manning, 2004; Bod, 2006) — and the recent improved raw-text parsing results of Reichart and Rappoport (2010) make direct use of CCL without modification.
Tasks and Benchmark
Finally, CCL outperforms most published POS-based models when those models are trained on unsupervised word classes rather than gold POS tags .
POS tags is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Bollegala, Danushka and Weir, David and Carroll, John
A Motivating Example
POS tags Excellent/JJ and/CC broad/JJ
Sentiment Sensitive Thesaurus
We then apply a simple word filter based on POS tags to select content words (nouns, verbs, adjectives, and adverbs).
Sentiment Sensitive Thesaurus
In addition to word-level sentiment features, we replace words with their POS tags to create
Sentiment Sensitive Thesaurus
POS tags generalize the word-level sentiment features, thereby reducing feature sparseness.
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Tan, Ming and Zhou, Wenli and Zheng, Lei and Wang, Shaojun
Composite language model
The SLM is based on statistical parsing techniques that allow syntactic analysis of sentences; it assigns a probability p(VV, T) to every sentence W and every possible binary parse T. The terminals of T are the words of W with POS tags , and the nodes of T are annotated with phrase headwords and nonterminal labels.
Composite language model
A word-parse k-prefix has a set of exposed heads h_m, - - - , h_1, with each head being a pair (headword, nonterminal label), or in the case of a root-only tree (word, POS tag ).
Composite language model
An m—th order SLM (m-SLM) has three operators to generate a sentence: WORD-PREDICTOR predicts the next word wk+1 based on the m leftmost exposed headwords bin 2 h_m, - - - , h_1 in the word-parse k-prefix with probability p(wk+1|h:,1n), and then passes control to the TAGGER; the TAGGER predicts the POS tag tk+1 to the next word wk+1 based on the next word wk+1 and the POS tags of the m leftmost exposed headwords hjn in the word-parse k-prefix with probability p(tk+1|wk+1, h_m.tag, - - - ,h_1.tag); the CONSTRUCTOR builds the partial parse Tk, from Tk,_1, wk, and tk, in a series of moves ending with NULL, where a parse move a is made with probability p(a|h:,1,,); a e A={(unary, NTlabel), (adjoin-left, NTlabel), (adjoin-right, NTlabel), null}.
Training algorithm
The TAGGER and CONSTRUCTOR are conditional probabilistic models of the type p(u|zl, - - - ,2“) where u, 21, - - - ,zn belong to a mixed set of words, POS tags , NTtags, CONSTRUCTOR actions (u only), and 21, - - - ,2“, form a linear Markov chain.
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Auli, Michael and Lopez, Adam
Abstract
On CCGbank we achieve a labelled dependency F—measure of 88.8% on gold POS tags , and 86.7% on automatic part-of-speeoch tags, the best reported results for this task.
Conclusion and Future Work
In future work we plan to integrate the POS tagger , which is crucial to parsing accuracy (Clark and Curran, 2004b).
Experiments
To the best of our knowledge, the results obtained with BP and DD are the best reported results on this task using gold POS tags .
POS tags is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: