Index of papers in Proc. ACL that mention
  • POS tags
Hatori, Jun and Matsuzaki, Takuya and Miyao, Yusuke and Tsujii, Jun'ichi
Abstract
We propose the first joint model for word segmentation, POS tagging , and dependency parsing for Chinese.
Abstract
Based on an extension of the incremental joint model for POS tagging and dependency parsing (Hatori et al., 2011), we propose an efficient character-based decoding method that can combine features from state-of-the-art segmentation, POS tagging , and dependency parsing models.
Abstract
In experiments using the Chinese Treebank (CTB), we show that the accuracies of the three tasks can be improved significantly over the baseline models, particularly by 0.6% for POS tagging and 2.4% for dependency parsing.
Introduction
Furthermore, the word-level information is often augmented with the POS tags , which, along with segmentation, form the basic foundation of statistical NLP.
Introduction
Because the tasks of word segmentation and POS tagging have strong interactions, many studies have been devoted to the task of joint word segmentation and POS tagging for languages such as Chinese (e.g.
Introduction
This is because some of the segmentation ambiguities cannot be resolved without considering the surrounding grammatical constructions encoded in a sequence of POS tags .
Related Works
In Chinese, Luo (2003) proposed a joint constituency parser that performs segmentation, POS tagging , and parsing within a single character-based framework.
POS tags is mentioned in 46 sentences in this paper.
Topics mentioned in this paper:
Sun, Weiwei and Uszkoreit, Hans
Abstract
From the perspective of structural linguistics, we explore paradigmatic and syntagmatic lexical relations for Chinese POS tagging , an important and challenging task for Chinese language processing.
Introduction
Automatically assigning POS tags to words plays an important role in parsing, word sense disambiguation, as well as many other NLP applications.
Introduction
While state-of-the-art tagging systems have achieved accuracies above 97% on English, Chinese POS tagging has proven to be more challenging and obtained accuracies about 93-94% (Tseng et al., 2005b; Huang et al., 2007, 2009; Li et al., 2011).
Introduction
It is generally accepted that Chinese POS tagging often requires more sophisticated language processing techniques that are capable of drawing inferences from more subtle linguistic knowledge.
State-of-the-Art
In some cases, the methods work well without large modifications, such as German POS tagging .
POS tags is mentioned in 35 sentences in this paper.
Topics mentioned in this paper:
Rastrow, Ariya and Dredze, Mark and Khudanpur, Sanjeev
Experiments
The dependency parser and POS tagger are trained on supervised data and up-trained on data labeled by the CKY—style bottom-up constituent parser of Huang et al.
Experiments
We use the POS tagger to generate tags for dependency training to match the test setting.
Incorporating Syntactic Structures
Long-span models — generative or discriminative, N -best or hill climbing — rely on auxiliary tools, such as a POS tagger or a parser, for extracting features for each hypothesis during rescoring, and during training for discriminative models.
Incorporating Syntactic Structures
A major complexity factor is due to processing 100s or 1000s of hypotheses for each speech utterance, even during hill climbing, each of which must be POS tagged and parsed.
Incorporating Syntactic Structures
For integer typed features the mapping is trivial, for string typed features (e. g. a POS tag identity) we use a mapping of the corresponding vocabulary to integers.
Syntactic Language Models
where h.w and h.t denote the word identity and the POS tag of the corresponding exposed head word.
Up-Training
We apply up-training to improve the accuracy of both our fast POS tagger and dependency parser.
POS tags is mentioned in 16 sentences in this paper.
Topics mentioned in this paper:
Abend, Omri and Rappoport, Ari
Algorithm
To estimate this joint distribution, PSH samples are extracted from the training corpus using unsupervised POS taggers (Clark, 2003; Abend et al., 2010) and an unsupervised parser (Seginer, 2007).
Algorithm
This parser is unique in its ability to induce a bracketing (unlabeled parsing) from raw text (without even using POS tags ) with strong results.
Algorithm
We continue by tagging the corpus using Clark’s unsupervised POS tagger (Clark, 2003) and the unsupervised Prototype Tagger (Abend et al., 2010)2.
Conclusion
The algorithm applies state-of-the-art unsupervised parser and POS tagger to collect statistics from a large raw text corpus.
Core-Adjunct in Previous Work
In addition, supervised models utilize supervised parsers and POS taggers, while the current state-of-the-art in unsupervised parsing and POS tagging is considerably worse than their supervised counterparts.
Core-Adjunct in Previous Work
First, all works use manual or supervised syntactic annotations, usually including a POS tagger .
Experimental Setup
This scenario decouples the accuracy of the algorithm from the quality of the unsupervised POS tagging .
Experimental Setup
Finally, we experiment on a scenario where even argument identification on the test set is not provided, but performed by the algorithm of (Abend et al., 2009), which uses neither syntactic nor SRL annotation but does utilize a supervised POS tagger .
Introduction
However, no work has tack-lwnyamammemmfimdmmmm.Umw pervised models reduce reliance on the costly and error prone manual multilayer annotation ( POS tagging , parsing, core-adjunct tagging) commonly used for this task.
POS tags is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Bengoetxea, Kepa and Agirre, Eneko and Nivre, Joakim and Zhang, Yue and Gojenola, Koldo
Abstract
Overall, we can say that the improvements are small and not significant using automatic POS tags, contrary to previously published results using gold POS tags (Agirre et al., 2011).
Experimental Framework
We modified the system in order to add semantic features, combining them with wordforms and POS tags , on the parent and child nodes of each arc.
Introduction
using MaltParser on gold POS tags .
Introduction
In this work, we will investigate the effect of semantic information using predicted POS tags .
Related work
(201 1) successfully introduced WordNet classes in a dependency parser, obtaining improvements on the full PTB using gold POS tags , trying different combinations of semantic classes.
Results
For all the tests, we used a perceptron POS-tagger (Collins, 2002), trained on WSJ sections 2—21, to assign POS tags automatically to both the training (using 10—way jackknifing) and test data, obtaining a POS tagging accuracy of 97.32% on the test data.
Results
Overall, we see that the small improvements do not confirm the previous results on Penn2Malt, MaltParser and gold POS tags .
Results
One of the obstacles of automatic parsers is the presence of incorrect POS tags due to auto-
POS tags is mentioned in 16 sentences in this paper.
Topics mentioned in this paper:
Parikh, Ankur P. and Cohen, Shay B. and Xing, Eric P.
Abstract
:0, is the POS tag of 21),).
Abstract
The word embeddings are used during the leam-ing process, but the final decoder that the learning algorithm outputs maps a POS tag sequence a: to a parse tree.
Abstract
While ideally we would want to use the word information in decoding as well, much of the syntax of a sentence is determined by the POS tags, and relatively high level of accuracy can be achieved by learning, for example, a supervised parser from POS tag sequences.
POS tags is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
Lippincott, Thomas and Korhonen, Anna and Ó Séaghdha, Diarmuid
Conclusions and future work
Second, simply treating POS tags within a small window of the verb as pseudo-GRs produces state-of-the-art results without the need for a parsing model.
Conclusions and future work
In fact, by integrating results from unsupervised POS tagging (Teichert and Daume III, 2009) we could render this approach fully domain- and language-independent.
Introduction
Second, by replacing the syntactic features with an approximation based on POS tags , we achieve state-of-the-art performance without relying on error-prone unlexicalized or domain-specific lexicalized parsers.
Methodology
The CONLL format is a common language for comparing output from dependency parsers: each lexical item has an index, lemma, POS tag , tGR in which it is the dependent, and index to the corresponding head.
Methodology
Table 2 shows the three variations we tested: the simple tGR type, with parameterization for the POS tags of head and dependent, and with closed-class POS tags (determiners, pronouns and prepositions) lexicalized.
Methodology
An unlexicalized parser cannot distinguish these based just on POS tags , while a lexicalized parser requires a large treebank.
Previous work
Graphical models have been increasingly popular for a variety of tasks such as distributional semantics (Blei et al., 2003) and unsupervised POS tagging (Finkel et al., 2007), and sampling methods allow efficient estimation of full joint distributions (Neal, 1993).
Previous work
Their study employed unsupervised POS tagging and parsing, and measures of selectional preference and argument structure as complementary features for the classifier.
Results
Since POS tagging is more reliable and robust across domains than parsing, retraining on new domains will not suffer the effects of a mismatched parsing model (Lippincott et al., 2010).
Results
Third, lexicalizing the closed-class POS tags introduces semantic information outside the scope of the alternation-based definition of subcategorization.
POS tags is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Ma, Ji and Zhang, Yue and Zhu, Jingbo
Abstract
In this paper, we address the problem of web-domain POS tagging using a two-phase approach.
Abstract
The representation is integrated as features into a neural network that serves as a scorer for an easy-first POS tagger .
Introduction
However, state-of-the-art POS taggers in the literature (Collins, 2002; Shen et al., 2007) are mainly optimized on the the Penn Treebank (PTB), and when shifted to web data, tagging accuracies drop significantly (Petrov and McDonald, 2012).
Introduction
We integrate the learned encoder with a set of well-established features for POS tagging (Ratnaparkhi, 1996; Collins, 2002) in a single neural network, which is applied as a scorer to an easy-first POS tagger .
Introduction
We choose the easy-first tagging approach since it has been demonstrated to give higher accuracies than the standard left-to-right POS tagger (Shen et al., 2007; Ma et al., 2013).
Learning from Web Text
This may partly be due to the fact that unlike computer vision tasks, the input structure of POS tagging or other sequential labelling tasks is relatively simple, and a single nonlinear layer is enough to model the interactions within the input (Wang and Manning, 2013).
Neural Network for POS Disambiguation
The main challenge to designing the neural network structure is: on the one hand, we hope that the model can take the advantage of information provided by the learned WRRBM, which reflects general properties of web texts, so that the model generalizes well in the web domain; on the other hand, we also hope to improve the model’s discriminative power by utilizing well-established POS tagging features, such as those of Ratnaparkhi (1996).
Neural Network for POS Disambiguation
Under the output layer, the network consists of two modules: the web-feature module, which incorporates knowledge from the pre-trained WRRBM, and the sparse-feature module, which makes use of other POS tagging features.
Neural Network for POS Disambiguation
For POS tagging , we found that a simple linear layer yields satisfactory accuracies.
POS tags is mentioned in 20 sentences in this paper.
Topics mentioned in this paper:
Li, Zhenghua and Liu, Ting and Che, Wanxiang
Dependency Parsing
Given an input sentence x = wowl...wn and its POS tag sequence 1; = totl...tn, the goal of dependency parsing is to build a dependency tree as depicted in Figure l, denoted by d = {(h, m, l) : 0 g h 3 72,0 < m g n,l E L}, where (h,m, l) indicates an directed arc from the head word (also called father) w, to the modifier (also called child or dependent) wm with a dependency label l, and L is the label set.
Dependency Parsing with QG Features
The type of the TP is conjoined with the related words and POS tags , such that the QG—enhanced parsing models can make more elaborate decisions based on the context.
Experiments and Analysis
CDT and CTB5/6 adopt different POS tag sets, and converting from one tag set to another is difficult (Niu et al., 2009).5 To overcome this problem, we use the People’s Daily corpus (PD),6 a large—scale corpus annotated with word segmentation and POS tags, to train a statistical POS tagger .
Experiments and Analysis
The tagger produces a universal layer of POS tags for both the source and target treebanks.
Experiments and Analysis
For all models used in current work ( POS tagging and parsing), we adopt averaged perceptron to train the feature weights (Collins, 2002).
POS tags is mentioned in 24 sentences in this paper.
Topics mentioned in this paper:
Sun, Weiwei and Wan, Xiaojun
About Heterogeneous Annotations
For Chinese word segmentation and POS tagging , supervised learning has become a dominant paradigm.
About Heterogeneous Annotations
Although several institutions to date have released their segmented and POS tagged data, acquiring sufficient quantities of high quality training examples is still a major bottleneck.
About Heterogeneous Annotations
The statistics after colons are how many times this POS tag pair appears among the 3561 words that are consistently segmented.
Introduction
In particular, joint word segmentation and POS tagging is addressed as a two step process.
Joint Chinese Word Segmentation and POS Tagging
words, word segmentation and POS tagging are important initial steps for Chinese language processing.
Joint Chinese Word Segmentation and POS Tagging
Two kinds of approaches are popular for joint word segmentation and POS tagging .
Joint Chinese Word Segmentation and POS Tagging
In this kind of approach, the task is formulated as the classification of characters into POS tags with boundary information.
Structure-based Stacking
Table 1: Mapping between CTB and PPD POS Tags .
POS tags is mentioned in 18 sentences in this paper.
Topics mentioned in this paper:
Bollegala, Danushka and Weir, David and Carroll, John
Distribution Prediction
As we go on to show in Section 6, this enables us to use the same distribution prediction method for both POS tagging and sentiment classification.
Domain Adaptation
We consider two DA tasks: (a) cross-domain POS tagging (Section 4.1), and (b) cross-domain sentiment classification (Section 4.2).
Domain Adaptation
4.1 Cross-Domain POS Tagging
Domain Adaptation
manually POS tagged ) sentence, we select its neighbours 7N) in the source domain as additional features.
Introduction
0 Using the learnt distribution prediction model, we propose a method to learn a cross-domain POS tagger .
Related Work
words that appear in both the source and target domains) to adapt a POS tagger to a target domain.
Related Work
Choi and Palmer (2012) propose a cross-domain POS tagging method by training two separate models: a generalised model and a domain-specific model.
Related Work
Adding latent states to the smoothing model further improves the POS tagging accuracy (Huang and Yates, 2012).
POS tags is mentioned in 22 sentences in this paper.
Topics mentioned in this paper:
Gardent, Claire and Narayan, Shashi
Experiment and Results
One feature of our approach is that it permits mining the data for tree patterns of arbitrary size using different types of labelling information ( POS tags , dependencies, word forms and any combination thereof).
Experiment and Results
4.3.1 Mining on single labels (word form, POS tag or dependency)
Experiment and Results
Mining on a single label permits (i) assessing the relative impact of each category in a given label category and (ii) identifying different sources of errors depending on the type of label considered ( POS tag , dependency or word form).
POS tags is mentioned in 18 sentences in this paper.
Topics mentioned in this paper:
Nagata, Ryo and Whittaker, Edward and Sheinman, Vera
Introduction
Such a comparison brings up another crucial question: “Do existing POS taggers and chun-
Introduction
Nevertheless, a great number of researchers have used existing POS taggers and chunkers to analyze the writing of learners of English.
Introduction
For instance, error detection methods normally use a POS tagger and/or a chunker in the error detection process.
Method
Considering this, we determined a basic rule as follows: “Use the Penn Treebank tag set and preserve the original texts as much as possible.” To handle such errors, we made several modifications and added two new POS tags (CE and UK) and another two for chunking (XP and PH), which are described below.
Method
Note that each POS tag is hyphenated.
UK and XP stand for unknown and X phrase, respectively.
5.1 POS Tagging
UK and XP stand for unknown and X phrase, respectively.
HMM-based and CRF-based POS taggers were tested on the shallow-parsed corpus.
UK and XP stand for unknown and X phrase, respectively.
Both use the Penn Treebank POS tag set.
POS tags is mentioned in 18 sentences in this paper.
Topics mentioned in this paper:
Das, Dipanjan and Petrov, Slav
Approach Overview
The focus of this work is on building POS taggers for foreign languages, assuming that we have an English POS tagger and some parallel text between the two languages.
Approach Overview
The POS distributions over the foreign trigram types are used as features to learn a better unsupervised POS tagger (§5).
Experiments and Results
9We extracted only the words and their POS tags from the treebanks.
Experiments and Results
(2011) provide a mapping A from the fine-grained language specific POS tags in the foreign treebank to the universal POS tags .
Graph Construction
Graph construction for structured prediction problems such as POS tagging is nontrivial: on the one hand, using individual words as the vertices throws away the context
Graph Construction
They considered a semi-supervised POS tagging scenario and showed that one can use a graph over trigram types, and edge weights based on distributional similarity, to improve a supervised conditional random field tagger.
Introduction
Unfortunately, the best completely unsupervised English POS tagger (that does not make use of a tagging dictionary) reaches only 76.1% accuracy (Christodoulopoulos et al., 2010), making its practical usability questionable at best.
Introduction
Our final average POS tagging accuracy of 83.4% compares very favorably to the average accuracy of Berg-Kirkpatrick et al.’s monolingual unsupervised state-of-the-art model (73.0%), and considerably bridges the gap to fully supervised POS tagging performance (96.6%).
PCS Induction
After running label propagation (LP), we compute tag probabilities for foreign word types cc by marginalizing the POS tag distributions of foreign trigrams ui = :c_ cc 55+ over the left and right con-
PCS Induction
This vector tag is constructed for every word in the foreign vocabulary and will be used to provide features for the unsupervised foreign language POS tagger .
PCS Induction
For English POS tagging , Berg-Kirkpatrick et al.
POS tags is mentioned in 18 sentences in this paper.
Topics mentioned in this paper:
Chan, Yee Seng and Roth, Dan
Mention Extraction System
These are a combination of 21);, itself, its POS tag , and its integer offset from the last word (lw) in the mention.
Mention Extraction System
These features are meant to capture the word and POS tag sequences in mentions.
Mention Extraction System
Contextual We extract the word C_1,_1 immediately before mi, the word C+1,+1 immediately after mi, and their associated POS tags P.
Relation Extraction System
POS features If there is a single word between the two mentions, we extract its POS tag .
Relation Extraction System
Given the hw of m, Pm- refers to the sequence of POS tags in the immediate context of hw (we exclude the POS tag of hw).
Relation Extraction System
The offsets i and j denote the position (relative to hw) of the first and last POS tag respectively.
Syntactico-Semantic Structures
0 If u* is not empty, we require that it satisfies any of the following POS tag sequences: JJ+ \/ JJ and JJ?
Syntactico-Semantic Structures
These are (optional) POS tag sequences that normally start a valid noun phrase.
Syntactico-Semantic Structures
0 We use two patterns to differentiate between premodifier relations and possessive relations, by checking for the existence of POS tags PRP$, WP$, POS, and the word “’s”.
POS tags is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Chen, Yanping and Zheng, Qinghua and Zhang, Wei
Discussion
Except in Row 8 and Row 11, when two head nouns of entity pair were combined as semantic pair and when POS tag were combined with the entity type, the performances are decreased.
Discussion
Comparing the reference set (5) with the reference set (3), the Head noan and adjacent entity POS tag get a better performance when used as singletons.
Discussion
In this paper, for a better demonstration of the constraint condition, we still use the Position Sensitive as the default setting to use the Head noan and the adjacent entity POS tag .
Feature Construction
All the employed features are simply classified into five categories: Entity Type and Subtype, Head Noun, Position Feature, POS Tag and Omni-word Feature.
Feature Construction
POS Tag: In our model, we use only the adjacent entity POS tags , which lie in two sides of the entity mention.
Feature Construction
These POS tags are labelled by the ICTCLAS packagez.
POS tags is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Bendersky, Michael and Croft, W. Bruce and Smith, David A.
Experiments
This sample is manually labeled with three annotations: capitalization, POS tags , and segmentation, according to the description of these annotations in Figure 1.
Experiments
Table 1: Summary of query annotation performance for capitalization (CAP), POS tagging (TAG) and segmentation.
Experiments
In case of POS tagging , the decisions are ternary, and hence we report the classification accuracy.
Independent Query Annotations
On the other hand, given sentence from a corpus that is relevant to the query lCh as “Hawaiian Falls is a family-friendly water-:irk”, the word “falls” is correctly identified by a andard POS tagger as a proper noun.
Independent Query Annotations
(2010), an estimate of p(Cz-|7“) is a smoothed estimator that combines the information from the retrieved sentence 7“ with the information about unigrams (for capitalization and POS tagging ) and bigrams (for segmentation) from a large n-gram corpus (Brants and Franz, 2006).
Joint Query Annotation
Many query annotations that are useful for IR can be represented using this simple form, including capitalization, POS tagging , phrase chunking, named entity recognition, and stopword indicators, to name just a few.
Joint Query Annotation
For instance, imagine that we need to perform two annotations: capitalization and POS tagging .
Query Annotation Example
In this scheme, each query is marked-up using three annotations: capitalization, POS tags , and segmentation indicators.
Related Work
Most of the previous work on query annotation focuses on performing a particular annotation task (e.g., segmentation or POS tagging ) in isolation.
POS tags is mentioned in 16 sentences in this paper.
Topics mentioned in this paper:
Gormley, Matthew R. and Mitchell, Margaret and Van Durme, Benjamin and Dredze, Mark
Approaches
A typical pipeline consists of a POS tagger , dependency parser, and semantic role labeler.
Approaches
Brown Clusters We use fully unsupervised Brown clusters (Brown et al., 1992) in place of POS tags .
Approaches
We define the DMV such that it generates sequences of word classes: either POS tags or Brown clusters as in Spitkovsky et al.
Experiments
Our experiments are subtractive, beginning with all supervision available and then successively removing (a) dependency syntax, (b) morphological features, (c) POS tags , and (d) lemmas.
Experiments
The CoNLL-2009 Shared Task (Hajic et al., 2009) dataset contains POS tags , lemmas, morphological features, syntactic dependencies, predicate senses, and semantic roles annotations for 7 languages: Catalan, Chinese, Czech, English, German, Japanese,4 Spanish.
Experiments
We first compare our models trained as a pipeline, using all available supervision (syntax, morphology, POS tags , lemmas) from the CoNLL-2009 data.
Introduction
0 Use of Brown clusters in place of POS tags for low-resource SRL.
Related Work
(2012) limit their exploration to a small set of basic features, and included high-resource supervision in the form of lemmas, POS tags , and morphology available from the CoNLL 2009 data.
Related Work
Our experiments also consider ‘longer’ pipelines that include earlier stages: a morphological analyzer, POS tagger , lemmatizer.
POS tags is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
Zhang, Yuan and Lei, Tao and Barzilay, Regina and Jaakkola, Tommi and Globerson, Amir
Features
0 Coordination In a coordinate structure, the two adj acent conjuncts usually agree with each other on POS tags and their span lengths.
Features
Therefore, we add different features to capture POS tag and span length consistency in a coordinate structure.
Features
0 Span Length This feature captures the distribution of the binned span length of each POS tag .
Introduction
When proposing a small move, i.e., sampling a head of the word, we can also jointly sample its POS tag from a set of alternatives provided by the tagger.
Sampling-Based Dependency Parsing with Global Features
For instance, we can sample the POS tag , the dependency relation or morphology information.
Sampling-Based Dependency Parsing with Global Features
POS correction scenario in which only the predicted POS tags are provided in the testing phase, while both gold and predicted tags are available for the training set.
Sampling-Based Dependency Parsing with Global Features
We extend our model such that it jointly learns how to predict a parse tree and also correct the predicted POS tags for a better parsing performance.
POS tags is mentioned in 19 sentences in this paper.
Topics mentioned in this paper:
Tamura, Akihiro and Watanabe, Taro and Sumita, Eiichiro and Takamura, Hiroya and Okumura, Manabu
Abstract
In particular, we extend the monolingual infinite tree model (Finkel et al., 2007) to a bilingual scenario: each hidden state ( POS tag ) of a source-side dependency tree emits a source word together with its aligned target word, either jointly (joint model), or independently (independent model).
Abstract
Evaluations of J apanese-to-English translation on the NTCIR-9 data show that our induced Japanese POS tags for dependency trees improve the performance of a forest-to-string SMT system.
Introduction
However, dependency parsing, which is a popular choice for Japanese, can incorporate only shallow syntactic information, i.e., POS tags , compared with the richer syntactic phrasal categories in constituency parsing.
Introduction
Figure 1: Examples of Existing Japanese POS Tags and Dependency Structures
Introduction
If we could discriminate POS tags for two cases, we might improve the performance of a Japanese-to-English SMT system.
POS tags is mentioned in 51 sentences in this paper.
Topics mentioned in this paper:
Dridan, Rebecca and Kordoni, Valia and Nicholson, Jeremy
Abstract
In terms of robustness, we try using different types of external data to increase lexical coverage, and find that simple POS tags have the most effect, increasing coverage on unseen data by up to 45%.
Abstract
Even using vanilla POS tags we achieve some efficiency gains, but when using detailed lexical types as supertags we manage to halve parsing time with minimal loss of coverage or precision.
Background
Supertagging is the process of assigning probable ‘supertags’ to words before parsing to restrict parser ambiguity, where a supertag is a tag that includes more specific information than the typical POS tags .
Parser Restriction
In these experiments we look at two methods of restricting the parser, first by using POS tags and then using lexical types.
Parser Restriction
We use TreeTagger (Schmid, 1994) to produce POS tags and then open class words are restricted if the POS tagger assigned a tag with a probability over a certain threshold.
Parser Restriction
Table 1: Results obtained when restricting the parser lexicon according to the POS tag , where words are restricted according to a threshold of POS probabilities.
POS tags is mentioned in 48 sentences in this paper.
Topics mentioned in this paper:
Espinosa, Dominic and White, Michael and Mehay, Dennis
Background
The best performing model interpolates a word trigram model with a trigram model that chains a POS model with a supertag model, where the POS model conditions on the previous two POS tags, and the supertag model conditions on the previous two POS tags as well as the current one.
The Approach
Clark (2002) notes in his parsing experiments that the POS tags of the surrounding words are highly informative.
The Approach
As discussed below, a significant gain in hypertagging accuracy resulted from including features sensitive to the POS tags of a node’s parent, the node itself, and all of its arguments and modifiers.
The Approach
Predicting these tags requires the use of a separate POS tagger , which operates in a manner similar to the hypertagger itself, though exploiting a slightly different set of features (e. g., including features corresponding to the four-character prefixes and suffixes of rare logical predication names).
POS tags is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Goldberg, Yoav and Tsarfaty, Reut
A Generative PCFG Model
The entries in such a lexicon may be thought of as meaningful surface segments paired up with their PoS tags I, = (3,, pi), but note that a surface segment 3 need not be a space-delimited token.
A Generative PCFG Model
(1996) who consider the kind of probabilities a generative parser should get from a PoS tagger , and concludes that these should be P(w|t) “and nothing fancier”.3 In our setting, therefore, the Lattice is not used to induce a probability distribution on a linear context, but rather, it is used as a common-denominator of state-indexation of all segmentations possibilities of a surface form.
Model Preliminaries
A Hebrew surface token may have several readings, each of which corresponding to a sequence of segments and their corresponding PoS tags .
Model Preliminaries
We refer to different readings as different analyses whereby the segments are deterministic given the sequence of PoS tags .
Model Preliminaries
We refer to a segment and its assigned PoS tag as a lexeme, and so analyses are in fact sequences of lexemes.
Modern Hebrew Structure
Such discrepancies can be aligned via an intermediate level of PoS tags .
Modern Hebrew Structure
PoS tags impose a unique morphological segmentation on surface tokens and present a unique valid yield for syntactic trees.
Previous Work on Hebrew Processing
Tsarfaty (2006) used a morphological analyzer (Segal, 2000), a PoS tagger (Bar-Haim et al., 2005), and a general purpose parser (Schmid, 2000) in an integrated framework in which morphological and syntactic components interact to share information, leading to improved performance on the joint task.
POS tags is mentioned in 17 sentences in this paper.
Topics mentioned in this paper:
Zeng, Xiaodong and Wong, Derek F. and Chao, Lidia S. and Trancoso, Isabel
Background
To perform segmentation and tagging simultaneously in a uniform framework, according to Ng and Low (2004), the tag is composed of a word boundary part, and a POS part, e. g., “B _N N” refers to the first character in a word with POS tag “NN”.
Background
As for the POS tag , we shal-1 use the 33 tags in the Chinese tree bank.
Introduction
The traditional way of segmentation and tagging is performed in a pipeline approach, first segmenting a sentence into words, and then assigning each word a POS tag .
Introduction
The pipeline approach is very simple to implement, but frequently causes error propagation, given that wrong seg-mentations in the earlier stage harm the subsequent POS tagging (Ng and Low, 2004).
Introduction
The joint approaches of word segmentation and POS tagging (joint S&T) are proposed to resolve these two tasks simultaneously.
Method
In fact, the sparsity is also a common phenomenon among character-based CWS and POS tagging .
Method
The performance measurement indicators for word segmentation and POS tagging (joint S&T) are balance F-score, F = 2PIU(P+R), the harmonic mean of precision (P) and recall (R), and out-of-vocabulary recall (OOV—R).
Related Work
There are few explorations of semi-supervised approaches for CWS or POS tagging in previous works.
POS tags is mentioned in 20 sentences in this paper.
Topics mentioned in this paper:
Zhang, Meishan and Zhang, Yue and Che, Wanxiang and Liu, Ting
Character-based Chinese Parsing
To produce character-level trees for Chinese NLP tasks, we develop a character-based parsing model, which can jointly perform word segmentation, POS tagging and phrase-structure parsing.
Character-based Chinese Parsing
We make two extensions to their work to enable joint segmentation, POS tagging and phrase-structure parsing from the character level.
Character-based Chinese Parsing
First, we split the original SHIFT action into SHIFT—SEPARATE (t) and SHIFT—APPEND, which jointly perform the word segmentation and POS tagging tasks.
Introduction
Compared to a pipeline system, the advantages of a joint system include reduction of error propagation, and the integration of segmentation, POS tagging and syntax features.
Introduction
To analyze word structures in addition to phrase structures, our character-based parser naturally performs joint word segmentation, POS tagging and parsing jointly.
Introduction
We extend their shift-reduce framework, adding more transition actions for word segmentation and POS tagging , and defining novel features that capture character information.
Word Structures and Syntax Trees
They made use of this information to help joint word segmentation and POS tagging .
Word Structures and Syntax Trees
In particular, we mark the original nodes that represent POS tags in CTB-style trees with “-t”, and insert our word structures as unary subnodes of the “-t” nodes.
POS tags is mentioned in 35 sentences in this paper.
Topics mentioned in this paper:
Abend, Omri and Reichart, Roi and Rappoport, Ari
Abstract
In this paper we present an unsupervised algorithm for identifying verb arguments, where the only type of annotation required is POS tagging .
Algorithm
This parser is unique in that it is able to induce a bracketing (unlabeled parsing) from raw text (without even using POS tags ) achieving state-of-the-art results.
Algorithm
The only type of supervised annotation we use is POS tagging .
Algorithm
We use the taggers MX-POST (Ratnaparkhi, 1996) for English and Tree-Tagger (Schmid, 1994) for Spanish, to obtain POS tags for our model.
Introduction
A standard SRL algorithm requires thousands to dozens of thousands sentences annotated with POS tags , syntactic annotation and SRL annotation.
POS tags is mentioned in 14 sentences in this paper.
Topics mentioned in this paper:
Mareċek, David and Straka, Milan
Introduction
Rasooli and Faili (2012) and Bisk and Hockenmaier (2012) made some efforts to boost the verbocentricity of the inferred structures; however, both of the approaches require manual identification of the POS tags marking the verbs, which renders them useless when unsupervised POS tags are employed.
Related Work
Our dependency model contained a submodel which directly prioritized subtrees that form reducible sequences of POS tags .
Related Work
Reducibility scores of given POS tag sequences were estimated using a large corpus of Wikipedia articles.
Related Work
The weakness of this approach was the fact that longer sequences of POS tags are very sparse and no reducibility scores could be estimated for them.
STOP-probability estimation
Hereinafter, Psfifzxch, dir) denotes the STOP-probability we want to estimate from a large corpus; ch is the head’s POS tag and dir is the direction in which the STOP probability is estimated.
STOP-probability estimation
For each POS tag 0;, in the given corpus, we first compute its left and right “raw” score Sst0p(ch, left) and Sst0p(ch, right) as the relative number of times a word with POS tag 0;, was in the first (or last) position in a reducible sequence found in the corpus.
STOP-probability estimation
Their main purpose is to sort the POS tags according to their “reducibility”.
POS tags is mentioned in 21 sentences in this paper.
Topics mentioned in this paper:
Ma, Ji and Zhu, Jingbo and Xiao, Tong and Yang, Nan
Abstract
In this paper, we combine easy-first dependency parsing and POS tagging algorithms with beam search and structured perceptron.
Experiments
We use the standard split for dependency parsing and the split used by (Ratnaparkhi, 1996) for POS tagging .
Experiments
For dependency parsing, POS tags of the training set are generated using 10-fold jackknifing.
Experiments
For dependency parsing, we assume gold segmentation and POS tags for the input.
Introduction
The proposed solution is general and can also be applied to other algorithms that exhibit spurious ambiguity, such as easy-first POS tagging (Ma et al., 2012) and transition-based dependency parsing with dynamic oracle (Goldberg and Nivre, 2012).
Introduction
In this paper, we report experimental results on both easy-first dependency parsing and POS tagging (Ma et al., 2012).
Introduction
We show that both easy-first POS tagging and dependency parsing can be improved significantly from beam search and global learning.
Training
wp denotes the head word of p, tp denotes the POS tag of wp.
POS tags is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Cirik, Volkan
Algorithm
We induce number of POS tags of a word type at this step.
Algorithm
Furthermore, they will have the same POS tags .
Experiments
As a result, this method inaccurately induces POS tags for the occurrences of word types with high gold tag perplexity.
Experiments
In other words, we assume that the number of different POS tags of each word type is equal to 2.
Introduction
part-of-speech or POS tagging ) is an important preprocessing step for many natural language processing applications because grammatical rules are not functions of individual words, instead, they are functions of word categories.
Introduction
Unlike supervised POS tagging systems, POS induction systems make use of unsupervised methods.
Introduction
Type based methods suffer from POS ambiguity because one POS tag is assigned to each word type.
POS tags is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
Huang, Fei and Yates, Alexander
Experiments
We investigate the use of smoothing in two test systems, conditional random field (CRF) models for POS tagging and chunking.
Experiments
Our baseline CRF system for POS tagging follows the model described by Lafferty et al.
Experiments
In addition to the transition, word-level, and orthographic features, we include features relating automatically-generated POS tags and the chunk labels.
Introduction
effects of our smoothing techniques on two sequence-labeling tasks, POS tagging and chunking, to answer the following: I.
Introduction
Our best smoothing technique improves a POS tagger by 11% on OOV words, and a chunker by an impressive 21% on OOV words.
POS tags is mentioned in 16 sentences in this paper.
Topics mentioned in this paper:
Wang, Zhiguo and Xue, Nianwen
Abstract
First, to resolve the error propagation problem of the traditional pipeline approach, we incorporate POS tagging into the syntactic parsing process.
Introduction
First, POS tagging is typically performed separately as a preliminary step, and POS tagging errors will propagate to the parsing process.
Introduction
This problem is especially severe for languages where the POS tagging accuracy is relatively low, and this is the case for Chinese where there are fewer contextual clues that can be used to inform the tagging process and some of the tagging decisions are actually influenced by the syntactic structure of the sentence.
Introduction
First, we integrate POS tagging into the parsing process and jointly optimize these two processes simultaneously.
Joint POS Tagging and Parsing with Nonlocal Features
To address the drawbacks of the standard transition-based constituent parsing model (described in Section 1), we propose a model to jointly solve POS tagging and constituent parsing with nonlocal features.
Joint POS Tagging and Parsing with Nonlocal Features
3.1 Joint POS Tagging and Parsing
Joint POS Tagging and Parsing with Nonlocal Features
POS tagging is often taken as a preliminary step for transition-based constituent parsing, therefore the accuracy of POS tagging would greatly affect parsing performance.
Transition-based Constituent Parsing
Figure 1: Two constituent trees for an example sentence wowlwg with POS tags abc.
Transition-based Constituent Parsing
For example, in Figure l, for the input sentence wowlwg and its POS tags abc, our parser can construct two parse trees using action sequences given below these trees.
POS tags is mentioned in 37 sentences in this paper.
Topics mentioned in this paper:
Jiang, Wenbin and Huang, Liang and Liu, Qun
Abstract
We test the efficacy of this method in the context of Chinese word segmentation and part-of-speech tagging, where no segmentation and POS tagging standards are widely accepted due to the lack of morphology in Chinese.
Experiments
For example, currently, most Chinese constituency and dependency parsers are trained on some version of CTB, using its segmentation and POS tagging as the defacto standards.
Experiments
Therefore, we expect the knowledge adapted from PD will lead to more precise CTB-style segmenter and POS tagger , which would in turn reduce the error propagation to parsing (and translation).
Introduction
Figure l: Incompatible word segmentation and POS tagging standards between CTB (upper) and People’s Daily (below).
Introduction
Our experiments show that adaptation from PD to CTB results in a significant improvement in segmentation and POS tagging , with error reductions of 30.2% and 14%, respectively.
Segmentation and Tagging as Character Classification
While in Joint S&T, each word is further annotated with a POS tag:
Segmentation and Tagging as Character Classification
Where tk(l<: = 1..m) denotes the POS tag for the word Cek_1+1;ek.
Segmentation and Tagging as Character Classification
In Ng and Low (2004), Joint S&T can also be treated as a character classification problem, Where a boundary tag is combined with a POS tag in order to give the POS information of the word containing these characters.
POS tags is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
Kruengkrai, Canasai and Uchimoto, Kiyotaka and Kazama, Jun'ichi and Wang, Yiou and Torisawa, Kentaro and Isahara, Hitoshi
Abstract
In this paper, we present a discriminative word-character hybrid model for joint Chinese word segmentation and POS tagging .
Background
In joint word segmentation and the POS tagging process, the task is to predict a path
Background
p is its POS tag , and a “7%” symbol denotes the number of elements in each variable.
Background
words found in the system’s word dictionary, have regular POS tags .
Introduction
Word segmentation and POS tagging results are required as inputs to other NLP tasks, such as phrase chunking, dependency parsing, and machine translation.
Introduction
Word segmentation and POS tagging in a joint process have received much attention in recent research and have shown improvements over a pipelined fashion (Ng and Low, 2004; Nakagawa and Uchimoto, 2007; Zhang and Clark, 2008; Jiang et al., 2008a; Jiang et al., 2008b).
Introduction
In joint word segmentation and the POS tagging process, one serious problem is caused by unknown words, which are defined as words that are not found in a training corpus or in a sys-
Policies for correct path selection
We can directly estimate the statistics of known words from an annotated corpus where a sentence is already segmented into words and assigned POS tags .
Policies for correct path selection
3We consider a word and its POS tag a single entry.
POS tags is mentioned in 26 sentences in this paper.
Topics mentioned in this paper:
Zhao, Qiuye and Marcus, Mitch
Abstract
We show for both English POS tagging and Chinese word segmentation that with proper representation, large number of deterministic constraints can be learned from training examples, and these are useful in constraining probabilistic inference.
Abstract
”assign label 75 to word w” for POS tagging .
Abstract
In this work, we explore deterministic constraints for two fundamental NLP problems, English POS tagging and Chinese word segmentation.
POS tags is mentioned in 31 sentences in this paper.
Topics mentioned in this paper:
Shen, Mo and Liu, Hongxiao and Kawahara, Daisuke and Kurohashi, Sadao
Abstract
We propose the first tagset designed for the task of character-level POS tagging .
Abstract
We propose a method that performs character-level POS tagging jointly with word segmentation and word-level POS tagging .
Character-level POS Tagset
We propose a tagset for the task of character-level POS tagging .
Chinese Morphological Analysis with Character-level POS
Previous studies have shown that jointly processing word segmentation and POS tagging is preferable to pipeline processing, which can propagate errors (Nakagawa and Uchimoto, 2007; Kruengkrai et a1., 2009).
Chinese Morphological Analysis with Character-level POS
Baseline features: For word-level nodes that represent known words, we use the symbols w, p and l to denote the word form, POS tag and length of the word, respectively.
Chinese Morphological Analysis with Character-level POS
Proposed features: For word-level nodes, the function CPpal-T (w) returns the pair of the char-acter-level POS tags of the first and last characters of w, and CPau(w) returns the sequence of character-level POS tags of w. If either the pair or the sequence of character-level P08 is ambiguous, which means there are multiple paths in the sub-lattice of the word-level node, then the values on the current best path (with local context) during the Viterbi search will be returned.
Evaluation
To evaluate our proposed method, we have conducted two sets of experiments on CTB5: word segmentation, and joint word segmentation and word-level POS tagging .
Evaluation
The results of the word segmentation experiment and the joint experiment of segmentation and POS tagging are shown in Table 5(a) and Table 5(b), respectively.
Introduction
ith Character-level POS Tagging
Introduction
We propose the first tagset designed for the task of character-level POS tagging , based on which we manually annotate the entire CTB5.
Introduction
We propose a method that performs character-level POS tagging jointly with word segmentation and word-level POS tagging .
POS tags is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
Yao, Xuchen and Van Durme, Benjamin and Clark, Peter
Experiments
But only sentence boundaries, POS tags and NER labels were kept as the annotation of the corpus.
Introduction
IR can easily make use of this knowledge: for a when question, IR retrieves sentences with tokens labeled as DATE by NER, or POS tagged as CD.
Introduction
Moreover, our approach extends easily beyond fixed answer types such as named entities: we are already using POS tags as a demonstration.
Method
We let the trained QA system guide the query formulation when performing coupled retrieval with Indri (Strohman et al., 2005), given a corpus already annotated with POS tags and NER labels.
Method
Since NER and POS tags are not lexicalized they accumulate many more counts (i.e.
Method
NER Types First We found NER labels better indicators of expected answer types than POS tags .
POS tags is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Cheung, Jackie Chi Kit and Penn, Gerald
A Latent Variable Parser
The Berkeley parser has been applied to the TuBaD/Z corpus in the constituent parsing shared task of the ACL-2008 Workshop on Parsing German (Petrov and Klein, 2008), achieving an F1-measure of 85.10% and 83.18% with and without gold standard POS tags respectively2.
Experiments
As part of our experiment design, we investigated the effect of providing gold POS tags to the parser, and the effect of incorporating edge labels into the nonterminal labels for training and parsing.
Experiments
In all cases, gold annotations which include gold POS tags were used when training the parser.
Experiments
This table shows the results after five iterations of grammar modification, parameterized over whether we provide gold POS tags for parsing, and edge labels for training and parsing.
Introduction
the unlexicalized, latent variable-based Berkeley Innser(PeUInIetal,2006).VVfihoutanylanguage-or model-dependent adaptation, we achieve state-of-the-art results on the TuBa-D/Z corpus (Telljo-hann et al., 2004), with a Fl-measure of 95.15% using gold POS tags .
Introduction
It is found that the three techniques perform about equally well, with F1 of 94.1% using POS tags from the TnT tagger, and 98.4% with gold tags.
POS tags is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Kaji, Nobuhiro and Fujiwara, Yasuhiro and Yoshinaga, Naoki and Kitsuregawa, Masaru
Abstract
Experiments on three tasks (POS tagging, joint POS tagging and chunking, and supertagging) show that the new algorithm is several orders of magnitude faster than the basic Viterbi and a state-of-the-art algorithm, CARPEDIEM (Esposito and Radicioni, 2009).
Introduction
Now they are indispensable in a wide range of NLP tasks including chunking, POS tagging , NER and so on (Sha and Pereira, 2003; Tsuruoka and Tsujii, 2005; Lin and Wu, 2009).
Introduction
For example, there are more than 40 and 2000 labels in POS tagging and supertagging, respectively (Brants, 2000; Matsuzaki et al., 2007).
Introduction
As we shall see later, we need over 300 labels to reduce joint POS tagging and chunking into the single sequence labeling problem.
POS tags is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Lei, Tao and Xin, Yu and Zhang, Yuan and Barzilay, Regina and Jaakkola, Tommi
Experimental Setup
These datasets include manually annotated dependency trees, POS tags and morphological information.
Experimental Setup
In contrast, assume we take the crossproduct of the auxiliary word vector values, POS tags and lexical items of a word and its context, and add the crossed values into a normal model (in gbhmm).
Introduction
This low dimensional syntactic abstraction can be thought of as a proxy to manually constructed POS tags .
Introduction
For instance, on the English dataset, the low-rank model trained without POS tags achieves 90.49% on first-order parsing, while the baseline gets 86.70% if trained under the same conditions, and 90.58% if trained with 12 core POS tags .
Problem Formulation
pos, form, lemma and morph stand for the fine POS tag , word form, word lemma and the morphology feature (provided in CoNLL format file) of the current word.
Problem Formulation
For example, pos-p means the POS tag to the left of the current word in the sentence.
Problem Formulation
Other possible features include, for example, the label of the arc h —> m, the POS tags between the head and the modifier, boolean flags which indicate the occurence of in-between punctutations or conjunctions, etc.
Results
The rationale is that given all other features, the model would induce representations that play a similar role to POS tags .
Results
Table 4: The first three columns show parsing results when models are trained without POS tags .
Results
the performance of a parser trained with 12 Core POS tags .
POS tags is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Zhang, Meishan and Zhang, Yue and Che, Wanxiang and Liu, Ting
Character-Level Dependency Tree
system, each word is initialized by the action SHW with a POS tag , before being incrementally modified by a sequence of intra-word actions, and finally being completed by the action PW.
Character-Level Dependency Tree
L and R denote the two elements over which the dependencies are built; the subscripts lcl and r01 denote the leftmost and rightmost children, respectively; the subscripts 102 and r02 denote the second leftmost and second rightmost children, respectively; w denotes the word; t denotes the POS tag ; 9 denotes the head character; ls_w and w denote the smallest left and right subwords respectively, as shown in Figure 2.
Character-Level Dependency Tree
Since the first element of the queue can be shifted onto the stack by either SH or AR, it is more difficult to assign a POS tag to each word by using a single action.
POS tags is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Vadas, David and Curran, James R.
Conversion Process
Since we are applying these to CCGbank NP structures rather than the Penn Treebank, the POS tag based heuristics are sufficient to determine heads accurately.
Conversion Process
Some POS tags require special behaviour.
Conversion Process
Accordingly, we do not alter tokens with POS tags of DT and PRP s. Instead, their sibling node is given the category N and their parent node is made the head.
Experiments
Table 3: Parsing results with gold-standard POS tags
Experiments
Table 4: Parsing results with automatic POS tags
Experiments
We have also experimented with using automatically assigned POS tags .
NER features
Many of these features generalise the head words and/or POS tags that are already part of the feature set.
NER features
There are already features in the model describing each combination of the children’s head words and POS tags , which we extend to include combinations with
POS tags is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Bhat, Suma and Xue, Huichao and Yoon, Su-Youn
Experimental Setup
The first stage, ASR, yields an automatic transcription, which is followed by the POS tagging stage.
Experimental Setup
The steps for automatic assessment of overall proficiency follow an analogous process (either including the POS tagger or not), depending on the objective measure being evaluated.
Experimental Setup
5.3.2 POS tagger
Related Work
The idea of capturing differences in POS tag distributions for classification has been explored in several previous studies.
Related Work
In the area of text-genre classification, POS tag distributions have been found to capture genre differences in text (Feldman et al., 2009; Marin et al., 2009); in a language testing context, it has been used in grammatical error detection and essay scoring (Chodorow and Leacock, 2000; Tetreault and Chodorow, 2008).
Shallow-analysis approach to measuring syntactic complexity
Consider the two sentence fragments below taken from actual responses (the bigrams of interest and their associated POS tags are boldfaced).
POS tags is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Ma, Xuezhe and Xia, Fei
Data and Tools
The set of POS tags needs to be consistent across languages and treebanks.
Data and Tools
For this reason we use the universal POS tag set of Petrov et al.
Data and Tools
POS tags are not available for parallel data in the Europarl and Kaist corpus, so we need to pro-
POS tags is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Koo, Terry and Collins, Michael
Parsing experiments
For example, fdep contains lexicalized “in-between” features that depend on the head and modifier words as well as a word lying in between the two; in contrast, previous work has generally defined in-between features for POS tags only.
Parsing experiments
8A3 in previous work, English evaluation ignores any token whose gold-standard POS tag is one of { ‘ ‘ ' ' :
Parsing experiments
First, we define 4-gram features that characterize the four relevant indices using words and POS tags; examples include POS 4-grams and mixed 4-grams with one word and three POS tags .
Related work
These indices allow the use of arbitrary features predicated on the position of the grandparent (e. g., word identity, POS tag, contextual POS tags ) without affecting the asymptotic complexity of the parsing algorithm.
POS tags is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Zollmann, Andreas and Vogel, Stephan
Clustering phrase pairs directly using the K-means algorithm
Using a scheme based on source and target phrases with accounting for phrase size, with 36 word classes (the size of the Penn English POS tag set) for both languages, yields a grammar with (36 + 2 >|< 362 )2 = 6.9m nonterminal labels.
Conclusion and discussion
Crucially, our methods only rely on “shallow” lexical tags, either generated by POS taggers or by automatic clustering of words into classes.
Conclusion and discussion
Using automatically obtained word clusters instead of POS tags yields essentially the same results, thus making our methods applicable to all languages pairs with parallel corpora, whether syntactic resources are available for them or not.
Conclusion and discussion
On the other extreme, the clustering based approach labels phrases based on the contained words alone.8 The POS grammar represents an intermediate point on this spectrum, since POS tags can change based on surrounding words in the sentence; and the position of the K-means model depends on the influence of the phrase contexts on the clustering process.
Experiments
The source and target language parses for the syntax-augmented grammar, as well as the POS tags for our POS-based grammars were generated by the Stanford parser (Klein and Manning, 2003).
Experiments
Our approach, using target POS tags (‘POS-tgt (no phr.
Experiments
, 36 (the number Penn treebank POS tags , used for the ‘POS’ models, is 36).6 For ‘Clust’, we see a comfortably wide plateau of nearly-identical scores from N = 7,. .
Hard rule labeling from word classes
We use the simple term ‘tag’ to stand for any kind of word-level analysis—a syntactic, statistical, or other means of grouping word types or tokens into classes, possibly based on their position and context in the sentence, POS tagging being the most obvious example.
Related work
(2007) improve the statistical phrase-based MT model by injecting supertags, lexical information such as the POS tag of the word and its subcategorization information, into the phrase table, resulting in generalized phrases with placeholders in them.
POS tags is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Garrette, Dan and Mielens, Jason and Baldridge, Jason
Approach
These targeted morphological features are effective during LP because words that share them are much more likely to actually share POS tags .
Approach
Since the LP graph contains a node for each corpus token, and each node is labeled with a distribution over POS tags , the graph provides a corpus of sentences labeled with noisy tag distributions along with an expanded tag dictionary.
Data
enized and labeled with POS tags by two linguistics graduate students, each of which was studying one of the languages.
Data
The KIN and MLG data have 12 and 23 distinct POS tags , respectively.
Data
The PTB uses 45 distinct POS tags .
Experiments3
Moreover, since large gains in accuracy can be achieved by spending a small amount of time just annotating word types with POS tags , we are led to conclude that time should be spent annotating types or tokens instead of developing an FST.
Introduction
Haghighi and Klein (2006) develop a model in which a POS-tagger is learned from a list of POS tags and just three “prototype” word types for each tag, but their approach requires a vector space to compute the distributional similarity between prototypes and other word types in the corpus.
POS tags is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Tsuruoka, Yoshimasa and Tsujii, Jun'ichi and Ananiadou, Sophia
Introduction
We evaluate the effectiveness of our method by using linear-chain conditional random fields (CRFs) and three traditional NLP tasks, namely, text chunking (shallow parsing), named entity recognition, and POS tagging .
Log-Linear Models
The model is used for a variety of sequence labeling tasks such as POS tagging , chunking, and named entity recognition.
Log-Linear Models
We evaluate the effectiveness our training algorithm using linear-chain CRF models and three NLP tasks: text chunking, named entity recognition, and POS tagging .
Log-Linear Models
The features used in this experiment were unigrams and bigrams of neighboring words, and unigrams, bigrams and trigrams of neighboring POS tags .
POS tags is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Abend, Omri and Cohen, Shay B. and Steedman, Mark
Experimental Setup
Relations were extracted using regular expressions over the output of a POS tagger and an NP chunker.
Experimental Setup
We use a Maximum Entropy POS Tagger , trained on the Penn Treebank, and the WordNet lemmatizer, both implemented within the NLTK package (Loper and Bird, 2002).
Experimental Setup
To obtain a coarse-grained set of POS tags , we collapse the tag set to 7 categories: nouns, verbs, adjectives, adverbs, prepositions, the word “to” and a category that includes all other words.
Our Proposal: A Latent LC Approach
1We use a POS tagger to identify content words.
Our Proposal: A Latent LC Approach
In addition, we use POS-based features that encode the most frequent POS tag for the word lemma and the second most frequent POS tag (according to R).
Our Proposal: A Latent LC Approach
Information about the second most frequent POS tag can be important in identifying light verb constructions, such as “take a swim” or “give a smile”, where the object is derived from a verb.
POS tags is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Huang, Minlie and Ye, Borui and Wang, Yichen and Chen, Haiqiang and Cheng, Junjun and Zhu, Xiaoyan
Abstract
The method is almost free of linguistic resources (except POS tags ), and requires no elaborated linguistic rules.
Conclusion
almost knowledge-free (except POS tags ) framework.
Conclusion
The method is almost free of linguistic resources (except POS tags ), and does not rely on elaborated linguistic rules.
Introduction
This framework is fully unsupervised and purely data-driven, and requires very lightweight linguistic resources (i.e., only POS tags ).
Methodology
In order to obtain lexical patterns, we can define regular expressions with POS tags 2 and apply the regular expressions on POS tagged texts.
Methodology
2Such expressions are very simple and easy to write because we only need to consider POS tags of adverbial and auxiliary word.
Methodology
Our algorithm is in spirit to double propagation (Qiu et al., 2011), however, the differences are apparent in that: firstly, we use very lightweight linguistic information (except POS tags ); secondly, our major contributions are to propose statistical measures to address the following key issues: first, to measure the utility of lexical patterns; second, to measure the possibility of a candidate word being a new word.
POS tags is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Zhao, Hai and Song, Yan and Kit, Chunyu and Zhou, Guodong
Dependency Parsing: Baseline
pas POS tag of word
Dependency Parsing: Baseline
cpos] coarse POS: the first letter of POS tag of word
Dependency Parsing: Baseline
cposZ coarse POS: the first two POS tags of word
Exploiting the Translated Treebank
Chinese word should be strictly segmented according to the guideline before POS tags and dependency relations are annotated.
Exploiting the Translated Treebank
The difference is, rootscore counts for the given POS tag occurring as ROOT, and pairscore counts for two POS tag combination occurring for a dependent relationship.
Treebank Translation and Dependency Transformation
Bind POS tag and dependency relation of a word with itself; 2.
Treebank Translation and Dependency Transformation
After the target sentence is generated, the attached POS tags and dependency information of each English word will also be transferred to each corresponding Chinese word.
POS tags is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Dickinson, Markus
Ad hoc rule detection
Units of comparison To determine similarity, one can compare dependency relations, POS tags , or both.
Ad hoc rule detection
Thus, we use the pairs of dependency relations and POS tags as the units of comparison.
Additional information
One method which does not have this problem of overflagging uses a “lexicon” of POS tag pairs, examining relations between POS, irrespective of position.
Evaluation
We use the gold standard POS tags for all experiments.
Evaluation
For example, the parsed rule TA —> IG:IG RO has a correct dependency relation (IG) between the POS tags IG and its head RO, yet is assigned a whole rule score of 2 and a bigram score of 20.
Evaluation
This is likely due to the fact that Alpino has the smallest label set of any of the corpora, with only 24 dependency labels and 12 POS tags (cf.
POS tags is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Plank, Barbara and Hovy, Dirk and Sogaard, Anders
Annotator disagreements across domains and languages
In this study, we had between 2-10 individual annotators with degrees in linguistics annotate different kinds of English text with POS tags , e.g., newswire text (PTB WSJ Section 00), transcripts of spoken language (from a database containing transcripts of conversations, Talkbankl), as well as Twitter posts.
Annotator disagreements across domains and languages
We instructed annotators to use the 12 universal POS tags of Petrov et al.
Annotator disagreements across domains and languages
2Experiments with variation 71- grams on WSJ (Dickinson and Meurers, 2003) and the French data lead us to estimate that the fine-to-coarse mapping of POS tags disregards about 20% of observed tag-pair confusion types, most of which relate to fine-grained verb and noun distinctions, e. g. past participle versus past in “[..] criminal lawyers speculated/VBD vs. VBN that [..]”.
Related work
(2014) use small samples of doubly-annotated POS data to estimate annotator reliability and show how those metrics can be implemented in the loss function when inducing POS taggers to reflect confidence we can put in annotations.
Related work
They show that not biasing the theory towards a single annotator but using a cost-sensitive learning scheme makes POS taggers more robust and more applicable for downstream tasks.
POS tags is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Wu, Fei and Weld, Daniel S.
Conclusion
WOE can run in two modes: a CRF extractor (WOEPOS) trained with shallow features like POS tags ; a pattern classfier (WOEparse) learned from dependency path patterns.
Related Work
Shallow or Deep Parsing: Shallow features, like POS tags , enable fast extraction over large-scale corpora (Davidov et al., 2007; Banko et al., 2007).
Wikipedia-based Open IE
NLP Annotation: As we discuss fully in Section 4 (Experiments), we consider several variations of our system; one version, WOEparse, uses parser-based features, while another, WOEPOS , uses shallow features like POS tags , which may be more quickly computed.
Wikipedia-based Open IE
Depending on which version is being trained, the preprocessor uses OpenNLP to supply POS tags and NP—chunk annotations — or uses the Stanford Parser to create a dependency parse.
Wikipedia-based Open IE
We learn two kinds of extractors, one (WOEparse) using features from dependency-parse trees and the other (WOEPOS) limited to shallow features like POS tags .
POS tags is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Sogaard, Anders
Conclusion
Our approach was superior to previous approaches across 12 multilingual cross-domain POS tagging datasets, with an average error reduction of 4% over a structured perceptron baseline.
Experiments
POS tagging accuracy is known to be very sensitive to domain shifts.
Experiments
(2011) report a POS tagging accuracy on social media data of 84% using a tagger that ac-chieves an accuracy of about 97% on newspaper data.
Experiments
While POS taggers can often recover the part of speech of a previously unseen word from the context it occurs in, this is harder than for previously seen words.
Introduction
This paper considers the POS tagging problem, i.e.
Introduction
Several authors have noted how POS tagging performance is sensitive to cross-domain shifts (Blitzer et al., 2006; Daume III, 2007; Jiang and Zhai, 2007), and while most authors have assumed known target distributions and pool unlabeled target data in order to automatically correct cross-domain bias (Jiang and Zhai, 2007; Foster et al., 2010), methods such as feature bagging (Sutton et al., 2006), learning with random adversaries (Globerson and Roweis, 2006) and LOO-regularization (Dekel and Shamir, 2008) have been proposed to improve performance on unknown target distributions.
Introduction
Section 4 presents experiments on POS tagging and discusses how to evaluate cross-domain performance.
POS tags is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Blanco, Eduardo and Moldovan, Dan
Learning Algorithm
Features (1—5) are extracted for each role and capture their presence, first POS tag and word, length and position within the roles present for that instance.
Learning Algorithm
Al—postag is extracted for the following POS tags : DT, JJ, PRP, CD, RB, VB and WP; Al—keywo rd for the following words: any, anybody, anymore, anyone, anything, anytime, anywhere, certain, enough, fall, many, much, other, some, specifics, too and until.
Learning Algorithm
These lists of POS tags and keywords were extracted after manual examination of training examples and aim at signaling whether this role correspond to the focus.
POS tags is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Adler, Meni and Goldberg, Yoav and Gabay, David and Elhadad, Michael
Evaluation
Table 5 shows the result of the disambiguation when we only take into account the POS tag of the unknown tokens.
Introduction
On the one hand, this tagset is much larger than the largest tagset used in English (from 17 tags in most unsupervised POS tagging experiments, to the 46 tags of the WSJ corpus and the about 150 tags of the LOB corpus).
Introduction
On average, each token in the 42M corpus is given 2.7 possible analyses by the analyzer (much higher than the average 1.41 POS tag ambiguity reported in English (Dermatas and Kokkinakis, 1995)).
Previous Work
At the word level, a segmented word is attached to a POS, where the character model is based on the observed characters and their classification: Begin of word, In the middle of a word, End of word, the character is a word itself S. They apply Baum-Welch training over a segmented corpus, where the segmentation of each word and its character classification is observed, and the POS tagging is ambiguous.
Previous Work
(of all words in a given sentence) and the POS tagging (of the known words) is based on a Viterbi search over a lattice composed of all possible word segmentations and the possible classifications of all observed characters.
Previous Work
They report a very slight improvement on Hebrew and Arabic supervised POS taggers .
POS tags is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Liang, Percy and Jordan, Michael and Klein, Dan
Experiments
predicate cc in w (e.g., (Boston, Boston)), and (iii) predicates for each POS tag in {11, NN, NNS} (e.g., (JJ, size), (JJ, area), etc.
Experiments
We also define an augmented lexicon L+ which includes a prototype word cc for each predicate appearing in (iii) above (e.g., (large, size)), which cancels the predicates triggered by :c’s POS tag .
Experiments
SEMRESP requires a lexicon of 1.42 words per non-value predicate, Word-Net features, and syntactic parse trees; DCS requires only words for the domain-independent predicates (overall, around 0.5 words per non-value predicate), POS tags , and very simple indicator features.
POS tags is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Niu, Zheng-Yu and Wang, Haifeng and Wu, Hua
Experiments of Grammar Formalism Conversion
(2008) used POS tag information, dependency structures and dependency tags in test set for conversion.
Experiments of Grammar Formalism Conversion
Similarly, we used POS tag information in the test set to restrict search space of the parser for generation of better N-best parses.
Experiments of Parsing
CDT consists of 60k Chinese sentences, annotated with POS tag information and dependency structure information (including 28 P08 tags, and 24 dependency tags) (Liu et al., 2006).
Experiments of Parsing
We did not use POS tag information as inputs to the parser in our conversion method due to the difficulty of conversion from CDT POS tags to CTB POS tags .
Experiments of Parsing
We used the POS tagged People Daily corpus9 (Jan. l998~Jun.
Our Two-Step Solution
” (a preposition, with “BA” as its POS tag in CTB), and the head of IP-OBJ is 3% [El ” .
POS tags is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Darwish, Kareem
Introduction
- Part-of-speech (POS) tags and morphological features: POS tags indicate (or counter-indicate) the possible presence of a named entity at word level or at word sequence level.
Related Work
Benajiba and Rosso (2007) improved their system by incorporating POS tags to improve NE boundary detection.
Related Work
Benajiba and Rosso (2008) used CRF sequence labeling and incorporated many language specific features, namely POS tagging , base-phrase chunking, Arabic tokenization, and adjectives indicating nationality.
Related Work
Using POS tagging generally improved recall at the expense of precision, leading to overall improvements in F-measure.
POS tags is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Huang, Fei and Yates, Alexander
Introduction
For the chunker and POS tagger , the drop-offs are less severe: 94.89 to 91.73, and 97.36 to 94.73.
Introduction
We use an open source CRF software package to implement our CRF models.1 We use words, POS tags , chunk labels, and the predicate label at the preceding and following nodes as features for our Baseline system.
Introduction
0 P05 before, after predicate: the POS tag of the tokens immediately preceding and following the predicate
POS tags is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Nagata, Ryo and Whittaker, Edward
Experiments
Performance of POS tagging is an important factor in our methods because they are based on wordfl’OS sequences.
Experiments
Existing POS taggers might not perform well on nonnative English texts because they are normally developed to analyze native English texts.
Methods
In this language model, content words in n-grams are replaced with their corresponding POS tags .
Methods
Finally, words are replaced with their corresponding POS tags; for the following words, word tokens are used as their corresponding POS tags : coordinating conjunctions, determiners, prepositions, modals, predeterminers, possessives, pronouns, question adverbs.
Methods
At this point, the special POS tags BOS and EOS are added at the beginning and end of each sentence, respectively.
POS tags is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Galley, Michel and Manning, Christopher D.
Dependency parsing experiments
The only features that are not cached are the ones that include contextual POS tags , since their miss rate is relatively high.
Dependency parsing for machine translation
o a predicted POS tag tj; o a dependency score sj.
Dependency parsing for machine translation
We write h-word, h-pos, m-word, m-pos to refer to head and modifier words and POS tags , and append a numerical value to shift the word offset either to the left or to the right (e.g., h-pos+1 is the POS to the right of the head word).
Dependency parsing for machine translation
It is quite similar to the McDonald (2005a) feature set, except that it does not include the set of all POS tags that appear between each candidate head-modifier pair (i , j).
POS tags is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Kozhevnikov, Mikhail and Titov, Ivan
Evaluation
The part-of-speech tags in all datasets were replaced with the universal POS tags of Petrov et al.
Model Transfer
This may have a negative effect on the performance of a monolingual model, since most part-of-speech tagsets are more fine-grained than the universal POS tags considered here.
Model Transfer
Since the finer-grained POS tags often reflect more language-specific phenomena, however, they would only be useful for very closely related languages in the cross-lingual setting.
Model Transfer
If Synt is enabled too, it also uses the POS tags of the argument’s parent, children and siblings.
Related Work
Cross-lingual annotation projection (Yarowsky et al., 2001) approaches have been applied extensively to a variety of tasks, including POS tagging (Xi and Hwa, 2005; Das and Petrov, 2011), morphology segmentation (Snyder and Barzilay, 2008), verb classification (Merlo et al., 2002), mention detection (Zitouni and Florian, 2008), LFG parsing (Wroblewska and Frank, 2009), information extraction (Kim et al., 2010), SRL (Pado and Lapata, 2009; van der Plas et al., 2011; Annesi and Basili, 2010; Tonelli and Pi-anta, 2008), dependency parsing (Naseem et al., 2012; Ganchev et al., 2009; Smith and Eisner, 2009; Hwa et al., 2005) or temporal relation pre-
POS tags is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Setiawan, Hendra and Zhou, Bowen and Xiang, Bing and Shen, Libin
Training
2. surrounding: lslex (the previous word / 33:11), rslex (the next word/ fJJ-jill), lspos (lsleX’S POS tag), rspos (rsleX’S POS tag ), lsparent (lsleX’S parent), rsparent
Training
3. nonlocal: lanchorslex (thE: pl‘EDVlOLlS anchor’s word) , ranchorslex (the next an-ChOf’S word), lanchorspos (lanchorslex’s POS tag), ranchorspos (ranchorslex’s POS tag ).
Training
Of mosl_int_spos (mosl_int_sleX’S POS tag ), mosl_ext_spos (mosl_ext_spos’S PQS tag), mosr_int_slex (the actual word.
POS tags is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Bramsen, Philip and Escobar-Molano, Martha and Patel, Ami and Alonso, Rafael
Abstract
Many would be better modeled by POS tag unigrams (with no word information) or by longer n-grams consisting of either words, POS tags , or a combination of the two.
Abstract
Each n-gram is a sequence of words, POS tags or a combination of words and POS tags
Abstract
or a POS tag .
POS tags is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Ganchev, Kuzman and Gillenwater, Jennifer and Taskar, Ben
Experiments
We used the Tokyo tagger (Tsuruoka and Tsujii, 2005) to POS tag the English tokens, and generated parses using the first-order model of McDonald et al.
Experiments
For Bulgarian we trained the Stanford POS tagger (Toutanova et al., 2003) on the Bul-
Experiments
The Spanish Europarl data was POS tagged with the FreeLing language analyzer (Atserias et al., 2006).
POS tags is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Choi, Jinho D. and McCallum, Andrew
Experiments
Moreover, all POS tag features from English are duplicated with coarse-grained POS tags provided by CoNLL-X.
Experiments
Before parsing, POS tags were assigned to the training set by using 20-way jackknifing.
Experiments
For the automatic generation of POS tags , we used the domain-specific model of Choi and Palmer (2012a)’s tagger, which gave 97.5% accuracy on the English evaluation set (0.2% higher than Collins (2002)’s tagger).
Related work
Bohnet and Nivre (2012) introduced a transition-based system that jointly performed POS tagging and dependency parsing.
POS tags is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Celikyilmaz, Asli and Hakkani-Tur, Dilek and Tur, Gokhan and Sarikaya, Ruhi
Experiments
Table 2: Domain Adaptation performance in F-measure on Semantic Tagging on Movie Target domain and POS tagging on QBanszuestionBank.
Related Work and Motivation
In (Subramanya et al., 2010) an efficient iterative SSL method is described for syntactic tagging, using graph-based learning to smooth POS tag posteriors.
Semi-Supervised Semantic Labeling
In (Subramanya et al., 2010), a new SSL method is described for adapting syntactic POS tagging of sentences in newswire articles along with search queries to a target domain of natural language (NL) questions.
Semi-Supervised Semantic Labeling
The unlabeled POS tag posteriors are then smoothed using a graph-based learning algorithm.
Semi-Supervised Semantic Labeling
Later, using Viterbi decoding, they select the l-best POS tag sequence, 33'?
POS tags is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Mi, Haitao and Liu, Qun
Decoding
where the first two terms are translation and language model probabilities, 6(0) is the target string (English sentence) for derivation 0, the third and forth items are the dependency language model probabilities on the target side computed with words and POS tags separately, De (0) is the target dependency tree of 0, the fifth one is the parsing probability of the source side tree TC(0) 6 FC, the ill(0) is the penalty for the number of ill-formed dependency structures in 0, and the last two terms are derivation and translation length penalties, respectively.
Decoding
In order to alleviate the problem of data sparse, we also compute a dependency language model for POS tages over a dependency tree.
Decoding
the POS tag information on the target side for each constituency-to-dependency rule.
Experiments
We also store the POS tag information for each word in dependency trees, and compute two different dependency language models for words and POS tags in dependency tree separately.
POS tags is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Ponvert, Elias and Baldridge, Jason and Erk, Katrin
CD
CCM learns to predict a set of brackets over a string (in practice, a string of POS tags ) by jointly estimating constituent and distituent strings and contexts using an iterative EM-like procedure (though, as noted by Smith and Eisner (2004), CCM is deficient as a generative model).
Introduction
Recent work (Headden III et al., 2009; Cohen and Smith, 2009; Hanig, 2010; Spitkovsky et al., 2010) has largely built on the dependency model with valence of Klein and Manning (2004), and is characterized by its reliance on gold-standard part-of—speech (POS) annotations: the models are trained on and evaluated using sequences of POS tags rather than raw tokens.
Introduction
An exception which learns from raw text and makes no use of POS tags is the common cover links parser (CCL, Seginer 2007).
Tasks and Benchmark
portantly, until recently it was the only unsupervised raw text constituent parser to produce results competitive with systems which use gold POS tags (Klein and Manning, 2002; Klein and Manning, 2004; Bod, 2006) — and the recent improved raw-text parsing results of Reichart and Rappoport (2010) make direct use of CCL without modification.
Tasks and Benchmark
Finally, CCL outperforms most published POS-based models when those models are trained on unsupervised word classes rather than gold POS tags .
POS tags is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Chen, Wenliang and Kazama, Jun'ichi and Torisawa, Kentaro
Bilingual subtree constraints
For the source part, we replace nouns and verbs using their POS tags (coarse grained tags).
Bilingual subtree constraints
For example, we have the subtree pair: “H %(society):2-ifl €%(fringe):0” and “fringes(W_2):0-of:1-society(W_1):2”, where “of” does not have a corresponding word, the POS tag of “fiéflsocietyY’ is N, and the POS tag of “53 é%(fringe)” is N. The source part of the rule becomes “N22-N20” and the target part becomes “W_2:0-of:1-W_1:2”.
Experiments
For Chinese unannotated data, we used the XIN_CMN portion of Chinese Gigaword Version 2.0 (LDC2009T14) (Huang, 2009), which has approximately 311 million words whose segmentation and POS tags are given.
Experiments
We used the MMA system (Kruengkrai et al., 2009) trained on the training data to perform word segmentation and POS tagging and used the Baseline Parser to parse all the sentences in the data.
Experiments
The POS tags were assigned by the MXPOST tagger trained on training data.
POS tags is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Chambers, Nathanael
Learning Time Constraints
n—gram POS The 4—gram and 3-gram of POS tags that end with the year
Previous Work
Kanhabua and Norvag (2008; 2009) extended this approach with the same model, but expanded its unigrams with POS tags , collocations, and tf-idf scores.
Timestamp Classifiers
Word Classes: include only nouns, verbs, and adjectives as labeled by a POS tagger
Timestamp Classifiers
on POS tags and tf-idf scores.
Timestamp Classifiers
Typed Dependency POS: Similar to Typed Dependency, this feature uses POS tags of the dependency relation’s governor.
POS tags is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Zou, Bowei and Zhou, Guodong and Zhu, Qiaoming
Baselines
Following is a list of features adopted in the two baselines, for both BaselineC4'5 and BaselineSVM, > Basic features: first token and its part-of-speech (POS) tag of the focus candidate; the number of tokens in the focus candidate; relative position of the focus candidate among all the roles present in the sentence; negated verb and its POS tag of the negative expression;
Baselines
> Syntactic features: the sequence of words from the beginning of the governing VP to the negated verb; the sequence of POS tags from the beginning of the governing VP to the negated verb; whether the governing VP contains a CC; whether the governing VP contains a RB.
Baselines
> Semantic features: the syntactic label of semantic role A1; whether A1 contains POS tag DT, JJ, PRP, CD, RB, VB, and WP, as defined in Blanco and Moldovan (2011); whether A1 contains token any, anybody, anymore, anyone, anything, anytime, anywhere, certain, enough, full, many, much, other, some, specifics, too, and until, as defined in Blanco and Moldovan (2011); the syntactic label of the first semantic role in the sentence; the semantic label of the last semantic role in the sentence; the thematic role for AO/Al/AZ/A3/A4 of the negated predicate.
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zhu, Muhua and Zhang, Yue and Chen, Wenliang and Zhang, Min and Zhu, Jingbo
Experiments
For both English and Chinese data, we used tenfold jackknifing (Collins, 2000) to automatically assign POS tags to the training data.
Experiments
For English POS tagging, we adopted SVMTool, 3 and for Chinese POS tagging
Experiments
we employed the Stanford POS tagger .
Semi-supervised Parsing with Large Data
Word clusters are regarded as lexical intermediaries for dependency parsing (Koo et al., 2008) and POS tagging (Sun and Uszkoreit, 2012).
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Berant, Jonathan and Liang, Percy
Paraphrasing
Deletions Deleted lemma and POS tag
Paraphrasing
£13,; j and ci/zj/ denote spans from a: and c. pos(:1:¢;j) and lemma(:1:i; j) denote the POS tag and lemma sequence of £13,; 3'.
Paraphrasing
For a pair (at, c), we also consider as candidate associations the set [3 (represented implicitly), which contains token pairs (510,, ci/) such that at, and oil share the same lemma, the same POS tag , or are linked through a derivation link on WordNet (Fellbaum, 1998).
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Candito, Marie and Constant, Matthieu
Use of external MWE resources
6We use the version available in the POS tagger MElt (Denis and Sagot, 2009).
Use of external MWE resources
The MWE analyzer is a CRF-based sequential labeler, which, given a tokenized text, jointly performs MWE segmentation and POS tagging (of simple tokens and of MWEs), both tasks mutually helping each other9.
Use of external MWE resources
The MWE analyzer integrates, among others, features computed from the external lexicons described in section 5.1, which greatly improve POS tagging (Denis and Sagot, 2009) and MWE segmentation (Constant and Tel-lier, 2012).
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Li, Zhenghua and Zhang, Min and Chen, Wenliang
Experiments and Analysis
We build a CRF-based bigram part-of-speech (POS) tagger with the features described in (Li et al., 2012), and produce POS tags for all trairfldevelopment/test/unlabeled sets (10-way jackknifing for training sets).
Experiments and Analysis
(2012) and Bohnet and Nivre (2012) use joint models for POS tagging and dependency parsing, significantly outperforming their pipeline counterparts.
Experiments and Analysis
Our approach can be combined with their work to utilize unlabeled data to improve both POS tagging and parsing simultaneously.
Supervised Dependency Parsing
ti denotes the POS tag of 10,-. b is an index between h and m. dir(z', j) and dist(i, j) denote the direction and distance of the dependency (i, j).
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Hermann, Karl Moritz and Das, Dipanjan and Weston, Jason and Ganchev, Kuzman
Argument Identification
0 bag of words in a 0 bag of POS tags in a
Argument Identification
o the set of dependency labels of the predicate’s children 0 dependency path conjoined with the POS tag of a’s head
Experiments
Before parsing the data, it is tagged with a POS tagger trained with a conditional random field (Lafferty et al., 2001) with the following emission features: word, the word cluster, word suffixes of length l, 2 and 3, capitalization, whether it has a hyphen, digit and punctuation.
Frame Identification with Embeddings
Let the lexical unit (the lemma conjoined with a coarse POS tag ) for the marked predicate be 6.
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Li, Junhui and Tu, Zhaopeng and Zhou, Guodong and van Genabith, Josef
Experiments
Examining translation rules extracted from the training data shows that there are 72,366 types of non-terminals with respect to 33 types of POS tags .
Head-Driven HPB Translation Model
Instead of collapsing all non-terminals in the source language into a single symbol X as in Chiang (2007), given a word sequence f2- from position i to position 3', we first find heads and then concatenate the POS tags of these heads as fé’s nonterminal symbol.
Head-Driven HPB Translation Model
We look for initial phrase pairs that contain other phrases and then replace sub-phrases with POS tags corresponding to their heads.
Introduction
Here, each Chinese word is attached with its POS tag and Pinyin.
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Kummerfeld, Jonathan K. and Roesner, Jessika and Dawborn, Tim and Haggerty, James and Curran, James R. and Clark, Stephen
Background
The C&C supertagger is similar to the Ratnaparkhi (1996) tagger, using features based on words and POS tags in a five-word window surrounding the target word, and defining a local probability distribution over supertags for each word in the sentence, given the previous two supertags.
Data
For supertagger evaluation, one thousand sentences were manually annotated with CCG lexical categories and POS tags .
Introduction
Since the CCG lexical category set used by the supertagger is much larger than the Penn Treebank POS tag set, the accuracy of supertagging is much lower than POS tagging ; hence the CCG supertagger assigns multiple supertags1 to a word, when the local context does not provide enough information to decide on the correct supertag.
Introduction
(2003) were unable to improve the accuracy of POS tagging using self-training.
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Jiang, Wenbin and Liu, Qun
Experiments
For English, we use the automatically-assigned POS tags produced by an implementation of the POS tagger of Collins (2002).
Experiments
While for Chinese, we just use the gold-standard POS tags following the tradition.
Experiments
Both English and Chinese sentences are tagged by the implementations of the POS tagger of Collins (2002), which trained on W8] and CTB 5.0 respectively.
Word-Pair Classification Model
1 Each feature is composed of some words and POS tags surrounded word 7' and/or word j, as well as an optional distance representations between this two words.
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Ravi, Sujith and Baldridge, Jason and Knight, Kevin
Introduction
The most ambiguous word has 7 different POS tags associated with it.
Minimized models for supertagging
We also wish to scale our methods to larger data settings than the 24k word tokens in the test data used in the POS tagging task.
Minimized models for supertagging
On the simpler task of unsupervised POS tagging with a dictionary, we compared our method versus directly solving I Pongmal and found that the minimization (in terms of grammar size) achieved by our method is close to the optimal solution for the original objective and yields the same tagging accuracy far more efficiently.
Minimized models for supertagging
Ravi and Knight (2009) exploited this to iteratively improve their POS tag model: since the first minimization procedure is seeded with a noisy grammar and tag dictionary, iterating the IP procedure with progressively better grammars further improves the model.
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Connor, Michael and Gertner, Yael and Fisher, Cynthia and Roth, Dan
Testing SRL Performance
When trained on arguments identified via the unsupervised POS tagger , noun pattern features promoted agent interpretations of tran-
Unsupervised Parsing
To implement this division into function and content words3, we start with a list of function word POS tags4 and then find words that appear predominantly with these POS tags , using tagged WSJ data (Marcus et al., 1993).
Unsupervised Parsing
Smaller numbers are better, indicating less information lost in moving from the HMM states to the gold POS tags .
Unsupervised Parsing
We first evaluate these parsers (the first stage of our SRL system) on unsupervised POS tagging .
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Bollegala, Danushka and Weir, David and Carroll, John
A Motivating Example
POS tags Excellent/JJ and/CC broad/JJ
Sentiment Sensitive Thesaurus
We then apply a simple word filter based on POS tags to select content words (nouns, verbs, adjectives, and adverbs).
Sentiment Sensitive Thesaurus
In addition to word-level sentiment features, we replace words with their POS tags to create
Sentiment Sensitive Thesaurus
POS tags generalize the word-level sentiment features, thereby reducing feature sparseness.
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Tan, Ming and Zhou, Wenli and Zheng, Lei and Wang, Shaojun
Composite language model
The SLM is based on statistical parsing techniques that allow syntactic analysis of sentences; it assigns a probability p(VV, T) to every sentence W and every possible binary parse T. The terminals of T are the words of W with POS tags , and the nodes of T are annotated with phrase headwords and nonterminal labels.
Composite language model
A word-parse k-prefix has a set of exposed heads h_m, - - - , h_1, with each head being a pair (headword, nonterminal label), or in the case of a root-only tree (word, POS tag ).
Composite language model
An m—th order SLM (m-SLM) has three operators to generate a sentence: WORD-PREDICTOR predicts the next word wk+1 based on the m leftmost exposed headwords bin 2 h_m, - - - , h_1 in the word-parse k-prefix with probability p(wk+1|h:,1n), and then passes control to the TAGGER; the TAGGER predicts the POS tag tk+1 to the next word wk+1 based on the next word wk+1 and the POS tags of the m leftmost exposed headwords hjn in the word-parse k-prefix with probability p(tk+1|wk+1, h_m.tag, - - - ,h_1.tag); the CONSTRUCTOR builds the partial parse Tk, from Tk,_1, wk, and tk, in a series of moves ending with NULL, where a parse move a is made with probability p(a|h:,1,,); a e A={(unary, NTlabel), (adjoin-left, NTlabel), (adjoin-right, NTlabel), null}.
Training algorithm
The TAGGER and CONSTRUCTOR are conditional probabilistic models of the type p(u|zl, - - - ,2“) where u, 21, - - - ,zn belong to a mixed set of words, POS tags , NTtags, CONSTRUCTOR actions (u only), and 21, - - - ,2“, form a linear Markov chain.
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Ravi, Sujith and Knight, Kevin
Abstract
We describe a novel method for the task of unsupervised POS tagging with a dictionary, one that uses integer programming to explicitly search for the smallest model that explains the data, and then uses EM to set parameter values.
Introduction
The classic Expectation Maximization (EM) algorithm has been shown to perform poorly on POS tagging , when compared to other techniques, such as Bayesian methods.
Introduction
(2008) depart from the Bayesian framework and show how EM can be used to learn good POS taggers for Hebrew and English, when provided with good initial conditions.
What goes wrong with EM?
The overall POS tag distribution learnt by EM is relatively uniform, as noted by Johnson (2007), and it tends to assign equal number of tokens to each
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Liu, Chang and Ng, Hwee Tou
Discussion and Future Work
We can then award partial scores for related words, such as those identified as such by WordNet or those with the same POS tags .
Experiments
However, its use of POS tags and synonym dictionaries prevents its use at the character-level.
Experiments
We use the Stanford Chinese word segmenter (Tseng et al., 2005) and POS tagger (Toutanova et al., 2003) for preprocessing and Cilin for synonym
Introduction
However, many different segmentation standards eXist for different purposes, such as Microsoft Research Asia (MSRA) for Named Entity Recognition (NER), Chinese Treebank (CTB) for parsing and part-of-speech (POS) tagging, and City University of Hong Kong (CITYU) and Academia Sinica (AS) for general word segmentation and POS tagging .
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Jeon, Je Hun and Liu, Yang
Co-training strategy for prosodic event detection
As described in Section 4, we use two classifiers for the prosodic event detection task based on two different information sources: one is the acoustic evidence extracted from the speech signal of an utterance; the other is the lexical and syntactic evidence such as syllables, words, POS tags and phrasal boundary information.
Previous work
(2007) applied co-training method in POS tagging using agreement-based selection strategy.
Prosodic event detection method
0 Accent detection: syllable identity, lexical stress (exist or not), word boundary information (boundary or not), and POS tag .
Prosodic event detection method
0 IPB and Break index detection: POS tag , the ratio of syntactic phrases the word initiates, and the ratio of syntactic phrases the word terminates.
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Hagiwara, Masato and Sekine, Satoshi
Experiments
4Since the dictionary is not explicitly annotated with PoS tags, we firstly took the intersection of the training corpus and the dictionary words, and assigned all the possible PoS tags to the words which appeared in the corpus.
Experiments
Proper noun performance for the Stanford segmenter is not shown since it does not assign PoS tags .
Word Segmentation Model
Here, 111,- and wi_1 denote the current and previous word in question, and ti and til are level-j PoS tags assigned to them.
Word Segmentation Model
1The Japanese dictionary and the corpus we used have 6 levels of PoS tag hierarchy, while the Chinese ones have only one level, which is why some of the PoS features are not included in Chinese.
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Socher, Richard and Bauer, John and Manning, Christopher D. and Andrew Y., Ng
Introduction
3.1 and the POS tags come from a PCFG.
Introduction
The standard RNN essentially ignores all POS tags and syntactic categories and each nonterminal node is associated with the same neural network (i.e., the weights across nodes are fully tied).
Introduction
While this results in a powerful composition function that essentially depends on the words being combined, the number of model parameters explodes and the composition functions do not capture the syntactic commonalities between similar POS tags or syntactic categories.
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Das, Dipanjan and Smith, Noah A.
Experimental Evaluation
model is approximate, because we used different preprocessing tools: MX-POST for POS tagging (Ratnaparkhi, 1996), MSTParser for parsing (McDonald et al., 2005), and Dan Bikel’s interface (http: //WWW .
QG for Paraphrase Modeling
For unobserved cases, the conditional probability is estimated by backing off to the parent POS tag and child direction.
QG for Paraphrase Modeling
We estimate the distributions over dependency labels, POS tags , and named entity classes using the transformed treebank (footnote 4).
QG for Paraphrase Modeling
(17) The parameters 9 to be learned include the class priors, the conditional distributions of the dependency labels given the various configurations, the POS tags given POS tags , the NE tags given NE
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Avramidis, Eleftherios and Koehn, Philipp
Introduction
In one of the first efforts to enrich the source in word-based SMT, Ueffing and Ney (2003) used part-of-speech (POS) tags, in order to deal with the verb conjugation of Spanish and Catalan; so, POS tags were used to identify the pronoun+verb sequence and splice these two words into one term.
Introduction
In their presentation of the factored SMT models, Koehn and Hoang (2007) describe experiments for translating from English to German, Spanish and Czech, using morphology tags added on the morphologically rich side, along with POS tags .
Methods for enriching input
The POS tag of this noun is then used to identify if it is plural or singular.
Methods for enriching input
The word “aspects” is found, which has a POS tag that shows it is a plural noun.
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Turian, Joseph and Ratinov, Lev-Arie and Bengio, Yoshua
Clustering-based word representations
Ushioda (1996) presents an extension to the Brown clustering algorithm, and learn hierarchical clusterings of words as well as phrases, which they apply to POS tagging .
Clustering-based word representations
Li and McCallum (2005) use an HMM-LDA model to improve POS tagging and Chinese Word Segmentation.
Clustering-based word representations
(2009) use an HMM to assign POS tags to words, which in turns improves the accuracy of the PCFG—based Hebrew parser.
POS tags is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Tomanek, Katrin and Hahn, Udo
Conditional Random Fields for Sequence Labeling
Many NLP tasks, such as POS tagging , chunking, or NER, are sequence labeling problems where a sequence of class labels 3] = (3/1,.
Conditional Random Fields for Sequence Labeling
Input units 553- are usually tokens, class labels yj can be POS tags or entity classes.
Introduction
When used for sequence labeling tasks such as POS tagging , chunking, or named entity recogni-
POS tags is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Xu, Wenduan and Clark, Stephen and Zhang, Yue
Experiments
For example, one template returns the top category on the stack plus its head word, together with the first word and its POS tag on the queue.
Experiments
Another template returns the second category on the stack, together with the POS tag of its head word.
Experiments
We use 10-fold cross validation for POS tagging and supertagging the training data, and automatically assigned POS tags for all experiments.
POS tags is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Fowler, Timothy A. D. and Penn, Gerald
A Latent Variable CCG Parser
In the supertagging literature, POS tagging and supertagging are distinguished — POS tags are the traditional Penn treebank tags (e. g. NN, VBZ and DT) and supertags are CCG categories.
A Latent Variable CCG Parser
However, because the Petrov parser trained on CCGbank has no notion of Penn treebank POS tags , we can only evaluate the accuracy of the supertags.
A Latent Variable CCG Parser
Despite the lack of POS tags in the Petrov parser, we can see that it performs slightly better than the Clark and Curran parser.
POS tags is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Penn, Gerald and Zhu, Xiaodan
Setting of the experiment
A decision tree (C4.5, Release 8) is used to detect false starts, trained on the POS tags and trigger-word status of the first and last four words of sentences from a training set.
Setting of the experiment
For (both WH-and Yesfl\Io) question identification, another C4.5 classifier was trained on 2,000 manually annotated sentences using utterance length, POS bigram occurrences, and the POS tags and trigger-word status of the first and last five words of an utterance.
Setting of the experiment
Taking ASR transcripts as input, we use the Brill tagger (Brill, 1995) to assign POS tags to each word.
POS tags is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Lavergne, Thomas and Cappé, Olivier and Yvon, François
Conditional Random Fields
3-grm 10.74% 14.3M 14.59% 0.3M 5-grm 8.48% 132.5M 11.54% 2.5M POS tagging
Conditional Random Fields
For the POS tagging task, BCD appears to be unpractically slower to train than the others approaches (SGD takes about 40min to train, OWL-QN about 1 hour) due the simultaneous increase in the sequence length and in the number of observations.
Conditional Random Fields
Based on this observation, we have designed an incremental training strategy for the POS tagging task, where more specific features are progressively incorporated into the model if the corresponding less specific feature is active.
POS tags is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Ji, Yangfeng and Eisenstein, Jacob
Experiments
POS tag at beginning and end of the EDU
Implementation
The dependency structure and POS tags are obtained from MALT-Parser (Nivre et al., 2007).
Model
While such feature learning approaches have proven to increase robustness for parsing, POS tagging , and NER (Miller et al., 2004; Koo et al., 2008; Turian et al., 2010), they would seem to have an especially promising role for discourse, where training data is relatively sparse and ambiguity is considerable.
POS tags is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Wang, Zhigang and Li, Zhixing and Li, Juanzi and Tang, Jie and Z. Pan, Jeff
Our Approach
As shown in Table 2, we classify the features used in WikiCiKE into three categories: format features, POS tag features and token features.
Our Approach
POS tag POS tag of current token features POS tags of previous 5 tokens
Our Approach
POS tags of
POS tags is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Auli, Michael and Lopez, Adam
Abstract
On CCGbank we achieve a labelled dependency F—measure of 88.8% on gold POS tags , and 86.7% on automatic part-of-speeoch tags, the best reported results for this task.
Conclusion and Future Work
In future work we plan to integrate the POS tagger , which is crucial to parsing accuracy (Clark and Curran, 2004b).
Experiments
To the best of our knowledge, the results obtained with BP and DD are the best reported results on this task using gold POS tags .
POS tags is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Green, Spence and DeNero, John
A Class-based Model of Agreement
The coarse categories are the universal POS tag set described by Petrov et al.
A Class-based Model of Agreement
For Arabic, we used the coarse POS tags plus definiteness and the so-called phi features (gender, number, and person).4 For example, SJWl ‘the car’ would be tagged “Noun+Def+Sg+Fem”.
Discussion of Translation Results
For comparison, +POS indicates our class-based model trained on the 11 coarse POS tags only (e.g., “Noun”).
POS tags is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Huang, Zhiheng and Chang, Yi and Long, Bo and Crespo, Jean-Francois and Dong, Anlei and Keerthi, Sathiya and Wu, Su-Lin
Experiments
We apply 1-best and k-best sequential decoding algorithms to five NLP tagging tasks: Penn TreeBank (PTB) POS tagging, CoNLLZOOO joint POS tagging and chunking, CoNLL 2003 joint POS tagging , chunking and named entity tagging, HPSG supertag-ging (Matsuzaki et al., 2007) and a search query named entity recognition (NER) dataset.
Experiments
As in (Kaji et al., 2010), we combine the POS tags and chunk tags to form joint tags for CoNLL 2000 dataset, e.g., NN|B-NP.
Experiments
Similarly we combine the POS tags , chunk tags, and named entity tags to form joint tags for CoNLL 2003 dataset, e.g., PRP$|I-NP|O.
POS tags is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Boros, Tiberiu and Ion, Radu and Tufis, Dan
Abstract
When tagging with CTAGS, one can use any statistical POS tagging method such as HMMs, Maximum Entropy Classifiers, Bayesian Networks, CRFs, etc., followed by the CTAG to MSD recovery.
Abstract
Manual+automatic Tmmmg a POS tagger rules for MSD recovery I I I I Tagging i i Input data Labeling with CTAGS —> MSD Recovery Output data
Abstract
Also, our POS tagger detected cases where the annotation in the Gold Standard was erroneous.
POS tags is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Persing, Isaac and Ng, Vincent
Error Classification
For this reason, we include POS tag 1, 2, 3, and 4-grams in the set of features we sort in the previous paragraph.
Error Classification
For each error 6,, we select POS tag n-grams from the top thousand features of the information gain sorted list to count toward the Ap+i and Api aggregation features.
Error Classification
This feature type may also help with Confusing Phrasing because the list of POS tag n-grams our annotator generated for its Ap+i contains useful features like DT NNS VBZ VBN (e.g., “these signals has been”), which captures noun-verb disagreement.
POS tags is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Visweswariah, Karthik and Khapra, Mitesh M. and Ramanathan, Ananthakrishnan
Generating reference reordering from parallel sentences
the Model 1 probabilities between pairs of words linked in the alignment a, features that inspect source and target POS tags and parses (if available) and features that inspect the alignments of adjacent words in the source and target sentence.
Generating reference reordering from parallel sentences
We conjoin the msd (minimum signed distance) with the POS tags to allow the model to capture the fact that the alignment error rate maybe higher for some POS tags than others (e.g., we have observed verbs have a higher error rate in Urdu-English alignments).
Reordering model
where 6 is a learned vector of weights and (I) is a vector of binary feature functions that inspect the words and POS tags of the source sentence at and around positions m and n. We use the features ((1)) described in Visweswariah et al.
POS tags is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Yang, Bishan and Cardie, Claire
Experiments
We trained CRFs for opinion entity identification using the following features: indicators for words, POS tags , and lexicon features (the subjectivity strength of the word in the Subjectivity Lexicon).
Model
Words and POS tags: the words contained in the candidate and their POS tags .
Model
For features, we use words, POS tags , phrase types, lexicon and semantic frames (see Section 3.2.1 for details) to capture the properties of the opinion expression, and also features that capture the context of the opinion expression:
POS tags is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Xiang, Bing and Luo, Xiaoqiang and Zhou, Bowen
Chinese Empty Category Prediction
leftmost child label or POS tag rightmost child label or POS tag label or POS tag of the head child the number of child nodes
Chinese Empty Category Prediction
left-sibling label or POS tag
Chinese Empty Category Prediction
0 right-sibling label or POS tag
POS tags is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: