Abstract | The focus of recent studies on Chinese word segmentation, part-of-speech (POS) tagging and parsing has been shifting from words to characters. |
Abstract | In this paper, we investigate the usefulness of character-level part-of-speech in the task of Chinese morphological analysis. |
Character-level POS Tagset | Some of these tags are directly derived from the commonly accepted word-level part-of-speech , such as noun, verb, adjective and adverb. |
Conclusion | A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-speech Tagging. |
Conclusion | and Part-of-speech Tagging. |
Conclusion | Chinese Part-of-speech Tagging: One-at—a-time or All-at-once? |
Introduction | In recent years, the focus of research on Chinese word segmentation, part-of-speech (POS) tagging and parsing has been shifting from words toward characters. |
Introduction | In our view, since each Chinese character is in fact created as a word in origin with complete and independent meaning, it should be treated as the actual minimal morphological unit in Chinese language, and therefore should carry specific part-of-speech . |
Introduction | This suggests that character-level POS can be used as cues in predicting the part-of-speech of unknown words. |
MWE-dedicated Features | We use part-of-speech unigrams and bigrams in order to capture MWEs with irregular syntactic structures that might indicate the id-iomacity of a word sequence. |
MWE-dedicated Features | We also integrated mixed bigrams made up of a word and a part-of-speech . |
MWE-dedicated Features | We associate each word with its part-of-speech tags found in our external morphological lexicon. |
Multiword expressions | In this paper, we focus on contiguous MWEs that form a lexical unit which can be marked by a part-of-speech tag (e. g. at night is an adverb, because of is a preposition). |
Resources | Compounds are identified with a specific nonterminal symbol ”MWX” where X is the part-of-speech of the expression. |
Resources | They have a flat structure made of the part-of-speech of their components as shown in figure 1. |
Resources | The nonterminal tagset is composed of 14 part-of-speech labels and 24 phrasal ones (including 11 MWE labels). |
Two strategies, two discriminative models | Constant and Sigogne (2011) proposed to combine MWE segmentation and part-of-speech tagging into a single sequence labelling task by assigning to each token a tag of the form TAG+X where TAG is the part-of-speech (POS) of the leXical unit the token belongs to and X is either B (i.e. |
Background and Motivation | (2011) successfully apply this idea to the transfer of dependency parsers, using part-of-speech tags as the shared representation of words. |
Model Transfer | This setup requires that we use the same feature representation for both languages, for example part-of-speech tags and dependency relation labels should be from the same inventory. |
Model Transfer | In this study we will confine ourselves to those features that are applicable to all languages in question, namely: part-of-speech tags, syntactic dependency structures and representations of the word’s identity. |
Model Transfer | Part-of-speech Tags. |
Setup | We also assume that the predicate identification information is available — in most languages it can be obtained using a relatively simple heuristic based on part-of-speech tags. |
Setup | (2011), we assume that a part-of-speech tagger is available for the target language. |
Experimental setup | The English-Urdu parallel corpus3 consists of 4,325 sentences from the first three sections of the Penn Treebank and their Urdu translations annotated at the part-of-speech level. |
Introduction | Multilingual learning has been successful for other linguistic induction tasks such as lexicon acquisition, morphological segmentation, and part-of-speech tagging (Genzel, 2005; Snyder and Barzilay, 2008; Snyder et al., 2008; Snyder |
Introduction | lingual constituent, a sequence of part-of-speech tags is drawn from a language-specific distribution. |
Introduction | For each pair of coupled bilingual constituents, a pair of part-of-speech sequences are drawn jointly from a cross-lingual distribution. |
Model | We treat the part-of-speech tag sequences of parallel sentences, as well as their |
Model | Under this model, the part-of-speech sequence of each span in a sentence is generated either as a constituent yield — if it is dominated by a node in the tree —or otherwise as a distituent yield. |
Model | While this model is deficient —each observed subsequence of part-of-speech tags is generated many times over — its performance is far higher than that of unsupervised PCFGs. |
Experiments | To extract a Hebrew morphological lexicon we assume the existence of manual morphological and part-of-speech annotations (Groves and Lowery, 2006). |
Experiments | We divide Hebrew stems into four main part-of-speech categories each with a distinct affix profile: Noun, Verb, Pronoun, and Particle. |
Experiments | For each part-of-speech category, we determine the set of allowable affixes using the annotated Bible corpus. |
Inference | First we sample the morphological segmentation of ui, along with the part-of-speech p03 of the latent stem cognate. |
Inference | To do so, we enumerate each possible segmentation and part-of-speech and calculate its joint conditional probability (for notational clarity, we leave implicit the conditioning on the other samples in the corpus): |
Inference | where the summations over character-edit sequences are restricted to those which yield the segmentation (upre, ustm, usuf) and a latent cognate with part-of-speech p03. |
Model | We model prefix and suffix distributions as conditionally dependent on the part-of-speech of the stem morpheme-pair. |
Model | in stem part-of-speech . |
Abstract | We study substitute vectors to solve the part-of-speech ambiguity problem in an unsupervised setting. |
Abstract | Part-of-speech tagging is a crucial preliminary process in many natural language processing applications. |
Abstract | Because many words in natural languages have more than one part-of-speech tag, resolving part-of-speech ambiguity is an important task. |
Algorithm | Previous work (Yatbaz et al., 2012) demonstrates that clustering substitute vectors of all word types alone has limited success in predicting part-of-speech tag of a word. |
Algorithm | The output of clustering induces part-of-speech categories of words tokens. |
Introduction | part-of-speech or POS tagging) is an important preprocessing step for many natural language processing applications because grammatical rules are not functions of individual words, instead, they are functions of word categories. |
Introduction | In addition, we suggest that the occurrences with different part-of-speech categories of a word should be seen in different contexts. |
Abstract | In this work we propose methods to label probabilistic synchronous context-free grammar (PSCFG) rules using only word tags, generated by either part-of-speech analysis or unsupervised word class induction. |
Experiments | This is due to the fact that for the source-tag based approach, a given chart cell in the CYK decoder, represented by a start and end position in the source sentence, almost uniquely determines the nonterminal any hypothesis in this cell can have: Disregarding part-of-speech tag ambiguity and phrase size accounting, that nonterminal will be the composition of the tags of the start and end source words spanned by that cell. |
Experiments | K-means clustering based models To establish suitable values for the 04 parameters and investigate the impact of the number of clusters, we looked at the development performance over various parameter combinations for a K-means model based on source and/or target part-of-speech tags.7 As can be seen from Figure 1 (right), our method reaches its peak performance at around 50 clusters and then levels off slightly. |
Hard rule labeling from word classes | Extension to a bilingually tagged corpus While the availability of syntactic annotations for both source and target language is unlikely in most translation scenarios, some form of word tags, be it part-of-speech tags or learned word clusters (cf. |
Hard rule labeling from word classes | Consider again our example sentence pair (now also annotated with source-side part-of-speech tags): |
Introduction | In this work, we propose a labeling approach that is based merely on part-of-speech analysis of the source or target language (or even both). |
Approach | For example, we can map each DOM tree node onto an integer equal to the number of children, or map each entity string onto its part-of-speech tag sequence. |
Approach | For abstract tokens with finitely many possible values (e. g., part-of-speech ), we also use the normalized |
Approach | On the other hand, random words on the web page tend to have more diverse lengths and part-of-speech tags. |
Experiments | For example, queries mayors of Chicago and universities in Chicago will produce entities of different lengths, part-of-speech sequences, and word distributions. |
Discussion | Prefix probabilities and right prefix probabilities for PSCFGs can be exploited to compute probability distributions for the next word or part-of-speech in left-to-right incremental translation of speech, or alternatively as a predictive tool in applications of interactive machine translation, of the kind described by Foster et al. |
Discussion | However, one may also compute the probability that the next part-of-speech in the target translation is A. |
Discussion | This can be realised by adding a rule 3’ : [B —> b, A —> CA] for each rule 3 : [B —> b, A —> a] from the source grammar, where A is a nonterminal representing a part-of-speech and CA is a (pre-)terminal specific to A. |
Introduction | Prefix probabilities can be used to compute probability distributions for the next word or part-of-speech . |
Introduction | Prefix probabilities and right prefix probabilities for PSCFGs can be exploited to compute probability distributions for the next word or part-of-speech in left-to-right incremental translation, essentially in the same way as described by Jelinek and Lafferty (1991) for probabilistic context-free grammars, as discussed later in this paper. |
Related Work | Subsequent improvements use the P(0|b, b’) formula, for example, for incorporating various linguistics feature like part-of-speech (Zens and Ney, 2006), syntactic (Chang et al., 2009), dependency information (Bach et al., 2009) and predicate-argument structure (Xiong et al., 2012). |
Training | In total, we consider 2l part-of-speech tags; some of which are as follow: VC (copula), DEG, DEG, DER, DEV (de-related), PU (punctuation), AD (adjectives) and P (prepositions). |
Training | We train the classifiers on a rich set of binary features ranging from lexical to part-of-speech (POS) and to syntactic features. |
Training | 1. anchor-related: slex (the actual word of 33.12), spos ( part-of-speech (POS) tag of slex), sparent (spos’s parent in the parse tree), tlex (6:: ’s actual target word).. |
Two-Neighbor Orientation Model | In our experiments, we use a simple heuristics based on part-of-speech tags which will be described in Section 7. |
Experiments | We use the features of Zhang and Nivre (2011), except that all lexical identities are dropped from the templates during training and testing, hence inducing a ‘delexicalized’ model that employs only ‘universal’ properties from source-side treebanks, such as part-of-speech tags, labels, head-modifier distance, etc. |
Introduction | In the context of part-of-speech tagging, universal representations, such as that of Petrov et al. |
Towards A Universal Treebank | (2012) as the underlying part-of-speech representation. |
Towards A Universal Treebank | For both English and Swedish, we mapped the language-specific part-of-speech tags to universal tags using the mappings of Petrov et al. |
Towards A Universal Treebank | Note that relative to the universal part-of-speech tagset of Petrov et al. |
Representations and models | In particular, our shallow tree structure is a two-level syntactic hierarchy built from word lemmas (leaves) and part-of-speech tags that are further grouped into chunks (Fig. |
Representations and models | As full syntactic parsers such as constituency or dependency tree parsers would significantly degrade in performance on noisy texts, e.g., Twitter or YouTube comments, we opted for shallow structures, which rely on simpler and more robust components: a part-of-speech tagger and a chunker. |
Representations and models | Hence, we use the CMU Twitter pos-tagger (Gimpel et al., 2011; Owoputi et al., 2013) to obtain the part-of-speech tags. |
Experimental Setup and Results | The first marker is the part-of-speech tag of the root and the remainder are the overt inflectional and derivational markers of the word. |
Experimental Setup and Results | Instead of using just the surface form of the word, we included the root, part-of-speech and morphological tag information into the corpus as additional factors alongside the surface form.13 Thus, a token is represented with three factors as Surface | Root | Tags where Tags are complex tags on the English side, and morphological tags on the Turkish side.14 |
Introduction | They have reported that, given the typical complexity of Turkish words, there was a substantial percentage of words whose morphological structure was incorrect: either the morphemes were not applicable for the part-of-speech category of the root word selected, or the morphemes were in the wrong order. |
Related Work | Popovic and Ney (2004) investigated improving translation quality from inflected languages by using stems, suffixes and part-of-speech tags. |
Syntax-to-Morphology Mapping | Part-of-Speech Tags for the English words: +IN -:eposition; +PRP$ - Possessive Pronoun; +JJ - Adjective; .‘IN - Noun; +NNS - Plural Noun. |
Conclusion | We have shown the efficacy of graph-based label propagation for projecting part-of-speech information across languages. |
Experiments and Results | 6.2 Part-of-Speech Tagset and HMM States |
Experiments and Results | While there might be some controversy about the exact definition of such a tagset, these 12 categories cover the most frequent part-of-speech and exist in one form or another in all of the languages that we studied. |
Introduction | To make the projection practical, we rely on the twelve universal part-of-Speech tags of Petrov et al. |
Experiments | The part-of-speech tags for the development and test set were automatically assigned by the MXPOST taggerlo, where the tagger was trained on the entire training corpus. |
Related Work | (2010) created robust supervised classifiers via web-scale N-gram data for adjective ordering, spelling correction, noun compound bracketing and verb part-of-speech disambiguation. |
Web-Derived Selectional Preference Features | In this paper, we employ two different feature sets: a baseline feature set3 which draw upon “normal” information source, such as word forms and part-of-speech (POS) without including the web-derived selectional preference4 features, a feature set conjoins the baseline features and the web-derived selectional preference features. |
Web-Derived Selectional Preference Features | is any token whose part-of-speech is IN |
Comparison on applications | We use the OpenNLP toolkit6 for segmentation and part-of-speech tagging. |
Content comparison of the 1911 and 1987 Thesauri | Hierarchy 1911 1987 Class 8 8 Section 39 39 Subsection 97 95 Head Group 625 596 Head 1044 990 Part-of-speech 3934 3220 Paragraph 10244 6443 Semicolon Group 43196 59915 Total Words 98924 225 124 Unique Words 59768 100470 |
Content comparison of the 1911 and 1987 Thesauri | The part-of-speech level is a little confusing, since clearly no such grouping contains an exhaustive list of all nouns, all verbs etc. |
Content comparison of the 1911 and 1987 Thesauri | We will write “POS” to indicate a structure in Roget’s and “part-of-speech” to indicate the word category in general. |
Hello. My name is Inigo Montoya. | Interestingly, this distinctiveness takes place at the level of words, but not at the level of other syntactic features: the part-of-speech composition of memorable quotes is in fact more likely with respect to newswire. |
Hello. My name is Inigo Montoya. | Thus, we can think of memorable quotes as consisting, in an aggregate sense, of unusual word choices built on a scaffolding of common part-of-speech patterns. |
Hello. My name is Inigo Montoya. | In particular, we analyze a corpus of advertising slogans, and we show that these slogans have significantly greater likelihood at both the word level and the part-of-speech level with respect to a language model trained on memorable movie quotes, compared to a corresponding language model trained on non-memorable movie quotes. |
Never send a human to do a machine’s job. | We then develop models using features based on the measures formulated earlier in this section: generality measures (the four listed in Table 4); distinctiveness measures (likelihood according to l, 2, and 3-gram “common language” models at the lexical and part-of-speech level for each quote in the pair, their differences, and pairwise comparisons between them); and similarity-to-slogans measures (likelihood according to l, 2, and 3-gram slogan-language models at the lexical and part-of-speech level for each quote in the pair, their differences, and pairwise comparisons between them). |
Conclusion | We proposed FWD (Frames, BOW, and part-of-speech specific DAL) features and SemTree data representations. |
Experiments | We remove stop words and use Stanford CoreNLP for part-of-speech tagging and named entity recognition. |
Methods | (2009) introduced part-of-speech specific DAL features for sentiment analysis. |
Related Work | Table 1: FWD features (Frame, bag-of-Words, part-of-speech DAL score) and their value types. |
Introduction | Syntactic relations manifest themselves in a broad range of surface indicators, ranging from morphological to lexical, including positional and part-of-speech (POS) tagging features. |
Introduction | For instance, morphological properties are closely tied to part-of-speech tags, which in turn relate to positional features. |
Introduction | The power of the low-rank model becomes evident in the absence of any part-of-speech tags. |
Results | Syntactic Abstraction without POS Since our model learns a compressed representation of feature vectors, we are interested to measure its performance when part-of-speech tags are not provided (See Table 4). |
Data and Tools | 3.3 Part-of-Speech Tagging |
Data and Tools | Several features in our parsing model involve part-of-speech (POS) tags of the input sentences. |
Data and Tools | As part-of-speech tags are also a form of syntactic analysis, this assumption weakens the applicability of our approach. |
Our Approach | By reducing unaligned edges to their deleXicalized forms, we can still use those deleXicalized features, such as part-of-speech tags, for those unaligned edges, and can address problem that automatically generated word alignments include errors. |
Introduction | Similar representations have proven useful in domain-adaptation for part-of-speech tagging and phrase chunking (Huang and Yates, 2009). |
Introduction | Every sentence in the dataset is automatically annotated with a number of NLP pipeline systems, including part-of-speech (POS) tags, phrase chunk labels (Carreras and Marquez, 2003), named-entity tags, and full parse information by multiple parsers. |
Introduction | As with our other HMM-based models, we use the largest number of latent states that will allow the resulting model to fit in our machine’s memory — our previous experiments on representations for part-of-speech tagging suggest that more latent states are usually better. |
Experimental Results | A large number of false negatives on the part of O-CRF can be attributed to its lack of lexical features, which are often crucial when part-of-speech tagging errors are present. |
Experimental Results | nize the positive instance, despite the incorrect part-of-speech tag. |
Relation Extraction | The set of features used by O-CRF is largely similar to those used by O-NB and other state-of-the-art relation extraction systems, They include part-of-speech tags (predicted using a separately trained maximum-entropy model), regular expressions (e. g.detecting capitalization, punctuation, eta), context words, and conjunctions of features occurring in adjacent positions within six words to the left and six words to the right of the current word. |
Relation Extraction | O-CRF was built using the CRF implementation provided by MALLET (McCallum, 2002), as well as part-of-speech tagging and phrase-chunking tools available from OPENNLP.2 |
Conclusions and Future Work | In the future, we hope to apply similar multilingual models to other core unsupervised analysis tasks, including part-of-speech tagging and grammar induction, and to further investigate the role that language relatedness plays in such models. |
Experimental SetUp | The accuracy of this analyzer is reported to be 94% for full morphological analyses, and 98%-99% when part-of-speech tag accuracy is not included. |
Related Work | An example of such a property is the distribution of part-of-speech bigrams. |
Related Work | Hana et al., (2004) demonstrate that adding such statistics from an annotated Czech corpus improves the performance of a Russian part-of-speech tagger over a fully unsupervised version. |
Introduction | They have become the workhorse in almost all subareas and components of NLP, including part-of-speech tagging, chunking, named entity recognition and parsing. |
Named Entity Recognition | Part-of-speech tags were used in the top-ranked systems in CoNLL 2003, as well as in many follow up studies that used the data set (Ando and Zhang 2005; Suzuki and Isozaki 2008). |
Named Entity Recognition | LDC refers to the clusters created with the smaller LDC corpus and +pos indicates the use of part-of-speech tags as features. |
Named Entity Recognition | The Top CoNLL 2003 systems all employed gazetteers or other types of specialized resources (e.g., lists of words that tend to co-occur with certain named entity types) in addition to part-of-speech tags. |
Features | Part-of-speech tags were assigned by a maximum entropy tagger trained on the Penn Tree-bank, and then simplified into seven categories: nouns, verbs, adverbs, adjectives, numbers, foreign words, and everything else. |
Features | Part-of-speech tags are not included in the dependency path. |
Previous work | Hearst (1992) used a small number of regular expressions over words and part-of-speech tags to find examples of the hypernym relation. |
Previous work | such as Ravichandran and Hovy (2002) and Pantel and Pennacchiotti (2006) use the same formalism of learning regular expressions over words and part-of-speech tags to discover patterns indicating a variety of relations. |
Linefeed Insertion Technique | o the rightmost independent morpheme (a part-of-speech, an inflected form) and rightmost morpheme (a part-of-speech ) of a bunsetsu bi |
Linefeed Insertion Technique | 0 whether or not the basic form or part-of-speech of the leftmost morpheme of the next bunsetsu of bi is one of the morphemes enumerated in Section 3.5. |
Preliminary Analysis about Linefeed Points | Here, we focused on the basic form and part-of-speech of a morpheme. |
Preliminary Analysis about Linefeed Points | 0 Part-of-speech : noun-non_independent-general [0/40], noun-nai_adj ective_stem [0/40], noun-non_independent-adverbial [(0/27] |
Conclusion and Future Work | The key innovation in the present work is the combination of unsupervised part-of-speech tagging and argument identification to permit leam-ing in a simplified SRL system. |
Conclusion and Future Work | have the luxury of treating part-of-speech tagging and semantic role labeling as separable tasks. |
Introduction | The first problem involves classifying words by part-of-speech . |
Introduction | By using the HMM part-of-speech tagger in this way, we can ask how the simple structural features that we propose children start with stand up to reductions in parsing accuracy. |
Bootstrapping Recursive Patterns | We noticed that despite the specific lexico-syntactic structure of the patterns, erroneous information can be acquired due to part-of-speech tagging errors or flawed facts on the Web. |
Results | wrong part-of-speech tag none of the above |
Results | The majority of the occurred errors are due to part-of-speech tagging. |
Semantic Relations | In total, we collected 30GB raw data which was part-of-speech tagged and used for the argument and supertype extraction. |
Conditional Random Fields | Our experiments use two standard NLP tasks, phonetization and part-of-speech tagging, chosen here to illustrate two very different situations, and to allow for comparison with results reported elsewhere in the literature. |
Conditional Random Fields | 5.1.2 Part-of-Speech Tagging |
Conditional Random Fields | Our second benchmark is a part-of-speech (POS) tagging task using the PennTreeBank corpus (Marcus et al., 1993), which provides us with a quite different condition. |
Introduction | Based on an efficient implementation of these algorithms, we were able to train very large CRFs containing more than a hundred of output labels and up to several billion features, yielding results that are as good or better than the best reported results for two NLP benchmarks, text phonetization and part-of-speech tagging. |
Experiments | 8A state-of-the-art, fully-supervised maximum entropy tagger (Clark and Curran, 2007) (which also uses part-of-speech labels) obtains 91.4% on the same train/test split. |
Grammar informed initialization for supertagging | Part-of-speech tags are atomic labels that in and of themselves encode no internal structure. |
Introduction | Creating accurate part-of-speech (POS) taggers using a tag dictionary and unlabeled data is an interesting task with practical applications. |
Introduction | Nonetheless, the methods proposed apply to realistic scenarios in which one has an electronic part-of-speech tag dictionary or a handcrafted grammar with limited coverage. |
Features | We employ a separate instance of this feature for each English part-of-speech tag: p( f | e, t). |
Features | link (e, f) if the part-of-speech tag of e is t. The conditional probabilities in this table are computed from our parse trees and the baseline Model 4 alignments. |
Features | These fire for for each link (e, f) and part-of-speech tag. |
Background | Past research in unsupervised PoS induction has largely been driven by two different motivations: a task based perspective which has focussed on inducing word classes to improve various applications, and a linguistic perspective where the aim is to induce classes which correspond closely to annotated part-of-speech corpora. |
Background | The HMM ignores orthographic information, which is often highly indicative of a word’s part-of-speech , particularly so in morphologically rich languages. |
Introduction | Unsupervised part-of-speech (PoS) induction has long been a cenUal chaflenge in conqnnafional linguistics, with applications in human language learning and for developing portable language processing systems. |
The PYP-HMM | In many languages morphological regularities correlate strongly with a word’s part-of-speech (e.g., suffixes in English), which we hope to capture using a basic character language model. |
Experimental Setup | This mapping technique is based on the many-to-one scheme used for evaluating unsupervised part-of-speech induction (Johnson, 2007). |
Experimental Setup | We implement a clustering baseline using the CLUTO toolkit with word and part-of-speech features. |
Model | These features can encode words, part-of-speech tags, context, and so on. |
Implementation Details | We first perform word segmentation (if needed) and part-of-speech tagging. |
Implementation Details | After that, we obtain the word-segmented sentences with the part-of-speech tags. |
Parsing with dependency language model | The feature templates are outlined in Table l, where TYPE refers to one of the typeszPL or PR, h_pos refers to the part-of-speech tag of :1: h, h_word refers to the lexical form of :1: h, ch_pos refers to the part-of-speech tag of mch, and ch_word refers to the lexical form of mm. |
Experiments | pendent part-of-speech . |
Integrated Models | All features are conjoined with the part-of-speech tags of the words involved in the dependency to allow the guided parser to learn weights relative to different surface syntactic environments. |
Integrated Models | Unlike MSTParser, features are not explicitly defined to conjoin guide features with part-of-speech features. |
Arabic Handwriting Recognition Challenges | In this paper we consider the value of morpho-leXical and morpho-syntactic features such as lemmas and part-of-speech tags, respectively, that may allow machine learning algorithms to learn generalizations. |
Experimental Settings | Ya and digit normalization pos The part-of-speech (POS) of the word lem The lemma of the word |
Related Work | part-of-speech tags, which they do not use, but suggest may help. |
Introduction | base phrases in Japanese) whose head part-of-speech was automatically tagged by the Japanese morphological analyser Chasen6 as either ‘noun’ or ‘unknown word’ according to the NAIST—jdic dictionary.7 |
Introduction | POS / LEMMA part-of-speech / dependency label / lemma / DEP_LABEL of the predicate which has ZERO. |
Introduction | D_POS / part-of-speech / dependency label / lemma D_LEMMA / of the dependents of the predicate which has D_DEP_LABEL ZERO. |
Features | features on preterminal part-of-speech tags. |
Introduction | An independent classification approach is actually very viable for part-of-speech tagging (Toutanova et al., 2003), but is problematic for parsing — if nothing else, parsing comes with a structural requirement that the output be a well-formed, nested tree. |
Other Languages | These part-of-speech taggers often incorporate substantial knowledge of each language’s morphology. |
Data: MWEs in Dependency Trees | In gold data, the MWEs appear in an expanded flat format: each MWE bears a part-of-speech and consists of a sequence of tokens (hereafter the “components” of the MWE), each having their proper POS, lemma and morphological features. |
Data: MWEs in Dependency Trees | For flat MWEs, the only missing information is the MWE part-of-speech : we concatenate it to the dep_cpd labels. |
Use of external MWE resources | We tested to incorporate the MWE-specific features as defined in the gold flat representation (section 3.1): the mwehead=POS feature for the MWE head token, POS being the part-of-speech of the MWE; the component=y feature for the non-first MWE component. |
Instantiation | Part-of-speech: Part-of-speech information can be used to produce features that encourage certain behavior, such as avoiding the deletion of noun phrases. |
Instantiation | We generate part-of-speech information over the original raw text using a Twitter part-of-speech tagger (Ritter et al., 2011). |
Instantiation | Of course, the part-of-speech information obtained this way is likely to be noisy, and we expect our learning algorithm to take that into account. |
Split-Merge Role Induction | When the part-of-speech similarity (pos) is below a certain threshold [3 or when clause-level constraints (cons) are satisfied to a lesser extent than threshold 7, the score takes value zero and the merge is ruled out. |
Split-Merge Role Induction | Part-of-speech Similarity Part-of-speech similarity is also measured through cosine-similarity (equation (3)). |
Split-Merge Role Induction | Clusters are again represented as vectors x and y whose components correspond to argument part-of-speech tags and values to their occurrence frequency. |
Abstract | We present a new perceptron learning algorithm using antagonistic adversaries and compare it to previous proposals on 12 multilingual cross-domain part-of-speech tagging datasets. |
Experiments | We consider part-of-speech (POS) tagging, i.e. |
Introduction | Most learning algorithms assume that training and test data are governed by identical distributions; and more specifically, in the case of part-of-speech (POS) tagging, that training and test sentences were sampled at random and that they are identically and independently distributed. |
Abstract | We evaluate the effectiveness of our method in three applications: text chunking, named entity recognition, and part-of-speech tagging. |
Introduction | The applications range from simple classification tasks such as text classification and history-based tagging (Ratnaparkhi, 1996) to more complex structured prediction tasks such as part-of-speech (POS) tagging (Lafferty et al., 2001), syntactic parsing (Clark and Curran, 2004) and semantic role labeling (Toutanova et al., 2005). |
Log-Linear Models | 4.3 Part-Of-Speech Tagging |
Detection of New Entities | To detect noun phrases that potentially refer to entities, we apply a part-of-speech tagger to the input text. |
Evaluation | HYENA’s relatively poor performance can be attributed to the fact that its features are mainly syntactic such as bi-grams and part-of-speech tags. |
Related Work | All methods use trained classifiers over a variety of linguistic features, most importantly, words and bigrams with part-of-speech tags in a mention and in the textual context preceding and following the mention. |
Introduction | Morphological taggers disambiguate morphological attributes such as part-of-speech (POS) or case, without taking syntax nfioaummn(Hflimme7azd,2mm;Ihficet al., 2001); dependency parsers commonly assume the “pipeline” approach, relying on morphological information as part of the input (Buchholz and Marsi, 2006; Nivre et al., 2007). |
Previous Work | Each of the resulting morphemes is then tagged with an atomic “part-of-speech” to indicate word class and some morphological features. |
Previous Work | According to the Latin morphological database encoded in MORPHEUS (Crane, 1991), 30% of Latin nouns can be parsed as another part-of-speech , and on average each has 3.8 possible morphological interpretations. |
Experiments | To extract part-of-speech tags, phrase structure trees, and typed dependencies, we use the Stanford parser (Klein and Manning, 2003; de Marneffe et al., 2006) on both train and test sets. |
Experiments | 14MAll’s features are similar to part-of-speech tags and untyped dependency relations. |
Our framework | The part-of-speech (POS) tag of the head of chunk The lexical item of the head noun |
Problem Definition and Notation | Structured output prediction encompasses a wide variety of NLP problems like part-of-speech tagging, parsing and machine translation. |
Problem Definition and Notation | Figure 1 illustrates this observation in the context of part-of-speech tagging. |
Problem Definition and Notation | Figure 1: Comparison of number of instances and the number of unique observed part-of-speech structures in the Gi-gaword corpus. |
Introduction | Prior work incorporating parse structure into machine translation (Chiang, 2010) and Semantic Role Labeling (Tsai et al., 2005; Punyakanok et al., 2008) indicate that such hierarchical structure can have great benefit over shallow labeling techniques like chunking and part-of-speech tagging. |
Open/Closed Cell Classification | The feature vector :13 is encoded with the chart cell’s absolute and relative span width, as well as unigram and bigram lexical and part-of-speech tag items from wi_1 . |
Results | Figure 4 contains a timing comparison of the three components of our final parser: Boundary FOM initialization (which includes the forward-backward algorithm over ambiguous part-of-speech tags), beam- |
Independent Query Annotations | (2010) we use a large -gram corpus (Brants and Franz, 2006) to estimate [Cl-mi) for annotating the query with capitalization 1d segmentation markup, and a standard POS tag-:r1 for part-of-speech tagging of the query. |
Introduction | Automatic markup of textual documents with linguistic annotations such as part-of-speech tags, sentence constituents, named entities, or semantic roles is a common practice in natural language processing (NLP). |
Related Work | The literature on query annotation includes query segmentation (Bergsma and Wang, 2007; Jones et al., 2006; Guo et al., 2008; Hagen et al., 2010; Hagen et al., 2011; Tan and Peng, 2008), part-of-speech and semantic tagging (Barr et al., 2008; Manshadi and Li, 2009; Li, 2010), named-entity recognition (Guo et al., 2009; Lu et al., 2009; Shen et al., 2008; Pasca, 2007), abbreviation disambiguation (Wei et al., 2008) and stopword detection (Lo et al., 2005; Jones and Fain, 2003). |
Abstract | Standard methods for part-of-speech tagging suffer from data sparseness when used on highly inflectional languages (which require large lexical tagset inventories). |
Abstract | Several neural network architectures have been proposed for the task of part-of-speech tagging. |
Abstract | We presented a new approach for large tagset part-of-speech tagging using neural networks. |
Conclusion and Future Work | Such diverse topics as machine translation (Dyer et al., 2008; Dyer and Resnik, 2010; Mi et al., 2008), part-of-speech tagging (Jiang et al., 2008), named entity recognition (Finkel and Manning, 2009) semantic role labelling (Sutton and McCallum, 2005; Finkel et al., 2006), and others have also been improved by combined models. |
Experiments | We supply gold-standard part-of-speech tags to the parsers. |
Experiments | Next, we evaluate performance when using automatic part-of-speech tags as input to our parser |