Experimental setup | The English-Urdu parallel corpus3 consists of 4,325 sentences from the first three sections of the Penn Treebank and their Urdu translations annotated at the part-of-speech level. |
Introduction | Multilingual learning has been successful for other linguistic induction tasks such as lexicon acquisition, morphological segmentation, and part-of-speech tagging (Genzel, 2005; Snyder and Barzilay, 2008; Snyder et al., 2008; Snyder |
Introduction | lingual constituent, a sequence of part-of-speech tags is drawn from a language-specific distribution. |
Introduction | For each pair of coupled bilingual constituents, a pair of part-of-speech sequences are drawn jointly from a cross-lingual distribution. |
Model | We treat the part-of-speech tag sequences of parallel sentences, as well as their |
Model | Under this model, the part-of-speech sequence of each span in a sentence is generated either as a constituent yield — if it is dominated by a node in the tree —or otherwise as a distituent yield. |
Model | While this model is deficient —each observed subsequence of part-of-speech tags is generated many times over — its performance is far higher than that of unsupervised PCFGs. |
Introduction | They have become the workhorse in almost all subareas and components of NLP, including part-of-speech tagging, chunking, named entity recognition and parsing. |
Named Entity Recognition | Part-of-speech tags were used in the top-ranked systems in CoNLL 2003, as well as in many follow up studies that used the data set (Ando and Zhang 2005; Suzuki and Isozaki 2008). |
Named Entity Recognition | LDC refers to the clusters created with the smaller LDC corpus and +pos indicates the use of part-of-speech tags as features. |
Named Entity Recognition | The Top CoNLL 2003 systems all employed gazetteers or other types of specialized resources (e.g., lists of words that tend to co-occur with certain named entity types) in addition to part-of-speech tags. |
Features | Part-of-speech tags were assigned by a maximum entropy tagger trained on the Penn Tree-bank, and then simplified into seven categories: nouns, verbs, adverbs, adjectives, numbers, foreign words, and everything else. |
Features | Part-of-speech tags are not included in the dependency path. |
Previous work | Hearst (1992) used a small number of regular expressions over words and part-of-speech tags to find examples of the hypernym relation. |
Previous work | such as Ravichandran and Hovy (2002) and Pantel and Pennacchiotti (2006) use the same formalism of learning regular expressions over words and part-of-speech tags to discover patterns indicating a variety of relations. |
Linefeed Insertion Technique | o the rightmost independent morpheme (a part-of-speech, an inflected form) and rightmost morpheme (a part-of-speech ) of a bunsetsu bi |
Linefeed Insertion Technique | 0 whether or not the basic form or part-of-speech of the leftmost morpheme of the next bunsetsu of bi is one of the morphemes enumerated in Section 3.5. |
Preliminary Analysis about Linefeed Points | Here, we focused on the basic form and part-of-speech of a morpheme. |
Preliminary Analysis about Linefeed Points | 0 Part-of-speech : noun-non_independent-general [0/40], noun-nai_adj ective_stem [0/40], noun-non_independent-adverbial [(0/27] |
Abstract | We evaluate the effectiveness of our method in three applications: text chunking, named entity recognition, and part-of-speech tagging. |
Introduction | The applications range from simple classification tasks such as text classification and history-based tagging (Ratnaparkhi, 1996) to more complex structured prediction tasks such as part-of-speech (POS) tagging (Lafferty et al., 2001), syntactic parsing (Clark and Curran, 2004) and semantic role labeling (Toutanova et al., 2005). |
Log-Linear Models | 4.3 Part-Of-Speech Tagging |