Experiments | To extract a Hebrew morphological lexicon we assume the existence of manual morphological and part-of-speech annotations (Groves and Lowery, 2006). |
Experiments | We divide Hebrew stems into four main part-of-speech categories each with a distinct affix profile: Noun, Verb, Pronoun, and Particle. |
Experiments | For each part-of-speech category, we determine the set of allowable affixes using the annotated Bible corpus. |
Inference | First we sample the morphological segmentation of ui, along with the part-of-speech p03 of the latent stem cognate. |
Inference | To do so, we enumerate each possible segmentation and part-of-speech and calculate its joint conditional probability (for notational clarity, we leave implicit the conditioning on the other samples in the corpus): |
Inference | where the summations over character-edit sequences are restricted to those which yield the segmentation (upre, ustm, usuf) and a latent cognate with part-of-speech p03. |
Model | We model prefix and suffix distributions as conditionally dependent on the part-of-speech of the stem morpheme-pair. |
Model | in stem part-of-speech . |
Experimental Setup and Results | The first marker is the part-of-speech tag of the root and the remainder are the overt inflectional and derivational markers of the word. |
Experimental Setup and Results | Instead of using just the surface form of the word, we included the root, part-of-speech and morphological tag information into the corpus as additional factors alongside the surface form.13 Thus, a token is represented with three factors as Surface | Root | Tags where Tags are complex tags on the English side, and morphological tags on the Turkish side.14 |
Introduction | They have reported that, given the typical complexity of Turkish words, there was a substantial percentage of words whose morphological structure was incorrect: either the morphemes were not applicable for the part-of-speech category of the root word selected, or the morphemes were in the wrong order. |
Related Work | Popovic and Ney (2004) investigated improving translation quality from inflected languages by using stems, suffixes and part-of-speech tags. |
Syntax-to-Morphology Mapping | Part-of-Speech Tags for the English words: +IN -:eposition; +PRP$ - Possessive Pronoun; +JJ - Adjective; .‘IN - Noun; +NNS - Plural Noun. |
Conclusion and Future Work | The key innovation in the present work is the combination of unsupervised part-of-speech tagging and argument identification to permit leam-ing in a simplified SRL system. |
Conclusion and Future Work | have the luxury of treating part-of-speech tagging and semantic role labeling as separable tasks. |
Introduction | The first problem involves classifying words by part-of-speech . |
Introduction | By using the HMM part-of-speech tagger in this way, we can ask how the simple structural features that we propose children start with stand up to reductions in parsing accuracy. |
Introduction | Similar representations have proven useful in domain-adaptation for part-of-speech tagging and phrase chunking (Huang and Yates, 2009). |
Introduction | Every sentence in the dataset is automatically annotated with a number of NLP pipeline systems, including part-of-speech (POS) tags, phrase chunk labels (Carreras and Marquez, 2003), named-entity tags, and full parse information by multiple parsers. |
Introduction | As with our other HMM-based models, we use the largest number of latent states that will allow the resulting model to fit in our machine’s memory — our previous experiments on representations for part-of-speech tagging suggest that more latent states are usually better. |
Bootstrapping Recursive Patterns | We noticed that despite the specific lexico-syntactic structure of the patterns, erroneous information can be acquired due to part-of-speech tagging errors or flawed facts on the Web. |
Results | wrong part-of-speech tag none of the above |
Results | The majority of the occurred errors are due to part-of-speech tagging. |
Semantic Relations | In total, we collected 30GB raw data which was part-of-speech tagged and used for the argument and supertype extraction. |
Conditional Random Fields | Our experiments use two standard NLP tasks, phonetization and part-of-speech tagging, chosen here to illustrate two very different situations, and to allow for comparison with results reported elsewhere in the literature. |
Conditional Random Fields | 5.1.2 Part-of-Speech Tagging |
Conditional Random Fields | Our second benchmark is a part-of-speech (POS) tagging task using the PennTreeBank corpus (Marcus et al., 1993), which provides us with a quite different condition. |
Introduction | Based on an efficient implementation of these algorithms, we were able to train very large CRFs containing more than a hundred of output labels and up to several billion features, yielding results that are as good or better than the best reported results for two NLP benchmarks, text phonetization and part-of-speech tagging. |
Experiments | 8A state-of-the-art, fully-supervised maximum entropy tagger (Clark and Curran, 2007) (which also uses part-of-speech labels) obtains 91.4% on the same train/test split. |
Grammar informed initialization for supertagging | Part-of-speech tags are atomic labels that in and of themselves encode no internal structure. |
Introduction | Creating accurate part-of-speech (POS) taggers using a tag dictionary and unlabeled data is an interesting task with practical applications. |
Introduction | Nonetheless, the methods proposed apply to realistic scenarios in which one has an electronic part-of-speech tag dictionary or a handcrafted grammar with limited coverage. |
Features | We employ a separate instance of this feature for each English part-of-speech tag: p( f | e, t). |
Features | link (e, f) if the part-of-speech tag of e is t. The conditional probabilities in this table are computed from our parse trees and the baseline Model 4 alignments. |
Features | These fire for for each link (e, f) and part-of-speech tag. |