Abstract | In terms of robustness, we try using different types of external data to increase lexical coverage, and find that simple POS tags have the most effect, increasing coverage on unseen data by up to 45%. |
Abstract | Even using vanilla POS tags we achieve some efficiency gains, but when using detailed lexical types as supertags we manage to halve parsing time with minimal loss of coverage or precision. |
Background | Supertagging is the process of assigning probable ‘supertags’ to words before parsing to restrict parser ambiguity, where a supertag is a tag that includes more specific information than the typical POS tags . |
Parser Restriction | In these experiments we look at two methods of restricting the parser, first by using POS tags and then using lexical types. |
Parser Restriction | We use TreeTagger (Schmid, 1994) to produce POS tags and then open class words are restricted if the POS tagger assigned a tag with a probability over a certain threshold. |
Parser Restriction | Table 1: Results obtained when restricting the parser lexicon according to the POS tag , where words are restricted according to a threshold of POS probabilities. |
Background | The best performing model interpolates a word trigram model with a trigram model that chains a POS model with a supertag model, where the POS model conditions on the previous two POS tags, and the supertag model conditions on the previous two POS tags as well as the current one. |
The Approach | Clark (2002) notes in his parsing experiments that the POS tags of the surrounding words are highly informative. |
The Approach | As discussed below, a significant gain in hypertagging accuracy resulted from including features sensitive to the POS tags of a node’s parent, the node itself, and all of its arguments and modifiers. |
The Approach | Predicting these tags requires the use of a separate POS tagger , which operates in a manner similar to the hypertagger itself, though exploiting a slightly different set of features (e. g., including features corresponding to the four-character prefixes and suffixes of rare logical predication names). |
A Generative PCFG Model | The entries in such a lexicon may be thought of as meaningful surface segments paired up with their PoS tags I, = (3,, pi), but note that a surface segment 3 need not be a space-delimited token. |
A Generative PCFG Model | (1996) who consider the kind of probabilities a generative parser should get from a PoS tagger , and concludes that these should be P(w|t) “and nothing fancier”.3 In our setting, therefore, the Lattice is not used to induce a probability distribution on a linear context, but rather, it is used as a common-denominator of state-indexation of all segmentations possibilities of a surface form. |
Model Preliminaries | A Hebrew surface token may have several readings, each of which corresponding to a sequence of segments and their corresponding PoS tags . |
Model Preliminaries | We refer to different readings as different analyses whereby the segments are deterministic given the sequence of PoS tags . |
Model Preliminaries | We refer to a segment and its assigned PoS tag as a lexeme, and so analyses are in fact sequences of lexemes. |
Modern Hebrew Structure | Such discrepancies can be aligned via an intermediate level of PoS tags . |
Modern Hebrew Structure | PoS tags impose a unique morphological segmentation on surface tokens and present a unique valid yield for syntactic trees. |
Previous Work on Hebrew Processing | Tsarfaty (2006) used a morphological analyzer (Segal, 2000), a PoS tagger (Bar-Haim et al., 2005), and a general purpose parser (Schmid, 2000) in an integrated framework in which morphological and syntactic components interact to share information, leading to improved performance on the joint task. |
Conversion Process | Since we are applying these to CCGbank NP structures rather than the Penn Treebank, the POS tag based heuristics are sufficient to determine heads accurately. |
Conversion Process | Some POS tags require special behaviour. |
Conversion Process | Accordingly, we do not alter tokens with POS tags of DT and PRP s. Instead, their sibling node is given the category N and their parent node is made the head. |
Experiments | Table 3: Parsing results with gold-standard POS tags |
Experiments | Table 4: Parsing results with automatic POS tags |
Experiments | We have also experimented with using automatically assigned POS tags . |
NER features | Many of these features generalise the head words and/or POS tags that are already part of the feature set. |
NER features | There are already features in the model describing each combination of the children’s head words and POS tags , which we extend to include combinations with |
Evaluation | Table 5 shows the result of the disambiguation when we only take into account the POS tag of the unknown tokens. |
Introduction | On the one hand, this tagset is much larger than the largest tagset used in English (from 17 tags in most unsupervised POS tagging experiments, to the 46 tags of the WSJ corpus and the about 150 tags of the LOB corpus). |
Introduction | On average, each token in the 42M corpus is given 2.7 possible analyses by the analyzer (much higher than the average 1.41 POS tag ambiguity reported in English (Dermatas and Kokkinakis, 1995)). |
Previous Work | At the word level, a segmented word is attached to a POS, where the character model is based on the observed characters and their classification: Begin of word, In the middle of a word, End of word, the character is a word itself S. They apply Baum-Welch training over a segmented corpus, where the segmentation of each word and its character classification is observed, and the POS tagging is ambiguous. |
Previous Work | (of all words in a given sentence) and the POS tagging (of the known words) is based on a Viterbi search over a lattice composed of all possible word segmentations and the possible classifications of all observed characters. |
Previous Work | They report a very slight improvement on Hebrew and Arabic supervised POS taggers . |
Introduction | In one of the first efforts to enrich the source in word-based SMT, Ueffing and Ney (2003) used part-of-speech (POS) tags, in order to deal with the verb conjugation of Spanish and Catalan; so, POS tags were used to identify the pronoun+verb sequence and splice these two words into one term. |
Introduction | In their presentation of the factored SMT models, Koehn and Hoang (2007) describe experiments for translating from English to German, Spanish and Czech, using morphology tags added on the morphologically rich side, along with POS tags . |
Methods for enriching input | The POS tag of this noun is then used to identify if it is plural or singular. |
Methods for enriching input | The word “aspects” is found, which has a POS tag that shows it is a plural noun. |
Setting of the experiment | A decision tree (C4.5, Release 8) is used to detect false starts, trained on the POS tags and trigger-word status of the first and last four words of sentences from a training set. |
Setting of the experiment | For (both WH-and Yesfl\Io) question identification, another C4.5 classifier was trained on 2,000 manually annotated sentences using utterance length, POS bigram occurrences, and the POS tags and trigger-word status of the first and last five words of an utterance. |
Setting of the experiment | Taking ASR transcripts as input, we use the Brill tagger (Brill, 1995) to assign POS tags to each word. |