Index of papers in Proc. ACL 2008 that mention
  • POS tags
Dridan, Rebecca and Kordoni, Valia and Nicholson, Jeremy
Abstract
In terms of robustness, we try using different types of external data to increase lexical coverage, and find that simple POS tags have the most effect, increasing coverage on unseen data by up to 45%.
Abstract
Even using vanilla POS tags we achieve some efficiency gains, but when using detailed lexical types as supertags we manage to halve parsing time with minimal loss of coverage or precision.
Background
Supertagging is the process of assigning probable ‘supertags’ to words before parsing to restrict parser ambiguity, where a supertag is a tag that includes more specific information than the typical POS tags .
Parser Restriction
In these experiments we look at two methods of restricting the parser, first by using POS tags and then using lexical types.
Parser Restriction
We use TreeTagger (Schmid, 1994) to produce POS tags and then open class words are restricted if the POS tagger assigned a tag with a probability over a certain threshold.
Parser Restriction
Table 1: Results obtained when restricting the parser lexicon according to the POS tag , where words are restricted according to a threshold of POS probabilities.
POS tags is mentioned in 48 sentences in this paper.
Topics mentioned in this paper:
Espinosa, Dominic and White, Michael and Mehay, Dennis
Background
The best performing model interpolates a word trigram model with a trigram model that chains a POS model with a supertag model, where the POS model conditions on the previous two POS tags, and the supertag model conditions on the previous two POS tags as well as the current one.
The Approach
Clark (2002) notes in his parsing experiments that the POS tags of the surrounding words are highly informative.
The Approach
As discussed below, a significant gain in hypertagging accuracy resulted from including features sensitive to the POS tags of a node’s parent, the node itself, and all of its arguments and modifiers.
The Approach
Predicting these tags requires the use of a separate POS tagger , which operates in a manner similar to the hypertagger itself, though exploiting a slightly different set of features (e. g., including features corresponding to the four-character prefixes and suffixes of rare logical predication names).
POS tags is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Goldberg, Yoav and Tsarfaty, Reut
A Generative PCFG Model
The entries in such a lexicon may be thought of as meaningful surface segments paired up with their PoS tags I, = (3,, pi), but note that a surface segment 3 need not be a space-delimited token.
A Generative PCFG Model
(1996) who consider the kind of probabilities a generative parser should get from a PoS tagger , and concludes that these should be P(w|t) “and nothing fancier”.3 In our setting, therefore, the Lattice is not used to induce a probability distribution on a linear context, but rather, it is used as a common-denominator of state-indexation of all segmentations possibilities of a surface form.
Model Preliminaries
A Hebrew surface token may have several readings, each of which corresponding to a sequence of segments and their corresponding PoS tags .
Model Preliminaries
We refer to different readings as different analyses whereby the segments are deterministic given the sequence of PoS tags .
Model Preliminaries
We refer to a segment and its assigned PoS tag as a lexeme, and so analyses are in fact sequences of lexemes.
Modern Hebrew Structure
Such discrepancies can be aligned via an intermediate level of PoS tags .
Modern Hebrew Structure
PoS tags impose a unique morphological segmentation on surface tokens and present a unique valid yield for syntactic trees.
Previous Work on Hebrew Processing
Tsarfaty (2006) used a morphological analyzer (Segal, 2000), a PoS tagger (Bar-Haim et al., 2005), and a general purpose parser (Schmid, 2000) in an integrated framework in which morphological and syntactic components interact to share information, leading to improved performance on the joint task.
POS tags is mentioned in 17 sentences in this paper.
Topics mentioned in this paper:
Vadas, David and Curran, James R.
Conversion Process
Since we are applying these to CCGbank NP structures rather than the Penn Treebank, the POS tag based heuristics are sufficient to determine heads accurately.
Conversion Process
Some POS tags require special behaviour.
Conversion Process
Accordingly, we do not alter tokens with POS tags of DT and PRP s. Instead, their sibling node is given the category N and their parent node is made the head.
Experiments
Table 3: Parsing results with gold-standard POS tags
Experiments
Table 4: Parsing results with automatic POS tags
Experiments
We have also experimented with using automatically assigned POS tags .
NER features
Many of these features generalise the head words and/or POS tags that are already part of the feature set.
NER features
There are already features in the model describing each combination of the children’s head words and POS tags , which we extend to include combinations with
POS tags is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Adler, Meni and Goldberg, Yoav and Gabay, David and Elhadad, Michael
Evaluation
Table 5 shows the result of the disambiguation when we only take into account the POS tag of the unknown tokens.
Introduction
On the one hand, this tagset is much larger than the largest tagset used in English (from 17 tags in most unsupervised POS tagging experiments, to the 46 tags of the WSJ corpus and the about 150 tags of the LOB corpus).
Introduction
On average, each token in the 42M corpus is given 2.7 possible analyses by the analyzer (much higher than the average 1.41 POS tag ambiguity reported in English (Dermatas and Kokkinakis, 1995)).
Previous Work
At the word level, a segmented word is attached to a POS, where the character model is based on the observed characters and their classification: Begin of word, In the middle of a word, End of word, the character is a word itself S. They apply Baum-Welch training over a segmented corpus, where the segmentation of each word and its character classification is observed, and the POS tagging is ambiguous.
Previous Work
(of all words in a given sentence) and the POS tagging (of the known words) is based on a Viterbi search over a lattice composed of all possible word segmentations and the possible classifications of all observed characters.
Previous Work
They report a very slight improvement on Hebrew and Arabic supervised POS taggers .
POS tags is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Avramidis, Eleftherios and Koehn, Philipp
Introduction
In one of the first efforts to enrich the source in word-based SMT, Ueffing and Ney (2003) used part-of-speech (POS) tags, in order to deal with the verb conjugation of Spanish and Catalan; so, POS tags were used to identify the pronoun+verb sequence and splice these two words into one term.
Introduction
In their presentation of the factored SMT models, Koehn and Hoang (2007) describe experiments for translating from English to German, Spanish and Czech, using morphology tags added on the morphologically rich side, along with POS tags .
Methods for enriching input
The POS tag of this noun is then used to identify if it is plural or singular.
Methods for enriching input
The word “aspects” is found, which has a POS tag that shows it is a plural noun.
POS tags is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Penn, Gerald and Zhu, Xiaodan
Setting of the experiment
A decision tree (C4.5, Release 8) is used to detect false starts, trained on the POS tags and trigger-word status of the first and last four words of sentences from a training set.
Setting of the experiment
For (both WH-and Yesfl\Io) question identification, another C4.5 classifier was trained on 2,000 manually annotated sentences using utterance length, POS bigram occurrences, and the POS tags and trigger-word status of the first and last five words of an utterance.
Setting of the experiment
Taking ASR transcripts as input, we use the Brill tagger (Brill, 1995) to assign POS tags to each word.
POS tags is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: