Index of papers in Proc. ACL 2011 that mention
  • part-of-speech
Zollmann, Andreas and Vogel, Stephan
Abstract
In this work we propose methods to label probabilistic synchronous context-free grammar (PSCFG) rules using only word tags, generated by either part-of-speech analysis or unsupervised word class induction.
Experiments
This is due to the fact that for the source-tag based approach, a given chart cell in the CYK decoder, represented by a start and end position in the source sentence, almost uniquely determines the nonterminal any hypothesis in this cell can have: Disregarding part-of-speech tag ambiguity and phrase size accounting, that nonterminal will be the composition of the tags of the start and end source words spanned by that cell.
Experiments
K-means clustering based models To establish suitable values for the 04 parameters and investigate the impact of the number of clusters, we looked at the development performance over various parameter combinations for a K-means model based on source and/or target part-of-speech tags.7 As can be seen from Figure 1 (right), our method reaches its peak performance at around 50 clusters and then levels off slightly.
Hard rule labeling from word classes
Extension to a bilingually tagged corpus While the availability of syntactic annotations for both source and target language is unlikely in most translation scenarios, some form of word tags, be it part-of-speech tags or learned word clusters (cf.
Hard rule labeling from word classes
Consider again our example sentence pair (now also annotated with source-side part-of-speech tags):
Introduction
In this work, we propose a labeling approach that is based merely on part-of-speech analysis of the source or target language (or even both).
part-of-speech is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Nederhof, Mark-Jan and Satta, Giorgio
Discussion
Prefix probabilities and right prefix probabilities for PSCFGs can be exploited to compute probability distributions for the next word or part-of-speech in left-to-right incremental translation of speech, or alternatively as a predictive tool in applications of interactive machine translation, of the kind described by Foster et al.
Discussion
However, one may also compute the probability that the next part-of-speech in the target translation is A.
Discussion
This can be realised by adding a rule 3’ : [B —> b, A —> CA] for each rule 3 : [B —> b, A —> a] from the source grammar, where A is a nonterminal representing a part-of-speech and CA is a (pre-)terminal specific to A.
Introduction
Prefix probabilities can be used to compute probability distributions for the next word or part-of-speech .
Introduction
Prefix probabilities and right prefix probabilities for PSCFGs can be exploited to compute probability distributions for the next word or part-of-speech in left-to-right incremental translation, essentially in the same way as described by Jelinek and Lafferty (1991) for probabilistic context-free grammars, as discussed later in this paper.
part-of-speech is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Blunsom, Phil and Cohn, Trevor
Background
Past research in unsupervised PoS induction has largely been driven by two different motivations: a task based perspective which has focussed on inducing word classes to improve various applications, and a linguistic perspective where the aim is to induce classes which correspond closely to annotated part-of-speech corpora.
Background
The HMM ignores orthographic information, which is often highly indicative of a word’s part-of-speech , particularly so in morphologically rich languages.
Introduction
Unsupervised part-of-speech (PoS) induction has long been a cenUal chaflenge in conqnnafional linguistics, with applications in human language learning and for developing portable language processing systems.
The PYP-HMM
In many languages morphological regularities correlate strongly with a word’s part-of-speech (e.g., suffixes in English), which we hope to capture using a basic character language model.
part-of-speech is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Das, Dipanjan and Petrov, Slav
Conclusion
We have shown the efficacy of graph-based label propagation for projecting part-of-speech information across languages.
Experiments and Results
6.2 Part-of-Speech Tagset and HMM States
Experiments and Results
While there might be some controversy about the exact definition of such a tagset, these 12 categories cover the most frequent part-of-speech and exist in one form or another in all of the languages that we studied.
Introduction
To make the projection practical, we rely on the twelve universal part-of-Speech tags of Petrov et al.
part-of-speech is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zhou, Guangyou and Zhao, Jun and Liu, Kang and Cai, Li
Experiments
The part-of-speech tags for the development and test set were automatically assigned by the MXPOST taggerlo, where the tagger was trained on the entire training corpus.
Related Work
(2010) created robust supervised classifiers via web-scale N-gram data for adjective ordering, spelling correction, noun compound bracketing and verb part-of-speech disambiguation.
Web-Derived Selectional Preference Features
In this paper, we employ two different feature sets: a baseline feature set3 which draw upon “normal” information source, such as word forms and part-of-speech (POS) without including the web-derived selectional preference4 features, a feature set conjoins the baseline features and the web-derived selectional preference features.
Web-Derived Selectional Preference Features
is any token whose part-of-speech is IN
part-of-speech is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Auli, Michael and Lopez, Adam
Conclusion and Future Work
Such diverse topics as machine translation (Dyer et al., 2008; Dyer and Resnik, 2010; Mi et al., 2008), part-of-speech tagging (Jiang et al., 2008), named entity recognition (Finkel and Manning, 2009) semantic role labelling (Sutton and McCallum, 2005; Finkel et al., 2006), and others have also been improved by combined models.
Experiments
We supply gold-standard part-of-speech tags to the parsers.
Experiments
Next, we evaluate performance when using automatic part-of-speech tags as input to our parser
part-of-speech is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Bendersky, Michael and Croft, W. Bruce and Smith, David A.
Independent Query Annotations
(2010) we use a large -gram corpus (Brants and Franz, 2006) to estimate [Cl-mi) for annotating the query with capitalization 1d segmentation markup, and a standard POS tag-:r1 for part-of-speech tagging of the query.
Introduction
Automatic markup of textual documents with linguistic annotations such as part-of-speech tags, sentence constituents, named entities, or semantic roles is a common practice in natural language processing (NLP).
Related Work
The literature on query annotation includes query segmentation (Bergsma and Wang, 2007; Jones et al., 2006; Guo et al., 2008; Hagen et al., 2010; Hagen et al., 2011; Tan and Peng, 2008), part-of-speech and semantic tagging (Barr et al., 2008; Manshadi and Li, 2009; Li, 2010), named-entity recognition (Guo et al., 2009; Lu et al., 2009; Shen et al., 2008; Pasca, 2007), abbreviation disambiguation (Wei et al., 2008) and stopword detection (Lo et al., 2005; Jones and Fain, 2003).
part-of-speech is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Bodenstab, Nathan and Dunlop, Aaron and Hall, Keith and Roark, Brian
Introduction
Prior work incorporating parse structure into machine translation (Chiang, 2010) and Semantic Role Labeling (Tsai et al., 2005; Punyakanok et al., 2008) indicate that such hierarchical structure can have great benefit over shallow labeling techniques like chunking and part-of-speech tagging.
Open/Closed Cell Classification
The feature vector :13 is encoded with the chart cell’s absolute and relative span width, as well as unigram and bigram lexical and part-of-speech tag items from wi_1 .
Results
Figure 4 contains a timing comparison of the three components of our final parser: Boundary FOM initialization (which includes the forward-backward algorithm over ambiguous part-of-speech tags), beam-
part-of-speech is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Chen, Harr and Benson, Edward and Naseem, Tahira and Barzilay, Regina
Experimental Setup
This mapping technique is based on the many-to-one scheme used for evaluating unsupervised part-of-speech induction (Johnson, 2007).
Experimental Setup
We implement a clustering baseline using the CLUTO toolkit with word and part-of-speech features.
Model
These features can encode words, part-of-speech tags, context, and so on.
part-of-speech is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Habash, Nizar and Roth, Ryan
Arabic Handwriting Recognition Challenges
In this paper we consider the value of morpho-leXical and morpho-syntactic features such as lemmas and part-of-speech tags, respectively, that may allow machine learning algorithms to learn generalizations.
Experimental Settings
Ya and digit normalization pos The part-of-speech (POS) of the word lem The lemma of the word
Related Work
part-of-speech tags, which they do not use, but suggest may help.
part-of-speech is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Iida, Ryu and Poesio, Massimo
Introduction
base phrases in Japanese) whose head part-of-speech was automatically tagged by the Japanese morphological analyser Chasen6 as either ‘noun’ or ‘unknown word’ according to the NAIST—jdic dictionary.7
Introduction
POS / LEMMA part-of-speech / dependency label / lemma / DEP_LABEL of the predicate which has ZERO.
Introduction
D_POS / part-of-speech / dependency label / lemma D_LEMMA / of the dependents of the predicate which has D_DEP_LABEL ZERO.
part-of-speech is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Lang, Joel and Lapata, Mirella
Split-Merge Role Induction
When the part-of-speech similarity (pos) is below a certain threshold [3 or when clause-level constraints (cons) are satisfied to a lesser extent than threshold 7, the score takes value zero and the merge is ruled out.
Split-Merge Role Induction
Part-of-speech Similarity Part-of-speech similarity is also measured through cosine-similarity (equation (3)).
Split-Merge Role Induction
Clusters are again represented as vectors x and y whose components correspond to argument part-of-speech tags and values to their occurrence frequency.
part-of-speech is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Lee, John and Naradowsky, Jason and Smith, David A.
Introduction
Morphological taggers disambiguate morphological attributes such as part-of-speech (POS) or case, without taking syntax nfioaummn(Hflimme7azd,2mm;Ihficet al., 2001); dependency parsers commonly assume the “pipeline” approach, relying on morphological information as part of the input (Buchholz and Marsi, 2006; Nivre et al., 2007).
Previous Work
Each of the resulting morphemes is then tagged with an atomic “part-of-speech” to indicate word class and some morphological features.
Previous Work
According to the Latin morphological database encoded in MORPHEUS (Crane, 1991), 30% of Latin nouns can be parsed as another part-of-speech , and on average each has 3.8 possible morphological interpretations.
part-of-speech is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: