Abstract | In this work we propose methods to label probabilistic synchronous context-free grammar (PSCFG) rules using only word tags, generated by either part-of-speech analysis or unsupervised word class induction. |
Experiments | This is due to the fact that for the source-tag based approach, a given chart cell in the CYK decoder, represented by a start and end position in the source sentence, almost uniquely determines the nonterminal any hypothesis in this cell can have: Disregarding part-of-speech tag ambiguity and phrase size accounting, that nonterminal will be the composition of the tags of the start and end source words spanned by that cell. |
Experiments | K-means clustering based models To establish suitable values for the 04 parameters and investigate the impact of the number of clusters, we looked at the development performance over various parameter combinations for a K-means model based on source and/or target part-of-speech tags.7 As can be seen from Figure 1 (right), our method reaches its peak performance at around 50 clusters and then levels off slightly. |
Hard rule labeling from word classes | Extension to a bilingually tagged corpus While the availability of syntactic annotations for both source and target language is unlikely in most translation scenarios, some form of word tags, be it part-of-speech tags or learned word clusters (cf. |
Hard rule labeling from word classes | Consider again our example sentence pair (now also annotated with source-side part-of-speech tags): |
Introduction | In this work, we propose a labeling approach that is based merely on part-of-speech analysis of the source or target language (or even both). |
Discussion | Prefix probabilities and right prefix probabilities for PSCFGs can be exploited to compute probability distributions for the next word or part-of-speech in left-to-right incremental translation of speech, or alternatively as a predictive tool in applications of interactive machine translation, of the kind described by Foster et al. |
Discussion | However, one may also compute the probability that the next part-of-speech in the target translation is A. |
Discussion | This can be realised by adding a rule 3’ : [B —> b, A —> CA] for each rule 3 : [B —> b, A —> a] from the source grammar, where A is a nonterminal representing a part-of-speech and CA is a (pre-)terminal specific to A. |
Introduction | Prefix probabilities can be used to compute probability distributions for the next word or part-of-speech . |
Introduction | Prefix probabilities and right prefix probabilities for PSCFGs can be exploited to compute probability distributions for the next word or part-of-speech in left-to-right incremental translation, essentially in the same way as described by Jelinek and Lafferty (1991) for probabilistic context-free grammars, as discussed later in this paper. |
Background | Past research in unsupervised PoS induction has largely been driven by two different motivations: a task based perspective which has focussed on inducing word classes to improve various applications, and a linguistic perspective where the aim is to induce classes which correspond closely to annotated part-of-speech corpora. |
Background | The HMM ignores orthographic information, which is often highly indicative of a word’s part-of-speech , particularly so in morphologically rich languages. |
Introduction | Unsupervised part-of-speech (PoS) induction has long been a cenUal chaflenge in conqnnafional linguistics, with applications in human language learning and for developing portable language processing systems. |
The PYP-HMM | In many languages morphological regularities correlate strongly with a word’s part-of-speech (e.g., suffixes in English), which we hope to capture using a basic character language model. |
Conclusion | We have shown the efficacy of graph-based label propagation for projecting part-of-speech information across languages. |
Experiments and Results | 6.2 Part-of-Speech Tagset and HMM States |
Experiments and Results | While there might be some controversy about the exact definition of such a tagset, these 12 categories cover the most frequent part-of-speech and exist in one form or another in all of the languages that we studied. |
Introduction | To make the projection practical, we rely on the twelve universal part-of-Speech tags of Petrov et al. |
Experiments | The part-of-speech tags for the development and test set were automatically assigned by the MXPOST taggerlo, where the tagger was trained on the entire training corpus. |
Related Work | (2010) created robust supervised classifiers via web-scale N-gram data for adjective ordering, spelling correction, noun compound bracketing and verb part-of-speech disambiguation. |
Web-Derived Selectional Preference Features | In this paper, we employ two different feature sets: a baseline feature set3 which draw upon “normal” information source, such as word forms and part-of-speech (POS) without including the web-derived selectional preference4 features, a feature set conjoins the baseline features and the web-derived selectional preference features. |
Web-Derived Selectional Preference Features | is any token whose part-of-speech is IN |
Conclusion and Future Work | Such diverse topics as machine translation (Dyer et al., 2008; Dyer and Resnik, 2010; Mi et al., 2008), part-of-speech tagging (Jiang et al., 2008), named entity recognition (Finkel and Manning, 2009) semantic role labelling (Sutton and McCallum, 2005; Finkel et al., 2006), and others have also been improved by combined models. |
Experiments | We supply gold-standard part-of-speech tags to the parsers. |
Experiments | Next, we evaluate performance when using automatic part-of-speech tags as input to our parser |
Independent Query Annotations | (2010) we use a large -gram corpus (Brants and Franz, 2006) to estimate [Cl-mi) for annotating the query with capitalization 1d segmentation markup, and a standard POS tag-:r1 for part-of-speech tagging of the query. |
Introduction | Automatic markup of textual documents with linguistic annotations such as part-of-speech tags, sentence constituents, named entities, or semantic roles is a common practice in natural language processing (NLP). |
Related Work | The literature on query annotation includes query segmentation (Bergsma and Wang, 2007; Jones et al., 2006; Guo et al., 2008; Hagen et al., 2010; Hagen et al., 2011; Tan and Peng, 2008), part-of-speech and semantic tagging (Barr et al., 2008; Manshadi and Li, 2009; Li, 2010), named-entity recognition (Guo et al., 2009; Lu et al., 2009; Shen et al., 2008; Pasca, 2007), abbreviation disambiguation (Wei et al., 2008) and stopword detection (Lo et al., 2005; Jones and Fain, 2003). |
Introduction | Prior work incorporating parse structure into machine translation (Chiang, 2010) and Semantic Role Labeling (Tsai et al., 2005; Punyakanok et al., 2008) indicate that such hierarchical structure can have great benefit over shallow labeling techniques like chunking and part-of-speech tagging. |
Open/Closed Cell Classification | The feature vector :13 is encoded with the chart cell’s absolute and relative span width, as well as unigram and bigram lexical and part-of-speech tag items from wi_1 . |
Results | Figure 4 contains a timing comparison of the three components of our final parser: Boundary FOM initialization (which includes the forward-backward algorithm over ambiguous part-of-speech tags), beam- |
Experimental Setup | This mapping technique is based on the many-to-one scheme used for evaluating unsupervised part-of-speech induction (Johnson, 2007). |
Experimental Setup | We implement a clustering baseline using the CLUTO toolkit with word and part-of-speech features. |
Model | These features can encode words, part-of-speech tags, context, and so on. |
Arabic Handwriting Recognition Challenges | In this paper we consider the value of morpho-leXical and morpho-syntactic features such as lemmas and part-of-speech tags, respectively, that may allow machine learning algorithms to learn generalizations. |
Experimental Settings | Ya and digit normalization pos The part-of-speech (POS) of the word lem The lemma of the word |
Related Work | part-of-speech tags, which they do not use, but suggest may help. |
Introduction | base phrases in Japanese) whose head part-of-speech was automatically tagged by the Japanese morphological analyser Chasen6 as either ‘noun’ or ‘unknown word’ according to the NAIST—jdic dictionary.7 |
Introduction | POS / LEMMA part-of-speech / dependency label / lemma / DEP_LABEL of the predicate which has ZERO. |
Introduction | D_POS / part-of-speech / dependency label / lemma D_LEMMA / of the dependents of the predicate which has D_DEP_LABEL ZERO. |
Split-Merge Role Induction | When the part-of-speech similarity (pos) is below a certain threshold [3 or when clause-level constraints (cons) are satisfied to a lesser extent than threshold 7, the score takes value zero and the merge is ruled out. |
Split-Merge Role Induction | Part-of-speech Similarity Part-of-speech similarity is also measured through cosine-similarity (equation (3)). |
Split-Merge Role Induction | Clusters are again represented as vectors x and y whose components correspond to argument part-of-speech tags and values to their occurrence frequency. |
Introduction | Morphological taggers disambiguate morphological attributes such as part-of-speech (POS) or case, without taking syntax nfioaummn(Hflimme7azd,2mm;Ihficet al., 2001); dependency parsers commonly assume the “pipeline” approach, relying on morphological information as part of the input (Buchholz and Marsi, 2006; Nivre et al., 2007). |
Previous Work | Each of the resulting morphemes is then tagged with an atomic “part-of-speech” to indicate word class and some morphological features. |
Previous Work | According to the Latin morphological database encoded in MORPHEUS (Crane, 1991), 30% of Latin nouns can be parsed as another part-of-speech , and on average each has 3.8 possible morphological interpretations. |