Evaluation | The part-of-speech tagger used to extract POS features was lgtagger7 (Constant and Sigogne, 2011). |
MWE-dedicated Features | We associate each word with its part-of-speech tags found in our external morphological lexicon. |
MWE-dedicated Features | Table 2: Feature templates (f) used both in the MWER and the reranker models: n is the current position in the sentence, is the word at position i; is the part-of-speech tag of w(z’); if the word at absolute position i is part of a compound in the Shortest Path Segmentation, mwt(i) and mws(i) are respectively the part-of-speech tag and the internal structure of the compound, mwpos(i) indicates its relative position in the compound (B or I). |
Multiword expressions | In this paper, we focus on contiguous MWEs that form a lexical unit which can be marked by a part-of-speech tag (e. g. at night is an adverb, because of is a preposition). |
Two strategies, two discriminative models | Constant and Sigogne (2011) proposed to combine MWE segmentation and part-of-speech tagging into a single sequence labelling task by assigning to each token a tag of the form TAG+X where TAG is the part-of-speech (POS) of the leXical unit the token belongs to and X is either B (i.e. |
Implementation Details | We first perform word segmentation (if needed) and part-of-speech tagging . |
Implementation Details | After that, we obtain the word-segmented sentences with the part-of-speech tags . |
Parsing with dependency language model | The feature templates are outlined in Table l, where TYPE refers to one of the typeszPL or PR, h_pos refers to the part-of-speech tag of :1: h, h_word refers to the lexical form of :1: h, ch_pos refers to the part-of-speech tag of mch, and ch_word refers to the lexical form of mm. |