A Single Generative Model for Joint Morphological Segmentation and Syntactic Parsing
Goldberg, Yoav and Tsarfaty, Reut

Article Structure

Abstract

Morphological processes in Semitic languages deliver space-delimited words which introduce multiple, distinct, syntactic units into the structure of the input sentence.

Introduction

Current state-of-the-art broad-coverage parsers assume a direct correspondence between the lexical items ingrained in the proposed syntactic analyses (the yields of syntactic parse-trees) and the space-delimited tokens (henceforth, ‘tokens’) that constitute the unanalyzed surface forms (utterances).

Modern Hebrew Structure

Segmental morphology Hebrew consists of seven particles m(“from”) f("when”/“who"/“that”) h(“the”) w(“and”) k(“like”) l(“to”) and b(“in” .

Previous Work on Hebrew Processing

Morphological analyzers for Hebrew that analyze a surface form in isolation have been proposed by Segal (2000), Yona and Wintner (2005), and recently by the knowledge center for processing Hebrew (Itai et al., 2006).

Model Preliminaries

4.1 The Status Space-Delimited Tokens

A Generative PCFG Model

The input for the joint task is a sequence W = 2121,. .

Experimental Setup

Previous work on morphological and syntactic disambiguation in Hebrew used different sets of data, different splits, differing annotation schemes, and different evaluation measures.

Results and Analysis

The accuracy results for segmentation, tagging and parsing using our different models and our standard data split are summarized in Table 1.

Discussion and Conclusion

Employing a PCFG-based generative framework to make both syntactic and morphological disambiguation decisions is not only theoretically clean and

Topics

PoS tags

Appears in 17 sentences as: PoS tag (2) PoS tagger (2) POS tagging (1) PoS tagging (1) PoS tags (11) PoS tags, (1)
In A Single Generative Model for Joint Morphological Segmentation and Syntactic Parsing
  1. Such discrepancies can be aligned via an intermediate level of PoS tags .
    Page 2, “Modern Hebrew Structure”
  2. PoS tags impose a unique morphological segmentation on surface tokens and present a unique valid yield for syntactic trees.
    Page 2, “Modern Hebrew Structure”
  3. Tsarfaty (2006) used a morphological analyzer (Segal, 2000), a PoS tagger (Bar-Haim et al., 2005), and a general purpose parser (Schmid, 2000) in an integrated framework in which morphological and syntactic components interact to share information, leading to improved performance on the joint task.
    Page 3, “Previous Work on Hebrew Processing”
  4. A Hebrew surface token may have several readings, each of which corresponding to a sequence of segments and their corresponding PoS tags .
    Page 3, “Model Preliminaries”
  5. We refer to different readings as different analyses whereby the segments are deterministic given the sequence of PoS tags .
    Page 3, “Model Preliminaries”
  6. We refer to a segment and its assigned PoS tag as a lexeme, and so analyses are in fact sequences of lexemes.
    Page 3, “Model Preliminaries”
  7. This means that we generate f and mnh independently depending on their corresponding PoS tags,
    Page 3, “Model Preliminaries”
  8. Each lattice arc corresponds to a segment and its corresponding PoS tag , and a path through the lattice corresponds to a specific morphological segmentation of the utterance.
    Page 4, “Model Preliminaries”
  9. Segments with the same surface form but different PoS tags are treated as different lexemes, and are represented as separate arcs (e. g. the two arcs labeled neim from node 6 to 7).
    Page 4, “Model Preliminaries”
  10. The entries in such a lexicon may be thought of as meaningful surface segments paired up with their PoS tags I, = (3,, pi), but note that a surface segment 3 need not be a space-delimited token.
    Page 4, “A Generative PCFG Model”
  11. (1996) who consider the kind of probabilities a generative parser should get from a PoS tagger , and concludes that these should be P(w|t) “and nothing fancier”.3 In our setting, therefore, the Lattice is not used to induce a probability distribution on a linear context, but rather, it is used as a common-denominator of state-indexation of all segmentations possibilities of a surface form.
    Page 5, “A Generative PCFG Model”

See all papers in Proc. ACL 2008 that mention PoS tags.

See all papers in Proc. ACL that mention PoS tags.

Back to top.

morphological analyzer

Appears in 13 sentences as: morphological analyses (2) morphological analysis (1) Morphological Analyzer (1) morphological analyzer (9) Morphological analyzers (1)
In A Single Generative Model for Joint Morphological Segmentation and Syntactic Parsing
  1. Morphological analyzers for Hebrew that analyze a surface form in isolation have been proposed by Segal (2000), Yona and Wintner (2005), and recently by the knowledge center for processing Hebrew (Itai et al., 2006).
    Page 3, “Previous Work on Hebrew Processing”
  2. Morphological dis-ambiguators that consider a token in context (an utterance) and propose the most likely morphological analysis of an utterance (including segmentation) were presented by Bar-Haim et a1.
    Page 3, “Previous Work on Hebrew Processing”
  3. Tsarfaty (2006) used a morphological analyzer (Segal, 2000), a PoS tagger (Bar-Haim et al., 2005), and a general purpose parser (Schmid, 2000) in an integrated framework in which morphological and syntactic components interact to share information, leading to improved performance on the joint task.
    Page 3, “Previous Work on Hebrew Processing”
  4. We represent all morphological analyses of a given utterance using a lattice structure.
    Page 4, “Model Preliminaries”
  5. The Input The set of analyses for a token is thus represented as a lattice in which every arc corresponds to a specific lexeme l, as shown in Figure l. A morphological analyzer M : W —> L is a function mapping sentences in Hebrew (W E W) to their corresponding lattices (M = L E L).
    Page 4, “A Generative PCFG Model”
  6. 212,, and a morphological analyzer , we look for the most probable parse tree 7r s.t.
    Page 4, “A Generative PCFG Model”
  7. Since the lattice L for a given sentence W is determined by the morphological analyzer M we have
    Page 4, “A Generative PCFG Model”
  8. We first make use of our morphological analyzer to find all segmentation possibilities by chopping off all prefix sequence possibilities (including the empty prefix) and construct a lattice off of them.
    Page 5, “A Generative PCFG Model”
  9. Morphological Analyzer Ideally, we would use an of-the-shelf morphological analyzer for mapping each input token to its possible analyses.
    Page 6, “Experimental Setup”
  10. patible with the one of the Hebrew Treebank.8 For this reason, we use a data-driven morphological analyzer derived from the training data similar to (Cohen and Smith, 2007).
    Page 6, “Experimental Setup”
  11. To control for the effect of the HSPELL-based pruning, we also experimented with a morphological analyzer that does not perform this pruning.
    Page 6, “Experimental Setup”

See all papers in Proc. ACL 2008 that mention morphological analyzer.

See all papers in Proc. ACL that mention morphological analyzer.

Back to top.

treebank

Appears in 12 sentences as: Treebank (4) treebank (8)
In A Single Generative Model for Joint Morphological Segmentation and Syntactic Parsing
  1. Using a treebank grammar, a data-driven lexicon, and a linguistically motivated unknown-tokens handling technique our model outperforms previous pipelined, integrated or factorized systems for Hebrew morphological and syntactic processing, yielding an error reduction of 12% over the best published results so far.
    Page 1, “Abstract”
  2. Morphological segmentation decisions in our model are delegated to a lexeme-based PCFG and we show that using a simple treebank grammar, a data-driven lexicon, and a linguistically motivated unknown-tokens handling our model outperforms (Tsarfaty, 2006) and (Cohen and Smith, 2007) on the joint task and achieves state-of-the-art results on a par with current respective standalone models.2
    Page 2, “Introduction”
  3. The development of the very first Hebrew Treebank (Sima’an et al., 2001) called for the exploration of general statistical parsing methods, but the application was at first limited.
    Page 3, “Previous Work on Hebrew Processing”
  4. Tsarfaty (2006) was the first to demonstrate that fully automatic Hebrew parsing is feasible using the newly available 5000 sentences treebank .
    Page 3, “Previous Work on Hebrew Processing”
  5. Data We use the Hebrew Treebank , (Sima’an et a1., 2001), provided by the knowledge center for processing Hebrew, in which sentences from the daily newspaper “Ha’aretz” are morphologically segmented and syntactically annotated.
    Page 6, “Experimental Setup”
  6. The treebank has two versions, v1.0 and v2.0, containing 5001 and 6501 sentences respectively.
    Page 6, “Experimental Setup”
  7. 6Unfortunatley running our setup on the v2.0 data set is currently not possible due to missing tokens-morphemes alignment in the v2.0 treebank .
    Page 6, “Experimental Setup”
  8. Parser and Grammar We used BitPar (Schmid, 2004), an efficient general purpose parser, 10 together with various treebank grammars to parse the input sentences and propose compatible morphological segmentation and syntactic analysis.
    Page 6, “Experimental Setup”
  9. We experimented with increasingly rich grammars read off of the treebank .
    Page 6, “Experimental Setup”
  10. Our first model is GTplain, a PCFG learned from the treebank after removing all functional features from the syntactic categories.
    Page 6, “Experimental Setup”
  11. The overall performance of our joint framework demonstrates that a probability distribution obtained over mere syntactic contexts using a Treebank grammar and a data-driven lexicon outperforms upper bounds proposed by previous joint disambiguation systems and achieves segmentation and parsing results on a par with state-of-the-art standalone applications results.
    Page 8, “Discussion and Conclusion”

See all papers in Proc. ACL 2008 that mention treebank.

See all papers in Proc. ACL that mention treebank.

Back to top.

segmentations

Appears in 6 sentences as: SEG (1) Segal (2) segmentations (3)
In A Single Generative Model for Joint Morphological Segmentation and Syntactic Parsing
  1. Morphological analyzers for Hebrew that analyze a surface form in isolation have been proposed by Segal (2000), Yona and Wintner (2005), and recently by the knowledge center for processing Hebrew (Itai et al., 2006).
    Page 3, “Previous Work on Hebrew Processing”
  2. Tsarfaty (2006) used a morphological analyzer ( Segal , 2000), a PoS tagger (Bar-Haim et al., 2005), and a general purpose parser (Schmid, 2000) in an integrated framework in which morphological and syntactic components interact to share information, leading to improved performance on the joint task.
    Page 3, “Previous Work on Hebrew Processing”
  3. Our use of an unweighted lattice reflects our belief that all the segmentations of the given input sentence are a-priori equally likely; the only reason to prefer one segmentation over the another is due to the overall syntactic context which is modeled via the PCFG derivations.
    Page 5, “A Generative PCFG Model”
  4. (1996) who consider the kind of probabilities a generative parser should get from a PoS tagger, and concludes that these should be P(w|t) “and nothing fancier”.3 In our setting, therefore, the Lattice is not used to induce a probability distribution on a linear context, but rather, it is used as a common-denominator of state-indexation of all segmentations possibilities of a surface form.
    Page 5, “A Generative PCFG Model”
  5. We use the HSPELL9 (Har’el and Kenigsberg, 2004) wordlist as a lexeme-based lexicon for pruning segmentations involving invalid segments.
    Page 6, “Experimental Setup”
  6. To evaluate the performance on the segmentation task, we report SEG , the standard harmonic means for segmentation Precision and Recall F1 (as defined in Bar-Haim et a1.
    Page 7, “Experimental Setup”

See all papers in Proc. ACL 2008 that mention segmentations.

See all papers in Proc. ACL that mention segmentations.

Back to top.

syntactic context

Appears in 5 sentences as: syntactic context (4) syntactic contexts (1)
In A Single Generative Model for Joint Morphological Segmentation and Syntactic Parsing
  1. Tsarfaty (2006) argues that for Semitic languages determining the correct morphological segmentation is dependent on syntactic context and shows that increasing information sharing between the morphological and the syntactic components leads to improved performance on the joint task.
    Page 1, “Introduction”
  2. We suggest that in unlexicalized PCFGs the syntactic context may be explicitly modeled in the derivation probabilities.
    Page 3, “Model Preliminaries”
  3. Our use of an unweighted lattice reflects our belief that all the segmentations of the given input sentence are a-priori equally likely; the only reason to prefer one segmentation over the another is due to the overall syntactic context which is modeled via the PCFG derivations.
    Page 5, “A Generative PCFG Model”
  4. Yet we note that the better grammars without pruning outperform the poorer grammars using this technique, indicating that the syntactic context aids, to some extent, the disambiguation of unknown tokens.
    Page 7, “Results and Analysis”
  5. The overall performance of our joint framework demonstrates that a probability distribution obtained over mere syntactic contexts using a Treebank grammar and a data-driven lexicon outperforms upper bounds proposed by previous joint disambiguation systems and achieves segmentation and parsing results on a par with state-of-the-art standalone applications results.
    Page 8, “Discussion and Conclusion”

See all papers in Proc. ACL 2008 that mention syntactic context.

See all papers in Proc. ACL that mention syntactic context.

Back to top.

probability distribution

Appears in 4 sentences as: probability distribution (4)
In A Single Generative Model for Joint Morphological Segmentation and Syntactic Parsing
  1. Given that weights on all outgoing arcs sum up to one, weights induce a probability distribution on the lattice paths.
    Page 4, “Model Preliminaries”
  2. (1996) who consider the kind of probabilities a generative parser should get from a PoS tagger, and concludes that these should be P(w|t) “and nothing fancier”.3 In our setting, therefore, the Lattice is not used to induce a probability distribution on a linear context, but rather, it is used as a common-denominator of state-indexation of all segmentations possibilities of a surface form.
    Page 5, “A Generative PCFG Model”
  3. We smooth Prf (p —> (s, 19)) for rare and 00V segments (3 E [,1 E L, s unseen) using a “per-tag” probability distribution over rare segments which we estimate using relative frequency estimates for once-occurring segments.
    Page 5, “A Generative PCFG Model”
  4. The overall performance of our joint framework demonstrates that a probability distribution obtained over mere syntactic contexts using a Treebank grammar and a data-driven lexicon outperforms upper bounds proposed by previous joint disambiguation systems and achieves segmentation and parsing results on a par with state-of-the-art standalone applications results.
    Page 8, “Discussion and Conclusion”

See all papers in Proc. ACL 2008 that mention probability distribution.

See all papers in Proc. ACL that mention probability distribution.

Back to top.

parse tree

Appears in 3 sentences as: parse tree (3)
In A Single Generative Model for Joint Morphological Segmentation and Syntactic Parsing
  1. 212,, and a morphological analyzer, we look for the most probable parse tree 7r s.t.
    Page 4, “A Generative PCFG Model”
  2. Hence, our parser searches for a parse tree 7r over lexemes (ll H.119) s.t.
    Page 4, “A Generative PCFG Model”
  3. Thus our proposed model is a proper model assigning probability mass to all (7r, L) pairs, where 7r is a parse tree and L is the one and only lattice that a sequence of characters (and spaces) W over our alpha-beth gives rise to.
    Page 5, “A Generative PCFG Model”

See all papers in Proc. ACL 2008 that mention parse tree.

See all papers in Proc. ACL that mention parse tree.

Back to top.