Evaluating a Crosslinguistic Grammar Resource: A Case Study of Wambaya
Bender, Emily M.

Article Structure

Abstract

This paper evaluates the LinGO Grammar Matrix, a cross-linguistic resource for the development of precision broad coverage grammars, by applying it to the Australian language Wambaya.

Introduction

Hand-built grammars are often dismissed as too expensive to build on the one hand, and too brittle on the other.

Background

2.1 The LinGO Grammar Matrix

Wambaya grammar

3.1 Development

Evaluation of Grammar Matrix

It is not possible to directly compare the development of a grammar for the same language, by the same grammar engineer, with and without the assistance of the Grammar Matrix.

Conclusion

This paper has presented a precision, hand-built grammar for the Australian language Wambaya, and through that grammar a case study evaluation of the LinGO Grammar Matrix.

Topics

word order

Appears in 8 sentences as: Word order (1) word order (8) word order” (1)
In Evaluating a Crosslinguistic Grammar Resource: A Case Study of Wambaya
  1. The LinGO Grammar Matrix (Bender et al., 2002; Bender and Flickinger, 2005; Drellishak and Bender, 2005) is a toolkit for reducing the cost of creating broad-coverage precision grammars by prepackaging both a cross-linguistic core grammar and a series of libraries of analyses of cross-linguistically variable phenomena, such as maj or-constituent word order or question formation.
    Page 1, “Introduction”
  2. The current libraries include analyses of major constituent word order (SOV, SVO, etc), sentential negation, coordination, and yes-no question formation.
    Page 2, “Background”
  3. Perhaps the most striking feature of Wambaya is its word order : it is a radically non-configurational language with a second position auxiliary/clitic cluster.
    Page 2, “Background”
  4. That is, aside from the constraint that verbal clauses require a clitic cluster (marking subject and object agreement and tense, aspect and mood) in second position, the word order is otherwise free, to the point that noun phrases can be noncontiguous, with head nouns and their modifiers separated by unrelated words.
    Page 2, “Background”
  5. 0 Word order: second position clitic cluster, otherwise free word order , discontinuous noun phrases
    Page 4, “Wambaya grammar”
  6. The customization-provided Wambaya-specific type definitions for word order , lexical types, and coordination constructions were used for inspiration, but most needed fairly extensive modification.
    Page 6, “Evaluation of Grammar Matrix”
  7. This is particularly unsurprising for basic word order, where the closest available option (“free word order” ) was taken, in the absence of a prepackaged analysis of non-configurationality and second-position phenomena.
    Page 6, “Evaluation of Grammar Matrix”
  8. This again suggests that lexical v. phrasal amalgamation should be encoded in the libraries, and selected according to the word order pattern of the language.
    Page 7, “Evaluation of Grammar Matrix”

See all papers in Proc. ACL 2008 that mention word order.

See all papers in Proc. ACL that mention word order.

Back to top.

development set

Appears in 5 sentences as: development set (5)
In Evaluating a Crosslinguistic Grammar Resource: A Case Study of Wambaya
  1. In addition, this grammar has relatively low ambiguity, assigning on average 11.89 parses per item in the development set .
    Page 3, “Wambaya grammar”
  2. Of the 92 sentences in this text, 20 overlapped with items in the development set , so the
    Page 5, “Wambaya grammar”
  3. The parsed portion of the development set (732 items) constitutes a sufficiently large corpus to train a parse selection model using the Redwoods disambiguation technology (Toutanova et al., 2005).
    Page 6, “Wambaya grammar”
  4. In the cross-validation trials on the development set , this model achieved a parse selection accuracy of 80.2% (random choice baseline: 23.9%).
    Page 6, “Wambaya grammar”
  5. from the development set and used to rank the parses of the test set.
    Page 6, “Wambaya grammar”

See all papers in Proc. ACL 2008 that mention development set.

See all papers in Proc. ACL that mention development set.

Back to top.

noun phrases

Appears in 5 sentences as: noun phrases (5)
In Evaluating a Crosslinguistic Grammar Resource: A Case Study of Wambaya
  1. That is, aside from the constraint that verbal clauses require a clitic cluster (marking subject and object agreement and tense, aspect and mood) in second position, the word order is otherwise free, to the point that noun phrases can be noncontiguous, with head nouns and their modifiers separated by unrelated words.
    Page 2, “Background”
  2. To relate such discontinuous noun phrases to appropriate semantic representations where ‘having-
    Page 2, “Background”
  3. 0 Word order: second position clitic cluster, otherwise free word order, discontinuous noun phrases
    Page 4, “Wambaya grammar”
  4. o Derived event modifiers: nominals (nouns, adjectives, noun phrases ) used as event modifiers with meaning dependent on their case marking
    Page 4, “Wambaya grammar”
  5. 0 Coordination: of clauses and noun phrases
    Page 4, “Wambaya grammar”

See all papers in Proc. ACL 2008 that mention noun phrases.

See all papers in Proc. ACL that mention noun phrases.

Back to top.

semantic representations

Appears in 5 sentences as: semantic representations (5)
In Evaluating a Crosslinguistic Grammar Resource: A Case Study of Wambaya
  1. Despite large typological differences between Wambaya and the languages on which the development of the resource was based, the Grammar Matrix is found to provide a significant jump-start in the creation of the grammar for Wambaya: With less than 5.5 person-weeks of development, the Wambaya grammar was able to assign correct semantic representations to 76% of the sentences in a naturally occurring text.
    Page 1, “Abstract”
  2. The core type hierarchy defines the basic feature geometry, the ways that heads combine with arguments and adjuncts, linking types for relating syntactic to semantic arguments, and the constraints required to compositionally build up semantic representations in the format of Minimal Recursion Semantics (Copestake et al., 2005; Flickinger and Bender, 2003).
    Page 2, “Background”
  3. To relate such discontinuous noun phrases to appropriate semantic representations where ‘having-
    Page 2, “Background”
  4. The linguistic analyses encoded in the grammar serve to map the surface strings to semantic representations (in Minimal Recursion Semantics (MRS) format (Copestake et al., 2005)).
    Page 4, “Wambaya grammar”
  5. This section has presented the Matrix-derived grammar of Wambaya, illustrating its semantic representations and analyses and measuring its performance against held-out data.
    Page 6, “Wambaya grammar”

See all papers in Proc. ACL 2008 that mention semantic representations.

See all papers in Proc. ACL that mention semantic representations.

Back to top.

treebanking

Appears in 4 sentences as: treebank (1) treebanking (2) treebanks (1)
In Evaluating a Crosslinguistic Grammar Resource: A Case Study of Wambaya
  1. treebanks .
    Page 1, “Introduction”
  2. It is compatible with the broader range of DELPH-IN tools, e. g., for machine translation (Lonning and Oepen, 2006), treebanking (Oepen et al., 2004) and parse selection (Toutanova et al., 2005).
    Page 2, “Background”
  3. With no prior knowledge of this language beyond its most general typological properties, we were able to develop in under 5.5 person-weeks of development time (210 hours) a grammar able to assign appropriate analyses to 91% of the examples in the development set.4 The 210 hours include 25 hours of an RA’s time entering lexical entries, 7 hours spent preparing the development test suite, and 15 hours treebanking (using the LinGO Redwoods software (Oepen et al., 2004) to annotate the intended parse for each item).
    Page 3, “Wambaya grammar”
  4. The resulting treebank was used to select appropriate parameters by 10-fold cross-validation, applying the experimentation environment and feature templates of (Velldal, 2007).
    Page 6, “Wambaya grammar”

See all papers in Proc. ACL 2008 that mention treebanking.

See all papers in Proc. ACL that mention treebanking.

Back to top.