PDT 2.0 Requirements on a Query Language
Mírovský, Jiří

Article Structure

Abstract

Linguistically annotated treebanks play an essential part in the modern computational linguistics.

Introduction

Searching in a linguistically annotated treebank is a principal task in the modern computational linguistics.

Phenomena and Requirements

We make a list of linguistic phenomena that are annotated in PDT 2.0 and that determine the necessary features of a query language.

Summary of the Features

Here we summarize what features the query language has to have to suit PDT 2.0.

Conclusion

We have studied the Prague Dependency Treebank 2.0 tectogrammatical annotation manual and listed linguistic phenomena that require a special feature from any query tool for this corpus.

Topics

Treebank

Appears in 16 sentences as: Treebank (8) treebank (6) treebanks (4)
In PDT 2.0 Requirements on a Query Language
  1. Linguistically annotated treebanks play an essential part in the modern computational linguistics.
    Page 1, “Abstract”
  2. The more complex the treebanks become, the more sophisticated tools are required for using them, namely for searching in the data.
    Page 1, “Abstract”
  3. We study linguistic phenomena annotated in the Prague Dependency Treebank 2.0 and create a list of requirements these phenomena set on a search tool, especially on its query language.
    Page 1, “Abstract”
  4. Searching in a linguistically annotated treebank is a principal task in the modern computational linguistics.
    Page 1, “Introduction”
  5. A search tool helps extract useful information from the treebank , in order to study the language, the annotation system or even to search for errors in the annotation.
    Page 1, “Introduction”
  6. The more complex the treebank is, the more sophisticated the search tool and its query language needs to be.
    Page 1, “Introduction”
  7. The Prague Dependency Treebank 2.0 (Hajic et al.
    Page 1, “Introduction”
  8. 2006) is one of the most advanced manually annotated treebanks .
    Page 1, “Introduction”
  9. We study mainly the tectogrammatical layer of the Prague Dependency Treebank 2.0 (PDT 2.0), which is by far the most advanced and complex layer in the treebank , and show what requirements on a query language the annotated linguistic phenomena bring.
    Page 1, “Introduction”
  10. A description of linguistic phenomena annotated in the Tiger Treebank, along with an introduction to a search tool TigerSearch, developed especially for this treebank , is given in Brants et al.
    Page 2, “Introduction”
  11. Laura Kallmeyer (Kallmeyer 2000) studies requirements on a query language based on two examples of complex linguistic phenomena taken from the NEGRA corpus and the Penn Treebank , respectively.
    Page 2, “Introduction”

See all papers in Proc. ACL 2008 that mention Treebank.

See all papers in Proc. ACL that mention Treebank.

Back to top.

coreference

Appears in 8 sentences as: Coreference (1) coreference (4) Coreferences (1) coreferences (2) coreferential (1)
In PDT 2.0 Requirements on a Query Language
  1. Coreference relations between nodes of certain category types are captured.
    Page 2, “Introduction”
  2. Attributes coref_text.rf and coref_gram.rf contain ids of coreferential nodes of the respective types.
    Page 2, “Introduction”
  3. 2.1.7 Coreferences
    Page 5, “Phenomena and Requirements”
  4. Two types of coreferences are annotated on the tectogrammatical layer:
    Page 5, “Phenomena and Requirements”
  5. 0 grammatical coreference
    Page 5, “Phenomena and Requirements”
  6. o textual coreference The current way of representing coreference uses references (t-manual, page 996).
    Page 5, “Phenomena and Requirements”
  7. If coreference , dual dependency, or valency member identity is a link between two nodes (one node referring to another), it is enough to specify the identifier of the referred node in the appropriate attribute of the referring node.
    Page 5, “Phenomena and Requirements”
  8. 0 secondary edges, secondary dependencies, coreferences , long-range relations
    Page 8, “Summary of the Features”

See all papers in Proc. ACL 2008 that mention coreference.

See all papers in Proc. ACL that mention coreference.

Back to top.

word order

Appears in 8 sentences as: Word Order (1) Word order (1) word order (7)
In PDT 2.0 Requirements on a Query Language
  1. 1998) are marked (attribute tf a), together with so-called deep word order reflected by the order of nodes in the annotation (attribute deepord).
    Page 2, “Introduction”
  2. In the underlying word order , nodes representing the quasi-focus, although they are contextually bound, are placed to the right from their governing node.
    Page 5, “Phenomena and Requirements”
  3. The position of rhematizers in the surface word order is quite loose, however they almost always stand right before the eXpressions they rhematize, i.e.
    Page 6, “Phenomena and Requirements”
  4. the node representing the rhematizer) is placed as the closest left brother (in the underlying word order ) of the first node of the expression that is in its scope.
    Page 6, “Phenomena and Requirements”
  5. Rhematizers therefore bring a further requirement on the query language — an ability to control the distance between nodes (in the terms of deep word order ); at the very least, the query language has to distinguish an immediate brother and relative horizontal position of nodes.
    Page 6, “Phenomena and Requirements”
  6. 2.2.3 Word Order
    Page 7, “Phenomena and Requirements”
  7. Word order is a linguistic phenomenon widely studied on the analytical layer, because it offers a perfect combination of a word order (the same like in the sentence) and syntactic relations between the words.
    Page 7, “Phenomena and Requirements”
  8. The same technique like with the deep word order on the tectogrammatical layer can be used here.
    Page 7, “Phenomena and Requirements”

See all papers in Proc. ACL 2008 that mention word order.

See all papers in Proc. ACL that mention word order.

Back to top.

dependency relations

Appears in 3 sentences as: dependency relation (1) dependency relations (2)
In PDT 2.0 Requirements on a Query Language
  1. The edges eXpress dependency relations between nodes.
    Page 3, “Phenomena and Requirements”
  2. The predicative complement is a nonobligatory free modification (adjunct) which has a dual semantic dependency relation .
    Page 4, “Phenomena and Requirements”
  3. These two dependency relations are represented by different means (t-manual, page 376):
    Page 4, “Phenomena and Requirements”

See all papers in Proc. ACL 2008 that mention dependency relations.

See all papers in Proc. ACL that mention dependency relations.

Back to top.

dependency tree

Appears in 3 sentences as: dependency tree (3)
In PDT 2.0 Requirements on a Query Language
  1. The analytical layer roughly corresponds to the surface syntax of the sentence; the annotation is a single-rooted dependency tree with labeled nodes.
    Page 2, “Introduction”
  2. Again, the annotation is a dependency tree with labeled nodes (Hajicova 1998).
    Page 2, “Introduction”
  3. The representation of the tectogrammatical annotation of a sentence is a rooted dependency tree .
    Page 3, “Phenomena and Requirements”

See all papers in Proc. ACL 2008 that mention dependency tree.

See all papers in Proc. ACL that mention dependency tree.

Back to top.