Computing Weakest Readings
Koller, Alexander and Thater, Stefan

Article Structure

Abstract

We present an efficient algorithm for computing the weakest readings of semantically ambiguous sentences.

Introduction

Over the past few years, there has been considerable progress in the ability of manually created large-scale grammars, such as the English Resource Grammar (ERG, Copestake and Flickinger (2000)) or the ParGram grammars (Butt et al., 2002), to parse wide-coverage text and assign it deep semantic representations.

Related work

The idea of deriving a single approximative semantic representation for ambiguous sentences goes back to Hobbs (1983); however, Hobbs only works his algorithm out for a restricted class of quantifiers, and his representations can be weaker than our weakest readings.

Underspecification

This section briefly reviews two formalisms for specifying sets of trees: dominance graphs and regular tree grammars.

Computing weakest readings

Now we are ready to talk about computing the weakest readings of a hypernormally connected dominance graph.

Redundancy elimination, revisited

The construction we just carried out — characterize the configurations we find interesting as the relative normal forms of an annotated rewrite system R, translate it into a transducer MR, and intersect conf (D) with the complement of the pre-image under MR — is more generally useful than just for the computation of weakest readings.

Evaluation

In this section, we evaluate the effectiveness and efficiency of our weakest readings algorithm on a treebank.

Conclusion

In this paper, we have shown how to compute the weakest readings of a dominance graph, characterized by an annotated rewrite system.

Topics

Treebank

Appears in 12 sentences as: Treebank (7) treebank (6)
In Computing Weakest Readings
  1. While applications should benefit from these very precise semantic representations, their usefulness is limited by the presence of semantic ambiguity: On the Rondane Treebank (Oepen et al., 2002), the ERG computes an average of several million semantic representations for each sentence, even when the syntactic analysis is fixed.
    Page 1, “Introduction”
  2. However, no such approach has been worked out in sufficient detail to support the disambiguation of treebank sentences.
    Page 1, “Introduction”
  3. It is of course completely infeasible to compute all readings and compare all pairs for entailment; but even the best known algorithm in the literature (Gabsdil and Striegnitz, 1999) is only an optimization of this basic strategy, and would take months to compute the weakest readings for the sentences in the Rondane Treebank .
    Page 1, “Introduction”
  4. We present an algorithm that computes the relative normal forms, and evaluate it on the underspecified descriptions that the ERG derives on a 624-sentence subcorpus of the Rondane Treebank .
    Page 1, “Introduction”
  5. In Section 6, we define a weakening rewrite system for the ERG and evaluate it on the Rondane Treebank .
    Page 2, “Introduction”
  6. always hnc subgraphs of D. In the worst case, GD can be exponentially bigger than D, but in practice it turns out that the grammar size remains manageable: even the RTG for the most ambiguous sentence in the Rondane Treebank , which has about 4.5 x l()12 scope readings, has only about 75 000 rules and can be computed in a few seconds.
    Page 4, “Underspecification”
  7. In this section, we evaluate the effectiveness and efficiency of our weakest readings algorithm on a treebank .
    Page 7, “Evaluation”
  8. We compute RTGs for all sentences in the treebank and measure how many weakest readings remain after the intersection, and how much time this computation takes.
    Page 7, “Evaluation”
  9. For our experiment, we use the Rondane treebank (version of January 2006), a “Redwoods style” (Oepen et al., 2002) treebank containing underspecified representations (USRs) in the MRS formalism (Copestake et al., 2005) for sentences from the tourism domain.
    Page 7, “Evaluation”
  10. In other words, our algorithm brings the semantic ambiguity in the Rondane Treebank down to practically useful levels at a mean runtime investment of a few milliseconds per sentence.
    Page 9, “Evaluation”
  11. Evaluating our algorithm on a subcorpus of the Rondane Treebank , we reduced the mean number of configurations of a sentence from several million to 4.5, in negligible runtime.
    Page 9, “Conclusion”

See all papers in Proc. ACL 2010 that mention Treebank.

See all papers in Proc. ACL that mention Treebank.

Back to top.

semantic representations

Appears in 11 sentences as: semantic representation (4) semantic representations (10)
In Computing Weakest Readings
  1. A corpus-based evaluation with a large-scale grammar shows that our algorithm reduces over 80% of sentences to one or two readings, in negligible runtime, and thus makes it possible to work with semantic representations derived by deep large-scale grammars.
    Page 1, “Abstract”
  2. Over the past few years, there has been considerable progress in the ability of manually created large-scale grammars, such as the English Resource Grammar (ERG, Copestake and Flickinger (2000)) or the ParGram grammars (Butt et al., 2002), to parse wide-coverage text and assign it deep semantic representations .
    Page 1, “Introduction”
  3. While applications should benefit from these very precise semantic representations, their usefulness is limited by the presence of semantic ambiguity: On the Rondane Treebank (Oepen et al., 2002), the ERG computes an average of several million semantic representations for each sentence, even when the syntactic analysis is fixed.
    Page 1, “Introduction”
  4. We follow an underspecification approach to managing ambiguity: Rather than deriving all semantic representations from the syntactic analysis, we work with a single, compact underspecified semantic representation, from which the semantic representations can then be extracted by need.
    Page 1, “Introduction”
  5. In other words, we make it feasible for the first time to build an application that uses the individual (weakest) semantic representations computed by the ERG, both in terms of the remaining ambiguity and in terms of performance.
    Page 1, “Introduction”
  6. The idea of deriving a single approximative semantic representation for ambiguous sentences goes back to Hobbs (1983); however, Hobbs only works his algorithm out for a restricted class of quantifiers, and his representations can be weaker than our weakest readings.
    Page 2, “Related work”
  7. The work presented here is related to other approaches that reduce the set of readings of an underspecified semantic representation (USR).
    Page 2, “Related work”
  8. Both of these formalisms can be used to model scope ambiguities compactly by regarding the semantic representations of a sentence as trees.
    Page 2, “Underspecification”
  9. The algorithm presented here makes it possible, for the first time, to derive a single meaningful semantic representation from the syntactic analysis of a deep grammar on a large scale.
    Page 9, “Conclusion”
  10. In the future, it will be interesting to explore how these semantic representations can be used in applications.
    Page 9, “Conclusion”
  11. We could then perform such inferences on (cleaner) semantic representations , rather than strings (as they do).
    Page 9, “Conclusion”

See all papers in Proc. ACL 2010 that mention semantic representations.

See all papers in Proc. ACL that mention semantic representations.

Back to top.