Incorporating Information Status into Generation Ranking
Cahill, Aoife and Riester, Arndt

Article Structure

Abstract

We investigate the influence of information status (IS) on constituent order in German, and integrate our findings into a log-linear surface realisation ranking model.

Introduction

There are many factors that influence word order, e. g. humanness, definiteness, linear order of grammatical functions, givenness, focus, constituent weight.

Generation Ranking

The task we are considering is generation ranking.

Information Status

The concept of information status (Prince, 1981; Prince, 1992) involves classifying NPfl’PHDP expressions in texts according to various ways of their being given or new.

Asymmetries in IS

In order to find out whether IS categories are unevenly distributed within German sentences we examine a corpus of German radio news bulletins that has been manually annotated for IS (496 annotated sentences in total) using the scheme of Riester (2008b).5

Syntactic IS Asymmetries

It seems that IS could, in principle, be quite beneficial in the generation ranking task.

Generation Ranking Experiments

Using the augmented set of IS asymmetries, we design new features to be included into the original model of Cahill et al.

Discussion

In the work described here, we concentrate only on taking advantage of the information that is readily available to us.

Conclusions

In this paper we presented a novel method of including IS into the task of generation ranking.

Topics

log-linear

Appears in 10 sentences as: log-linear (10)
In Incorporating Information Status into Generation Ranking
  1. We investigate the influence of information status (IS) on constituent order in German, and integrate our findings into a log-linear surface realisation ranking model.
    Page 1, “Abstract”
  2. We build a log-linear model that incorporates these asymmetries for ranking German string reali-sations from input LFG F-structures.
    Page 1, “Abstract”
  3. (2007), a log-linear model based on the Lexical Functional Grammar (LFG) Framework (Kaplan and Bresnan, 1982).
    Page 1, “Generation Ranking”
  4. (2007) describe a log-linear model that uses linguistically motivated features and improves over a simple trigram language model baseline.
    Page 2, “Generation Ranking”
  5. We take this log-linear model as our starting point.3
    Page 2, “Generation Ranking”
  6. These are all automatically removed from the list of features to give a total of 130 new features for the log-linear ranking model.
    Page 5, “Generation Ranking Experiments”
  7. We train the log-linear ranking model on 7759 F-structures from the TIGER treebank.
    Page 5, “Generation Ranking Experiments”
  8. We tune the parameters of the log-linear model on a small development set of 63 sentences, and carry out the final evaluation on 261 unseen sentences.
    Page 5, “Generation Ranking Experiments”
  9. We evaluate the string chosen by the log-linear model against the original treebank string in terms of exact match and BLEU score (Papineni et al.,
    Page 5, “Generation Ranking Experiments”
  10. By calculating strong asymmetries between pairs of IS labels, and establishing the most frequent syntactic characteristics of these asymmetries, we designed a new set of features for a log-linear ranking model.
    Page 8, “Conclusions”

See all papers in Proc. ACL 2009 that mention log-linear.

See all papers in Proc. ACL that mention log-linear.

Back to top.

BLEU

Appears in 9 sentences as: BLEU (9)
In Incorporating Information Status into Generation Ranking
  1. We show that it achieves a statistically significantly higher BLEU score than the baseline system without these features.
    Page 1, “Abstract”
  2. Model BLEU Match (%)
    Page 5, “Generation Ranking Experiments”
  3. We evaluate the string chosen by the log-linear model against the original treebank string in terms of exact match and BLEU score (Papineni et al.,
    Page 5, “Generation Ranking Experiments”
  4. We achieve an improvement of 0.0168 BLEU points and 1.91 percentage points in exact match.
    Page 7, “Generation Ranking Experiments”
  5. The improvement in BLEU is statistically significant (p < 0.01) using the paired bootstrap resampling significance test (Koehn, 2004).
    Page 7, “Generation Ranking Experiments”
  6. Model BLEU Match (%)
    Page 7, “Generation Ranking Experiments”
  7. The difference in BLEU score between the model of Cahill et al.
    Page 7, “Generation Ranking Experiments”
  8. Given that we only looked at IS factors within a sentence, we think that such a significant improvement in BLEU and exact match scores is very encouraging.
    Page 8, “Discussion”
  9. In comparison to a baseline model, we achieve statistically significant improvement in BLEU score.
    Page 8, “Conclusions”

See all papers in Proc. ACL 2009 that mention BLEU.

See all papers in Proc. ACL that mention BLEU.

Back to top.

word order

Appears in 7 sentences as: word order (7)
In Incorporating Information Status into Generation Ranking
  1. There are many factors that influence word order , e. g. humanness, definiteness, linear order of grammatical functions, givenness, focus, constituent weight.
    Page 1, “Introduction”
  2. It is common knowledge that information status1 (henceforth, IS) has a strong influence on syntax and word order ; for instance, in inversions, where the subject follows some preposed element, Birner (1994) reports that the preposed element must not be newer in the discourse than the subject.
    Page 1, “Introduction”
  3. We believe, however, that despite this shortcoming, we can still take advantage of some of the insights gained from looking at the influence of IS on word order .
    Page 1, “Introduction”
  4. If we limit ourselves to single sentences, the task for the model is then to choose the string that is closest to the “default” expected word order (i.e.
    Page 3, “Generation Ranking”
  5. 6Even if some of the sentences we are learning from are marked in terms of word order , the ratios allow us to still learn the predominant order, since the marked order should occur much less frequently and the ratio will remain low.
    Page 4, “Asymmetries in IS”
  6. Pulman (1997) also uses information about parallelism to predict word order .
    Page 8, “Discussion”
  7. and Klabunde (2000) describe a sentence planner for German that annotates the propositional input with discourse-related features in order to determine the focus, and thus influence word order and accentuation.
    Page 8, “Discussion”

See all papers in Proc. ACL 2009 that mention word order.

See all papers in Proc. ACL that mention word order.

Back to top.

log-linear model

Appears in 6 sentences as: log-linear model (6)
In Incorporating Information Status into Generation Ranking
  1. We build a log-linear model that incorporates these asymmetries for ranking German string reali-sations from input LFG F-structures.
    Page 1, “Abstract”
  2. (2007), a log-linear model based on the Lexical Functional Grammar (LFG) Framework (Kaplan and Bresnan, 1982).
    Page 1, “Generation Ranking”
  3. (2007) describe a log-linear model that uses linguistically motivated features and improves over a simple trigram language model baseline.
    Page 2, “Generation Ranking”
  4. We take this log-linear model as our starting point.3
    Page 2, “Generation Ranking”
  5. We tune the parameters of the log-linear model on a small development set of 63 sentences, and carry out the final evaluation on 261 unseen sentences.
    Page 5, “Generation Ranking Experiments”
  6. We evaluate the string chosen by the log-linear model against the original treebank string in terms of exact match and BLEU score (Papineni et al.,
    Page 5, “Generation Ranking Experiments”

See all papers in Proc. ACL 2009 that mention log-linear model.

See all papers in Proc. ACL that mention log-linear model.

Back to top.

manually annotated

Appears in 5 sentences as: manual annotation (1) manual annotations (1) manually annotate (1) manually annotated (2)
In Incorporating Information Status into Generation Ranking
  1. In order to find out whether IS categories are unevenly distributed within German sentences we examine a corpus of German radio news bulletins that has been manually annotated for IS (496 annotated sentences in total) using the scheme of Riester (2008b).5
    Page 4, “Asymmetries in IS”
  2. The problem, of course, is that we do not possess any reliable system of automatically assigning IS labels to unknown text and manual annotations are costly and time-consuming.
    Page 4, “Syntactic IS Asymmetries”
  3. (2007) present work on predicting the dative alternation in English using 14 features relating to information status which were manually annotated in their corpus.
    Page 8, “Discussion”
  4. In our work, we manually annotate a small corpus in order to learn generalisations.
    Page 8, “Discussion”
  5. From these we learn features that approximate the generalisations, enabling us to apply them to large amounts of unseen data without further manual annotation .
    Page 8, “Discussion”

See all papers in Proc. ACL 2009 that mention manually annotated.

See all papers in Proc. ACL that mention manually annotated.

Back to top.

BLEU score

Appears in 4 sentences as: BLEU score (4)
In Incorporating Information Status into Generation Ranking
  1. We show that it achieves a statistically significantly higher BLEU score than the baseline system without these features.
    Page 1, “Abstract”
  2. We evaluate the string chosen by the log-linear model against the original treebank string in terms of exact match and BLEU score (Papineni et al.,
    Page 5, “Generation Ranking Experiments”
  3. The difference in BLEU score between the model of Cahill et al.
    Page 7, “Generation Ranking Experiments”
  4. In comparison to a baseline model, we achieve statistically significant improvement in BLEU score .
    Page 8, “Conclusions”

See all papers in Proc. ACL 2009 that mention BLEU score.

See all papers in Proc. ACL that mention BLEU score.

Back to top.

statistically significant

Appears in 4 sentences as: statistically significant (4) statistically significantly (1)
In Incorporating Information Status into Generation Ranking
  1. We show that it achieves a statistically significantly higher BLEU score than the baseline system without these features.
    Page 1, “Abstract”
  2. The improvement in BLEU is statistically significant (p < 0.01) using the paired bootstrap resampling significance test (Koehn, 2004).
    Page 7, “Generation Ranking Experiments”
  3. (2007) and the model that only takes syntactic-based asymmetries into account is not statistically significant, while the difference between Model 1 and this model is statistically significant (p < 0.05).
    Page 7, “Generation Ranking Experiments”
  4. In comparison to a baseline model, we achieve statistically significant improvement in BLEU score.
    Page 8, “Conclusions”

See all papers in Proc. ACL 2009 that mention statistically significant.

See all papers in Proc. ACL that mention statistically significant.

Back to top.

treebank

Appears in 4 sentences as: treebank (4)
In Incorporating Information Status into Generation Ranking
  1. We train the log-linear ranking model on 7759 F-structures from the TIGER treebank .
    Page 5, “Generation Ranking Experiments”
  2. We generate strings from each F-structure and take the original treebank string to be the labelled example.
    Page 5, “Generation Ranking Experiments”
  3. We evaluate the string chosen by the log-linear model against the original treebank string in terms of exact match and BLEU score (Papineni et al.,
    Page 5, “Generation Ranking Experiments”
  4. This corpus contains text of a similar domain to the TIGER treebank .
    Page 8, “Discussion”

See all papers in Proc. ACL 2009 that mention treebank.

See all papers in Proc. ACL that mention treebank.

Back to top.