Automatic sense prediction for implicit discourse relations in text
Pitler, Emily and Louis, Annie and Nenkova, Ani

Article Structure

Abstract

We present a series of experiments on automatically identifying the sense of implicit discourse relations, i.e.

Introduction

Implicit discourse relations abound in text and readers easily recover the sense of such relations during semantic interpretation.

Related Work

Experiments on implicit and explicit relations Previous work has dealt with the prediction of discourse relation sense, but often for explicits and at the sentence level.

Penn Discourse Treebank

For our experiments, we use the Penn Discourse Treebank (PDTB; Prasad et al., 2008), the largest available annotated corpora of discourse relations.

Word pair features in prior work

Cross product of words Discourse connectives are the most reliable predictors of the semantic sense of the relation (Marcu, 2000; Pitler et al., 2008).

Analysis of word pair features

For the analysis of word pair features, we use a large collection of automatically extracted explicit examples from the experiments in Blair-Goldensohn et al.

Features for sense prediction of implicit discourse relations

The contrast between the “popular”/ “oblivion” example we started with above can be analyzed in terms of lexical relations (near antonyms), but also could be explained by different polarities of the two words: “popular” is generally a positive word, while “oblivion” has negative connotations.

Classification Results

For all experiments, we used sections 2-20 of the PDTB for training and sections 21-22 for testing.

Conclusion

We have presented the first study that predicts implicit discourse relations in a realistic setting (distinguishing a relation of interest from all others, where the relations occur in their natural distributions).

Acknowledgments

This work was partially supported by NSF grants 118-0803159, IIS-O705671 and IGERT 0504487.

Topics

word pairs

Appears in 31 sentences as: word pair (11) Word pairs (4) word pairs (18)
In Automatic sense prediction for implicit discourse relations in text
  1. We examine the most informative word pair features and find that they are not the semantically-related pairs that researchers had hoped.
    Page 1, “Introduction”
  2. Indeed, word pairs form the basic feature of most previous work on classifying implicit relations (Marcu and Echihabi, 2001; Blair-Goldensohn et al., 2007; Sporleder and Lascarides, 2008) or the simpler task of predicting which connective should be used to express a relation (Lapata and Lascarides, 2004).
    Page 3, “Word pair features in prior work”
  3. Semantic relations vs. function word pairs If the hypothesis for word pair triggers of discourse relations were true, the analysis of unambiguous relations can be used to discover pairs of words with causal or contrastive relations holding between them.
    Page 3, “Word pair features in prior work”
  4. At the same time, feature selection is always necessary for word pairs , which are numerous and lead to data sparsity problems.
    Page 3, “Word pair features in prior work”
  5. Marcu and Echihabi (2001) considered only nouns, verbs and and other cue phrases in word pairs .
    Page 3, “Word pair features in prior work”
  6. (2007) proposed several refinements of the word pair model.
    Page 3, “Word pair features in prior work”
  7. In our work that we describe next, we use feature selection to investigate the word pairs in detail.
    Page 3, “Word pair features in prior work”
  8. For the analysis of word pair features, we use a large collection of automatically extracted explicit examples from the experiments in Blair-Goldensohn et al.
    Page 3, “Analysis of word pair features”
  9. For the complete set of 10,000 examples, word pair features were computed.
    Page 3, “Analysis of word pair features”
  10. After removing word pairs that appear less than 5 times, the remaining features were ranked by information gain using the MALLET toolkitl.
    Page 3, “Analysis of word pair features”
  11. Table 1 lists the word pairs with highest information gain for the Contrast vs. Other and Cause vs. Other classification tasks.
    Page 3, “Analysis of word pair features”

See all papers in Proc. ACL 2009 that mention word pairs.

See all papers in Proc. ACL that mention word pairs.

Back to top.

f-score

Appears in 14 sentences as: f-score (17)
In Automatic sense prediction for implicit discourse relations in text
  1. The table lists the f-score for each of the target relations, with overall accuracy shown in brackets.
    Page 6, “Classification Results”
  2. Given that the experiments are run on natural distribution of the data, which are skewed towards Expansion relations, the f-score is the more important measure to track.
    Page 6, “Classification Results”
  3. Our random baseline is the f-score one would achieve by randomly assigning classes in proportion to its true distribution in the test set.
    Page 6, “Classification Results”
  4. Our features provide 6% to 18% absolute improvements in f-score over the baseline for each of the four tasks.
    Page 6, “Classification Results”
  5. However, since Expansion forms the largest class of relations, its f-score is still the highest overall.
    Page 6, “Classification Results”
  6. Surprisingly, polarity was actually one of the worst classes of features for Comparison, achieving an f-score of 16.33 (in contrast to using the first, last and first three words of the sentences as features, which leads to an f-score of 21.01).
    Page 7, “Classification Results”
  7. The two most useful classes of features for recognizing Comparison relations were the first, last and first three words in the sentence and the context features that indicate the presence of a paragraph boundary or of an explicit relation just before or just after the location of the hypothesized implicit relation (19.32 f-score ).
    Page 7, “Classification Results”
  8. Contingency The two best features for the Contingency vs. Other distinction were verb information (36.59 f-score) and first, last and first three words in the sentence (36.75 f-score ).
    Page 7, “Classification Results”
  9. The features that help achieve the best f-score are all features that were found to be useful in identifying other relations.
    Page 7, “Classification Results”
  10. Yet again, the first and last words of the sentence turned out to be useful indicators for temporal relations (15.93 f-score ).
    Page 7, “Classification Results”
  11. This model gives 4% better absolute f-score for Comparison and 14% for Contingency over Wordpairs-TextRels.
    Page 7, “Classification Results”

See all papers in Proc. ACL 2009 that mention f-score.

See all papers in Proc. ACL that mention f-score.

Back to top.

language models

Appears in 6 sentences as: language model (1) language models (5)
In Automatic sense prediction for implicit discourse relations in text
  1. For each sense, we created uni-gram and bigram language models over the implicit examples in the training set.
    Page 5, “Features for sense prediction of implicit discourse relations”
  2. We compute each example’s probability according to each of these language models .
    Page 5, “Features for sense prediction of implicit discourse relations”
  3. of the spans’ likelihoods according to the various language models .
    Page 5, “Features for sense prediction of implicit discourse relations”
  4. Expl-LM: This feature ranks the text spans according to language models derived from the explicit examples in the TextRels corpus.
    Page 5, “Features for sense prediction of implicit discourse relations”
  5. However, the corpus contains only Cause, Contrast and No-relation, hence we expect the WSJ language models to be more helpful.
    Page 5, “Features for sense prediction of implicit discourse relations”
  6. The language model features were completely useless for distinguishing contingencies from
    Page 7, “Classification Results”

See all papers in Proc. ACL 2009 that mention language models.

See all papers in Proc. ACL that mention language models.

Back to top.

Semantic relations

Appears in 4 sentences as: semantic relation (1) Semantic relations (1) semantic relations (1) semantically related (1)
In Automatic sense prediction for implicit discourse relations in text
  1. Semantic relations vs. function word pairs If the hypothesis for word pair triggers of discourse relations were true, the analysis of unambiguous relations can be used to discover pairs of words with causal or contrastive relations holding between them.
    Page 3, “Word pair features in prior work”
  2. One approach for reducing the number of features follows the hypothesis of semantic relations between words.
    Page 3, “Word pair features in prior work”
  3. Also note that the only two features predictive of the comparison class (indicated by * in Table l): the-it and to-it, contain only function words rather than semantically related non-function words.
    Page 4, “Analysis of word pair features”
  4. We show that the features in fact do not capture semantic relation but rather give information about function word co-occurrences.
    Page 8, “Conclusion”

See all papers in Proc. ACL 2009 that mention Semantic relations.

See all papers in Proc. ACL that mention Semantic relations.

Back to top.

Treebank

Appears in 4 sentences as: Treebank (4)
In Automatic sense prediction for implicit discourse relations in text
  1. For our experiments, we use the Penn Discourse Treebank , the largest existing corpus of discourse annotations for both implicit and explicit relations.
    Page 1, “Introduction”
  2. For our experiments, we use the Penn Discourse Treebank (PDTB; Prasad et al., 2008), the largest available annotated corpora of discourse relations.
    Page 2, “Penn Discourse Treebank”
  3. The PDTB contains discourse annotations over the same 2,312 Wall Street Journal (WSJ) articles as the Penn Treebank .
    Page 2, “Penn Discourse Treebank”
  4. Our final verb features were the part of speech tags (gold-standard from the Penn Treebank ) of the main verb.
    Page 5, “Features for sense prediction of implicit discourse relations”

See all papers in Proc. ACL 2009 that mention Treebank.

See all papers in Proc. ACL that mention Treebank.

Back to top.