Learning Script Knowledge with Web Experiments
Regneri, Michaela and Koller, Alexander and Pinkal, Manfred

Article Structure

Abstract

We describe a novel approach to unsupervised learning of the events that make up a script, along with constraints on their temporal ordering.

Introduction

A script is “a standardized sequence of events that describes some stereotypical human activity such as going to a restaurant or visiting a doctor” (Barr and Feigenbaum, 1981).

Related Work

Approaches to learning script-like knowledge are not new.

Scripts

Before we delve into the technical details, let us establish some terminology.

Data Acquisition

In order to automatically learn TSGs, we selected 22 scenarios for which we collect ESDs.

Temporal Script Graphs

We will now describe how we compute a temporal script graph out of the collected data.

Evaluation

We evaluated the two core aspects of our system: its ability to recognize descriptions of the same event (paraphrases) and the resulting temporal constraints it defines on the event descriptions (happens-before relation).

Conclusion

We conclude with a summary of this paper and some discussion along with hints to future work in the last part.

Topics

semantically similar

Appears in 7 sentences as: Semantic similarity (1) semantic similarity (2) semantically similar (4)
In Learning Script Knowledge with Web Experiments
  1. Crucially, our algorithm exploits the sequential structure of the ESDs to distinguish event descriptions that occur at different points in the script storyline, even when they are semantically similar .
    Page 1, “Introduction”
  2. 5.2 Semantic similarity
    Page 4, “Temporal Script Graphs”
  3. Intuitively, we want the MSA to prefer the alignment of two phrases if they are semantically similar , i.e.
    Page 4, “Temporal Script Graphs”
  4. with a weighted edge; the weight reflects the semantic similarity of the nodes’ event descriptions as described in Section 5.2.
    Page 7, “Evaluation”
  5. Levenshtein Baseline: This system follows the same steps as our system, but using Levenshtein distance as the measure of semantic similarity for MSA and for node merging (cf.
    Page 7, “Evaluation”
  6. The clustering system, which can’t exploit the sequential information from the ESDs, has trouble distinguishing semantically similar phrases (high recall, low precision).
    Page 8, “Evaluation”
  7. We showed that our system outperforms two baselines and sometimes approaches human-level performance, especially because it can exploit the sequential structure of the script descriptions to separate clusters of semantically similar events.
    Page 9, “Conclusion”

See all papers in Proc. ACL 2010 that mention semantically similar.

See all papers in Proc. ACL that mention semantically similar.

Back to top.

f-score

Appears in 4 sentences as: f-score (4)
In Learning Script Knowledge with Web Experiments
  1. We calculated precision, recall, and f-score for our system, the baselines, and the upper bound as follows, with allsystem being the number of pairs labelled as paraphrase or happens-before, allgold as the respective number of pairs in the gold standard and correct as the number of pairs labeled correctly by the system.
    Page 7, “Evaluation”
  2. The f-score for the upper bound is in the column upper.
    Page 7, “Evaluation”
  3. For the f-score values, we calculated the significance for the difference between our system and the baselines as well as the upper bound, using a resampling test (Edgington, 1986).
    Page 7, “Evaluation”
  4. However, recall and f-score for this trivial lower bound would be 0.
    Page 8, “Evaluation”

See all papers in Proc. ACL 2010 that mention f-score.

See all papers in Proc. ACL that mention f-score.

Back to top.

Mechanical Turk

Appears in 3 sentences as: Mechanical Turk (3)
In Learning Script Knowledge with Web Experiments
  1. In particular, the use of the Amazon Mechanical Turk , which we use here, has been evaluated and shown to be useful for language processing tasks (Snow et al., 2008).
    Page 2, “Related Work”
  2. We presented each pair to 5 non-experts, all US residents, via Mechanical Turk .
    Page 6, “Evaluation”
  3. We want to thank Dustin Smith for the OMICS data, Alexis Palmer for her support with Amazon Mechanical Turk , Nils Bendfeldt for the creation of all web forms and Ines Rehbein for her effort
    Page 9, “Conclusion”

See all papers in Proc. ACL 2010 that mention Mechanical Turk.

See all papers in Proc. ACL that mention Mechanical Turk.

Back to top.

similarity measure

Appears in 3 sentences as: similarity measure (3)
In Learning Script Knowledge with Web Experiments
  1. On the basis of this pseudo-parse, we compute the similarity measure sim:
    Page 5, “Temporal Script Graphs”
  2. The semantic constraints check whether the event descriptions of the merged node would be sufficiently consistent according to the similarity measure from Section 5.2.
    Page 5, “Temporal Script Graphs”
  3. The Levenshtein similarity measure , on the other hand, is too restrictive and thus results in comparatively high precisions, but very low recall.
    Page 8, “Evaluation”

See all papers in Proc. ACL 2010 that mention similarity measure.

See all papers in Proc. ACL that mention similarity measure.

Back to top.