Joint Event Extraction via Structured Prediction with Global Features
Li, Qi and Ji, Heng and Huang, Liang

Article Structure

Abstract

Traditional approaches to the task of ACE event extraction usually rely on sequential pipelines with multiple stages, which suffer from error propagation since event triggers and arguments are predicted in isolation by independent local classifiers.

Introduction

Event extraction is an important and challenging task in Information Extraction (IE), which aims to discover event triggers with specific types and their arguments.

Event Extraction Task

In this paper we focus on the event extraction task defined in Automatic Content Extraction (ACE) evaluation.1 The task defines 8 event types and 33 subtypes such as Attack, End-Position etc.

Joint Framework for Event Extraction

Based on the hypothesis that facts are interdependent, we propose to use structured perceptron with inexact search to jointly extract triggers and arguments that co-occur in the same sentence.

Experiments

4.1 Data set and evaluation metric

Related Work

Most recent studies about ACE event extraction rely on staged pipeline which consists of separate local classifiers for trigger labeling and argument labeling (Grishman et al., 2005; Ahn, 2006; Ji and Grishman, 2008; Chen and Ji, 2009; Liao and Grishman, 2010; Hong et al., 2011; Li et al., 2012a; Chen and Ng, 2012).

Conclusions and Future Work

We presented a joint framework for ACE event extraction based on structured perceptron with inexact search.

Topics

beam size

Appears in 15 sentences as: Beam size (1) beam size (13) beam sizes (2)
In Joint Event Extraction via Structured Prediction with Global Features
  1. In Section 4.5 we will show that the standard perceptron introduces many invalid updates especially with smaller beam sizes , also observed by Huang et al.
    Page 3, “Joint Framework for Event Extraction”
  2. Then the K -best partial configurations are selected to the beam, assuming the beam size is K.
    Page 4, “Joint Framework for Event Extraction”
  3. K: Beam size .
    Page 4, “Joint Framework for Event Extraction”
  4. Figure 6 shows the training curves of the averaged perceptron with respect to the performance on the development set when the beam size is 4.
    Page 7, “Experiments”
  5. 4.4 Impact of beam size
    Page 7, “Experiments”
  6. The beam size is an important hyper parameter in both training and test.
    Page 7, “Experiments”
  7. Larger beam size will increase the computational cost while smaller beam size may reduce the performance.
    Page 7, “Experiments”
  8. Table 4 shows the performance on the development set with several different beam sizes .
    Page 7, “Experiments”
  9. When beam size = 4, the algorithm achieved the highest performance on the development set with trigger F1 2 67.9, argument F1 2 51.5, and harmonic mean = 58.6.
    Page 7, “Experiments”
  10. Based on this observation, we chose beam size as 4 for the remaining experiments.
    Page 7, “Experiments”
  11. 1 2 4 s 16 32 beam size
    Page 8, “Experiments”

See all papers in Proc. ACL 2013 that mention beam size.

See all papers in Proc. ACL that mention beam size.

Back to top.

perceptron

Appears in 15 sentences as: Perceptron (1) perceptron (15)
In Joint Event Extraction via Structured Prediction with Global Features
  1. We propose a novel joint event extraction algorithm to predict the triggers and arguments simultaneously, and use the structured perceptron (Collins, 2002) to train the joint model.
    Page 1, “Introduction”
  2. Therefore we employ beam search in decoding, and train the model using the early-update perceptron variant tailored for beam search (Collins and Roark, 2004; Huang et al., 2012).
    Page 2, “Introduction”
  3. Based on the hypothesis that facts are interdependent, we propose to use structured perceptron with inexact search to jointly extract triggers and arguments that co-occur in the same sentence.
    Page 2, “Joint Framework for Event Extraction”
  4. 3.1 Structured perceptron with beam search
    Page 2, “Joint Framework for Event Extraction”
  5. Structured perceptron is an extension to the standard linear perceptron for structured prediction, which was proposed in (Collins, 2002).
    Page 2, “Joint Framework for Event Extraction”
  6. Given a sentence instance cc 6 X, which in our case is a sentence with argument candidates, the structured perceptron involves the following decoding prob-
    Page 2, “Joint Framework for Event Extraction”
  7. The perceptron learns the model w in an online fashion.
    Page 3, “Joint Framework for Event Extraction”
  8. Figure 2 describes the skeleton of perceptron training algorithm with beam search.
    Page 3, “Joint Framework for Event Extraction”
  9. In Section 4.5 we will show that the standard perceptron introduces many invalid updates especially with smaller beam sizes, also observed by Huang et al.
    Page 3, “Joint Framework for Event Extraction”
  10. The resulting model is called averaged perceptron (Collins, 2002).
    Page 3, “Joint Framework for Event Extraction”
  11. Figure 2: Perceptron training with beam-search (Huang et al., 2012).
    Page 3, “Joint Framework for Event Extraction”

See all papers in Proc. ACL 2013 that mention perceptron.

See all papers in Proc. ACL that mention perceptron.

Back to top.

sentence-level

Appears in 7 sentences as: sentence-level (7)
In Joint Event Extraction via Structured Prediction with Global Features
  1. Our approach advances state-of-the-art sentence-level event extraction, and even outperforms previous argument labeling methods which use external knowledge from other sentences and documents.
    Page 1, “Abstract”
  2. Different from traditional pipeline approach, we present a novel framework for sentence-level event extraction, which predicts triggers and their arguments jointly (Section 3).
    Page 2, “Introduction”
  3. In addition to our baseline, we compare against the sentence-level system reported in Hong et a1.
    Page 8, “Experiments”
  4. Remarkably, compared to the cross-entity approach reported in (Hong et al., 2011), which attained 68.3% F1 for triggers and 48.3% for arguments, our approach with global features achieves even better performance on argument labeling although we only used sentence-level information.
    Page 8, “Experiments”
  5. We also show that it outperforms the sentence-level baseline reported in (J i and Grishman, 2008; Liao and Grishman, 2010), both of which attained 59.7% F1 for triggers and 36.6% for arguments.
    Page 8, “Experiments”
  6. Our approach aims to tackle the problem of sentence-level event extraction, thereby only used intra-sentential evidence.
    Page 8, “Experiments”
  7. Ji and Grishman (2008) 59.7 36.6 sentence-level
    Page 9, “Related Work”

See all papers in Proc. ACL 2013 that mention sentence-level.

See all papers in Proc. ACL that mention sentence-level.

Back to top.

beam search

Appears in 5 sentences as: beam search (6)
In Joint Event Extraction via Structured Prediction with Global Features
  1. Therefore we employ beam search in decoding, and train the model using the early-update perceptron variant tailored for beam search (Collins and Roark, 2004; Huang et al., 2012).
    Page 2, “Introduction”
  2. 3.1 Structured perceptron with beam search
    Page 2, “Joint Framework for Event Extraction”
  3. Figure 2 describes the skeleton of perceptron training algorithm with beam search .
    Page 3, “Joint Framework for Event Extraction”
  4. In each step of the beam search , if the prefix of oracle assignment 3/ falls out from the beam, then the top result in the beam is returned for early update.
    Page 3, “Joint Framework for Event Extraction”
  5. In comparison, our approach is a unified framework based on beam search , which allows us to exploit arbitrary global features efficiently.
    Page 9, “Related Work”

See all papers in Proc. ACL 2013 that mention beam search.

See all papers in Proc. ACL that mention beam search.

Back to top.

development set

Appears in 5 sentences as: development set (5)
In Joint Event Extraction via Structured Prediction with Global Features
  1. For comparison, we used the same test set with 40 newswire articles (672 sentences) as in (J i and Grishman, 2008; Liao and Grishman, 2010) for the experiments, and randomly selected 30 other documents (863 sentences) from different genres as the development set .
    Page 7, “Experiments”
  2. We use the harmonic mean of the trigger’s F1 measure and argument’s F1 measure to measure the performance on the development set .
    Page 7, “Experiments”
  3. Figure 6 shows the training curves of the averaged perceptron with respect to the performance on the development set when the beam size is 4.
    Page 7, “Experiments”
  4. Table 4 shows the performance on the development set with several different beam sizes.
    Page 7, “Experiments”
  5. When beam size = 4, the algorithm achieved the highest performance on the development set with trigger F1 2 67.9, argument F1 2 51.5, and harmonic mean = 58.6.
    Page 7, “Experiments”

See all papers in Proc. ACL 2013 that mention development set.

See all papers in Proc. ACL that mention development set.

Back to top.

entity mention

Appears in 4 sentences as: entity mention (4)
In Joint Event Extraction via Structured Prediction with Global Features
  1. Event argument: an entity mention , temporal expression or value (e.g.
    Page 2, “Event Extraction Task”
  2. For example, if the nearest entity mention is “Company”, the current token is likely to be Personnel no matter whether it is End-Postion or Start-Position.
    Page 5, “Joint Framework for Event Extraction”
  3. In this example, an entity mention is Victim argument to Die event and Target argument to Attack event, and the two event triggers are connected by the typed dependency advcl.
    Page 6, “Joint Framework for Event Extraction”
  4. If a partial configuration mistakenly classifies more than one entity mention as Place arguments for the same trigger, then it will be penalized.
    Page 7, “Joint Framework for Event Extraction”

See all papers in Proc. ACL 2013 that mention entity mention.

See all papers in Proc. ACL that mention entity mention.

Back to top.

joint model

Appears in 4 sentences as: joint model (1) joint modeling (1) jointly model (1) jointly modeling (1)
In Joint Event Extraction via Structured Prediction with Global Features
  1. We propose a novel joint event extraction algorithm to predict the triggers and arguments simultaneously, and use the structured perceptron (Collins, 2002) to train the joint model .
    Page 1, “Introduction”
  2. Unfortunately, it is intractable to perform the exact search in our framework because: (1) by jointly modeling the trigger labeling and argument labeling, the search space becomes much more complex.
    Page 3, “Joint Framework for Event Extraction”
  3. To the best of our knowledge, our work is the first attempt to jointly model these two ACE event subtasks.
    Page 8, “Related Work”
  4. There has been some previous work on joint modeling for biomedical events (Riedel and McCallum, 2011a; Riedel et al., 2009; McClosky et al., 2011; Riedel and McCallum, 2011b).
    Page 9, “Related Work”

See all papers in Proc. ACL 2013 that mention joint model.

See all papers in Proc. ACL that mention joint model.

Back to top.

gold standard

Appears in 3 sentences as: gold standard (3)
In Joint Event Extraction via Structured Prediction with Global Features
  1. In this work, we assume that argument candidates such as entities are part of the input to the event extraction, and can be from either gold standard or IE system output.
    Page 2, “Event Extraction Task”
  2. to denote the corresponding gold standard structure, where 75, represents the trigger assignment for the token 510,, and a”, represents the argument role label for the edge between at, and argument candidate 6],.
    Page 3, “Joint Framework for Event Extraction”
  3. is the best-reported system in the literature based on gold standard argument candidates.
    Page 8, “Experiments”

See all papers in Proc. ACL 2013 that mention gold standard.

See all papers in Proc. ACL that mention gold standard.

Back to top.