Towards Open-Domain Semantic Role Labeling
Croce, Danilo and Giannone, Cristina and Annesi, Paolo and Basili, Roberto

Article Structure

Abstract

Current Semantic Role Labeling technologies are based on inductive algorithms trained over large scale repositories of annotated examples.

Introduction

The availability of large scale semantic lexicons, such as FrameNet (Baker et al., 1998), allowed the adoption of a Wide family of learning paradigms in the automation of semantic parsing.

Related Work

State-of-art approaches to frame-based SRL are based on Support Vector Machines, trained over linear models of syntactic features, e.g.

A Distributional Model for Argument Classification

High quality lexical information is crucial for robust open-domain SRL, as semantic generalization highly depends on lexical information.

Empirical Analysis

The aim of the evaluation is to measure the reachable accuracy of the simple model proposed and to compare its impact over in-domain and out-of-domain semantic role labeling tasks.

Conclusions

In this paper, a distributional approach for acquiring a semi-supervised model of argument classification (AC) preferences has been proposed.

Topics

reranking

Appears in 11 sentences as: Reranking (1) reranking (10)
In Towards Open-Domain Semantic Role Labeling
  1. This approach effectively introduces a new step in SRL, also called Joint Reranking , (RR), e.g.
    Page 2, “Related Work”
  2. We thus propose to model the reranking phase (RR) as a HMM sequence labeling task.
    Page 5, “A Distributional Model for Argument Classification”
  3. In these experiments we evaluate the quality of the argument classification step against the lexical knowledge acquired from unlabeled texts and the reranking step.
    Page 7, “Empirical Analysis”
  4. The Global Prior model is obtained by applying reranking (Section 3.2) to the best n = 10 candidates provided by the Local Prior model.
    Page 7, “Empirical Analysis”
  5. 6) and the HMM-based reranking characterize the final two configurations.
    Page 8, “Empirical Analysis”
  6. the FN-BNC dataset reported in column 2, it is worth noticing that the proposed model, with backoff and global reranking , is quite effective with respect to the state-of-the-art.
    Page 8, “Empirical Analysis”
  7. Notice how the positive impact of the backoff models and the HMM reranking policy is similarly reflected by all the collections.
    Page 8, “Empirical Analysis”
  8. It then gets benefits from all the analysis stages, in particular the final HMM reranking .
    Page 9, “Empirical Analysis”
  9. 0 The role of HMM reranking is an effective way to compensate errors in the local argument classifications for all the three domains.
    Page 9, “Empirical Analysis”
  10. in the NTI and ANC data sets), the higher seems the impact of the reranking phase.
    Page 9, “Empirical Analysis”
  11. the estimation of lexico-grammatical preferences through distributional analysis over unlabeled data), estimation (through syntactic or lexical backoff where necessary) and reranking .
    Page 9, “Conclusions”

See all papers in Proc. ACL 2010 that mention reranking.

See all papers in Proc. ACL that mention reranking.

Back to top.

Semantic Role

Appears in 8 sentences as: Semantic Role (3) semantic role (2) semantic roles (3)
In Towards Open-Domain Semantic Role Labeling
  1. Current Semantic Role Labeling technologies are based on inductive algorithms trained over large scale repositories of annotated examples.
    Page 1, “Abstract”
  2. Semantic Role Labeling (SRL) is the task of automatic recognition of individual predicates together with their major roles (e.g.
    Page 1, “Introduction”
  3. Semantic Role Labeling
    Page 1, “Introduction”
  4. More recently, the state-of-art frame-based semantic role labeling system discussed in (Johansson and Nugues, 2008b) reports a 19% drop in accuracy for the argument classification task when a different test domain is targeted (i.e.
    Page 2, “Introduction”
  5. A lexicalized model for individual semantic roles is first defined in order to achieve robust semantic classification local to each argument.
    Page 4, “A Distributional Model for Argument Classification”
  6. As the classification of semantic roles is strictly related to the lexical meaning of argument heads, we adopt a distributional perspective, where the meaning is described by the set of textual contexts in which words appear.
    Page 4, “A Distributional Model for Argument Classification”
  7. However, one single vector is a too simplistic representation given the rich nature of semantic roles FE’“.
    Page 4, “A Distributional Model for Argument Classification”
  8. The aim of the evaluation is to measure the reachable accuracy of the simple model proposed and to compare its impact over in-domain and out-of-domain semantic role labeling tasks.
    Page 6, “Empirical Analysis”

See all papers in Proc. ACL 2010 that mention Semantic Role.

See all papers in Proc. ACL that mention Semantic Role.

Back to top.

Role Labeling

Appears in 6 sentences as: Role Labeling (3) role labeling (2) role labels (1)
In Towards Open-Domain Semantic Role Labeling
  1. Current Semantic Role Labeling technologies are based on inductive algorithms trained over large scale repositories of annotated examples.
    Page 1, “Abstract”
  2. Semantic Role Labeling (SRL) is the task of automatic recognition of individual predicates together with their major roles (e.g.
    Page 1, “Introduction”
  3. Semantic Role Labeling
    Page 1, “Introduction”
  4. More recently, the state-of-art frame-based semantic role labeling system discussed in (Johansson and Nugues, 2008b) reports a 19% drop in accuracy for the argument classification task when a different test domain is targeted (i.e.
    Page 2, “Introduction”
  5. First local models are applied to produce role labels over individual arguments, then the joint model is used to decide the entire argument sequence among the set of the n-best competing solutions.
    Page 2, “Related Work”
  6. The aim of the evaluation is to measure the reachable accuracy of the simple model proposed and to compare its impact over in-domain and out-of-domain semantic role labeling tasks.
    Page 6, “Empirical Analysis”

See all papers in Proc. ACL 2010 that mention Role Labeling.

See all papers in Proc. ACL that mention Role Labeling.

Back to top.

semi-supervised

Appears in 6 sentences as: semi-supervised (6)
In Towards Open-Domain Semantic Role Labeling
  1. Finally, the application of semi-supervised learning is attempted to increase the lexical expressiveness of the model, e.g.
    Page 2, “Introduction”
  2. A semi-supervised statistical model exploiting useful lexical information from unlabeled corpora is proposed.
    Page 2, “Introduction”
  3. Accordingly a semi-supervised approach for reducing the costs of the manual annotation effort is proposed.
    Page 3, “Related Work”
  4. It embodies the idea that a multitask learning architecture coupled with semi-supervised learning can be effectively applied even to complex linguistic tasks such as SRL.
    Page 3, “Related Work”
  5. In this paper, a distributional approach for acquiring a semi-supervised model of argument classification (AC) preferences has been proposed.
    Page 9, “Conclusions”
  6. Moreover, dimensionality reduction methods alternative to LSA, as currently studied on semi-supervised spectral learning (Johnson and Zhang, 2008), will be experimented.
    Page 9, “Conclusions”

See all papers in Proc. ACL 2010 that mention semi-supervised.

See all papers in Proc. ACL that mention semi-supervised.

Back to top.

Semantic Role Labeling

Appears in 5 sentences as: Semantic Role Labeling (3) semantic role labeling (2)
In Towards Open-Domain Semantic Role Labeling
  1. Current Semantic Role Labeling technologies are based on inductive algorithms trained over large scale repositories of annotated examples.
    Page 1, “Abstract”
  2. Semantic Role Labeling (SRL) is the task of automatic recognition of individual predicates together with their major roles (e.g.
    Page 1, “Introduction”
  3. Semantic Role Labeling
    Page 1, “Introduction”
  4. More recently, the state-of-art frame-based semantic role labeling system discussed in (Johansson and Nugues, 2008b) reports a 19% drop in accuracy for the argument classification task when a different test domain is targeted (i.e.
    Page 2, “Introduction”
  5. The aim of the evaluation is to measure the reachable accuracy of the simple model proposed and to compare its impact over in-domain and out-of-domain semantic role labeling tasks.
    Page 6, “Empirical Analysis”

See all papers in Proc. ACL 2010 that mention Semantic Role Labeling.

See all papers in Proc. ACL that mention Semantic Role Labeling.

Back to top.

classification task

Appears in 4 sentences as: classification task (2) classification tasks (2)
In Towards Open-Domain Semantic Role Labeling
  1. More recently, the state-of-art frame-based semantic role labeling system discussed in (Johansson and Nugues, 2008b) reports a 19% drop in accuracy for the argument classification task when a different test domain is targeted (i.e.
    Page 2, “Introduction”
  2. In the argument classification task , the similarity between two argument heads hl and fig observed in FrameNet can be computed over l7; and
    Page 4, “A Distributional Model for Argument Classification”
  3. Table 3: Accuracy on Arg classification tasks wrt different clustering policies
    Page 7, “Empirical Analysis”
  4. We measured the performance on the argument classification tasks of different models obtained by combing different choices of o with Eq.
    Page 7, “Empirical Analysis”

See all papers in Proc. ACL 2010 that mention classification task.

See all papers in Proc. ACL that mention classification task.

Back to top.

feature set

Appears in 4 sentences as: feature set (2) feature sets (2)
In Towards Open-Domain Semantic Role Labeling
  1. Notice how this is also a general problem of statistical learning processes, as large fine grain feature sets are more exposed to the risks of overfitting.
    Page 2, “Introduction”
  2. While these approaches increase the expressive power of the models to capture more general linguistic properties, they rely on complex feature sets , are more demanding about the amount of training information and increase the overall exposure to overfitting effects.
    Page 2, “Related Work”
  3. Results on the Boundary Detection BD task are obtained by training an SVM model on the same feature set presented in (J ohansson and Nugues, 2008b) and are slightly below the state-of-the art BD accuracy reported in (Coppola et al., 2009).
    Page 9, “Empirical Analysis”
  4. Given the relatively simple feature set adopted here, this result is very significant as for its resulting efficiency.
    Page 9, “Empirical Analysis”

See all papers in Proc. ACL 2010 that mention feature set.

See all papers in Proc. ACL that mention feature set.

Back to top.

in-domain

Appears in 4 sentences as: in-domain (4)
In Towards Open-Domain Semantic Role Labeling
  1. The aim of the evaluation is to measure the reachable accuracy of the simple model proposed and to compare its impact over in-domain and out-of-domain semantic role labeling tasks.
    Page 6, “Empirical Analysis”
  2. The in-domain test has been run over the FrameNet annotated corpus, derived from the British National Corpus (BNC).
    Page 6, “Empirical Analysis”
  3. training FN—BNC 134,697 271,560 test in-domain FN—BNC 14,952 30,173 t d .
    Page 7, “Empirical Analysis”
  4. In the in-domain scenario, i.e.
    Page 8, “Empirical Analysis”

See all papers in Proc. ACL 2010 that mention in-domain.

See all papers in Proc. ACL that mention in-domain.

Back to top.

overfitting

Appears in 4 sentences as: overfitting (4)
In Towards Open-Domain Semantic Role Labeling
  1. The resulting argument classification model promotes a simpler feature space that limits the potential overfitting effects.
    Page 1, “Abstract”
  2. Notice how this is also a general problem of statistical learning processes, as large fine grain feature sets are more exposed to the risks of overfitting .
    Page 2, “Introduction”
  3. While these approaches increase the expressive power of the models to capture more general linguistic properties, they rely on complex feature sets, are more demanding about the amount of training information and increase the overall exposure to overfitting effects.
    Page 2, “Related Work”
  4. First, we propose a model that does not depend on complex syntactic information in order to minimize the risk of overfitting .
    Page 3, “A Distributional Model for Argument Classification”

See all papers in Proc. ACL 2010 that mention overfitting.

See all papers in Proc. ACL that mention overfitting.

Back to top.

semantic parsing

Appears in 4 sentences as: semantic parsing (4)
In Towards Open-Domain Semantic Role Labeling
  1. The availability of large scale semantic lexicons, such as FrameNet (Baker et al., 1998), allowed the adoption of a Wide family of learning paradigms in the automation of semantic parsing .
    Page 1, “Introduction”
  2. The above problems are particularly critical for frame-based shallow semantic parsing where, as opposed to more syntactic-oriented semantic labeling schemes (as Propbank (Palmer et al., 2005)), a significant mismatch exists between the semantic descriptors and the underlying syntactic annotation level.
    Page 2, “Introduction”
  3. In (J ohansson and Nugues, 2008b) the impact of different grammatical representations on the task of frame-based shallow semantic parsing is studied and the poor lexical generalization problem is outlined.
    Page 2, “Related Work”
  4. The obtained results are close to the state-of-art in FrameNet semantic parsing .
    Page 9, “Conclusions”

See all papers in Proc. ACL 2010 that mention semantic parsing.

See all papers in Proc. ACL that mention semantic parsing.

Back to top.

CoNLL

Appears in 3 sentences as: CoNLL (3)
In Towards Open-Domain Semantic Role Labeling
  1. work of (Gildea and Jurafsky, 2002) and the successful CoNLL evaluation campaigns (Carreras and Marquez, 2005).
    Page 1, “Introduction”
  2. Most of the CoNLL 2005 systems show a significant performance drop when the tested corpus, i.e.
    Page 1, “Introduction”
  3. Indeed, all the best systems in the CoNLL shared task competitions (e.g.
    Page 3, “Related Work”

See all papers in Proc. ACL 2010 that mention CoNLL.

See all papers in Proc. ACL that mention CoNLL.

Back to top.

feature space

Appears in 3 sentences as: feature space (3)
In Towards Open-Domain Semantic Role Labeling
  1. The resulting argument classification model promotes a simpler feature space that limits the potential overfitting effects.
    Page 1, “Abstract”
  2. The model adopts a simple feature space by relying on a limited set of grammatical properties, thus reducing its learning capacity.
    Page 2, “Introduction”
  3. As we will see, the accuracy reachable through a restricted feature space is still quite close to the state-of-art, but interestingly the performance drops in out-of-domain tests are avoided.
    Page 2, “Introduction”

See all papers in Proc. ACL 2010 that mention feature space.

See all papers in Proc. ACL that mention feature space.

Back to top.

joint model

Appears in 3 sentences as: Joint Model (1) joint model (2)
In Towards Open-Domain Semantic Role Labeling
  1. It incorporates strong dependencies within a comprehensive statistical joint model with a rich set of features over multiple argument phrases.
    Page 2, “Related Work”
  2. First local models are applied to produce role labels over individual arguments, then the joint model is used to decide the entire argument sequence among the set of the n-best competing solutions.
    Page 2, “Related Work”
  3. 3.2 A Joint Model for Argument Classification
    Page 5, “A Distributional Model for Argument Classification”

See all papers in Proc. ACL 2010 that mention joint model.

See all papers in Proc. ACL that mention joint model.

Back to top.

Latent Semantic

Appears in 3 sentences as: Latent Semantic (2) latent semantic (1)
In Towards Open-Domain Semantic Role Labeling
  1. Moreover, it generalizes lexical information about the annotated examples by applying a geometrical model, in a Latent Semantic Analysis style, inspired by a distributional paradigm (Pado
    Page 2, “Introduction”
  2. Latent Semantic Analysis (LSA) (Landauer and Dumais, 1997), is then applied to M to acquire meaningful representations LSA exploits the linear transformation called Singular Value Decomposition (SVD) and produces an approximation of the original matrix M, capturing (semantic) dependencies between context vectors.
    Page 4, “A Distributional Model for Argument Classification”
  3. Clustering, as discussed in Section 3.1, allows to generalize lexical information: similar heads within the latent semantic space are built from the annotated examples and they allow to predict the behavior of new unseen words as found in the test sentences.
    Page 7, “Empirical Analysis”

See all papers in Proc. ACL 2010 that mention Latent Semantic.

See all papers in Proc. ACL that mention Latent Semantic.

Back to top.