Semantic Frame Identification with Distributed Word Representations
Hermann, Karl Moritz and Das, Dipanjan and Weston, Jason and Ganchev, Kuzman

Article Structure

Abstract

We present a novel technique for semantic frame identification using distributed representations of predicates and their syntactic context; this technique leverages automatic syntactic parses and a generic set of word embeddings.

Introduction

Distributed representations of words have proved useful for a number of tasks.

Overview

Early work in frame-semantic analysis was pioneered by Gildea and Jurafsky (2002).

Frame Identification with Embeddings

We continue using the example sentence from §2.2: “He runs the company.” where we want to disambiguate the frame of runs in context.

Argument Identification

Here, we briefly describe the argument identification model used in our frame-semantic parsing experiments, post frame identification.

Experiments

In this section, we present our experiments and the results achieved.

Discussion

For FrameNet, the WSABIE EMBEDDING model we propose strongly outperforms the baselines on all metrics, and sets a new state of the art.

Conclusion

We have presented a simple model that outperforms the prior state of the art on FrameNet-style frame-semantic parsing, and performs at par with one of the previous-best single-parser systems on PropBank SRL.

Topics

LOG-LINEAR

Appears in 19 sentences as: LOG-LINEAR (14) log-linear (8)
In Semantic Frame Identification with Distributed Word Representations
  1. where p9 is a log-linear model normalized over the set Ry, with features described in Table 1.
    Page 5, “Argument Identification”
  2. Inference Although our learning mechanism uses a local log-linear model, we perform inference globally on a per-frame basis by applying hard structural constraints.
    Page 5, “Argument Identification”
  3. The baselines use a log-linear model that models the following probability at training time:
    Page 6, “Experiments”
  4. For comparison with our model from §3, which we call WSABIE EMBEDDING, we implemented two baselines with the log-linear model.
    Page 6, “Experiments”
  5. We call this baseline LOG-LINEAR WORDS.
    Page 6, “Experiments”
  6. So the second baseline has the same input representation as WSABIE EMBEDDING but uses a log-linear model instead of WSABIE.
    Page 6, “Experiments”
  7. We call this model LOG-LINEAR EMBEDDING.
    Page 6, “Experiments”
  8. The WsABIE EMBEDDING model from §3 performs significantly better than the LOG-LINEAR WORDS baseline, while LOG-LINEAR EMBEDDING underperforms in every metric.
    Page 8, “Experiments”
  9. We believe that the WSABIE EMBEDDING model performs better than the LOG-LINEAR EMBEDDING baseline (that uses the same input representation) because the former setting allows examples with different labels and confusion sets to share information; this is due to the fact that all labels live in the same label space, and a single projection matrix is shared across the examples to map the input features to this space.
    Page 9, “Discussion”
  10. Consequently, the WSABIE EMBEDDING model can share more information between different examples in the training data than the LOG-LINEAR EMBEDDING model.
    Page 9, “Discussion”
  11. Since the LOG-LINEAR WORDS model always performs better than the LOG-LINEAR EMBEDDING model, we conclude that the primary benefit does not come from the input embedding representation.15
    Page 9, “Discussion”

See all papers in Proc. ACL 2014 that mention LOG-LINEAR.

See all papers in Proc. ACL that mention LOG-LINEAR.

Back to top.

dependency path

Appears in 13 sentences as: dependency path (8) Dependency Paths (1) dependency paths (4)
In Semantic Frame Identification with Distributed Word Representations
  1. second example, for the predicate run, the agent The athlete is not a direct dependent, but is connected via a longer dependency path .
    Page 4, “Frame Identification with Embeddings”
  2. Dependency Paths To capture more relevant context, we developed a second context function as follows.
    Page 4, “Frame Identification with Embeddings”
  3. We scanned the training data for a given task (either the PropBank or the FrameNet domains) for the dependency paths that connected the gold predicates to the gold semantic arguments.
    Page 4, “Frame Identification with Embeddings”
  4. This set of dependency paths were deemed as possible positions in the initial vector space representation.
    Page 4, “Frame Identification with Embeddings”
  5. Thus for this context function, the block cardinality k was the sum of the number of scanned gold dependency path types and the number of dependency labels.
    Page 4, “Frame Identification with Embeddings”
  6. We performed initial experiments using context extracted from 1) direct dependents, 2) dependency paths , and 3) both.
    Page 4, “Frame Identification with Embeddings”
  7. For all our experiments, setting 3) which concatenates the direct dependents and dependency path always dominated the other two, so we only report results for this setting.
    Page 4, “Frame Identification with Embeddings”
  8. o dependency path between a’s head and the predicate
    Page 5, “Argument Identification”
  9. o the set of dependency labels of the predicate’s children 0 dependency path conjoined with the POS tag of a’s head
    Page 5, “Argument Identification”
  10. 0 dependency path conjoined with the word cluster of a’s head
    Page 5, “Argument Identification”
  11. o missingsubj, conjoined with the dependency path
    Page 5, “Argument Identification”

See all papers in Proc. ACL 2014 that mention dependency path.

See all papers in Proc. ACL that mention dependency path.

Back to top.

semantic role

Appears in 12 sentences as: semantic role (7) semantic roles (6)
In Semantic Frame Identification with Distributed Word Representations
  1. Additionally, we report strong results on PropBank-style semantic role labeling in comparison to prior work.
    Page 1, “Abstract”
  2. According to the theory of frame semantics (Fillmore, 1982), a semantic frame represents an event or scenario, and possesses frame elements (or semantic roles ) that participate in the
    Page 1, “Introduction”
  3. Most work on frame-semantic parsing has usually divided the task into two major subtasks: frame identification, namely the disambiguation of a given predicate to a frame, and argument identification (or semantic role labeling), the analysis of words and phrases in the sentential context that satisfy the frame’s semantic roles (Das et al., 2010; Das et al., 2014).1 Here, we focus on the first subtask of frame identification for given predicates; we use our novel method (§3) in conjunction with a standard argument identification model (§4) to perform full frame-semantic parsing.
    Page 1, “Introduction”
  4. Second, we present results on PropBank-style semantic role labeling (Palmer et al., 2005; Meyers et al., 2004; Marquez et al., 2008), that approach strong baselines, and are on par with prior state of the art (Punyakanok et al., 2008).
    Page 1, “Introduction”
  5. 2004; Carreras and Marquez, 2005) on PropBank semantic role labeling (SRL), it has been treated as an important NLP problem.
    Page 2, “Overview”
  6. PropBank The PropBank project (Palmer et al., 2005) is another popular resource related to semantic role labeling.
    Page 2, “Overview”
  7. Like FrameNet, it also has a lexical database that stores type information about verbs, in the form of sense frames and the possible semantic roles each frame could take.
    Page 2, “Overview”
  8. As mentioned in §1, these correspond to a frame disambiguation stage,4 and a stage that finds the various arguments that fulfill the frame’s semantic roles within the sentence, respectively.
    Page 2, “Overview”
  9. 5 The frame lexicon stores the frames, corresponding semantic roles and the lexical units associated with the frame.
    Page 3, “Frame Identification with Embeddings”
  10. From a frame lexicon, we look up the set of semantic roles Ry that associate with y.
    Page 5, “Argument Identification”
  11. 7By overtness, we mean the non-null instantiation of a semantic role in a frame-semantic parse.
    Page 5, “Argument Identification”

See all papers in Proc. ACL 2014 that mention semantic role.

See all papers in Proc. ACL that mention semantic role.

Back to top.

state of the art

Appears in 10 sentences as: state of the art (10)
In Semantic Frame Identification with Distributed Word Representations
  1. First, we show that for frame identification on the FrameNet corpus (Baker et al., 1998; Fillmore et al., 2003), we outperform the prior state of the art (Das et al., 2014).
    Page 1, “Introduction”
  2. Second, we present results on PropBank-style semantic role labeling (Palmer et al., 2005; Meyers et al., 2004; Marquez et al., 2008), that approach strong baselines, and are on par with prior state of the art (Punyakanok et al., 2008).
    Page 1, “Introduction”
  3. (2010) improved performance, and later set the current state of the art on this task (Das et al., 2014).
    Page 2, “Overview”
  4. This would be a standard NLP approach for the frame identification problem, but is surprisingly competitive with the state of the art .
    Page 6, “Experiments”
  5. (2014) describe the state of the art
    Page 7, “Experiments”
  6. While comparing with prior state of the art on the same corpus, we noted that Das et al.
    Page 7, “Experiments”
  7. For the SEMAFOR LEXICON setup, we also compare with the state of the art from Das
    Page 8, “Experiments”
  8. For FrameNet, the WSABIE EMBEDDING model we propose strongly outperforms the baselines on all metrics, and sets a new state of the art .
    Page 9, “Discussion”
  9. In comparison to prior work on FrameNet, even our baseline models outperform the previous state of the art .
    Page 9, “Discussion”
  10. We have presented a simple model that outperforms the prior state of the art on FrameNet-style frame-semantic parsing, and performs at par with one of the previous-best single-parser systems on PropBank SRL.
    Page 9, “Conclusion”

See all papers in Proc. ACL 2014 that mention state of the art.

See all papers in Proc. ACL that mention state of the art.

Back to top.

embeddings

Appears in 9 sentences as: embeddings (9)
In Semantic Frame Identification with Distributed Word Representations
  1. We present a novel technique for semantic frame identification using distributed representations of predicates and their syntactic context; this technique leverages automatic syntactic parses and a generic set of word embeddings .
    Page 1, “Abstract”
  2. We present a model that takes word embeddings as input and learns to identify semantic frames.
    Page 3, “Overview”
  3. We use word embeddings to represent the syntactic context of a particular predicate instance as a vector.
    Page 3, “Overview”
  4. First, we extract the words in the syntactic context of runs; next, we concatenate their word embeddings as described in §2.2 to create an initial vector space representation.
    Page 3, “Frame Identification with Embeddings”
  5. Formally, let cc represent the actual sentence with a marked predicate, along with the associated syntactic parse tree; let our initial representation of the predicate context be Suppose that the word embeddings we start with are of dimension n. Then 9 is a function from a parsed sentence cc to Rm“, where k is the number of possible syntactic context types.
    Page 3, “Frame Identification with Embeddings”
  6. So for example “He runs the company” could help the model disambiguate “He owns the company.” Moreover, since g(:c) relies on word embeddings rather than word identities, information is shared between words.
    Page 4, “Frame Identification with Embeddings”
  7. If a label occurs multiple times, then the embeddings of the words below this label are averaged.
    Page 4, “Frame Identification with Embeddings”
  8. The second baseline, tries to decouple the WSABIE training from the embedding input, and trains a log linear model using the embeddings .
    Page 6, “Experiments”
  9. Hyperparameters For our frame identification model with embeddings , we search for the WSABIE hyperparameters using the development data.
    Page 6, “Experiments”

See all papers in Proc. ACL 2014 that mention embeddings.

See all papers in Proc. ACL that mention embeddings.

Back to top.

role labeling

Appears in 9 sentences as: role labeling (7) role labels (2)
In Semantic Frame Identification with Distributed Word Representations
  1. Additionally, we report strong results on PropBank-style semantic role labeling in comparison to prior work.
    Page 1, “Abstract”
  2. Most work on frame-semantic parsing has usually divided the task into two major subtasks: frame identification, namely the disambiguation of a given predicate to a frame, and argument identification (or semantic role labeling ), the analysis of words and phrases in the sentential context that satisfy the frame’s semantic roles (Das et al., 2010; Das et al., 2014).1 Here, we focus on the first subtask of frame identification for given predicates; we use our novel method (§3) in conjunction with a standard argument identification model (§4) to perform full frame-semantic parsing.
    Page 1, “Introduction”
  3. Second, we present results on PropBank-style semantic role labeling (Palmer et al., 2005; Meyers et al., 2004; Marquez et al., 2008), that approach strong baselines, and are on par with prior state of the art (Punyakanok et al., 2008).
    Page 1, “Introduction”
  4. 2004; Carreras and Marquez, 2005) on PropBank semantic role labeling (SRL), it has been treated as an important NLP problem.
    Page 2, “Overview”
  5. PropBank The PropBank project (Palmer et al., 2005) is another popular resource related to semantic role labeling .
    Page 2, “Overview”
  6. Generic core role labels (of which there are seven, namely A0-A5 and AA) for the verb frames are marked in the figure.3 A key difference between the two annotation systems is that PropBank uses a local frame inventory, where frames are predicate-specific.
    Page 2, “Overview”
  7. Moreover, role labels , although few in number, take specific meaning for each verb frame.
    Page 2, “Overview”
  8. Note that this two-stage approach is unusual for the PropBank corpora when compared to prior work, where the vast majority of published papers have not focused on the verb frame disambiguation problem at all, only focusing on the role labeling stage (see the overview paper of Marquez et al.
    Page 2, “Overview”
  9. Finally, we presented results on PropBank-style semantic role labeling with a system that included the task of automatic verb frame identification, in tune with the FrameNet literature; we believe that such a system produces more interpretable output, both from the perspective of human understanding as well as downstream applications, than pipelines that are oblivious to the verb frame, only focusing on argument analysis.
    Page 9, “Conclusion”

See all papers in Proc. ACL 2014 that mention role labeling.

See all papers in Proc. ACL that mention role labeling.

Back to top.

log-linear model

Appears in 8 sentences as: log-linear model (7) log-linear models (1)
In Semantic Frame Identification with Distributed Word Representations
  1. where p9 is a log-linear model normalized over the set Ry, with features described in Table 1.
    Page 5, “Argument Identification”
  2. Inference Although our learning mechanism uses a local log-linear model , we perform inference globally on a per-frame basis by applying hard structural constraints.
    Page 5, “Argument Identification”
  3. The baselines use a log-linear model that models the following probability at training time:
    Page 6, “Experiments”
  4. For comparison with our model from §3, which we call WSABIE EMBEDDING, we implemented two baselines with the log-linear model .
    Page 6, “Experiments”
  5. So the second baseline has the same input representation as WSABIE EMBEDDING but uses a log-linear model instead of WSABIE.
    Page 6, “Experiments”
  6. However, since the input representation is shared across all frames, every other training example from all the lexical units affects the optimal estimate, since they all modify the joint parameter matrix M. By contrast, in the log-linear models each label has its own set of parameters, and they interact only via the normalization constant.
    Page 9, “Discussion”
  7. They also use a log-linear model , but they incorporate a latent variable that uses WordNet (Fellbaum, 1998) to get lexical-semantic relationships and smooths over frames for ambiguous lexical units.
    Page 9, “Discussion”
  8. Another difference is that when training the log-linear model , they normalize over all frames, while we normalize over the allowed frames for the current lexical unit.
    Page 9, “Discussion”

See all papers in Proc. ACL 2014 that mention log-linear model.

See all papers in Proc. ACL that mention log-linear model.

Back to top.

word embeddings

Appears in 8 sentences as: word embedding (2) word embeddings (6)
In Semantic Frame Identification with Distributed Word Representations
  1. We present a novel technique for semantic frame identification using distributed representations of predicates and their syntactic context; this technique leverages automatic syntactic parses and a generic set of word embeddings .
    Page 1, “Abstract”
  2. We present a model that takes word embeddings as input and learns to identify semantic frames.
    Page 3, “Overview”
  3. A word embedding is a distributed representation of meaning where each word is represented as a vector in R”.
    Page 3, “Overview”
  4. We use word embeddings to represent the syntactic context of a particular predicate instance as a vector.
    Page 3, “Overview”
  5. First, we extract the words in the syntactic context of runs; next, we concatenate their word embeddings as described in §2.2 to create an initial vector space representation.
    Page 3, “Frame Identification with Embeddings”
  6. Formally, let cc represent the actual sentence with a marked predicate, along with the associated syntactic parse tree; let our initial representation of the predicate context be Suppose that the word embeddings we start with are of dimension n. Then 9 is a function from a parsed sentence cc to Rm“, where k is the number of possible syntactic context types.
    Page 3, “Frame Identification with Embeddings”
  7. So for example “He runs the company” could help the model disambiguate “He owns the company.” Moreover, since g(:c) relies on word embeddings rather than word identities, information is shared between words.
    Page 4, “Frame Identification with Embeddings”
  8. as described in §3.l but conjoins them with the word identity rather than a word embedding .
    Page 6, “Experiments”

See all papers in Proc. ACL 2014 that mention word embeddings.

See all papers in Proc. ACL that mention word embeddings.

Back to top.

semantic role labeling

Appears in 6 sentences as: semantic role labeling (6)
In Semantic Frame Identification with Distributed Word Representations
  1. Additionally, we report strong results on PropBank-style semantic role labeling in comparison to prior work.
    Page 1, “Abstract”
  2. Most work on frame-semantic parsing has usually divided the task into two major subtasks: frame identification, namely the disambiguation of a given predicate to a frame, and argument identification (or semantic role labeling ), the analysis of words and phrases in the sentential context that satisfy the frame’s semantic roles (Das et al., 2010; Das et al., 2014).1 Here, we focus on the first subtask of frame identification for given predicates; we use our novel method (§3) in conjunction with a standard argument identification model (§4) to perform full frame-semantic parsing.
    Page 1, “Introduction”
  3. Second, we present results on PropBank-style semantic role labeling (Palmer et al., 2005; Meyers et al., 2004; Marquez et al., 2008), that approach strong baselines, and are on par with prior state of the art (Punyakanok et al., 2008).
    Page 1, “Introduction”
  4. 2004; Carreras and Marquez, 2005) on PropBank semantic role labeling (SRL), it has been treated as an important NLP problem.
    Page 2, “Overview”
  5. PropBank The PropBank project (Palmer et al., 2005) is another popular resource related to semantic role labeling .
    Page 2, “Overview”
  6. Finally, we presented results on PropBank-style semantic role labeling with a system that included the task of automatic verb frame identification, in tune with the FrameNet literature; we believe that such a system produces more interpretable output, both from the perspective of human understanding as well as downstream applications, than pipelines that are oblivious to the verb frame, only focusing on argument analysis.
    Page 9, “Conclusion”

See all papers in Proc. ACL 2014 that mention semantic role labeling.

See all papers in Proc. ACL that mention semantic role labeling.

Back to top.

syntactic context

Appears in 6 sentences as: syntactic context (6)
In Semantic Frame Identification with Distributed Word Representations
  1. We present a novel technique for semantic frame identification using distributed representations of predicates and their syntactic context ; this technique leverages automatic syntactic parses and a generic set of word embeddings.
    Page 1, “Abstract”
  2. Given labeled data annotated with frame-semantic parses, we learn a model that projects the set of word representations for the syntactic context around a predicate to a low dimensional representation.
    Page 1, “Abstract”
  3. We use word embeddings to represent the syntactic context of a particular predicate instance as a vector.
    Page 3, “Overview”
  4. We could represent the syntactic context of runs as a vector with blocks for all the possible dependents warranted by a syntactic parser; for example, we could assume that positions 0 .
    Page 3, “Overview”
  5. First, we extract the words in the syntactic context of runs; next, we concatenate their word embeddings as described in §2.2 to create an initial vector space representation.
    Page 3, “Frame Identification with Embeddings”
  6. Formally, let cc represent the actual sentence with a marked predicate, along with the associated syntactic parse tree; let our initial representation of the predicate context be Suppose that the word embeddings we start with are of dimension n. Then 9 is a function from a parsed sentence cc to Rm“, where k is the number of possible syntactic context types.
    Page 3, “Frame Identification with Embeddings”

See all papers in Proc. ACL 2014 that mention syntactic context.

See all papers in Proc. ACL that mention syntactic context.

Back to top.

POS tag

Appears in 4 sentences as: POS tag (2) POS tagger (1) POS tags (1)
In Semantic Frame Identification with Distributed Word Representations
  1. Let the lexical unit (the lemma conjoined with a coarse POS tag ) for the marked predicate be 6.
    Page 3, “Frame Identification with Embeddings”
  2. 0 bag of words in a 0 bag of POS tags in a
    Page 5, “Argument Identification”
  3. o the set of dependency labels of the predicate’s children 0 dependency path conjoined with the POS tag of a’s head
    Page 5, “Argument Identification”
  4. Before parsing the data, it is tagged with a POS tagger trained with a conditional random field (Lafferty et al., 2001) with the following emission features: word, the word cluster, word suffixes of length l, 2 and 3, capitalization, whether it has a hyphen, digit and punctuation.
    Page 6, “Experiments”

See all papers in Proc. ACL 2014 that mention POS tag.

See all papers in Proc. ACL that mention POS tag.

Back to top.

syntactic parse

Appears in 4 sentences as: syntactic parse (1) syntactic parser (1) syntactic parsers (1) syntactic parses (1)
In Semantic Frame Identification with Distributed Word Representations
  1. We present a novel technique for semantic frame identification using distributed representations of predicates and their syntactic context; this technique leverages automatic syntactic parses and a generic set of word embeddings.
    Page 1, “Abstract”
  2. We could represent the syntactic context of runs as a vector with blocks for all the possible dependents warranted by a syntactic parser ; for example, we could assume that positions 0 .
    Page 3, “Overview”
  3. Formally, let cc represent the actual sentence with a marked predicate, along with the associated syntactic parse tree; let our initial representation of the predicate context be Suppose that the word embeddings we start with are of dimension n. Then 9 is a function from a parsed sentence cc to Rm“, where k is the number of possible syntactic context types.
    Page 3, “Frame Identification with Embeddings”
  4. combination of two syntactic parsers as input.
    Page 9, “Discussion”

See all papers in Proc. ACL 2014 that mention syntactic parse.

See all papers in Proc. ACL that mention syntactic parse.

Back to top.

Distributed representations

Appears in 3 sentences as: distributed representation (1) Distributed representations (1) distributed representations (1)
In Semantic Frame Identification with Distributed Word Representations
  1. We present a novel technique for semantic frame identification using distributed representations of predicates and their syntactic context; this technique leverages automatic syntactic parses and a generic set of word embeddings.
    Page 1, “Abstract”
  2. Distributed representations of words have proved useful for a number of tasks.
    Page 1, “Introduction”
  3. A word embedding is a distributed representation of meaning where each word is represented as a vector in R”.
    Page 3, “Overview”

See all papers in Proc. ACL 2014 that mention Distributed representations.

See all papers in Proc. ACL that mention Distributed representations.

Back to top.

ILP

Appears in 3 sentences as: ILP (4)
In Semantic Frame Identification with Distributed Word Representations
  1. (2008) we use the log-probability of the local classifiers as a score in an integer linear program ( ILP ) to assign roles subject to hard constraints described in §5.4 and §5.5.
    Page 5, “Argument Identification”
  2. We use an off-the-shelf ILP solver for inference.
    Page 5, “Argument Identification”
  3. ILP constraints For FrameNet, we used three ILP constraints during argument identification (§4).
    Page 7, “Experiments”

See all papers in Proc. ACL 2014 that mention ILP.

See all papers in Proc. ACL that mention ILP.

Back to top.

vector space

Appears in 3 sentences as: vector space (3)
In Semantic Frame Identification with Distributed Word Representations
  1. First, we extract the words in the syntactic context of runs; next, we concatenate their word embeddings as described in §2.2 to create an initial vector space representation.
    Page 3, “Frame Identification with Embeddings”
  2. This set of dependency paths were deemed as possible positions in the initial vector space representation.
    Page 4, “Frame Identification with Embeddings”
  3. We search for the stochastic gradient learning rate in {0.0001, 0.001, 0.01}, the margin 7 E {0.001, m, 0.1, l} and the dimensionality of the final vector space m E {E, 512}, to maximize the frame identification accuracy of ambiguous lexical units; by ambiguous, we imply lexical units that appear in the training data or the lexicon with more than one semantic frame.
    Page 7, “Experiments”

See all papers in Proc. ACL 2014 that mention vector space.

See all papers in Proc. ACL that mention vector space.

Back to top.

word representations

Appears in 3 sentences as: Word Representations (1) word representations (2)
In Semantic Frame Identification with Distributed Word Representations
  1. Given labeled data annotated with frame-semantic parses, we learn a model that projects the set of word representations for the syntactic context around a predicate to a low dimensional representation.
    Page 1, “Abstract”
  2. We present a new technique for semantic frame identification that leverages distributed word representations .
    Page 1, “Introduction”
  3. h Distributed Word Representations
    Page 1, “Introduction”

See all papers in Proc. ACL 2014 that mention word representations.

See all papers in Proc. ACL that mention word representations.

Back to top.