The Role of Syntax in Vector Space Models of Compositional Semantics
Hermann, Karl Moritz and Blunsom, Phil

Article Structure

Abstract

Modelling the compositional process by which the meaning of an utterance arises from the meaning of its parts is a fundamental task of Natural Language Processing.

Introduction

Since Frege stated his ‘Principle of Semantic Compositionality’ in 1892 researchers have pondered both how the meaning of a complex expression is determined by the meanings of its parts, and how those parts are combined.

Background

There exist a number of formal approaches to language that provide mechanisms for compositionality.

Model

The models in this paper combine the power of recursive, vector-based models with the linguistic intuition of the CCG formalism.

Learning

In this section we briefly discuss unsupervised learning for our models.

Experiments

We describe a number of standard evaluations to determine the comparative performance of our model.

Topics

CCG

Appears in 37 sentences as: CCG (38)
In The Role of Syntax in Vector Space Models of Compositional Semantics
  1. This model leverages the CCG combinatory operators to guide a nonlinear transformation of meaning within a sentence.
    Page 1, “Abstract”
  2. in this field includes the Combinatory Categorial Grammar ( CCG ), which also places increased emphasis on syntactic coverage (Szabolcsi, 1989).
    Page 1, “Introduction”
  3. We achieve this goal by employing the CCG formalism to consider compositional structures at any point in a parse tree.
    Page 1, “Introduction”
  4. CCG is attractive both for its transparent interface between syntax and semantics, and a small but powerful set of combinatory operators with which we can parametrise our nonlinear transformations of compositional meaning.
    Page 1, “Introduction”
  5. We present a novel class of recursive models, the Combinatory Categorial Autoencoders (CCAE), which marry a semantic process provided by a recursive autoencoder with the syntactic representations of the CCG formalism.
    Page 1, “Introduction”
  6. CCAEs make use of CCG combinators and types by conditioning each composition function on its equivalent step in a CCG proof.
    Page 2, “Introduction”
  7. In this paper we focus on CCG , a linguistically expressive yet computationally efficient grammar formalism.
    Page 2, “Background”
  8. CCG relies on combinatory logic (as opposed to lambda calculus) to build its expressions.
    Page 2, “Background”
  9. CCG has been described as having a transparent surface between the syntactic and the seman-
    Page 2, “Background”
  10. Figure 1: CCG derivation for Tina likes tigers with forward (>) and backward application (<).
    Page 2, “Background”
  11. While one could debate the relative merits of various linguistic formalisms the existence of mature tools and resources, such as the CCGBank (Hockenmaier and Steedman, 2007), the Groningen Meaning Bank (Basile et a1., 2012) and the C&C Tools (Curran et a1., 2007) is another big advantage for CCG .
    Page 2, “Background”

See all papers in Proc. ACL 2013 that mention CCG.

See all papers in Proc. ACL that mention CCG.

Back to top.

recursive

Appears in 10 sentences as: Recursive (1) recursive (12)
In The Role of Syntax in Vector Space Models of Compositional Semantics
  1. We present a novel class of recursive models, the Combinatory Categorial Autoencoders (CCAE), which marry a semantic process provided by a recursive autoencoder with the syntactic representations of the CCG formalism.
    Page 1, “Introduction”
  2. tions: Can recursive vector space models be reconciled with a more formal notion of compositionality; and is there a role for syntax in guiding semantics in these types of models?
    Page 2, “Introduction”
  3. In terms of learning complexity and space requirements, our models strike a balance between simpler greedy approaches (Socher et al., 201 lb) and the larger recursive vector-matrix models (Socher et al., 2012b).
    Page 2, “Introduction”
  4. In both sentiment and compound similarity experiments we show that our CCAE models match or better comparable recursive autoencoder models.1
    Page 2, “Introduction”
  5. Another set of models that have very successfully been applied in this area are recursive autoencoders (Socher et al., 2011a; Socher et al., 2011b), which are discussed in the next section.
    Page 3, “Background”
  6. 2.3 Recursive Autoencoders
    Page 3, “Background”
  7. 9 9 Extending this idea, recursive autoencoders (RAE) allow the modelling of data of variable size.
    Page 4, “Background”
  8. The recursive application of autoencoders was first introduced in Pollack (1990), whose recursive auto-associative memories learn vector representations over pre-specified recursive data structures.
    Page 4, “Background”
  9. The models in this paper combine the power of recursive , vector-based models with the linguistic intuition of the CCG formalism.
    Page 4, “Model”
  10. In this paper we have brought a more formal notion of semantic compositionality to vector space models based on recursive autoencoders.
    Page 9, “Experiments”

See all papers in Proc. ACL 2013 that mention recursive.

See all papers in Proc. ACL that mention recursive.

Back to top.

embeddings

Appears in 9 sentences as: embeddings (9)
In The Role of Syntax in Vector Space Models of Compositional Semantics
  1. We use this model to learn high dimensional embeddings for sentences and evaluate them in a range of tasks, demonstrating that the incorporation of syntax allows a concise model to learn representations that are both effective and general.
    Page 1, “Abstract”
  2. We conclude with some qualitative analysis to get a better idea of whether the combination of CCG and RAE can learn semantically expressive embeddings .
    Page 6, “Experiments”
  3. use word-vectors of size 50, initialized using the embeddings provided by Turian et al.
    Page 6, “Experiments”
  4. Experiment 1: Semi-Supervised Training In the first experiment, we use the semi-supervised training strategy described previously and initialize our models with the embeddings provided by Turian et al.
    Page 6, “Experiments”
  5. While the initialization of the word vectors with previously learned embeddings (as was previously shown by Socher et al.
    Page 7, “Experiments”
  6. Instead of initialising the model with external word embeddings , we first train it on a large amount of data with the aim of overcoming the sparsity issues encountered in the previous experiment.
    Page 7, “Experiments”
  7. In this phase only the reconstruction signal is used to learn word embeddings and transformation matrices.
    Page 7, “Experiments”
  8. By learning word embeddings and composition matrices on more data, the model is likely to gen-eralise better.
    Page 7, “Experiments”
  9. Using the Turian embeddings instead of random initialisa-tion did not improve results in this setup.
    Page 7, “Experiments”

See all papers in Proc. ACL 2013 that mention embeddings.

See all papers in Proc. ACL that mention embeddings.

Back to top.

vector space

Appears in 8 sentences as: Vector Space (1) Vector space (1) vector space (5) vector spaces (1)
In The Role of Syntax in Vector Space Models of Compositional Semantics
  1. In this paper we draw upon recent advances in the learning of vector space representations of sentential semantics and the transparent interface between syntax and semantics provided by Combinatory Categorial Grammar to introduce Combinatory Categorial Autoencoders.
    Page 1, “Abstract”
  2. tions: Can recursive vector space models be reconciled with a more formal notion of compositionality; and is there a role for syntax in guiding semantics in these types of models?
    Page 2, “Introduction”
  3. 2.2 Vector Space Models of Semantics
    Page 2, “Background”
  4. Vector space models of compositional semantics aim to fill this gap by providing a methodology for deriving the representation of an expression from those of its parts.
    Page 3, “Background”
  5. There are a number of ideas on how to define composition in such vector spaces .
    Page 3, “Background”
  6. CCG-Vector Interface Exactly how the information contained in a CCG derivation is best applied to a vector space model of compositionality is another issue for future research.
    Page 9, “Experiments”
  7. In this paper we have brought a more formal notion of semantic compositionality to vector space models based on recursive autoencoders.
    Page 9, “Experiments”
  8. While the connections between formal linguistics and vector space approaches to NLP may not be immediately obvious, we believe that there is a case for the continued investigation of ways to best combine these two schools of thought.
    Page 9, “Experiments”

See all papers in Proc. ACL 2013 that mention vector space.

See all papers in Proc. ACL that mention vector space.

Back to top.

distributional representations

Appears in 6 sentences as: Distributional representations (2) distributional representations (4)
In The Role of Syntax in Vector Space Models of Compositional Semantics
  1. While distributional semantics is easily applied to single words, sparsity implies that attempts to directly extract distributional representations for larger expressions are doomed to fail.
    Page 1, “Introduction”
  2. Distributional representations encode an expression by its environment, assuming the context-dependent nature of meaning according to which one “shall know a word by the company it keeps” (Firth, 1957).
    Page 2, “Background”
  3. Distributional representations are frequently used to encode single words as vectors.
    Page 2, “Background”
  4. While it is theoretically possible to apply the same mechanism to larger expressions, sparsity prevents learning meaningful distributional representations for expressions much larger than single words.2
    Page 3, “Background”
  5. While distributional representations frequently serve to encode single words in such approaches this is not a strict requirement.
    Page 3, “Background”
  6. is one of the few examples where distributional representations are used for word pairs.
    Page 3, “Background”

See all papers in Proc. ACL 2013 that mention distributional representations.

See all papers in Proc. ACL that mention distributional representations.

Back to top.

parse tree

Appears in 6 sentences as: parse tree (4) parse trees (2)
In The Role of Syntax in Vector Space Models of Compositional Semantics
  1. We achieve this goal by employing the CCG formalism to consider compositional structures at any point in a parse tree .
    Page 1, “Introduction”
  2. We use the parse tree to structure an RAE, so that each combinatory step is represented by an autoencoder function.
    Page 4, “Model”
  3. As an internal baseline we use model CCAE-A, which is an RAE structured along a CCG parse tree .
    Page 4, “Model”
  4. We use the C&C parser (Clark and Curran, 2007) to generate CCG parse trees for the data used in our experiments.
    Page 6, “Experiments”
  5. We assume fixed parse trees for all of the compounds (Figure 6), and use these to compute compound level vectors for all word pairs.
    Page 7, “Experiments”
  6. Our experimental findings indicate a clear advantage for a deeper integration of syntax over models that use only the bracketing structure of the parse tree .
    Page 9, “Experiments”

See all papers in Proc. ACL 2013 that mention parse tree.

See all papers in Proc. ACL that mention parse tree.

Back to top.

sentiment analysis

Appears in 4 sentences as: Sentiment Analysis (1) sentiment analysis (3)
In The Role of Syntax in Vector Space Models of Compositional Semantics
  1. Previously, RAE have successfully been applied to a number of tasks including sentiment analysis , paraphrase detection, relation extraction
    Page 4, “Background”
  2. The first task of sentiment analysis allows us to compare our CCG-conditioned RAE with similar, existing models.
    Page 6, “Experiments”
  3. 5.1 Sentiment Analysis
    Page 6, “Experiments”
  4. The effect of this was highlighted by the sentiment analysis task, with the more complex models performing
    Page 8, “Experiments”

See all papers in Proc. ACL 2013 that mention sentiment analysis.

See all papers in Proc. ACL that mention sentiment analysis.

Back to top.

machine learning

Appears in 3 sentences as: machine learning (3)
In The Role of Syntax in Vector Space Models of Compositional Semantics
  1. In this paper we bridge the gap between recent advances in machine learning and more traditional approaches within computational linguistics.
    Page 1, “Introduction”
  2. We show that this combination of state of the art machine learning and an advanced linguistic formalism translates into concise models with competitive performance on a variety of tasks.
    Page 2, “Introduction”
  3. This paper represents one step towards the reconciliation of traditional formal approaches to compositional semantics with modern machine learning .
    Page 9, “Experiments”

See all papers in Proc. ACL 2013 that mention machine learning.

See all papers in Proc. ACL that mention machine learning.

Back to top.

objective function

Appears in 3 sentences as: objective function (3)
In The Role of Syntax in Vector Space Models of Compositional Semantics
  1. The gradient of the regularised objective function then becomes:
    Page 5, “Learning”
  2. We learn the gradient using backpropagation through structure (Goller and Kuchler, 1996), and minimize the objective function using L-BFGS.
    Page 5, “Learning”
  3. pred(l=l|v, 6) = Singid(Wlabel ’U + blabel) (9) Given our corpus of CCG parses with label pairs (N, l), the new objective function becomes:
    Page 6, “Learning”

See all papers in Proc. ACL 2013 that mention objective function.

See all papers in Proc. ACL that mention objective function.

Back to top.

state of the art

Appears in 3 sentences as: state of the art (3)
In The Role of Syntax in Vector Space Models of Compositional Semantics
  1. We show that this combination of state of the art machine learning and an advanced linguistic formalism translates into concise models with competitive performance on a variety of tasks.
    Page 2, “Introduction”
  2. — Overall, our models compare favourably with the state of the art .
    Page 8, “Experiments”
  3. With an additional, unsupervised training step we achieved results beyond the current state of the art on this task, too.
    Page 8, “Experiments”

See all papers in Proc. ACL 2013 that mention state of the art.

See all papers in Proc. ACL that mention state of the art.

Back to top.

vector representations

Appears in 3 sentences as: vector representation (1) vector representations (2)
In The Role of Syntax in Vector Space Models of Compositional Semantics
  1. The recursive application of autoencoders was first introduced in Pollack (1990), whose recursive auto-associative memories learn vector representations over pre-specified recursive data structures.
    Page 4, “Background”
  2. Their purpose is to learn semantically meaningful vector representations for sentences and phrases of variable size, while the purpose of this paper is to investigate the use of syntax and linguistic formalisms in such vector-based compositional models.
    Page 4, “Model”
  3. The unsupervised method described so far learns a vector representation for each sentence.
    Page 6, “Learning”

See all papers in Proc. ACL 2013 that mention vector representations.

See all papers in Proc. ACL that mention vector representations.

Back to top.

word embeddings

Appears in 3 sentences as: word embeddings (3)
In The Role of Syntax in Vector Space Models of Compositional Semantics
  1. Instead of initialising the model with external word embeddings , we first train it on a large amount of data with the aim of overcoming the sparsity issues encountered in the previous experiment.
    Page 7, “Experiments”
  2. In this phase only the reconstruction signal is used to learn word embeddings and transformation matrices.
    Page 7, “Experiments”
  3. By learning word embeddings and composition matrices on more data, the model is likely to gen-eralise better.
    Page 7, “Experiments”

See all papers in Proc. ACL 2013 that mention word embeddings.

See all papers in Proc. ACL that mention word embeddings.

Back to top.

word pairs

Appears in 3 sentences as: word pairs (3)
In The Role of Syntax in Vector Space Models of Compositional Semantics
  1. is one of the few examples where distributional representations are used for word pairs .
    Page 3, “Background”
  2. The task is thus to rank these pairs of word pairs by their semantic similarity.
    Page 7, “Experiments”
  3. We assume fixed parse trees for all of the compounds (Figure 6), and use these to compute compound level vectors for all word pairs .
    Page 7, “Experiments”

See all papers in Proc. ACL 2013 that mention word pairs.

See all papers in Proc. ACL that mention word pairs.

Back to top.