Interpretable Semantic Vectors from a Joint Model of Brain- and Text- Based Meaning
Fyshe, Alona and Talukdar, Partha P. and Murphy, Brian and Mitchell, Tom M.

Article Structure

Abstract

Vector space models (VSMs) represent word meanings as points in a high dimensional space.

Introduction

Vector Space Models (VSMs) represent lexical meaning by assigning each word a point in high dimensional space.

NonNegative Sparse Embedding

NonNegative Sparse Embedding (NNSE) (Murphy et al., 2012a) is an algorithm that produces a latent representation using matrix factorization.

Joint NonNegative Sparse Embedding

We extend NNSEs to incorporate an additional source of data for a subset of the words in X, and call the approach Joint NonNegative Sparse Embeddings (JNNSEs).

Data

4.1 Corpus Data

Experimental Results

Here we explore several variations of JNNSE and NNSE formulations.

Future Work and Conclusion

We are interested in pursuing many future projects inspired by the success of this model.

Topics

semantic representation

Appears in 8 sentences as: semantic representation (4) semantic representations (3) semantics represented (1)
In Interpretable Semantic Vectors from a Joint Model of Brain- and Text- Based Meaning
  1. For example, multiple word senses collide in the same vector, and noise from mis-parsed sentences or spam documents can interfere with the final semantic representation .
    Page 1, “Introduction”
  2. In this work we focus on the scientific question: Can the inclusion of brain data improve semantic representations learned from corpus data?
    Page 2, “Introduction”
  3. One could also use a topic model style formulation to represent this semantic representation task.
    Page 4, “Joint NonNegative Sparse Embedding”
  4. The same idea could be applied here: the latent semantic representation generates the observed brain activity and corpus statistics.
    Page 4, “Joint NonNegative Sparse Embedding”
  5. For example, models with behavioral data (Sil-berer and Lapata, 2012) and models with visual information (Bruni et al., 2011; Silberer et al., 2013) have both shown to improve semantic representations .
    Page 4, “Joint NonNegative Sparse Embedding”
  6. This gives us a semantic representation of each of the 60 words in a 218-dimensional behavioral space.
    Page 5, “Experimental Results”
  7. It is possible that some JNNSE(Brain+Text) dimensions are being used exclusively to fit brain activation data, and not the semantics represented in both brain and corpus data.
    Page 8, “Experimental Results”
  8. This result shows that neural semantic representations can create a latent representation that is faithful to unseen corpus statistics, providing further evidence that the two data sources share a strong common element.
    Page 8, “Experimental Results”

See all papers in Proc. ACL 2014 that mention semantic representation.

See all papers in Proc. ACL that mention semantic representation.

Back to top.

SVD

Appears in 7 sentences as: SVD (7)
In Interpretable Semantic Vectors from a Joint Model of Brain- and Text- Based Meaning
  1. Typically, VSMs are created by collecting word usage statistics from large amounts of text data and applying some dimensionality reduction technique like Singular Value Decomposition ( SVD ).
    Page 1, “Introduction”
  2. SVD was applied to the document and dependency statistics and the top 1000 dimensions of each type were retained.
    Page 5, “Data”
  3. The SVD matrix for the original corpus data has correlation 0.4279 to the behavioral data, also below the 95 % confidence interval for all J NNSE models.
    Page 5, “Experimental Results”
  4. _ SVD (Text)
    Page 6, “Experimental Results”
  5. J NNSE(fMRI+Text) data performed on average 6% better than the best NNSE(Text), and exceeding even the original SVD corpus representations while maintaining interpretability.
    Page 6, “Experimental Results”
  6. -JNNSE(fMRI+Text) -JNNSE(MEG+Text) - NNSE(Text) 7 SVD (Text) 250 500 1000 Number of Latent Dimensions
    Page 7, “Experimental Results”
  7. 7 SVD Text)
    Page 7, “Experimental Results”

See all papers in Proc. ACL 2014 that mention SVD.

See all papers in Proc. ACL that mention SVD.

Back to top.

semantic space

Appears in 5 sentences as: semantic space (5)
In Interpretable Semantic Vectors from a Joint Model of Brain- and Text- Based Meaning
  1. The sparse and nonnegative representation in A produces a more interpretable semantic space , where interpretability is quantified with a behavioral task (Chang et al., 2009; Murphy et al., 2012a).
    Page 3, “NonNegative Sparse Embedding”
  2. We compared J NNSE(Brain+Text) and NNSE(Text) models by measuring the correlation of all pairwise distances in J NNSE(Brain+Text) and NNSE(Text) space to the pairwise distances in the 218-dimensional semantic space .
    Page 5, “Experimental Results”
  3. Figure 1: Correlation of JNNSE(Brain+Text) and NNSE(Text) models with the distances in a semantic space constructed from behavioral data.
    Page 6, “Experimental Results”
  4. screwdriver and hammer) are closer in semantic space than words in different word categories, which makes some 2 vs. 2 tests more difficult than others.
    Page 6, “Experimental Results”
  5. Figure 4: The mappings (D(b)) from latent semantic space (A) to brain space (Y) for fMRI and words from three semantic categories.
    Page 9, “Experimental Results”

See all papers in Proc. ACL 2014 that mention semantic space.

See all papers in Proc. ACL that mention semantic space.

Back to top.

ground truth

Appears in 4 sentences as: ground truth (4)
In Interpretable Semantic Vectors from a Joint Model of Brain- and Text- Based Meaning
  1. If brain activation data encodes semantics, we theorized that including brain data in a model of semantics could result in a model more consistent with semantic ground truth .
    Page 1, “Introduction”
  2. Our results are evidence that a joint model of brain- and text-based semantics may be closer to semantic ground truth than text-only models.
    Page 2, “Introduction”
  3. To test if our joint model of Brain+Text is closer to semantic ground truth we compared the latent representation A learned via JNNSE(Brain+Text) or NNSE(Text) to an independent behavioral measure of semantics.
    Page 5, “Experimental Results”
  4. We have provided evidence that the latent representations are closer to the neural representation of semantics, and possibly, closer to semantic ground truth .
    Page 9, “Future Work and Conclusion”

See all papers in Proc. ACL 2014 that mention ground truth.

See all papers in Proc. ACL that mention ground truth.

Back to top.

objective function

Appears in 4 sentences as: objective function (3) objective function: (1)
In Interpretable Semantic Vectors from a Joint Model of Brain- and Text- Based Meaning
  1. NNSE solves the following objective function:
    Page 2, “NonNegative Sparse Embedding”
  2. new objective function is:
    Page 3, “Joint NonNegative Sparse Embedding”
  3. With A or D fixed, the objective function for NNSE(Text) and JNNSE(Brain+Text) is convex.
    Page 3, “Joint NonNegative Sparse Embedding”
  4. For a given value of 6 we solve the NNSE(Text) and J NNSE(Brain+Text) objective function as detailed in Equation 1 and 4 respectively.
    Page 5, “Experimental Results”

See all papers in Proc. ACL 2014 that mention objective function.

See all papers in Proc. ACL that mention objective function.

Back to top.

co-occurrence

Appears in 3 sentences as: co-occurrence (3)
In Interpretable Semantic Vectors from a Joint Model of Brain- and Text- Based Meaning
  1. The basic assumption is that semantics drives a person’s language production behavior, and as a result co-occurrence patterns in written text indirectly encode word meaning.
    Page 1, “Introduction”
  2. The raw co-occurrence statistics are unwieldy, but in the compressed
    Page 1, “Introduction”
  3. Document statistics are word-document co-occurrence counts.
    Page 5, “Data”

See all papers in Proc. ACL 2014 that mention co-occurrence.

See all papers in Proc. ACL that mention co-occurrence.

Back to top.