Improving Multi-Modal Representations Using Image Dispersion: Why Less is Sometimes More
Kiela, Douwe and Hill, Felix and Korhonen, Anna and Clark, Stephen

Article Structure

Abstract

Models that learn semantic representations from both linguistic and perceptual input outperform text-only models in many contexts and better reflect human concept acquisition.

Introduction

Multi-modal models that learn semantic concept representations from both linguistic and perceptual input were originally motivated by parallels with human concept acquisition, and evidence that many concepts are grounded in the perceptual system (Barsalou et al., 2003).

Experimental Approach

Our experiments focus on multi-modal models that extract their perceptual input automatically from images.

Improving Multi-Modal Representations

We apply image dispersion-based filtering as follows: if both concepts in an evaluation pair have an image dispersion below a given threshold, both the linguistic and the visual representations are included.

Concreteness and Image Dispersion

The filtering approach described thus far improves multi-modal representations because image dispersion provides a means to distinguish concrete concepts from more abstract concepts.

Conclusions

We presented a novel method, image dispersion-based filtering, that improves multi-modal representations by approximating conceptual concreteness from images and filtering model input.

Topics

im

Appears in 4 sentences as: im (4)
In Improving Multi-Modal Representations Using Image Dispersion: Why Less is Sometimes More
  1. Such models extract information about the perceptible characteristics of words from data collected in property norming experiments (Roller and Schulte im Walde, 2013; Silberer and Lapata, 2012) or directly from ‘raw’ data sources such as images (Feng and Lapata, 2010; Bruni et al., 2012).
    Page 1, “Introduction”
  2. Multi-modal models outperform language-only models on a range of tasks, including modelling conceptual association and predicting com-positionality (Bruni et al., 2012; Silberer and Lapata, 2012; Roller and Schulte im Walde, 2013).
    Page 1, “Introduction”
  3. Previous NLP-related work uses SIF T (Feng and Lapata, 2010; Bruni et al., 2012) or SURF (Roller and Schulte im Walde, 2013) descriptors for identifying points of interest in an image, quantified by 128-dimensional local descriptors.
    Page 3, “Experimental Approach”
  4. The USP norms have been used in many previous studies to evaluate semantic representations (Andrews et al., 2009; Feng and Lapata, 2010; Silberer and Lapata, 2012; Roller and Schulte im Walde, 2013).
    Page 3, “Experimental Approach”

See all papers in Proc. ACL 2014 that mention im.

See all papers in Proc. ACL that mention im.

Back to top.

semantic representations

Appears in 4 sentences as: semantic representations (4)
In Improving Multi-Modal Representations Using Image Dispersion: Why Less is Sometimes More
  1. Models that learn semantic representations from both linguistic and perceptual input outperform text-only models in many contexts and better reflect human concept acquisition.
    Page 1, “Abstract”
  2. Multi-modal models in which perceptual input is filtered according to our algorithm learn higher-quality semantic representations than previous approaches, resulting in a significant performance improvement of up to 17% in captur-
    Page 1, “Introduction”
  3. This model learns high quality lexical semantic representations based on the distributional properties of words in text, and has been shown to outperform simple distributional models on applications such as semantic composition and analogical mapping (Mikolov et al., 2013b).
    Page 3, “Experimental Approach”
  4. The USP norms have been used in many previous studies to evaluate semantic representations (Andrews et al., 2009; Feng and Lapata, 2010; Silberer and Lapata, 2012; Roller and Schulte im Walde, 2013).
    Page 3, “Experimental Approach”

See all papers in Proc. ACL 2014 that mention semantic representations.

See all papers in Proc. ACL that mention semantic representations.

Back to top.

vector representations

Appears in 3 sentences as: vector representation (1) vector representations (2)
In Improving Multi-Modal Representations Using Image Dispersion: Why Less is Sometimes More
  1. Generating Visual Representations Visual vector representations for each image were obtained using the well-known bag of visual words (BoVW) approach (Sivic and Zisserman, 2003).
    Page 2, “Experimental Approach”
  2. BoVW obtains a vector representation for an
    Page 2, “Experimental Approach”
  3. Generating Linguistic Representations We extract continuous vector representations (also of 50 dimensions) for concepts using the continuous log-linear skipgram model of Mikolov et al.
    Page 3, “Experimental Approach”

See all papers in Proc. ACL 2014 that mention vector representations.

See all papers in Proc. ACL that mention vector representations.

Back to top.