Automatic Image Annotation Using Auxiliary Text Information
Feng, Yansong and Lapata, Mirella

Article Structure

Abstract

The availability of databases of images labeled with keywords is necessary for developing and evaluating image annotation models.

Introduction

As the number of image collections is rapidly growing, so does the need to browse and search them.

Related Work

Automatic image annotation is a popular task in computer vision.

BBC News Database

Our database consists of news images which are abundant.

Topics

latent variable

Appears in 13 sentences as: latent variable (8) latent variables (5)
In Automatic Image Annotation Using Auxiliary Text Information
  1. Another way of capturing co-occurrence information is to introduce latent variables linking image features with words.
    Page 2, “Related Work”
  2. Unlike other unsupervised approaches vhere a set of latent variables is introduced, each 1efining a joint distribution on the space of key-vords and image features, the relevance model cap-,ures the joint probability of images and annotated vords directly, without requiring an intermediate :lustering stage.
    Page 3, “BBC News Database”
  3. achieve competitive performance with latent variable models.
    Page 4, “BBC News Database”
  4. Each annotated image in the training set is treated as a latent variable .
    Page 4, “BBC News Database”
  5. where D is the number of images in the training database, V; are visual features of the image regions representing I , W; are the keywords of I , s is a latent variable (i.e., an image-annotation pair), and P(s) the prior probability of s. The latter is drawn from a uniform distribution:
    Page 4, “BBC News Database”
  6. where ND is number of the latent variables in the training database D.
    Page 4, “BBC News Database”
  7. where NV] is the number of regions in image I , vr the feature vector for region r in image I , nsv the number of regions in the image of latent variable 5, v,- the feature vector for region i in 5’s image, k the dimension of the image feature vectors and Z the feature covariance matrix.
    Page 4, “BBC News Database”
  8. According to equation (3), a Gaussian kernel is fit to every feature vector v,- corresponding to region i in the image of the latent variable 5.
    Page 4, “BBC News Database”
  9. The probability of sampling a set of words W given a latent variable 5 from the underlying multiple Bernoulli distribution that has generated the training set D is:
    Page 5, “BBC News Database”
  10. where 0t is a smoothing parameter tuned on the development set, sa is the annotation for the latent variable 5 and sd its corresponding document.
    Page 5, “BBC News Database”
  11. where ,u is a smoothing parameter estimated on the development set, lama is a Boolean variable denoting whether w appears in the annotation sa, and Nw is the number of latent variables that contain w in their annotations.
    Page 5, “BBC News Database”

See all papers in Proc. ACL 2008 that mention latent variable.

See all papers in Proc. ACL that mention latent variable.

Back to top.

LDA

Appears in 12 sentences as: LDA (12)
In Automatic Image Annotation Using Auxiliary Text Information
  1. More sophisticated graphical models (Blei and Jordan, 2003) have also been employed including Gaussian Mixture Models (GMM) and Latent Dirichlet Allocation ( LDA ).
    Page 2, “Related Work”
  2. Specifically, we use Latent Dirichlet Allocation ( LDA ) as our topic model (Blei et al., 2003).
    Page 5, “BBC News Database”
  3. LDA
    Page 5, “BBC News Database”
  4. Given a collection of documents and a set of latent variables (i.e., the number of topics), the LDA model estimates the probability of topics per document and the probability of words per topic.
    Page 6, “BBC News Database”
  5. For our re-ranking task, we use the LDA model to infer the m-best topics in the accompanying document.
    Page 6, “BBC News Database”
  6. However, according to the LDA model neither W2 nor W5 are likely topic indicators.
    Page 6, “BBC News Database”
  7. An advantage of using LDA is that at test time we can perform inference without retraining the topic model.
    Page 6, “BBC News Database”
  8. We trained an LDA model with 20 topics on our document collection using David Blei’s implementation.3 We used this model to re-rank the output of our annotation model according to the three most likely topics in each document.
    Page 6, “BBC News Database”
  9. Eliminating the LDA reranker from the extended model decreases F1 by 0.62%.
    Page 7, “BBC News Database”
  10. Incidentally, LDA can be also used to rerank the output of Lavrenko et al.’s (2003) model.
    Page 7, “BBC News Database”
  11. LDA also increases the performance of this model by 0.41%.
    Page 7, “BBC News Database”

See all papers in Proc. ACL 2008 that mention LDA.

See all papers in Proc. ACL that mention LDA.

Back to top.

news articles

Appears in 7 sentences as: news article (1) News articles (1) news articles (5)
In Automatic Image Annotation Using Auxiliary Text Information
  1. We create a database of pictures that are naturally embedded into news articles and propose to use their captions as a proxy for annotation keywords.
    Page 1, “Abstract”
  2. We also demonstrate that the news article associated with the picture can be used to boost image annotation performance.
    Page 1, “Abstract”
  3. News articles associated with images and their captions spring readily to mind (e.g., BBC News, Yahoo News).
    Page 2, “Introduction”
  4. Importantly, our images are not standalone, they come with news articles whose content is shared with the image.
    Page 2, “Introduction”
  5. For example, news articles often contain images whose captions can be thought of as annotations.
    Page 3, “Related Work”
  6. Many online news providers supply pictures with news articles , some even classify news into broad topic categories (e.g., business, world, sports, entertainment).
    Page 3, “BBC News Database”
  7. We downloaded 3,361 news articles from the BBC News website.2 Each article was accompanied with an image and its caption.
    Page 3, “BBC News Database”

See all papers in Proc. ACL 2008 that mention news articles.

See all papers in Proc. ACL that mention news articles.

Back to top.

development set

Appears in 4 sentences as: development set (4)
In Automatic Image Annotation Using Auxiliary Text Information
  1. the kernel whose value is optimized on the development set .
    Page 5, “BBC News Database”
  2. where 0t is a smoothing parameter tuned on the development set , sa is the annotation for the latent variable 5 and sd its corresponding document.
    Page 5, “BBC News Database”
  3. where ,u is a smoothing parameter estimated on the development set , lama is a Boolean variable denoting whether w appears in the annotation sa, and Nw is the number of latent variables that contain w in their annotations.
    Page 5, “BBC News Database”
  4. The model presented in Section 4 has a few parameters that must be selected empirically on the development set .
    Page 6, “BBC News Database”

See all papers in Proc. ACL 2008 that mention development set.

See all papers in Proc. ACL that mention development set.

Back to top.

content words

Appears in 3 sentences as: content words (3)
In Automatic Image Annotation Using Auxiliary Text Information
  1. We randomly selected 240 image-caption pairs and manually assessed whether the caption content words (i.e., nouns, verbs, and adjectives) could describe the image.
    Page 4, “BBC News Database”
  2. We rank the document’s content words (i.e., nouns, verbs, and adjectives) according to their tf * idf weight and select the top k to be the final annotations.
    Page 6, “BBC News Database”
  3. Again we only use content words (the average title length in the training set was 4.0 words).
    Page 6, “BBC News Database”

See all papers in Proc. ACL 2008 that mention content words.

See all papers in Proc. ACL that mention content words.

Back to top.

feature vector

Appears in 3 sentences as: feature vector (3) feature vectors (2)
In Automatic Image Annotation Using Auxiliary Text Information
  1. Secondly, the generation of feature vectors is modeled directly, so there is no need for quantization.
    Page 4, “BBC News Database”
  2. where NV] is the number of regions in image I , vr the feature vector for region r in image I , nsv the number of regions in the image of latent variable 5, v,- the feature vector for region i in 5’s image, k the dimension of the image feature vectors and Z the feature covariance matrix.
    Page 4, “BBC News Database”
  3. According to equation (3), a Gaussian kernel is fit to every feature vector v,- corresponding to region i in the image of the latent variable 5.
    Page 4, “BBC News Database”

See all papers in Proc. ACL 2008 that mention feature vector.

See all papers in Proc. ACL that mention feature vector.

Back to top.

topic model

Appears in 3 sentences as: topic model (3)
In Automatic Image Annotation Using Auxiliary Text Information
  1. A simple way to implement this idea is by re-ranking our k-best list according to a topic model estimated from the entire document collection.
    Page 5, “BBC News Database”
  2. Specifically, we use Latent Dirichlet Allocation (LDA) as our topic model (Blei et al., 2003).
    Page 5, “BBC News Database”
  3. An advantage of using LDA is that at test time we can perform inference without retraining the topic model .
    Page 6, “BBC News Database”

See all papers in Proc. ACL 2008 that mention topic model.

See all papers in Proc. ACL that mention topic model.

Back to top.