Learning Better Data Representation Using Inference-Driven Metric Learning
Dhillon, Paramveer S. and Talukdar, Partha Pratim and Crammer, Koby

Article Structure

Abstract

We initiate a study comparing effectiveness of the transformed spaces learned by recently proposed supervised, and semi-supervised metric learning algorithms to those generated by previously proposed unsupervised dimensionality reduction methods (e.g., PCA).

Introduction

Because of the high-dimensional nature of NLP datasets, estimating a large number of parameters (a parameter for each dimension), often from a limited amount of labeled data, is a challenging task for statistical learners.

Metric Learning

2.1 Relationship between Metric Learning and Linear Projection

A <— METRICLEARNER(X, 3,1?)

4.

Conclusion

In this paper, we compared the effectiveness of the transformed spaces learned by recently proposed supervised, and semi-supervised metric learning algorithms to those generated by previously proposed unsupervised dimensionality reduction methods (e.g., PCA).

Topics

learning algorithms

Appears in 11 sentences as: learning algorithm (5) learning algorithms (6)
In Learning Better Data Representation Using Inference-Driven Metric Learning
  1. We initiate a study comparing effectiveness of the transformed spaces learned by recently proposed supervised, and semi-supervised metric learning algorithms to those generated by previously proposed unsupervised dimensionality reduction methods (e.g., PCA).
    Page 1, “Abstract”
  2. Through a variety of experiments on different real-world datasets, we find IDML—IT, a semi-supervised metric learning algorithm to be the most effective.
    Page 1, “Abstract”
  3. Recently, several supervised metric learning algorithms have been proposed (Davis et al., 2007; Weinberger and Saul, 2009).
    Page 1, “Introduction”
  4. Even though different supervised and semi-supervised metric learning algorithms have recently been proposed, effectiveness of the transformed spaces learned by them in NLP
    Page 1, “Introduction”
  5. We find IDML-IT, a semi-supervised metric learning algorithm to be the most effective.
    Page 1, “Introduction”
  6. We shall now review two recently proposed metric learning algorithms .
    Page 1, “Metric Learning”
  7. The ITML metric learning algorithm , which we reviewed in Section 2.2, is supervised in nature, and hence it does not exploit widely available unlabeled data.
    Page 2, “Metric Learning”
  8. In this section, we review Inference Driven Metric Learning (IDML) (Algorithm 1) (Dhillon et al., 2010), a recently proposed metric learning framework which combines an existing supervised metric learning algorithm (such as ITML) along with transductive graph-based label inference to learn a new distance metric from labeled as well as unlabeled data combined.
    Page 2, “Metric Learning”
  9. IDML starts out with the assumption that existing supervised metric learning algorithms , such as ITML, can learn a better metric if the number of available labeled instances is increased.
    Page 2, “Metric Learning”
  10. We also experimented with the supervised large-margin metric learning algorithm (LMNN) presented in (Weinberger and Saul, 2009).
    Page 4, “A <— METRICLEARNER(X, 3,1?)”
  11. In this paper, we compared the effectiveness of the transformed spaces learned by recently proposed supervised, and semi-supervised metric learning algorithms to those generated by previously proposed unsupervised dimensionality reduction methods (e.g., PCA).
    Page 5, “Conclusion”

See all papers in Proc. ACL 2010 that mention learning algorithms.

See all papers in Proc. ACL that mention learning algorithms.

Back to top.

semi-supervised

Appears in 10 sentences as: Semi-Supervised (2) semi-supervised (8)
In Learning Better Data Representation Using Inference-Driven Metric Learning
  1. We initiate a study comparing effectiveness of the transformed spaces learned by recently proposed supervised, and semi-supervised metric learning algorithms to those generated by previously proposed unsupervised dimensionality reduction methods (e.g., PCA).
    Page 1, “Abstract”
  2. Through a variety of experiments on different real-world datasets, we find IDML—IT, a semi-supervised metric learning algorithm to be the most effective.
    Page 1, “Abstract”
  3. Even though different supervised and semi-supervised metric learning algorithms have recently been proposed, effectiveness of the transformed spaces learned by them in NLP
    Page 1, “Introduction”
  4. We find IDML-IT, a semi-supervised metric learning algorithm to be the most effective.
    Page 1, “Introduction”
  5. 2.3 Inference-Driven Metric Learning (IDML): Semi-Supervised
    Page 2, “Metric Learning”
  6. Since we are focusing on the semi-supervised learning (SSL) setting with n; labeled and nu unlabeled instances, the idea is to automatically label the unlabeled instances using a graph based SSL algorithm, and then include instances with low assigned label entropy (i.e., high confidence label assignments) in the next round of metric learning.
    Page 2, “Metric Learning”
  7. 3.3 Semi-Supervised Classification
    Page 4, “A <— METRICLEARNER(X, 3,1?)”
  8. In this section, we trained the GRF classifier (see Equation 3), a graph-based semi-supervised leam-ing (SSL) algorithm (Zhu et al., 2003), using Gaussian kernel parameterized by A = FTP to set edge weights.
    Page 4, “A <— METRICLEARNER(X, 3,1?)”
  9. In this paper, we compared the effectiveness of the transformed spaces learned by recently proposed supervised, and semi-supervised metric learning algorithms to those generated by previously proposed unsupervised dimensionality reduction methods (e.g., PCA).
    Page 5, “Conclusion”
  10. Through a variety of experiments on different real-world NLP datasets, we demonstrated that supervised as well as semi-supervised classifiers trained on the space learned by IDML—IT consistently result in the lowest classification errors.
    Page 5, “Conclusion”

See all papers in Proc. ACL 2010 that mention semi-supervised.

See all papers in Proc. ACL that mention semi-supervised.

Back to top.

unlabeled data

Appears in 4 sentences as: unlabeled data (4)
In Learning Better Data Representation Using Inference-Driven Metric Learning
  1. IDML-IT (Dhillon et al., 2010) is another such method which exploits labeled as well as unlabeled data during metric learning.
    Page 1, “Introduction”
  2. The ITML metric learning algorithm, which we reviewed in Section 2.2, is supervised in nature, and hence it does not exploit widely available unlabeled data .
    Page 2, “Metric Learning”
  3. In this section, we review Inference Driven Metric Learning (IDML) (Algorithm 1) (Dhillon et al., 2010), a recently proposed metric learning framework which combines an existing supervised metric learning algorithm (such as ITML) along with transductive graph-based label inference to learn a new distance metric from labeled as well as unlabeled data combined.
    Page 2, “Metric Learning”
  4. In this case, we treat the set of test instances (without their gold labels) as the unlabeled data .
    Page 4, “A <— METRICLEARNER(X, 3,1?)”

See all papers in Proc. ACL 2010 that mention unlabeled data.

See all papers in Proc. ACL that mention unlabeled data.

Back to top.