Creating Similarity: Lateral Thinking for Vertical Similarity Judgments
Veale, Tony and Li, Guofu

Article Structure

Abstract

Just as observing is more than just seeing, comparing is far more than mere matching.

Seeing is Believing (and Creating)

Similarity is a cognitive phenomenon that is both complex and subjective, yet for practical reasons it is often modeled as if it were simple and objective.

Related Work and Ideas

WordNet’s taxonomic organization of noun-senses and verb-senses — in which very general categories are successively divided into increasingly informative subcategories or instance-level ideas — allows us to gauge the overlap in information content, and thus of meaning, of two lexical concepts.

Divergent (Re)Categorization

To tap into a richer source of concept properties than WordNet’s glosses, we can use web n-grams.

Measuring and Creating Similarity

Which perspectives will be most useful and informative to a WordNet-based similarity metric?

Empirical Evaluation

Many fascinating perspectives on familiar ideas are bootstrapped from the web using similes as a starting point.

Summary and Conclusions

de Bono (1970) argues that the best solutions arise from using lateral and vertical thinking in unison.

Topics

WordNet

Appears in 51 sentences as: WordNet (58) WordNet’s (7)
In Creating Similarity: Lateral Thinking for Vertical Similarity Judgments
  1. Structured resources such as WordNet offer a convenient hierarchical means for converging on a common ground for comparison, but offer little support for the divergent thinking that is needed to creatively view one concept as another.
    Page 1, “Abstract”
  2. These lateral views complement the vertical views of WordNet , and support a system for idea exploration called Thesaurus Rex.
    Page 1, “Abstract”
  3. We show also how Thesaurus Rex supports a novel, generative similarity measure for WordNet .
    Page 1, “Abstract”
  4. This reliance on the consensus viewpoint explains why WordNet (Fellbaum, 1998) has proven so useful as a basis for computational measures of lexico-semantic similarity
    Page 1, “Seeing is Believing (and Creating)”
  5. Using WordNet , for instance, a similarity measure can vertically converge on a common superordinate category of both inputs, and generate a single numeric result based on their distance to, and the information content of, this common generalization.
    Page 2, “Seeing is Believing (and Creating)”
  6. Though WordNet is ideally structured to support vertical, convergent reasoning, its comprehensive nature means it can also be used as a solid foundation for building a more lateral and divergent model of similarity.
    Page 2, “Seeing is Believing (and Creating)”
  7. Here we will use the web as a source of diverse perspectives on familiar ideas, to complement the conventional and often narrow views codified by WordNet .
    Page 2, “Seeing is Believing (and Creating)”
  8. WordNet’s taxonomic organization of noun-senses and verb-senses — in which very general categories are successively divided into increasingly informative subcategories or instance-level ideas — allows us to gauge the overlap in information content, and thus of meaning, of two lexical concepts.
    Page 2, “Related Work and Ideas”
  9. Wu & Palmer (1994) use the depth of a lexical concept in the WordNet hierarchy as such a proxy, and thereby estimate the similarity of two lexical concepts as twice the depth of their LCS divided by the sum of their individual depths.
    Page 2, “Related Work and Ideas”
  10. Rather, when using Resnick’s metric (or that of Lin, or Jiang and Conrath) for measuring the similarity of lexical concepts in WordNet, one can use the category structure of WordNet itself to estimate information content.
    Page 3, “Related Work and Ideas”
  11. sources of information besides WordNet’s category structures.
    Page 3, “Related Work and Ideas”

See all papers in Proc. ACL 2013 that mention WordNet.

See all papers in Proc. ACL that mention WordNet.

Back to top.

fine-grained

Appears in 17 sentences as: Fine-grained (1) fine-grained (19)
In Creating Similarity: Lateral Thinking for Vertical Similarity Judgments
  1. A fine-grained category hierarchy permits fine-grained similarity judgments, and though WordNet is useful, its sense hierarchies are not especially fine-grained .
    Page 4, “Related Work and Ideas”
  2. However, we can automatically make WordNet subtler and more discerning, by adding new fine-grained categories to unite lexical concepts whose similarity is not reflected by any existing categories.
    Page 4, “Related Work and Ideas”
  3. Veale (2003) shows how a property that is found in the glosses of two lexical concepts, of the same depth, can be combined with their LCS to yield a new fine-grained parent category, so e.g.
    Page 4, “Related Work and Ideas”
  4. To find the stable properties that can underpin a meaningful fine-grained category for cowboy, we must seek out the properties that are so often presupposed to be salient of all cowboys that one can use them to anchor a simile, such as "swaggering like a cowboy” or “as grizzled as a cowboy”.
    Page 4, “Divergent (Re)Categorization”
  5. Since each hit will also yield a value for S via the wildcard *, and a fine-grained category PS for C, we use this approach here to harvest fine-grained categories from the web from most of our similes.
    Page 5, “Divergent (Re)Categorization”
  6. After 2 cycles we acquire 43 categories; after 3 cycles, 72; after 4 cycles, 93; and after 5 cycles, we acquire 102 fine-grained perspectives on cola, such as stimu-lating-drink and corrosive-substance.
    Page 5, “Divergent (Re)Categorization”
  7. Fine-grained perspectives for cola found by Thesaurus Rex on the web.
    Page 5, “Divergent (Re)Categorization”
  8. We also want any fine-grained perspective M-H to influence our similarity metric, provided it can be coherently tied into WordNet as a shared hypemym of the two lexical concepts being compared.
    Page 6, “Measuring and Creating Similarity”
  9. The denominator in (2) denotes the sum total of the size of all fine-grained categories that can be coherently added to WordNet for any term.
    Page 6, “Measuring and Creating Similarity”
  10. For a shared dimension H in the feature vectors of concepts C1 and C2, if at least one fine-grained perspective M-H has been added to WordNet between H and C1 and between H and C2, then the value of dimension H for C1 and for C2 is given by (4):
    Page 6, “Measuring and Creating Similarity”
  11. A fine-grained perspective M-H will thus influence a similarity judgment between C1 and C2 only if M-H can be coherently added to WordNet as a hypemym of C1 and C2, and if M-H enriches our view of H. Unlike Resnick (1995), Lin (1998) and Seco et al.
    Page 6, “Measuring and Creating Similarity”

See all papers in Proc. ACL 2013 that mention fine-grained.

See all papers in Proc. ACL that mention fine-grained.

Back to top.

similarity measure

Appears in 8 sentences as: similarity measure (5) similarity measurement (1) similarity measures (2)
In Creating Similarity: Lateral Thinking for Vertical Similarity Judgments
  1. We show also how Thesaurus Rex supports a novel, generative similarity measure for WordNet.
    Page 1, “Abstract”
  2. Using WordNet, for instance, a similarity measure can vertically converge on a common superordinate category of both inputs, and generate a single numeric result based on their distance to, and the information content of, this common generalization.
    Page 2, “Seeing is Believing (and Creating)”
  3. To be as useful for creative tasks as they are for conventional tasks, we need to re-imagine our computational similarity measures as generative rather than selective, expansive rather than reductive, divergent as well as convergent and lateral as well as vertical.
    Page 2, “Seeing is Believing (and Creating)”
  4. Section 2 provides a brief overview of past work in the area of similarity measurement , before section 3 describes a simple bootstrapping loop for acquiring richly diverse perspectives from the web for a wide variety of familiar ideas.
    Page 2, “Seeing is Believing (and Creating)”
  5. more rounded similarity measures .
    Page 3, “Related Work and Ideas”
  6. A similarity measure can draw on other
    Page 3, “Related Work and Ideas”
  7. Their best similarity measure achieves a remarkable 0.93 correlation with human judgments on the Miller & Charles word-pair set.
    Page 3, “Related Work and Ideas”
  8. But no structural similarity measure for WordNet exhibits enough discernment to e.g.
    Page 4, “Related Work and Ideas”

See all papers in Proc. ACL 2013 that mention similarity measure.

See all papers in Proc. ACL that mention similarity measure.

Back to top.

n-grams

Appears in 4 sentences as: n-grams (4)
In Creating Similarity: Lateral Thinking for Vertical Similarity Judgments
  1. To tap into a richer source of concept properties than WordNet’s glosses, we can use web n-grams .
    Page 4, “Divergent (Re)Categorization”
  2. Consider these descriptions of a cowboy from the Google n-grams (Brants & Franz, 2006).
    Page 4, “Divergent (Re)Categorization”
  3. So for each property P suggested by Google n-grams for a lexical concept C, we generate a like-simile for verbal behaviors such as swaggering and an as-as-simile for adjectives such as lonesome.
    Page 4, “Divergent (Re)Categorization”
  4. Using the Google n-grams as a source of tacit grouping constructions, we have created a comprehensive lookup table that provides Rex similarity scores for the most common (if often implicit) comparisons.
    Page 8, “Summary and Conclusions”

See all papers in Proc. ACL 2013 that mention n-grams.

See all papers in Proc. ACL that mention n-grams.

Back to top.

similarity score

Appears in 4 sentences as: similarity score (3) similarity scores (1)
In Creating Similarity: Lateral Thinking for Vertical Similarity Judgments
  1. Negating the log of this normalized length yields a corresponding similarity score .
    Page 2, “Related Work and Ideas”
  2. Rex estimates a similarity score for each of the 1,264,827 pairings of comparable terms it finds in the Google 3-grams.
    Page 7, “Empirical Evaluation”
  3. Using the Google n-grams as a source of tacit grouping constructions, we have created a comprehensive lookup table that provides Rex similarity scores for the most common (if often implicit) comparisons.
    Page 8, “Summary and Conclusions”
  4. Comparability is not the same as similarity, and a nonzero similarity score does not mean that two concepts would ever be considered comparable by a human.
    Page 8, “Summary and Conclusions”

See all papers in Proc. ACL 2013 that mention similarity score.

See all papers in Proc. ACL that mention similarity score.

Back to top.

human judgments

Appears in 3 sentences as: human judges (1) human judgments (2)
In Creating Similarity: Lateral Thinking for Vertical Similarity Judgments
  1. Strube and Ponzetto (2006) show how Wikipedia can support a measure of similarity (and relatedness) that better approximates human judgments than many WordNet-based measures.
    Page 3, “Related Work and Ideas”
  2. Their best similarity measure achieves a remarkable 0.93 correlation with human judgments on the Miller & Charles word-pair set.
    Page 3, “Related Work and Ideas”
  3. We evaluate Rex by estimating how closely its judgments correlate with those of human judges on the 30-pair word set of Miller & Charles (M&C), who aggregated the judgments of multiple human raters into mean ratings for these pairs.
    Page 6, “Empirical Evaluation”

See all papers in Proc. ACL 2013 that mention human judgments.

See all papers in Proc. ACL that mention human judgments.

Back to top.