SciSurf: Index of 'Creative Language Retrieval: A Robust Hybrid of Information Retrieval and Linguistic Creativity'

Creative Language Retrieval: A Robust Hybrid of Information Retrieval and Linguistic Creativity

Veale, Tony

Published in Proc. ACL, 2011

Article Structure

Abstract

Information retrieval (IR) and figurative language processing (FLP) could scarcely be more different in their treatment of language and meaning.

Introduction

Words should not always be taken at face value.

Related Work and Ideas

IR works on the premise that a user can turn an information need into an effective query by anticipating the language that is used to talk about a given topic in a target collection.

Creative Text Retrieval

In language, creativity is always a matter of con-strual.

Applications of Creative Retrieval

The Google ngrams comprise a vast array of extracts from English web texts, of 1 to 5 words in length (Brants and Franz, 2006).

Empirical Evaluation

Though A is the most overtly categorical of our wildcards, all three wildcards — ?, @ and A — are categorical in nature.

Concluding Remarks

Creative information retrieval is not a single application, but a paradigm that allows us to conceive of many different kinds of application for creatively manipulating text.

Topics

WordNet

Appears in 12 sentences as: WordNet (11) WordNet’s (1)

In Creative Language Retrieval: A Robust Hybrid of Information Retrieval and Linguistic Creativity

Techniques vary, from the use of stemmers and morphological analysis to the use of thesauri (such as WordNet ; see Fellbaum, 1998; Voorhees, 1998) to pad a query with synonyms, to the use of statistical analysis to identify more appropriate context-sensitive associations and near-synonyms (e.g.
Page 2, “Related Work and Ideas”
Hearst (1992) shows how a pattern like “Xs and other Ys” can be used to construct more fluid, context-specific taxonomies than those provided by WordNet (e.g.
Page 2, “Related Work and Ideas”
A generic, lightweight resource like WordNet can provide these relations, or a richer ontology can be used if one is available (e.g.
Page 3, “Creative Text Retrieval”
But ad-hoc categories do not replace natural kinds; rather, they supplement an existing system of more-or-less rigid categories, such as the categories found in WordNet .
Page 4, “Creative Text Retrieval”
member of the category named by C. AC can denote a fixed category in a resource like WordNet or even Wikipedia; thus, Afruit matches any member of {apple, orange, pear, lemon} and Aanimal any member of {dog, cat, mouse, deer, fox}.
Page 4, “Creative Text Retrieval”
and @ as category builders to a handcrafted gold standard like WordNet .
Page 7, “Empirical Evaluation”
Other researchers have likewise used WordNet as a gold standard for categorization experiments, and we replicate here the experimental setup of Almuhareb and Poesio (2004, 2005), which is designed to measure the effectiveness of web-acquired conceptual descriptions.
Page 7, “Empirical Evaluation”
Almuhareb and Poesio choose 214 English nouns from 13 of WordNet’s upper-level semantic categories, and proceed to harvest property values for these concepts from the web using the Hearst-like pattern “alanlthe * C islwas”.
Page 7, “Empirical Evaluation”
Let AAP denote the set of 214 WordNet nouns used by Almuhareb and Poesio.
Page 7, “Empirical Evaluation”
When the 8,300 features in ?AAP are clustered into 13 categories, the resulting clusters have a purity of 93.4% relative to WordNet .
Page 8, “Empirical Evaluation”
Almuhareb and Poe-sio’s set of 214 words does not contain adjectives, and besides, WordNet does not impose a category structure on its adjectives.
Page 8, “Empirical Evaluation”

See all papers in Proc. ACL 2011 that mention WordNet.

See all papers in Proc. ACL that mention WordNet.

lexicalized

Appears in 3 sentences as: lexicalized (3)

In Creative Language Retrieval: A Robust Hybrid of Information Retrieval and Linguistic Creativity

While some techniques may suggest conventional metaphors that have become lexicalized in a language, they are unlikely to identify relatively novel expressions.
Page 2, “Related Work and Ideas”
The Google ngrams can be seen as a lexicalized idea space, embedded within a larger sea of noise.
Page 5, “Applications of Creative Retrieval”
Each creative query is a jumping off point in a space of lexicalized ideas that is implied by a large corpus, with each successive match leading the user deeper into the space.
Page 5, “Applications of Creative Retrieval”

See all papers in Proc. ACL 2011 that mention lexicalized.

See all papers in Proc. ACL that mention lexicalized.