A Re-examination of Query Expansion Using Lexical Resources
Fang, Hui

Article Structure

Abstract

Query expansion is an effective technique to improve the performance of information retrieval systems.

Introduction

Most information retrieval models (Salton et al., 1975; Fuhr, 1992; Ponte and Croft, 1998; Fang and Zhai, 2005) compute relevance scores based on matching of terms in queries and documents.

Related Work

Although the use of WordNet in query expansion has been studied by various researchers, the improvement of retrieval performance is often limited.

Query Expansion in Axiomatic Retrieval Model

Axiomatic approaches have recently been proposed and studied to develop retrieval functions (Fang and Zhai, 2005; Fang and Zhai, 2006).

Term Similarity based on Lexical Resources

In this section, we discuss a set of term similarity functions that exploit the information stored in two lexical resources: WordNet (Miller, 1990) and dependency-based thesaurus (Lin, 1998).

Experiments

In this section, we experimentally evaluate the effectiveness of query expansion with the term similarity functions discussed in Section 4 in the axiomatic framework.

Conclusions

Query expansion is an effective technique in information retrieval to improve the retrieval performance, because it often can bridge the vocabulary gaps between queries and documents.

Topics

WordNet

Appears in 23 sentences as: WordNet (25)
In A Re-examination of Query Expansion Using Lexical Resources
  1. Although handcrafted lexical resources, such as WordNet, could provide more reliable related terms, previous studies showed that query expansion using only WordNet leads to very limited performance improvement.
    Page 1, “Abstract”
  2. Intuitively, compared with co-occurrence-based thesauri, handcrafted thesauri, such as WordNet , could provide more reliable terms for query expansion.
    Page 1, “Introduction”
  3. However, previous studies failed to show any significant gain in retrieval performance when queries are expanded with terms selected from WordNet (Voorhees, 1994; Stairmand, 1997).
    Page 1, “Introduction”
  4. In this paper, we study several term similarity functions that exploit various information from two lexical resources, i.e., WordNet
    Page 1, “Introduction”
  5. We find that the most effective way to utilize the information from WordNet is to compute the term similarity based on the overlap of synset definitions.
    Page 2, “Introduction”
  6. We then present our study of using lexical resources, such as WordNet , for query expansion in Section 4, and discuss experiment results in Section 5.
    Page 2, “Introduction”
  7. Although the use of WordNet in query expansion has been studied by various researchers, the improvement of retrieval performance is often limited.
    Page 2, “Related Work”
  8. Voorhees (Voorhees, 1994) expanded queries using a combination of synonyms, hypemyms and hyponyms manually selected from WordNet , and achieved limited improvement (i.e., around —2% to
    Page 2, “Related Work”
  9. Stairmand (Stairmand, 1997) used WordNet for query expansion, but they concluded that the improvement was restricted by the coverage of the WordNet and no empirical results were reported.
    Page 2, “Related Work”
  10. Another way to improve retrieval performance using WordNet is to disambiguate word senses.
    Page 2, “Related Work”
  11. Voorhees (Voorhees, 1993) showed that using WordNet for word sense disambiguation degrade the retrieval performance.
    Page 2, “Related Work”

See all papers in Proc. ACL 2008 that mention WordNet.

See all papers in Proc. ACL that mention WordNet.

Back to top.

synset

Appears in 12 sentences as: synset (8) synsets (6)
In A Re-examination of Query Expansion Using Lexical Resources
  1. We find that the most effective way to utilize the information from WordNet is to compute the term similarity based on the overlap of synset definitions.
    Page 2, “Introduction”
  2. Every node in the WordNet is a synset , i.e., a set of synonyms.
    Page 3, “Term Similarity based on Lexical Resources”
  3. The definition of a synset , which is referred to as gloss, is also provided.
    Page 3, “Term Similarity based on Lexical Resources”
  4. For a query term, all the synsets in which the term appears can be returned, along with the definition of the synsets .
    Page 3, “Term Similarity based on Lexical Resources”
  5. Thus, we can compute the term semantic similarity based on synset definitions in the following way:
    Page 3, “Term Similarity based on Lexical Resources”
  6. where D(t) is the concatenation of the definitions for all the synsets containing term t and |D| is the number of words of the set D.
    Page 3, “Term Similarity based on Lexical Resources”
  7. Within a taxonomy, synsets are organized by their lexical relations.
    Page 3, “Term Similarity based on Lexical Resources”
  8. Thus, given a term, related terms can be found in the synsets related to the synsets containing the term.
    Page 3, “Term Similarity based on Lexical Resources”
  9. Experiment results show that the similarity function based on synset definitions is most effective.
    Page 4, “Experiments”
  10. First, the similarity function based on synset definitions is the most effective one.
    Page 5, “Experiments”
  11. As shown in Table 2, the similarity function based on synset definitions, i.e., sdef, is most effective.
    Page 7, “Experiments”

See all papers in Proc. ACL 2008 that mention synset.

See all papers in Proc. ACL that mention synset.

Back to top.

significantly improve

Appears in 6 sentences as: significant improvement (1) significantly improve (4) significantly improves (1)
In A Re-examination of Query Expansion Using Lexical Resources
  1. In this paper, we reexamine this problem using recently proposed axiomatic approaches and find that, with appropriate term weighting strategy, we are able to exploit the information from lexical resources to significantly improve the retrieval performance.
    Page 1, “Abstract”
  2. Using this similarity function in query expansion can significantly improve the retrieval performance.
    Page 2, “Introduction”
  3. Unlike previous studies, we are able to show that query expansion using only manually created lexical resources can significantly improve the retrieval performance.
    Page 2, “Introduction”
  4. By incorporating this similarity function into the axiomatic retrieval models, we show that query expansion using the information from only WordNet can lead to significant improvement of retrieval performance, which has not been shown in the previous studies (Voorhees, 1994; Stairmand, 1997).
    Page 4, “Experiments”
  5. QEdef significantly improves the retrieval performance for all the data sets.
    Page 5, “Experiments”
  6. In this paper, we reexamine the problem of query expansion using lexical resources in recently proposed axiomatic framework and find that we are able to significantly improve retrieval performance through query expansion using only handcrafted lexical resources.
    Page 7, “Conclusions”

See all papers in Proc. ACL 2008 that mention significantly improve.

See all papers in Proc. ACL that mention significantly improve.

Back to top.

semantic similarity

Appears in 5 sentences as: semantic similarity (4) semantically similar (1)
In A Re-examination of Query Expansion Using Lexical Resources
  1. where s(q, d) is a semantic similarity function between two terms q and d, and f is a monotonically increasing function defined as
    Page 3, “Query Expansion in Axiomatic Retrieval Model”
  2. where 6 is a parameter that regulates the weighting of the original query terms and the semantically similar terms.
    Page 3, “Query Expansion in Axiomatic Retrieval Model”
  3. In our previous study (Fang and Zhai, 2006), term similarity function 3 is derived based on the mutual information of terms over collections that are constructed under the guidance of a set of term semantic similarity constraints.
    Page 3, “Query Expansion in Axiomatic Retrieval Model”
  4. Since the definition provides valuable information about the semantic meaning of a term, we can use the definitions of the terms to measure their semantic similarity .
    Page 3, “Term Similarity based on Lexical Resources”
  5. Thus, we can compute the term semantic similarity based on synset definitions in the following way:
    Page 3, “Term Similarity based on Lexical Resources”

See all papers in Proc. ACL 2008 that mention semantic similarity.

See all papers in Proc. ACL that mention semantic similarity.

Back to top.