Selecting Query Term Alternations for Web Search by Exploiting Query Contexts
Cao, Guihong and Robertson, Stephen and Nie, Jian-Yun

Article Structure

Abstract

Query expansion by word alterations (alternative forms of a word) is often used in Web search to replace word stemming.

Introduction

Word stemming is a basic NLP technique used in most of Information Retrieval (IR) systems.

Related Work

Many stemmers have been implemented and used as standard processing in IR.

Generating Alteration Candidates

Our method to generate alteration candidates can be described as follows.

Bigram Expansion Model for Alteration Selection

In this section, we try to select the most suitable alterations according to the query context.

Regression Model for Alteration Selection

None of the previous selection methods considers how well an alteration would perform in retrieval.

Experiments 6.1 Experiment Settings

In this section, our aim is to evaluate the two con-text-sensitive word alteration selection methods.

Conclusion

Traditional IR approaches stem terms in both documents and queries.

Topics

Bigram

Appears in 25 sentences as: Bigram (16) bigram (12)
In Selecting Query Term Alternations for Web Search by Exploiting Query Contexts
  1. The selection is made according to the appropriateness of the alteration to the query context (using a bigram language model), or according to its expected impact on the retrieval effectiveness (using a regression model).
    Page 1, “Abstract”
  2. The query context is modeled by a bigram language model.
    Page 2, “Introduction”
  3. which is the most coherent with the bigram model.
    Page 2, “Introduction”
  4. We call this model Bigram Expansion.
    Page 2, “Introduction”
  5. Both the Naive Expansion and the Bigram Expansion determine word alterations solely according to general knowledge about the language ( bigram model or morphological rules), and no consideration about the possible effect of the expansion term is made.
    Page 2, “Introduction”
  6. Compared to the bigram expansion method, the regression method results in even fewer alterations, but experiments show that the retrieval effectiveness is even better.
    Page 2, “Introduction”
  7. In section 4 and 5, we describe the Bigram Expansion method and Regression method respectively.
    Page 2, “Introduction”
  8. 2007), a bigram language model is used to determine the alteration of the head word that best fits the query.
    Page 3, “Related Work”
  9. In this paper, one of the proposed methods will also use a bigram language model of the query to determine the appropriate alteration candidates.
    Page 3, “Related Work”
  10. The query context is modeled by a bigram language model as in (Peng et al.
    Page 3, “Bigram Expansion Model for Alteration Selection”
  11. In this work, we used bigram language model to calculate the probability of each path.
    Page 4, “Bigram Expansion Model for Alteration Selection”

See all papers in Proc. ACL 2008 that mention Bigram.

See all papers in Proc. ACL that mention Bigram.

Back to top.

regression model

Appears in 20 sentences as: Regression Model (2) Regression model (6) regression model (13) regression models (2)
In Selecting Query Term Alternations for Web Search by Exploiting Query Contexts
  1. The selection is made according to the appropriateness of the alteration to the query context (using a bigram language model), or according to its expected impact on the retrieval effectiveness (using a regression model ).
    Page 1, “Abstract”
  2. In this paper, we will use a regression model to predict the impact on retrieval effectiveness.
    Page 2, “Introduction”
  3. This method develops a regression model from a set of training data, and it is capable of predicting the expected change in performance when the original query is augmented by this alteration.
    Page 4, “Regression Model for Alteration Selection”
  4. 5.1 Linear Regression Model
    Page 4, “Regression Model for Alteration Selection”
  5. The goal of the regression model is to predict the performance change when a query term is augmented with an alteration.
    Page 4, “Regression Model for Alteration Selection”
  6. There are several regression models, ranging from the simplest linear regression model to nonlinear alternatives, such as a neural network (Duda et al., 2001), a Regression SVM (Bishop, 2006).
    Page 4, “Regression Model for Alteration Selection”
  7. For simplicity, we use linear regression model here.
    Page 4, “Regression Model for Alteration Selection”
  8. We denote an instance in the feature space as X, and the weights of features are denoted as W. Then the linear regression model is defined as:
    Page 4, “Regression Model for Alteration Selection”
  9. We train the regression model by minimizing the mean square error.
    Page 5, “Regression Model for Alteration Selection”
  10. As a supervised learning method, the regression model is trained with a set of training data.
    Page 5, “Regression Model for Alteration Selection”
  11. 5.3 Features Used for Regression Model
    Page 5, “Regression Model for Alteration Selection”

See all papers in Proc. ACL 2008 that mention regression model.

See all papers in Proc. ACL that mention regression model.

Back to top.

language model

Appears in 8 sentences as: language model (8)
In Selecting Query Term Alternations for Web Search by Exploiting Query Contexts
  1. The selection is made according to the appropriateness of the alteration to the query context (using a bigram language model ), or according to its expected impact on the retrieval effectiveness (using a regression model).
    Page 1, “Abstract”
  2. The query context is modeled by a bigram language model .
    Page 2, “Introduction”
  3. 2007), a bigram language model is used to determine the alteration of the head word that best fits the query.
    Page 3, “Related Work”
  4. In this paper, one of the proposed methods will also use a bigram language model of the query to determine the appropriate alteration candidates.
    Page 3, “Related Work”
  5. The query context is modeled by a bigram language model as in (Peng et al.
    Page 3, “Bigram Expansion Model for Alteration Selection”
  6. In this work, we used bigram language model to calculate the probability of each path.
    Page 4, “Bigram Expansion Model for Alteration Selection”
  7. P(el,ez,...,ei,...,en) = P(e1 )H:=2P(ek Iek_1) (2) P(ek|ek_1) is estimated with a back-off bigram language model (Goodman, 2001).
    Page 4, “Bigram Expansion Model for Alteration Selection”
  8. In the first method proposed — the Bigram Expansion model, query context is modeled by a bigram language model .
    Page 8, “Conclusion”

See all papers in Proc. ACL 2008 that mention language model.

See all papers in Proc. ACL that mention language model.

Back to top.

linear regression

Appears in 4 sentences as: Linear Regression (1) linear regression (3)
In Selecting Query Term Alternations for Web Search by Exploiting Query Contexts
  1. 5.1 Linear Regression Model
    Page 4, “Regression Model for Alteration Selection”
  2. There are several regression models, ranging from the simplest linear regression model to nonlinear alternatives, such as a neural network (Duda et al., 2001), a Regression SVM (Bishop, 2006).
    Page 4, “Regression Model for Alteration Selection”
  3. For simplicity, we use linear regression model here.
    Page 4, “Regression Model for Alteration Selection”
  4. We denote an instance in the feature space as X, and the weights of features are denoted as W. Then the linear regression model is defined as:
    Page 4, “Regression Model for Alteration Selection”

See all papers in Proc. ACL 2008 that mention linear regression.

See all papers in Proc. ACL that mention linear regression.

Back to top.

co-occurrence

Appears in 3 sentences as: co-occurrence (3)
In Selecting Query Term Alternations for Web Search by Exploiting Query Contexts
  1. For example, for the query “controlling acid rain”, the coherence of the alteration “acidic” is measured by the logarithm of its co-occurrence with the other query terms within a predefined window (90 words) in the corpus.
    Page 5, “Regression Model for Alteration Selection”
  2. where P(controlling...acidic...rain|window) is the co-occurrence probability of the trigram containing acidic within a predefined window (50 words).
    Page 5, “Regression Model for Alteration Selection”
  3. On the other hand, the second feature helps because it can capture some co-occurrence information no matter how long the query is.
    Page 6, “Regression Model for Alteration Selection”

See all papers in Proc. ACL 2008 that mention co-occurrence.

See all papers in Proc. ACL that mention co-occurrence.

Back to top.

statistically significant

Appears in 3 sentences as: statistically significant (3)
In Selecting Query Term Alternations for Web Search by Exploiting Query Contexts
  1. We also conducted t-tests to determine whether the improvement is statistically significant .
    Page 6, “Experiments 6.1 Experiment Settings”
  2. model is statistically significant with p-value<0.05,
    Page 7, “Experiments 6.1 Experiment Settings”
  3. Moreover, the improvements on five collections are statistically significant .
    Page 7, “Experiments 6.1 Experiment Settings”

See all papers in Proc. ACL 2008 that mention statistically significant.

See all papers in Proc. ACL that mention statistically significant.

Back to top.