Semantic Tagging of Web Search Queries
Manshadi, Mehdi and Li, Xiao

Article Structure

Abstract

We present a novel approach to parse web search queries for the purpose of automatic tagging of the queries.

Introduction

Understanding users’ intent from web search queries is an important step in designing an intelligent search engine.

ID/LP Grammar

Context-free phrase structure grammars are widely used for parsing natural language.

Our Grammar Model

3.1 The basic model We propose a set of rules in the form: (6) s —> {3, c}

Parsing Algorithm

4.1 Deterministic parser

A grammar for semantic tagging

As mentioned before, in our system queries are already classified into different domains like movies, books, products, etc.

Discriminative re-ranking

By using a context-free grammar, we are missing a great source of clues that can help to resolve ambiguity.

Evaluation

Our resources are a set of 21000 manually labeled queries, a manually designed grammar, a lexicon for every tag (except Type and Attribute), and a set of regular expressions defined for Models and Attributes.

Summary

We introduced a novel approach for deep parsing of web search queries.

Topics

parse tree

Appears in 14 sentences as: parse tree (8) parse trees (6)
In Semantic Tagging of Web Search Queries
  1. In order to take contextual information into account, a discriminative model is used on top of the parser to re—rank the n—best parse trees generated by the parser.
    Page 1, “Abstract”
  2. To overcome this limitation, we further present a discriminative re-ranking module on top of the parser to re-rank the n-best parse trees generated by the parser using contextual features.
    Page 2, “Introduction”
  3. A CFSG parse tree
    Page 4, “Our Grammar Model”
  4. A CFSG parse tree
    Page 4, “Our Grammar Model”
  5. (9) EjP(Ai—> xi) = 1 Consider a sentence wle...wn, a parse tree T of this sentence, and an interior node v in T labeled with Av and assume that v1, 122, ...vk are the children of the node v in T. We define:
    Page 4, “Our Grammar Model”
  6. The parse tree that the probabilistic model assigns to the sentence is defined as:
    Page 4, “Our Grammar Model”
  7. (12) Tmax = argmaxT (P(W1W2...Wn , T)) where T ranges over all possible parse trees of the sentence.
    Page 4, “Our Grammar Model”
  8. This information is necessary for the termination step in order to print the parse trees .
    Page 5, “Parsing Algorithm”
  9. T wo equivalent CFSG parse trees
    Page 7, “A grammar for semantic tagging”
  10. Figure (7a) shows an example of a parse tree generated for the query “Canon vs Sony Camera” in which B, Q, and T are abbreviations for Brand, Query, and Type, and U is a special tag for the words that does not fall into any other tag categories and have been left unlabeled in our corpus such as a, the, for, etc.
    Page 7, “A grammar for semantic tagging”
  11. A more careful look at the grammar shows that there is another parse tree for this query as shown in figure (7b).
    Page 7, “A grammar for semantic tagging”

See all papers in Proc. ACL 2009 that mention parse tree.

See all papers in Proc. ACL that mention parse tree.

Back to top.

labeled data

Appears in 11 sentences as: labeled data (12)
In Semantic Tagging of Web Search Queries
  1. Preparing labeled data , however, is very expensive.
    Page 2, “Introduction”
  2. Therefore in cases where there is no or a small amount of labeled data available, these models do a poor job.
    Page 2, “Introduction”
  3. As seen later, in the case where there is not a large amount of labeled data available, the parser part is the dominant part of the module and performs reasonably well.
    Page 2, “Introduction”
  4. In cases where there is a large amount of labeled data available, the discriminative re-ranking incorporates into the system and enhances the performance.
    Page 2, “Introduction”
  5. As seen later, preliminary experiments show that this hybrid genera-tive/discriminative model performs significantly better than a CRF-based module in both absence and presence of the labeled data .
    Page 2, “Introduction”
  6. For these reasons, we evaluate our grammar model on the task of automatic tagging of queries for which we have labeled data available.
    Page 6, “A grammar for semantic tagging”
  7. The model, however, extends the lexicon by including words discovered from labeled data (if available).
    Page 6, “A grammar for semantic tagging”
  8. It is true that the word “vs” plays a critical role in this query, representing that the user’s intention is to compare the two brands; but as mentioned above in our labeled data such words has left unlabeled.
    Page 7, “A grammar for semantic tagging”
  9. In particular, when there is no or a very small amount of labeled data , a parser could still work by using unsupervised learning approaches to learn the rules, or by simply using a set of hand-built rules (as we did above for the task of semantic tagging).
    Page 7, “Discriminative re-ranking”
  10. When there is enough labeled data, then a discriminative model can be trained on the labeled data to learn contextual information and to further enhance the tagging performance.
    Page 7, “Discriminative re-ranking”
  11. This is a big advantage of the parser model, because in practice providing labeled data is very expensive but very often the lexicons can be easily extracted from the structured data on the web (for example extracting movie titles from imdb or book titles from Amazon).
    Page 8, “Summary”

See all papers in Proc. ACL 2009 that mention labeled data.

See all papers in Proc. ACL that mention labeled data.

Back to top.

CRF

Appears in 6 sentences as: CRF (6)
In Semantic Tagging of Web Search Queries
  1. MEMM and CRF are discriminative models; hence they are highly dependent on the training data.
    Page 2, “Introduction”
  2. As seen in this plot, when there is a small amount of training data, the parser performs better than the CRF module and parser+SVM module performs better than the other two.
    Page 7, “Evaluation”
  3. With a large amount of training data, the CRF and parser almost have the same performance.
    Page 7, “Evaluation”
  4. 4 The CRF module also uses the lexical resources and regular expressions.
    Page 7, “Evaluation”
  5. --- CRF 0.69 0.67 0.65
    Page 8, “Evaluation”
  6. Test N0 = 3000 P R F Q CRF 0.815 0.812 0.813 0.509 Parser 0.808 0.814 0.811 0.494
    Page 8, “Summary”

See all papers in Proc. ACL 2009 that mention CRF.

See all papers in Proc. ACL that mention CRF.

Back to top.

natural language

Appears in 5 sentences as: natural language (5) natural languages (1)
In Semantic Tagging of Web Search Queries
  1. Context-free phrase structure grammars are widely used for parsing natural language .
    Page 3, “ID/LP Grammar”
  2. There are however natural languages that are free word order.
    Page 3, “ID/LP Grammar”
  3. Although very intuitive, ID/LP rules are not widely used in the area of natural language processing.
    Page 3, “ID/LP Grammar”
  4. Since the length of a natural language sentence can easily reach 30-40 (and sometimes even up to 100) words, ID/LP grammar is not a practical model for natural language syntax.
    Page 3, “ID/LP Grammar”
  5. Similar studies in parsing natural language sen-
    Page 7, “Discriminative re-ranking”

See all papers in Proc. ACL 2009 that mention natural language.

See all papers in Proc. ACL that mention natural language.

Back to top.

contextual information

Appears in 4 sentences as: Contextual information (1) contextual information (3)
In Semantic Tagging of Web Search Queries
  1. In order to take contextual information into account, a discriminative model is used on top of the parser to re—rank the n—best parse trees generated by the parser.
    Page 1, “Abstract”
  2. Contextual information often plays a big role in resolving tagging ambiguities and is one of the key benefits of discriminative models such as CRFs.
    Page 2, “Introduction”
  3. When there is enough labeled data, then a discriminative model can be trained on the labeled data to learn contextual information and to further enhance the tagging performance.
    Page 7, “Discriminative re-ranking”
  4. to take contextual information into account.
    Page 8, “Summary”

See all papers in Proc. ACL 2009 that mention contextual information.

See all papers in Proc. ACL that mention contextual information.

Back to top.

CRFs

Appears in 3 sentences as: CRFS (1) CRFs (2)
In Semantic Tagging of Web Search Queries
  1. The most widely used approaches to these problems have been sequential models including hidden Markov models (HMMS), maximum entropy Markov models (MEMMS) (Mccallum 2000), and conditional random fields ( CRFS ) (Lafferty et al.
    Page 1, “Introduction”
  2. Because of this limitation, Viola and Narasimhand (2007) use a discriminative context-free (phrase structure) grammar for extracting information from semistructured data and report higher performances over CRFs .
    Page 1, “Introduction”
  3. Contextual information often plays a big role in resolving tagging ambiguities and is one of the key benefits of discriminative models such as CRFs .
    Page 2, “Introduction”

See all papers in Proc. ACL 2009 that mention CRFs.

See all papers in Proc. ACL that mention CRFs.

Back to top.