Piggyback: Using Search Engines for Robust Cross-Domain Named Entity Recognition
Rüd, Stefan and Ciaramita, Massimiliano and Müller, Jens and Schütze, Hinrich

Article Structure

Abstract

We use search engine results to address a particularly difficult cross-domain language processing task, the adaptation of named entity recognition (NER) from news text to web queries.

Introduction

As statistical Natural Language Processing (NLP) matures, NLP components are increasingly used in real-world applications.

Related work

Barr et al.

Standard NER features

As is standard in supervised NER, we train an NE tagger on a dataset where each token is represented as a feature vector.

Piggyback features

Feature groups URL, LEX, BOW, and MISC are piggyback features.

Experimental data

In our experiments, we train an NER classifier on an in-domain data set and test it on two different out-of-domain data sets.

Experimental setup

Recall that the input features for a token wo consist of standard NER features (BASE and GAZ) and features derived from the search result we obtain by

Results and discussion

Table 3 summarizes the experimental results.

Conclusion

Robust cross-domain generalization is key in many NLP applications.

Topics

NER

Appears in 40 sentences as: NER (43)
In Piggyback: Using Search Engines for Robust Cross-Domain Named Entity Recognition
  1. We use search engine results to address a particularly difficult cross-domain language processing task, the adaptation of named entity recognition ( NER ) from news text to web queries.
    Page 1, “Abstract”
  2. We achieve strong gains in NER performance on news, in-domain and out-of-domain, and on web queries.
    Page 1, “Abstract”
  3. In this paper, we use piggyback features to address a particularly hard cross-domain problem, the application of an NER system trained on news to web queries.
    Page 1, “Introduction”
  4. Thus, applying NER systems trained on news to web queries requires a robust cross-domain approach.
    Page 1, “Introduction”
  5. The lack of context and capitalization, and the noisiness of real-world web queries (to-kenization irregularities and misspellings) all make NER hard.
    Page 1, “Introduction”
  6. Thus, NER for short, noisy text fragments, in the absence of capitalization, is of general importance.
    Page 1, “Introduction”
  7. NER performance is to a large extent determined by the quality of the feature representation.
    Page 2, “Introduction”
  8. While the impact of different types of features is well understood for standard NER , fundamentally different types of features can be used when leveraging search engine results.
    Page 2, “Introduction”
  9. Returning to the NE London in the query London Klondike gold rush, the feature “proportion of search engine results in which a first name precedes the token of interest” is likely to be useful in NER .
    Page 2, “Introduction”
  10. We describe standard NER features in Section 3.
    Page 2, “Introduction”
  11. The results in Section 7 show that piggyback features significantly increase NER performance.
    Page 2, “Introduction”

See all papers in Proc. ACL 2011 that mention NER.

See all papers in Proc. ACL that mention NER.

Back to top.

CoNLL

Appears in 23 sentences as: CoNLL (23) |CoNLL (1)
In Piggyback: Using Search Engines for Robust Cross-Domain Named Entity Recognition
  1. (2010) show that adapting from CoNLL to MUC-7 (Chinchor, 1998) data (thus between different newswire sources), the best unsupervised feature (Brown clusters) improves F1 from .68 to .79.
    Page 2, “Related work”
  2. |CoNLL trn|CoNLL tst|IEER|KDD-D|KDD-T
    Page 6, “Experimental data”
  3. Table 2: Percentages of NEs in CoNLL , IEER, and KDD.
    Page 6, “Experimental data”
  4. As training data for all models evaluated we used the CoNLL 2003 English NER dataset, a corpus of approximately 300,000 tokens of Reuters news from 1992 annotated with person, location, organization and miscellaneous NE labels (Sang and Meulder, 2003).
    Page 6, “Experimental data”
  5. CoNLL and IEER are professionally edited and, in particular, properly capitalized news corpora.
    Page 6, “Experimental data”
  6. As capitalization is absent from queries we lowercased both CoNLL and IEER.
    Page 6, “Experimental data”
  7. We instructed workers to follow the CoNLL 2003 NER guidelines (augmented with several examples from queries that we annotated) and identify up to three NEs in a short text and copy and paste them into a box with associated multiple choice menu with the 4 CoNLL NE labels: LOC, MISC, ORG, and PER.
    Page 6, “Experimental data”
  8. IEER has fewer NEs than CoNLL , KDD has more.
    Page 6, “Experimental data”
  9. PER is about as prevalent in KDD as in CoNLL , but LOC and ORG have higher percentages, reflecting the fact that people search frequently for locations and commercial organizations.
    Page 6, “Experimental data”
  10. These differences between source domain ( CoNLL ) and target domains (IEER, KDD) add to the difficulty of cross-domain generalization in this case.
    Page 6, “Experimental data”
  11. We use BIO encoding as in the original CoNLL task (Sang and Meulder, 2003).
    Page 7, “Experimental setup”

See all papers in Proc. ACL 2011 that mention CoNLL.

See all papers in Proc. ACL that mention CoNLL.

Back to top.

in-domain

Appears in 8 sentences as: in-domain (8)
In Piggyback: Using Search Engines for Robust Cross-Domain Named Entity Recognition
  1. We achieve strong gains in NER performance on news, in-domain and out-of-domain, and on web queries.
    Page 1, “Abstract”
  2. Another source of world knowledge for NER is Wikipedia: Kazama and Torisawa (2007) show that pseudocategories extracted from Wikipedia help for in-domain NER.
    Page 2, “Related work”
  3. In our experiments, we train an NER classifier on an in-domain data set and test it on two different out-of-domain data sets.
    Page 5, “Experimental data”
  4. 3A reviewer points out that we use the terms in-domain and out-of-domain somewhat liberally.
    Page 6, “Experimental data”
  5. For our in-domain evaluation, we tune T on a 10% development sample of the CoNLL data and test on the remaining 10%.
    Page 7, “Experimental setup”
  6. Even though the emphasis of this paper is on cross-domain robustness, we can see that our approach also has clear in-domain benefits.
    Page 8, “Results and discussion”
  7. ment due to piggyback features increases as out-of-domain data become more different from the in-domain training set, performance declines in absolute terms from .930 (CoNLL) to .681 (IEER) and .438 (KDD-T).
    Page 9, “Results and discussion”
  8. Even in-domain , we were able to get a smaller, but still noticeable improvement of 4.2% due to piggyback features.
    Page 9, “Conclusion”

See all papers in Proc. ACL 2011 that mention in-domain.

See all papers in Proc. ACL that mention in-domain.

Back to top.

named entity

Appears in 5 sentences as: named entities (1) named entitihood (1) named entity (3)
In Piggyback: Using Search Engines for Robust Cross-Domain Named Entity Recognition
  1. We use search engine results to address a particularly difficult cross-domain language processing task, the adaptation of named entity recognition (NER) from news text to web queries.
    Page 1, “Abstract”
  2. For example, a named entity (NE) recognizer trained on news text may tag the NE London in an out-of-domain web query like London Klondike gold rush as a location.
    Page 1, “Introduction”
  3. The value of the feature URL-MI is the average difference between the MI of PER and the other named entities .
    Page 4, “Piggyback features”
  4. The feature LEX-MI interprets words occurring before or after so as indicators of named entitihood .
    Page 5, “Piggyback features”
  5. As out-of-domain newswire evaluation data3 we use the development test data from the NIST 1999 IEER named entity corpus, a dataset of 50,000 tokens of New York Times (NYT) and Associated Press Weekly news.4 This corpus is annotated with person, location, organization, cardinal, duration, measure, and date labels.
    Page 6, “Experimental data”

See all papers in Proc. ACL 2011 that mention named entity.

See all papers in Proc. ACL that mention named entity.

Back to top.

F1 scores

Appears in 4 sentences as: F1 score (1) F1 scores (3)
In Piggyback: Using Search Engines for Robust Cross-Domain Named Entity Recognition
  1. Lin and Wu (2009) report an F1 score of 90.90 on the original split of the CoNLL data.
    Page 8, “Results and discussion”
  2. Our F1 scores > 92% can be explained by a combination of randomly partitioning the data and the fact that the four-class problem is easier than the five-class problem LOC-ORG-PER-MISC-O.
    Page 8, “Results and discussion”
  3. We use the t-test to compute significance on the two sets of five F1 scores from the two experiments that are being compared (two-tailed, p < .01 for t > 3.36).8 CoNLL scores that are significantly different from line c7 are marked with >|<.
    Page 8, “Results and discussion”
  4. 8We make the assumption that the distribution of F1 scores is approximately normal.
    Page 8, “Results and discussion”

See all papers in Proc. ACL 2011 that mention F1 scores.

See all papers in Proc. ACL that mention F1 scores.

Back to top.

best results

Appears in 3 sentences as: best result (1) best results (2)
In Piggyback: Using Search Engines for Robust Cross-Domain Named Entity Recognition
  1. ALL scores Significantly different from the best results for the three datasets (lines c7, i8, k7) are marked >|< (see text).
    Page 7, “Experimental setup”
  2. This subset always includes the best results and a number of other combinations where feature groups are added to or removed from the optimal combination.
    Page 7, “Results and discussion”
  3. On line k7, we show results for this run for KDD-T and for runs that differ by one feature group (lines k2—k6, k8).9 The overall best result (43.8%) is achieved when using all feature groups (line k8).
    Page 8, “Results and discussion”

See all papers in Proc. ACL 2011 that mention best results.

See all papers in Proc. ACL that mention best results.

Back to top.