Entity Linking for Tweets
Liu, Xiaohua and Li, Yitong and Wu, Haocheng and Zhou, Ming and Wei, Furu and Lu, Yi

Article Structure

Abstract

We study the task of entity linking for tweets, which tries to associate each mention in a tweet with a knowledge base entry.

Introduction

Twitter is a widely used social networking service.

Related Work

Existing entity linking work can roughly be divided into two categories.

Task Definition

Given a sequence of mentions, denoted by

Our Method

In this section, we first present the framework of our entity linking method.

Experiments

In this section, we introduce the data set and experimental settings, and present results.

Conclusions and Future work

We have presented a collective inference method that jointly links a set of tweet mentions to their corresponding entities.

Topics

knowledge base

Appears in 11 sentences as: knowledge base (10) knowledge base” (1)
In Entity Linking for Tweets
  1. We study the task of entity linking for tweets, which tries to associate each mention in a tweet with a knowledge base entry.
    Page 1, “Abstract”
  2. In this work, we study the entity linking task for tweets, which maps each entity mention in a tweet to a unique entity, i.e., an entry ID of a knowledge base like Wikipedia.
    Page 1, “Introduction”
  3. linking task is generally considered as a bridge between unstructured text and structured machine-readable knowledge base , and represents a critical role in machine reading program (Singh et al., 2011).
    Page 1, “Introduction”
  4. Current entity linking methods are built on top of a large scale knowledge base such as Wikipedia.
    Page 1, “Introduction”
  5. A knowledge base consists of a set of entities, and each entity can have a variation list2.
    Page 1, “Introduction”
  6. 5 TAB (http://www.w3.org/2002/05/tapl) is a shallow knowledge base that contains a broad range of lexical and taxonomic information about popular objects like music, movies, authors, sports, autos, health, etc.
    Page 2, “Related Work”
  7. (2012) propose LIEGE, a framework to link the entities in web lists with the knowledge base , with the assumption that entities mentioned in a Web list tend to be a collection of entities of the same conceptual type.
    Page 3, “Related Work”
  8. Here, an entity refers to an item of a knowledge base .
    Page 3, “Task Definition”
  9. Following most existing work, we use Wikipedia as the knowledge base , and an entity is a definition page in Wikipedia; a mention denotes a sequence of tokens in a tweet that can be potentially linked to an entity.
    Page 3, “Task Definition”
  10. 0 Total is the total number of knowledge base entities;
    Page 5, “Our Method”
  11. Following most existing studies, we choose Wikipedia as our knowledge base” .
    Page 6, “Experiments”

See all papers in Proc. ACL 2013 that mention knowledge base.

See all papers in Proc. ACL that mention knowledge base.

Back to top.

edit distance

Appears in 8 sentences as: Edit Distance (3) Edit distance (1) edit distance (4)
In Entity Linking for Tweets
  1. More specifically, we define local features, including context similarity and edit distance , to model the similarity between a mention and an entity.
    Page 2, “Introduction”
  2. Finally, we introduce a set of features to compute the similarity between mentions, including how similar the tweets containing the mentions are, whether they come from the tweets of the same account, and their edit distance .
    Page 2, “Introduction”
  3. 0 Edit Distance Similarity: If Length(mi)+ED(mi, 61-) = Length(ei), f3(mi,ei) = 1, otherwise 0.
    Page 5, “Our Method”
  4. ED(-,-) computes the character level edit distance .
    Page 5, “Our Method”
  5. “ms” is 2, and the edit distance between them is 7.
    Page 5, “Our Method”
  6. 0 35(mi, mj): Edit distance related similarity between mi and mj, as defined in Formula 5.
    Page 5, “Our Method”
  7. It can be seen that: 1) using only Prior Probability feature already yields a reasonable F1; and 2) Context Similarity and Edit Distance Similarity feature have little contribution to the F1, while Mention and Entity Title Similarity feature greatly boosts the F1.
    Page 7, “Experiments”
  8. denote Prior Probability, Context Similarity, Edit Distance Similarity, and Mention and Entity Title Similarity, respectively.
    Page 7, “Experiments”

See all papers in Proc. ACL 2013 that mention edit distance.

See all papers in Proc. ACL that mention edit distance.

Back to top.

entity mention

Appears in 6 sentences as: entities mentioned (1) entity mention (5)
In Entity Linking for Tweets
  1. Two main challenges of this task are the dearth of information in a single tweet and the rich entity mention variations.
    Page 1, “Abstract”
  2. In this work, we study the entity linking task for tweets, which maps each entity mention in a tweet to a unique entity, i.e., an entry ID of a knowledge base like Wikipedia.
    Page 1, “Introduction”
  3. That means, an entity mention often occurs in many tweets, which allows us to aggregate all related tweets to compute mention-mention similarity and mention-entity similarity.
    Page 2, “Introduction”
  4. (2012) propose LIEGE, a framework to link the entities in web lists with the knowledge base, with the assumption that entities mentioned in a Web list tend to be a collection of entities of the same conceptual type.
    Page 3, “Related Work”
  5. They propose a machine learning based approach using n-gram features, concept features, and tweet features, to identify concepts semantically related to a tweet, and for every entity mention to generate links to its corresponding Wikipedia article.
    Page 3, “Related Work”
  6. Second, we want to integrate the entity mention normalization techniques as introduced by Liu et al.
    Page 8, “Conclusions and Future work”

See all papers in Proc. ACL 2013 that mention entity mention.

See all papers in Proc. ACL that mention entity mention.

Back to top.

cosine similarity

Appears in 3 sentences as: cosine similarity (3)
In Entity Linking for Tweets
  1. SemTag uses the TAP knowledge base5, and employs the cosine similarity with TF-IDF weighting scheme to compute the match degree between a mention and an entity, achieving an accuracy of around 82%.
    Page 2, “Related Work”
  2. o 31(mi, mj): The cosine similarity of 75071;) and t(mj); and tweets are represented as TF-IDF vectors;
    Page 5, “Our Method”
  3. 0 32(mi, mj): The cosine similarity of 75071;) and t(mj); and tweets are represented as topic distribution vectors;
    Page 5, “Our Method”

See all papers in Proc. ACL 2013 that mention cosine similarity.

See all papers in Proc. ACL that mention cosine similarity.

Back to top.

named entity

Appears in 3 sentences as: named entity (3)
In Entity Linking for Tweets
  1. Many tweet related researches are inspired, from named entity recognition (Liu et al., 2012), topic detection (Mathioudakis and Koudas, 2010), clustering (Rosa et al., 2010), to event extraction (Grinev et al., 2009).
    Page 1, “Introduction”
  2. (2012), on average a named entity has 3.3 different surface forms in tweets.
    Page 1, “Introduction”
  3. First, we assume that mentions are given, e.g., identified by some named entity recognition system.
    Page 3, “Task Definition”

See all papers in Proc. ACL 2013 that mention named entity.

See all papers in Proc. ACL that mention named entity.

Back to top.