Resolving Personal Names in Email Using Context Expansion
Elsayed, Tamer and Oard, Douglas W. and Namata, Galileo

Article Structure

Abstract

This paper describes a computational approach to resolving the true referent of a named mention of a person in the body of an email.

Introduction

The increasing prevalence of informal text from which a dialog structure can be reconstructed (e.g., email or instant messaging), raises new challenges if we are to help users make sense of this cacophony.

Related Work

The problem of identity resolution in email is a special case of the more general problem referred to as “Entity Resolution.” Entity resolution is generically defined as a process of determining the mapping from references (e.g., names, phrases) observed in data to real-world entities (e. g., persons, locations).

Mention Resolution Approach

The problem we are interested in is the resolution of a personal-name mention (i.e., a named reference to a person) m, in a specific email em in the given collection of emails E, to its true referent.

Experimental Evaluation

We evaluate our mention resolution approach using four test collections, all are based on the CMU version of the Enron collection; each was created by selecting a subset of that collection, selecting a set of query-mentions within emails from that subset, and creating an answer key in which each query-mention is associated with a single email address.

Conclusion

We have presented an approach to mention resolution in email that flexibly makes use of expanding contexts to accurately resolve the identity of a given mention.

Topics

lm

Appears in 12 sentences as: lm (15) lm/ (1) lm| (2)
In Resolving Personal Names in Email Using Context Expansion
  1. We define a mention m as a tuple < lm, em >, where lm is the “literal” string of characters that represents m and em is the email where m is observed.1 We assume that m can be resolved to a distinguishable participant for whom at least one email address is present in the collection.2
    Page 2, “Mention Resolution Approach”
  2. Select a specific lexical reference lm to refer to 0 given the context ark.
    Page 2, “Mention Resolution Approach”
  3. 1The exact position in em where lm is observed should also be included in the definition, but we ignore it assuming that all matched literal mentions in one email refer to the same identity.
    Page 2, “Mention Resolution Approach”
  4. p(clm) = p(cllm, X(em)) We then apply Bayes’ rule to get 19(Cllm7 X(em)> : 19(07 [m7 X(em)> p( lm , X (em))
    Page 5, “Mention Resolution Approach”
  5. where p(lm, X (e“’)) is the probability of observing lm in the context.
    Page 5, “Mention Resolution Approach”
  6. We now restrict our focus to the numerator p(c, lm, X (e“’)), that is the probability that the sender chose to refer to c by lm in the contextual space.
    Page 5, “Mention Resolution Approach”
  7. me, lm ,X(em)) = EA.
    Page 5, “Mention Resolution Approach”
  8. * p(lm|£vk(em), 0) where p(c) is the probability of selecting a candidate c, p(:ck(em) |c) is the probability of selecting ask, as an appropriate context to mention 0, and p(lm|:ck(em), c) is the probability of choosing to mention 0 by lm given that ask, is the appropriate context.
    Page 5, “Mention Resolution Approach”
  9. Choosing a name-mention: To estimate p( lm| :ck(em), c), we suggest that the email author would choose either to select a reference (or a modified version of a reference) that was previously mentioned in the context or just ignore the context.
    Page 6, “Mention Resolution Approach”
  10. Hence, we estimate that probability as follows: p(lm|xk(em), c) = ozp(lm E mk(em)|c) +(1 - a) p(lmlc) where 04 E [0, l] is a mixing parameter (set at 0.9 in our experiments), and p(lm|c) is estimated as in section 3.1. p( lm E xk(em)|c) can be estimated as follows: Wm E Cvk:(€m>lc) = Z p(lmllm )pUm lick) p(cllm )
    Page 6, “Mention Resolution Approach”
  11. where p(lm|lml) is the probability of modifying lm/ into lm .
    Page 6, “Mention Resolution Approach”

See all papers in Proc. ACL 2008 that mention lm.

See all papers in Proc. ACL that mention lm.

Back to top.

generative model

Appears in 3 sentences as: generative model (2) generative models (1)
In Resolving Personal Names in Email Using Context Expansion
  1. A generative model of mention generation is used to guide mention resolution.
    Page 1, “Abstract”
  2. Our principal contributions are the approaches we take to evidence generation (leveraging three ways of linking to other emails where evidence might be found: reply chains, social interaction, and topical similarity) and our approach to choosing among candidates (based on a generative model of reference production).
    Page 1, “Introduction”
  3. Similarly, approaches in unstructured data (e.g., text) have involved using clustering techniques over biographical facts (Mann and Yarowsky, 2003), within-document resolution (Blume, 2005), and dis-criminative unsupervised generative models (Li et al., 2005).
    Page 2, “Related Work”

See all papers in Proc. ACL 2008 that mention generative model.

See all papers in Proc. ACL that mention generative model.

Back to top.