Automatic Term Ambiguity Detection
Baldwin, Tyler and Li, Yunyao and Alexe, Bogdan and Stanoi, Ioana R.

Article Structure

Abstract

While the resolution of term ambiguity is important for information extraction (IE) systems, the cost of resolving each instance of an entity can be prohibitively expensive on large datasets.

Introduction

Many words, phrases, and referring expressions are semantically ambiguous.

Term Ambiguity Detection (TAD)

A term can be ambiguous in many ways.

Experimental Evaluation

3.1 Data Set

Related Work

Polysemy is a known problem for many NLP-related applications.

Conclusion

This paper introduced the term ambiguity detection task, which detects whether a term is ambiguous relative to a topical domain.

Topics

n-gram

Appears in 7 sentences as: N-gram (1) n-gram (6)
In Automatic Term Ambiguity Detection
  1. This module examines n-gram data from a large text collection.
    Page 2, “Term Ambiguity Detection (TAD)”
  2. The rationale behind the n-gram module is based on the understanding that terms appearing in non-named entity contexts are likely to be non-referential, and terms that can be non-referential are ambiguous.
    Page 2, “Term Ambiguity Detection (TAD)”
  3. Since we wish for the ambiguity detection determination to be fast, we develop our method to make this judgment solely on the n-gram probability, without the need to examine each individual usage context.
    Page 2, “Term Ambiguity Detection (TAD)”
  4. After removing stopwords from the term, we calculate the n-gram probability of the lower-cased form of the remaining words.
    Page 2, “Term Ambiguity Detection (TAD)”
  5. N-gram suggests no n-referential instances
    Page 3, “Experimental Evaluation”
  6. Effectiveness To understand the contribution of the n-gram (NG), ontology (ON), and clustering (CL) based modules, we ran each separately, as well as every possible combination.
    Page 3, “Experimental Evaluation”
  7. Of the three individual modules, the n-gram and clustering methods achieve F-measure of around 0.9, while the ontology-based module performs only modestly above baseline.
    Page 3, “Experimental Evaluation”

See all papers in Proc. ACL 2013 that mention n-gram.

See all papers in Proc. ACL that mention n-gram.

Back to top.

word sense

Appears in 6 sentences as: word sense (7)
In Automatic Term Ambiguity Detection
  1. Several NLP tasks, such as word sense disambiguation, word sense induction, and named entity disambiguation, address this ambiguity problem to varying degrees.
    Page 1, “Introduction”
  2. the well studied problems of named entity disambiguation (NED) and word sense disambiguation (WSD).
    Page 4, “Related Work”
  3. Both named entity and word sense disambiguation are extensively studied, and surveys on each are available (Nadeau and Sekine, 2007; Navigli, 2009).
    Page 4, “Related Work”
  4. Another task that shares similarities with TAD is word sense induction (WSI).
    Page 4, “Related Work”
  5. Unlike those approaches, the word sense induction task attempts to both figure out the number of senses a word has, and what they are.
    Page 4, “Related Work”
  6. For instance, TAD could be used to aid word sense induction more generally, or could be applied as part of other tasks such as coreference resolution.
    Page 5, “Conclusion”

See all papers in Proc. ACL 2013 that mention word sense.

See all papers in Proc. ACL that mention word sense.

Back to top.

named entity

Appears in 4 sentences as: named entities (1) named entity (3)
In Automatic Term Ambiguity Detection
  1. Several NLP tasks, such as word sense disambiguation, word sense induction, and named entity disambiguation, address this ambiguity problem to varying degrees.
    Page 1, “Introduction”
  2. several potential named entities it could refer to, even if the vast majority of references were to only a single entity.
    Page 4, “Experimental Evaluation”
  3. the well studied problems of named entity disambiguation (NED) and word sense disambiguation (WSD).
    Page 4, “Related Work”
  4. Both named entity and word sense disambiguation are extensively studied, and surveys on each are available (Nadeau and Sekine, 2007; Navigli, 2009).
    Page 4, “Related Work”

See all papers in Proc. ACL 2013 that mention named entity.

See all papers in Proc. ACL that mention named entity.

Back to top.

F-measure

Appears in 3 sentences as: F-measure (3)
In Automatic Term Ambiguity Detection
  1. Results over a dataset of entities from four product domains show that the proposed approach achieves significantly above baseline F-measure of 0.96.
    Page 1, “Abstract”
  2. Of the three individual modules, the n-gram and clustering methods achieve F-measure of around 0.9, while the ontology-based module performs only modestly above baseline.
    Page 3, “Experimental Evaluation”
  3. The final system that employed all modules produced an F-measure of 0.960, a significant (p < 0.01) absolute increase of 15.4% over the baseline.
    Page 4, “Experimental Evaluation”

See all papers in Proc. ACL 2013 that mention F-measure.

See all papers in Proc. ACL that mention F-measure.

Back to top.

sense disambiguation

Appears in 3 sentences as: sense disambiguation (3)
In Automatic Term Ambiguity Detection
  1. Several NLP tasks, such as word sense disambiguation , word sense induction, and named entity disambiguation, address this ambiguity problem to varying degrees.
    Page 1, “Introduction”
  2. the well studied problems of named entity disambiguation (NED) and word sense disambiguation (WSD).
    Page 4, “Related Work”
  3. Both named entity and word sense disambiguation are extensively studied, and surveys on each are available (Nadeau and Sekine, 2007; Navigli, 2009).
    Page 4, “Related Work”

See all papers in Proc. ACL 2013 that mention sense disambiguation.

See all papers in Proc. ACL that mention sense disambiguation.

Back to top.

topic modeling

Appears in 3 sentences as: topic modeling (3)
In Automatic Term Ambiguity Detection
  1. To address the term ambiguity detection problem, we employ a model that combines data from language models, ontologies, and topic modeling .
    Page 1, “Abstract”
  2. To do so, we utilized the popular Latent Dirichlet Allocation (LDA (Blei et al., 2003)) topic modeling method.
    Page 2, “Term Ambiguity Detection (TAD)”
  3. Following standard procedure, stopwords and infrequent words were removed before topic modeling was performed.
    Page 2, “Term Ambiguity Detection (TAD)”

See all papers in Proc. ACL 2013 that mention topic modeling.

See all papers in Proc. ACL that mention topic modeling.

Back to top.