Index of papers in Proc. ACL 2008 that mention
  • named entity
Hermjakob, Ulf and Knight, Kevin and Daumé III, Hal
Discussion
Improved named entity translation accuracy as measured by the NEWA metric in general, and a reduction in dropped names in particular is clearly valuable to the human reader of machine translated documents as well as for systems using machine translation for further information processing.
End-to-End results
Table 2: Name translation accuracy with respect to BBN and re-annotated Gold Standard on 1730 named entities in 637 sentences.
Evaluation
General MT metrics such as BLEU, TER, METEOR are not suitable for evaluating named entity translation and transliteration, because they are not focused on named entities (NEs).
Evaluation
The general idea of the Named Entity Weak Accuracy (NEWA) metric is to
Evaluation
BBN kindly provided us with an annotated Arabic text corpus, in which named entities were marked up with their type (e. g. GPE for Geopolitical Entity) and one or more English translations.
Introduction
0 Not all named entities should be transliterated.
Introduction
Many named entities require a mix of transliteration and translation.
Introduction
We ask: what percentage of source-language named entities are translated correctly?
Learning what to transliterate
As already mentioned in the introduction, named entity (NE) identification followed by MT is a bad idea.
named entity is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Kazama, Jun'ichi and Torisawa, Kentaro
Abstract
We propose using large-scale clustering of dependency relations between verbs and multi-word nouns (MN 5) to construct a gazetteer for named entity recognition (N ER).
Experiments
We obtained 11,892 sentences13 with 18,677 named entities .
Experiments
We define “# e-matches” as the number of matches that also match a boundary of a named entity in the training set, and “# optimal” as the optimal number of “# e-matches” that can be achieved when we know the
Experiments
These gazetteers cover 40 - 50% of the named entities , and the cluster gazetteers have relatively wider coverage than the Wikipedia gazetteer has.
Gazetteer Induction 2.1 Induction by MN Clustering
Figure 2: Clean MN clusters with named entity entries (Left: car brand names.
Introduction
Gazetteers, or entity dictionaries, are important for performing named entity recognition (NER) accurately.
Introduction
from a multi-word noun (MN)1 to named entity categories such as “Tokyo Stock Exchange —> {ORGANIZATION}”.2 However, since the correspondence between the labels and the NE categories can be learned by tagging models, a gazetteer will be useful as long as it returns consistent labels even if those returned are not the NE categories.
Introduction
Therefore, performing the clustering with a vocabulary that is large enough to cover the many named entities required to improve the accuracy of NER is difficult.
Related Work and Discussion
To cover most of the named entities in the data, we need much larger gazetteers.
named entity is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Arnold, Andrew and Nallapati, Ramesh and Cohen, William W.
Abstract
We present a novel hierarchical prior structure for supervised transfer learning in named entity recognition, motivated by the common structure of feature spaces for this task across natural language data sets.
Conclusions, related & future work
In this work we have introduced hierarchical feature tree priors for use in transfer learning on named entity extraction tasks.
Introduction
Consider the task of named entity recognition (NER).
Introduction
Having successfully trained a named entity classifier on this news data, now consider the problem of learning to classify tokens as names in email data.
Introduction
In particular, we develop a novel prior for named entity recognition that exploits the hierarchical feature space often found in natural language domains (§l.2) and allows for the transfer of information from labeled datasets in other domains (§l.3).
Investigation
The goal of our experiments was to see to what degree named entity recognition problems naturally conformed to hierarchical methods, and not just to achieve the highest performance possible.
named entity is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Richman, Alexander E. and Schone, Patrick
Abstract
In this paper, we describe a system by which the multilingual characteristics of Wikipedia can be utilized to annotate a large corpus of text with Named Entity Recognition (NER) tags requiring minimal human intervention and no linguistic expertise.
Abstract
We show how the Wikipedia format can be used to identify possible named entities and discuss in detail the process by which we use the Category structure inherent to Wikipedia to determine the named entity type of a proposed entity.
Conclusions
In conclusion, we have demonstrated that Wikipedia can be used to create a Named Entity Recognition system with performance comparable to one developed from 15-40,000 words of human-anno-tated newswire, while not requiring any linguistic expertise on the part of the user.
Introduction
Named Entity Recognition (NER) has long been a major task of natural language processing.
Training Data Generation
We elected to use the ACE Named Entity types PERSON, GPE (GeoPolitical Entities), ORGANIZATION, VEHICLE, WEAPON, LOCATION, FACILITY, DATE, TIME, MONEY, and PERCENT.
Training Data Generation
Other categories can reliably be used to determine that the article does not refer to a named entity , such as “CategoryzEndangered species.” We manually derived a relatively small set of key phrases, the most important of which are shown in Table 1.
Wikipedia 2.1 Structure
Toral and Munoz (2006) used Wikipedia to create lists of named entities .
Wikipedia 2.1 Structure
Cucerzan (2007), by contrast to the above, used Wikipedia primarily for Named Entity Disambiguation, following the path of Bunescu and Pasca (2006).
named entity is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Saha, Sujan Kumar and Mitra, Pabitra and Sarkar, Sudeshna
Abstract
Statistical machine learning methods are employed to train a Named Entity Recognizer from annotated data.
Abstract
A number of word similarity measures are proposed for clustering words for the Named Entity Recognition task.
Introduction
Named Entity Recognition (NER) involves locating and classifying the names in a text.
Maximum Entropy Based Model for Hindi NER
This corpus has been manually annotated and contains about 16,491 Named Entities (NEs).
named entity is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Vadas, David and Curran, James R.
NER features
Named entity recognition (NER) provides information that is particularly relevant for NP parsing, simply because entities are nouns.
NER features
coordinate structure in biological named entities .
NER features
We identify constituents that dominate tokens that all have the same NE tag, as these nodes will not cause a “crossing bracket” with the named entity .
named entity is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Yang, Fan and Zhao, Jun and Zou, Bo and Liu, Kang and Liu, Feifan
Introduction*
The task of Name Entity (NE) translation is to translate a name entity from source language to target language, which plays an important role in machine translation and cross-language information retrieval (CLIR).
Statistical Transliteration Model
Because the process of named entity recognition may lose some NEs, we will reserve all the words in web corpus without any filtering.
Statistical Transliteration Model
Then for every Tik, we use the named entity recognition (NER) software to determine whether rci is a NE or not.
Statistical Transliteration Model
The training corpus for statistical transliteration model comes from the corpus of Chinese <-> English Name Entity Lists v 1.0 (LDC2005T34).
named entity is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Dridan, Rebecca and Kordoni, Valia and Nicholson, Jeremy
Conclusion
While annotating with named entity data or a lexical type supertagger were also found to increase coverage, the POS tagger had the greatest effect with up to 45% coverage increase on unseen text.
Unknown Word Handling
Since the parser has the means to accept named entity (NE) information in the input, we also experimented with using generic lexical items generated from NE data.
Unknown Word Handling
It is possible that another named entity tagger would give better results, and this may be looked at in future experiments.
named entity is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Li, Zhifei and Yarowsky, David
Background: Chinese Abbreviations
While the abbreviations mostly originate from noun phrases (in particular, named entities ), other general phrases are also abbreviatable.
Unsupervised Translation Induction for Chinese Abbreviations
One may use a named entity tagger to obtain such a list.
Unsupervised Translation Induction for Chinese Abbreviations
However, this relies on the existence of a Chinese named entity tagger with high-precision.
named entity is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: