Index of papers in Proc. ACL 2013 that mention
  • named entities
Darwish, Kareem
Abstract
Some languages lack large knowledge bases and good discriminative features for Name Entity Recognition (NER) that can generalize to previously unseen named entities .
Introduction
Named Entity Recognition (NER) is essential for a variety of Natural Language Processing (NLP) applications such as information extraction.
Introduction
- Contextual features: Certain words are indicative of the existence of named entities .
Introduction
For example, the word “said” is often preceded by a named entity of type “person” or “organization”.
named entities is mentioned in 26 sentences in this paper.
Topics mentioned in this paper:
Guo, Weiwei and Li, Hao and Ji, Heng and Diab, Mona
Abstract
We show that using tweet specific feature (hashtag) and news specific feature ( named entities ) as well as temporal constraints, we are able to extract text-to-text correlations, and thus completes the semantic picture of a short text.
Creating Text-to-text Relations via Twitter/News Features
4.1 Hashtags and Named Entities
Creating Text-to-text Relations via Twitter/News Features
Named entities are some of the most salient features in a news article.
Creating Text-to-text Relations via Twitter/News Features
Directly applying Named Entity Recognition (NER) tools on news titles or
Introduction
such as named entities in a document.
Introduction
Named entities acquired from a news document, typically with high accuracy using Named Entity Recognition [NER] tools, may be particularly informative.
named entities is mentioned in 19 sentences in this paper.
Topics mentioned in this paper:
Scheible, Christian and Schütze, Hinrich
Distant Supervision
As this classifier uses training data that is biased towards a specialized case (sentences containing the named entity types creators and characters), it does not generalize well to other S-relevance problems and thus yields lower performance on the full dataset.
Distant Supervision
First, the classifier only sees a subset of examples that contain named entities , making generalization to other types of expressions difficult.
Distant Supervision
Conditions 4-8 train supervised classifiers based on the labels from DSlabels+MinCut: (4) MaXEnt with named entities (NE); (5) MaXEnt with NE and semantic (SEM) features; (6) CRF with NE; (7) MaXEnt with NE and sequential (SQ) features; (8) MaXEnt with NE, SQ, and SEM.
Features
5.2 Named Entities
Features
As standard named entity recognition (NER) systems do not capture categories that are relevant to the movie domain, we opt for a lexicon-based approach similar to (Zhuang et al., 2006).
Features
If a capitalized word occurs, we check whether it is part of an already recognized named entity .
Related Work
We follow their approach by using IMDb to define named entity features.
Sentiment Relevance
An error analysis for the classifier trained on P&L shows that many sentences misclassified as S-relevant (fpSR) contain polar words; for example, Then, the situation turns M. In contrast, sentences misclassified as S-nonrelevant (fpSNR) contain named entities or plot and movie business vocabulary; for example, Ti_m Roth delivers the most impressive MM by getting the M language right.
named entities is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Rokhlenko, Oleg and Szpektor, Idan
Comparable Question Mining
As a preprocessing step for detecting comparable relations, our extraction algorithm identifies all the named entities of interest in our corpus, keeping only questions that contain at least two entities.
Comparable Question Mining
Answers containing two named entities , e. g. “Is #1 dating #2 ?”, our CRF tagger is trained to detect only comparable relations like “Who is prettier #1 or #2 ?”.
Comparable Question Mining
Input: A news article Output: A sorted list of comparable questions 1: Identify all target named entities (NEs) in the article 2: Infer the distribution of LDA topics for the article 3: For each comparable relation R in the database, compute its relevance score to be the similarity between the topic distributions of R and the article 4: Rank all the relations according to their relevance score and pick the top M as relevant 5: for each relevant relation R in the order of relevance ranking do 6: Filter out all the target NEs that do not pass the single entity classifier for R 7: Generate all possible NE pairs from the those that passed the single classifier 8: Filter out all the generated NE pairs that do not pass the entity pair classifier for R 9: Pick up the top N pairs with positive classification score to be qualified for generation
Motivation and Algorithmic Overview
Looking at the structure of comparable questions, we observed that a specific comparable relation, such as ‘better dad’ and ‘faster’, can usually be combined with named entities in several syntactic ways to construct a concrete question.
Online Question Generation
For each relevant relation, we then generate concrete questions by picking generic templates that are applicable for this relation and instantiating them with pairs of named entities appearing in the article.
Online Question Generation
To this end, we utilize two different broad-scale sources of information about named entities .
Online Question Generation
The first is DBPedia3, which contains structured information on entries in Wikipedia, many of them are named entities that appear in news articles.
named entities is mentioned in 22 sentences in this paper.
Topics mentioned in this paper:
Kondadadi, Ravi and Howald, Blake and Schilder, Frank
Background
(2013) where, in a given corpus, a combination of domain specific named entity tagging and clustering sentences (based on semantic predicates) were used to generate templates.
Methodology
The DRS consists of semantic predicates and named entity tags.
Methodology
In parallel, domain specific named entity tags are identified and, in conjunction with the semantic predicates, are used to create templates.
Methodology
For example, in (2), using the templates in (le-f), the identified named entities are assigned to a clustered CuId (2ab).
named entities is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Speriosu, Michael and Baldridge, Jason
Data
Toponyms were annotated by a semiautomated process: a named entity recognizer identified toponyms, and then coordinates were assigned using simple rules and corrected by hand.
Evaluation
when a named entity recognizer is used to identify toponyms.
Evaluation
We primarily present results from experiments with gold toponyms but include an accuracy measure for comparability with results from experiments run on plain text with a named entity recognizer.
Introduction
(2010) use relationships learned between people, organizations, and locations from Wikipedia to aid in toponym resolution when such named entities are present, but do not exploit any other textual context.
Introduction
However, it is important to consider the utility of an end-to-end toponym identification and resolution system, so we also demonstrate that performance is still strong when toponyms are detected with a standard named entity recognizer.
Results
The named entity recognizer is likely better at detecting common toponyms than rare toponyms due to the na-
Results
We also measured the mean and median error distance for toponyms correctly identified by the named entity recognizer, and found that they tended to be 50-200km worse than for gold toponyms.
Results
This also makes sense given the named entity recognizer’s tendency to detect common toponyms: common toponyms tend to be more ambiguous than others.
Toponym Resolvers
To create the indirectly supervised training data for WISTR, the OpenNLP named entity recognizer detects toponyms in GEOWIKI, and candidate locations for each toponym are retrieved from GEONAMES.
named entities is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Wang, Mengqiu and Che, Wanxiang and Manning, Christopher D.
Abstract
Translated bi-texts contain complementary language cues, and previous work on Named Entity Recognition (NER) has demonstrated improvements in performance over monolingual taggers by promoting agreement of tagging decisions between the two languages.
Experimental Setup
We train the two CRF models on all portions of the OntoNotes corpus that are annotated with named entity tags, except the parallel-aligned portion which we reserve for development and test purposes.
Experimental Setup
Out of the 18 named entity types that are annotated in OntoNotes, which include person, location, date, money, and so on, we select the four most commonly seen named entity types for evaluation.
Introduction
We study the problem of Named Entity Recognition (NER) in a bilingual context, where the goal is to annotate parallel bi-texts with named entity tags.
Introduction
We can also automatically construct a named entity translation lexicon by annotating and extracting entities from bi-texts, and use it to improve MT performance (Huang and Vogel, 2002; Al-Onaizan and Knight, 2002).
Introduction
As a result, we can find complementary cues in the two languages that help to disambiguate named entity mentions (Brown et al., 1991).
Related Work
set of heuristic rules to expand a candidate named entity set generated by monolingual taggers, and then rank those candidates using a bilingual named entity dictionary.
named entities is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Yih, Wen-tau and Chang, Ming-Wei and Meek, Christopher and Pastusiak, Andrzej
Experiments
All these systems incorporated lexical semantics features derived from WordNet and named entity features.
Experiments
Features used in the experiments can be categorized into six types: identical word matching (I), lemma matching (L), WordNet (WN), enhanced Lexical Semantics (LS), Named Entity matching (NE) and Answer type checking (Ans).
Experiments
Named entity matching (NE) checks whether two words are individually part of some named entities with the same type.
Lexical Semantic Models
For instance, when a word refers to a named entity , the particular sense and meaning is often not encoded.
named entities is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Yao, Xuchen and Van Durme, Benjamin and Clark, Peter
Background
(2007) proposed indexing text with their semantic roles and named entities .
Background
Queries then include constraints of semantic roles and named entities for the predicate and its arguments in the question.
Experiments
sentence-segmented and word-tokenized by NLTK (Bird and Loper, 2004), dependency-parsed by the Stanford Parser (Klein and Manning, 2003), and NER-tagged by the Illinois Named Entity Tagger (Ratinov and Roth, 2009) with an 18-label type set.
Introduction
Moreover, this approach is more robust against, e.g., entity recognition errors, because answer typing knowledge is learned from how the data was actually labeled, not from how the data was assumed to be labeled (e. g., manual templates usually assume perfect labeling of named entities , but often it is not the case
Introduction
This will be our off-the-shelf QA system, which recognizes the association between question type and expected answer types through various features based on e.g., part-of-speech tagging (POS) and named entity recognition (NER).
Introduction
Moreover, our approach extends easily beyond fixed answer types such as named entities : we are already using POS tags as a demonstration.
Method
5Ogilvie (2010) showed in chapter 4.3 that keyword and named entities based retrieval actually outperformed SRL—based structured retrieval in MAP for the answer-bearing sentence retrieval task in their setting.
named entities is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Radziszewski, Adam
Related works
The task of phrase lemmatisation bears a close resemblance to a more popular task, namely lemmatisation of named entities .
Related works
type of named entities considered, those two may be solved using similar or significantly different methodologies.
Related works
Hence, the main challenge is to define a similarity metric between named entities (Piskorski et al., 2009; Kocon and Piasecki, 2012), which can be used to match different mentions of the same names.
named entities is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Wang, Aobo and Kan, Min-Yen
Discussion
Table 6: Sample Chinese freestyle named entities that are usernames.
Discussion
Another major group of errors come from what we term freestyle named entities as exemplified in Table 6; i.e., person names in the form of user IDs and nicknames, that have less constraint on form in terms of length, canonical structure (not surnames with given names; as is standard in Chinese names) and may mix alphabetic characters.
Discussion
Most of these belong to the category of Person Name (PER), as defined in CoNLL-200311 Named Entity Recognition shared task.
Methodology
In addition, we employ additional online word lists3 to distinguish named entities and function words from potential informal words.
named entities is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Huang, Hongzhao and Wen, Zhen and Yu, Dian and Ji, Heng and Sun, Yizhou and Han, Jiawei and Li, He
Experiments
Named entities which co-occur at least 6 times with a morph query in the same topic are selected as its target candidates.
Related Work
Other similar research lines are the TAC-KBP Entity Linking (EL) (Ji et al., 2010; Ji et al., 2011), which links a named entity in news and web documents to an appropriate knowledge base (KB) entry, the task of mining name translation pairs from comparable corpora (Udupa et al., 2009; Ji, 2009; Fung and Yee, 1998; Rapp, 1999; Shao and Ng, 2004; Hassan et al., 2007) and the link prediction problem (Adamic and Adar, 2001; Liben-Nowell and Kleinberg, 2003; Sun et al., 2011b;
Target Candidate Identification
However, obviously we cannot consider all of the named entities in these sources as target candidates due to the sheer volume of information.
Target Candidate Identification
In addition, morphs are not limited to named entity forms.
Target Candidate Ranking
Then we apply a hierarchical Hidden Markov Model (HMM) based Chinese lexical analyzer ICTCLAS (Zhang et al., 2003) to extract named entities , noun phrases and events.
named entities is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
You, Gae-won and Cha, Young-rok and Kim, Jinhan and Hwang, Seung-won
Abstract
This paper studies named entity translation and proposes “selective temporality” as a new feature, as using temporal features may be harmful for translating “atemporal” entities.
Introduction
Named entity translation discovery aims at mapping entity names for people, locations, etc.
Introduction
As many new named entities appear every day in newspapers and web sites, their translations are nontrivial yet essential.
Introduction
Early efforts of named entity translation have focused on using phonetic feature (called PH) to estimate a phonetic similarity between two names (Knight and Graehl, 1998; Li et al., 2004; Virga and Khudanpur, 2003).
Preliminaries
To identify entities, we use a CRF-based named entity tagger (Finkel et al., 2005) and a Chinese word breaker (Gao et al., 2003) for English and Chinese corpora, respectively.
named entities is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Baldwin, Tyler and Li, Yunyao and Alexe, Bogdan and Stanoi, Ioana R.
Experimental Evaluation
several potential named entities it could refer to, even if the vast majority of references were to only a single entity.
Introduction
Several NLP tasks, such as word sense disambiguation, word sense induction, and named entity disambiguation, address this ambiguity problem to varying degrees.
Related Work
the well studied problems of named entity disambiguation (NED) and word sense disambiguation (WSD).
Related Work
Both named entity and word sense disambiguation are extensively studied, and surveys on each are available (Nadeau and Sekine, 2007; Navigli, 2009).
named entities is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Pilehvar, Mohammad Taher and Jurgens, David and Navigli, Roberto
Conclusions
Specifically, we plan to investigate higher coverage inventories such as BabelNet (Navigli and Ponzetto, 2012a), which will handle texts with named entities and rare senses that are not in WordNet, and will also enable cross-lingual semantic similarity.
Experiment 1: Textual Similarity
The TLsyn system also uses Google Book Ngrams, as well as dependency parsing and named entity recognition.
Experiment 1: Textual Similarity
Additionally, because the texts often contain named entities which are not present in WordNet, we incorporated the similarity values produced by four string-based measures, which were used by other teams in the STS task: (1) longest common substring which takes into account the length of the longest overlapping contiguous sequence of characters (substring) across two strings (Gusfield, 1997), (2) longest common subsequence which, instead, finds the longest overlapping subsequence of two strings (Allison and Dix, 1986), (3) Greedy String Tiling which allows reordering in strings (Wise, 1993), and (4) the character/word n-gram similarity proposed by Barron-Cedefio et al.
Experiment 1: Textual Similarity
Named entity features used by the TLsim system could be the reason for its better performance on the MSRpar dataset, which contains a large number of named entities .
named entities is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Nakashole, Ndapandula and Tylenda, Tomasz and Weikum, Gerhard
Evaluation
The problem in this example is incorrect segmentation of a named entity .
Introduction
Known entities are recognized and mapped to the KB using a recent tool for named entity disambiguation (Hoffart 2011).
Related Work
Tagging mentions of named entities with lexical types has been pursued in previous work.
Related Work
Most well-known is the Stanford named entity recognition (NER) tagger (Finkel 2005) which assigns coarse-grained types like person, organization, location, and other to noun phrases that are likely to denote entities.
named entities is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Razmara, Majid and Siahbani, Maryam and Haffari, Reza and Sarkar, Anoop
Experiments & Results 4.1 Experimental Setup
From the oovs, we exclude numbers as well as named entities .
Experiments & Results 4.1 Experimental Setup
We apply a simple heuristic to detect named entities: basically words that are capitalized in the original deV/test set that do not appear at the beginning of a sentence are named entities .
Introduction
Although this is helpful in translating a small fraction of oovs such as named entities for languages with same writing systems, it harms the translation in other types of oovs and distant language pairs.
named entities is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Liu, Xiaohua and Li, Yitong and Wu, Haocheng and Zhou, Ming and Wei, Furu and Lu, Yi
Introduction
Many tweet related researches are inspired, from named entity recognition (Liu et al., 2012), topic detection (Mathioudakis and Koudas, 2010), clustering (Rosa et al., 2010), to event extraction (Grinev et al., 2009).
Introduction
(2012), on average a named entity has 3.3 different surface forms in tweets.
Task Definition
First, we assume that mentions are given, e.g., identified by some named entity recognition system.
named entities is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Lee, Taesung and Hwang, Seung-won
Abstract
This paper studies the problem of mining named entity translations from comparable corpora with some “asymmetry”.
Abstract
Our experimental results on English-Chinese corpora show that our selective propagation approach outperforms the previous approaches in named entity translation in terms of the mean reciprocal rank by up to 0.16 for organization names, and 0.14 in a low comparability case.
Introduction
This task is more challenging in the presence of multilingual text, because translating named entities (NEs), such as persons, locations, or organizations, is a nontrivial task.
named entities is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Lassalle, Emmanuel and Denis, Pascal
Experiments
Depending on the document category, we found some variations as to which hierarchy was learned in each setting, but we noticed that parameters starting with right and left gramtypes often produced quite good hierarchies: for instance right gramtype —> left gramtype —> same sentence —> right named entity type.
Introduction
The main question we raise is, given a set of indicators (such as grammatical types, distance between two mentions, or named entity types), how to best partition the pool of mention pair examples in order to best discriminate coreferential pairs from non coreferential ones.
System description
We used classical features that can be found in details in (Bengston and Roth, 2008) and (Rah-man and Ng, 2011): grammatical type and subtype of mentions, string match and substring, apposition and copula, distance (number of separating mentions/sentences/words), gender/number match, synonymy/hypemym and animacy (using WordNet), family name (based on lists), named entity types, syntactic features (gold parse) and anaphoricity detection.
named entities is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: