Abstract | Our NER system achieves the best current result on the widely used CoNLL benchmark. |
Conclusions | Our system achieved the best current result on the CoNLL NER data set. |
Named Entity Recognition | Named entity recognition ( NER ) is one of the first steps in many applications of information extraction, information retrieval, question answering and other applications of NLP. |
Named Entity Recognition | 2001) is one of the most competitive NER algorithms. |
Named Entity Recognition | The CoNLL 2003 Shared Task (Tjong Kim Sang and Meulder 2003) offered a standard experimental platform for NER . |
Active Learning for Sequence Labeling | This might, in particular, apply to NER where larger stretches of sentences do not contain any entity mention at all, or merely trivial instances of an entity class easily predictable by the current model. |
Conditional Random Fields for Sequence Labeling | Many NLP tasks, such as POS tagging, chunking, or NER , are sequence labeling problems where a sequence of class labels 3] = (3/1,. |
Experiments and Results | This coincides with the assumption that SeSAL works especially well for labeling tasks where some classes occur predominantly and can, in most cases, easily be discriminated from the other classes, as is the case in the NER scenario. |
Introduction | tion ( NER ), the examples selected by AL are sequences of text, typically sentences. |
Introduction | In the NER scenario, e.g., large portions of the text do not contain any target entity mention at all. |
Introduction | Our experiments are laid out in Section 5 where we compare fully and semi-supervised AL for NER on two corpora, the newspaper selection of MUC7 and PENNBIOIE, a biological abstracts corpus. |
Summary and Discussion | Our experiments in the context of the NER scenario render evidence to the hypothesis that the proposed approach to semi-supervised AL (SeSAL) for sequence labeling indeed strongly reduces the amount of tokens to be manually annotated — in terms of numbers, about 60% compared to its fully supervised counterpart (FuSAL), and over 80% compared to a totally passive learning scheme based on random selection. |
Summary and Discussion | In our experiments on the NER scenario, those regions were mentions of entity names or linguistic units which had a surface appearance similar to entity mentions but could not yet be correctly distinguished by the model. |
Summary and Discussion | Future research is needed to empirically investigate into this area and quantify the savings in terms of the time achievable with SeSAL in the NER scenario. |
Asymmetric Alignment Method for Equivalent Extraction | 1) Traditional alignment method needs the NER process in both sides, but the NER process may often bring in some mistakes. |
Asymmetric Alignment Method for Equivalent Extraction | The NER process is not necessary for that we align the Chinese ON with English sentences directly. |
Experiments | The asymmetric alignment method can avoid the mistakes made in the NER process and give an explicit alignment matching. |
Experiments | Our method can overcome the mistakes introduced in the NER process. |
Introduction | However, the named entity recognition ( NER ) will always introduce some mistakes. |
Introduction | In order to avoid NER mistakes, we propose an asymmetric alignment method which align the Chinese ON with an English sentence directly and then extract the English fragment with the largest alignment score as the equivalent. |
Introduction | The asymmetric alignment method can avoid the influence of improper results of NER and generate an explicit matching between the source and the target phrases which can guarantee the precision of alignment. |
Experiments | line represents a converted question, in order to extract the question-type feature, we use a matching NER-type between the headline and candidate sentence to set question-type NER match feature. |
Experiments | From section 2.2, QTCF represents question-type NER match feature, LexSem is the bundle of lexico-semantic features and QComp is the matching features of subject, head, object, and three complements. |
Feature Extraction for Entailment | Named-Entity Recognizer ( NER ): This component identifies and classifies basic entities such as proper names of person, organization, product, location; time and numerical expressions such as year, day, month; various measurements such as weight, money, percentage; contact information like address, webpage, phone-number, etc. |
Feature Extraction for Entailment | The NER module is based on a combination of user defined rules based on Lesk word disambiguation (Lesk, 1988), WordNet (Miller, 1995) lookups, and many user-defined dictionary lookups, e.g. |
Feature Extraction for Entailment | During the NER extraction, we also employ phrase analysis based on our phrase utility extraction method using Standford dependency parser ((Klein and Manning, 2003)). |