Introduction | based Feature Reduction for MaxEnt |
Introduction | In their Maximum Entropy ( MaXEnt ) based approach for Hindi NER development, Saha et al. |
Introduction | (2008) also observed that the performance of the MaXEnt based model often decreases when huge number of features are used in the model. |
Maximum Entropy Based Model for Hindi NER | Maximum Entropy ( MaxEnt ) principle is a commonly used technique which provides probability of belongingness of a token to a class. |
Maximum Entropy Based Model for Hindi NER | MaxEnt computes the probability p(0| h) for any 0 from the space of all possible outcomes 0, and for every h from the space of all possible histories H. In NER, history can be viewed as all information derivable from the training corpus relative to the current token. |
Maximum Entropy Based Model for Hindi NER | The computation of probability (p(0|h)) of an outcome for a token in MaxEnt depends on a set of features that are helpful in making predictions about the outcome. |
Methods | This is performed by weighting features to maximise the likelihood of data and, for each instance, decisions are made based on features present at that point, thus maxent classification is quite suitable for our purposes. |
Methods | As feature weights are mutually estimated, the maxent classifier is capable of taking feature dependence into account. |
Methods | By downweighting such features, maxent is capable of modelling to a certain extent the special characteristics which arise from the automatic or weakly supervised training data acquisition procedure. |
Results | Here we decided not to check whether these keywords made sense in scientific texts or not, but instead left this task to the maximum entropy classifier, and added only those keywords that were found reliable enough to predict spec label alone by the maxent model trained on the training dataset. |
Results | This 54—keyword maxent classifier got an F5=1(spec) score of 79.73%. |
Results | We manually examined all keywords that had a P(spec) > 0.5 given as a standalone instance for our maxent model, and constructed a dictionary of hedge cues from the promising candidates. |