Index of papers in Proc. ACL that mention
  • maximum entropy
Liu, Yang
Abstract
To resolve conflicts in shift-reduce parsing, we propose a maximum entropy model trained on the derivation graph of training data.
Introduction
3 A Maximum Entropy Based Shift-Reduce Parsing Model
Introduction
We propose a maximum entropy model to resolve the conflicts for “h+h”: 2
Introduction
1. relative frequencies in two directions; 2. lexical weights in two directions; 3. phrase penalty; 4. distance-based reordering model; 5. lexicaized reordering model; 6. n-gram language model model; 7. word penalty; 8. ill-formed structure penalty; 9. dependency language model; 10. maximum entropy parsing model.
maximum entropy is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Zhai, Feifei and Zhang, Jiajun and Zhou, Yu and Zong, Chengqing
Abstract
Then we propose two novel methods to handle the two PAS ambiguities for SMT accordingly: 1) inside context integration; 2) a novel maximum entropy PAS disambiguation (MEPD) model.
Conclusion and Future Work
Towards the MEPD model, we design a maximum entropy model for each ambitious source-side PASs.
Introduction
As to the role ambiguity, we design a novel maximum entropy PAS disambiguation (MEPD) model to combine various context features, such as context words of PAS.
Maximum Entropy PAS Disambiguation (MEPD) Model
In order to handle the role ambiguities, in this section, we concentrate on utilizing a maximum entropy model to incorporate the context information for PAS disambiguation.
Maximum Entropy PAS Disambiguation (MEPD) Model
The maximum entropy model is the classical way to handle this problem:
Maximum Entropy PAS Disambiguation (MEPD) Model
We train a maximum entropy classifier for each Sp via the off-the-shelf MaxEnt toolkit3.
PAS-based Translation Framework
Thus to overcome this problem, we design two novel methods to cope with the PAS ambiguities: inside-context integration and a maximum entropy PAS disambiguation (MEPD) model.
Related Work
(2010) designed maximum entropy (ME) classifiers to do better rule section for hierarchical phrase-based model and tree-to-string model respectively.
maximum entropy is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Szarvas, Gy"orgy
Conclusions
Next, as the learnt maximum entropy models show, the hedge classification task reduces to a lookup for single keywords or phrases and to the evaluation of the text based on the most relevant cue alone.
Methods
We chose not to add any weighting of features (by frequency or importance) and for the Maximum Entropy Model classifier we included binary data about whether single features occurred in the given context or not.
Methods
2.4 Maximum Entropy Classifier
Methods
Maximum Entropy Models (Berger et al., 1996) seek to maximise the conditional probability of classes, given certain observations (features).
Results
This shows that the Maximum Entropy Model in this situation could not learn any meaningful hypothesis from the cooc-curence of individually weak keywords.
Results
Here we decided not to check whether these keywords made sense in scientific texts or not, but instead left this task to the maximum entropy classifier, and added only those keywords that were found reliable enough to predict spec label alone by the maxent model trained on the training dataset.
Results
The majority of these phrases were found to be reliable enough for our maximum entropy model to predict a speculative class based on that single feature.
maximum entropy is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Dridan, Rebecca and Kordoni, Valia and Nicholson, Jeremy
Parser Restriction
Consequently, we developed a Maximum Entropy model for supertagging using the OpenNLP implementation.2 Similarly to Zhang and Kordoni (2006), we took training data from the gold—standard lexical types in the treebank associated with ERG (in our case, the July-07 version).
Parser Restriction
We held back the jh5 section of the treebank for testing the Maximum Entropy model.
Parser Restriction
Again, the lexical items that were to be restricted were controlled by a threshold, in this case the probability given by the maximum entropy model.
Unknown Word Handling
The same maximum entropy tagger used in Section 3 was used and each open class word was tagged with its most likely lexical type, as predicted by the maximum entropy model.
Unknown Word Handling
Again it is clear that the use of POS tags as features obviously improves the maximum entropy model, since this second model has almost 10% better coverage on our unseen texts.
maximum entropy is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Li, Junhui and Marton, Yuval and Resnik, Philip and Daumé III, Hal
Discussion
To validate this conjecture on our translation test data, we compare the reordering performance among the MR08 system, the improved systems and the maximum entropy classifiers.
Discussion
Then we evaluate the automatic reordering outputs generated from both our translation systems and maximum entropy classifiers.
Discussion
Potential improvement analysis: Table 7 also shows that our current maximum entropy classifiers have room for improvement, especially for semantic reordering.
Related Work
Ge (2010) presented a syntax-driven maximum entropy reordering model that predicted the source word translation order.
Unified Linguistic Reordering Models
In order to predict either the leftmost or rightmost reordering type for two adjacent constituents, we use a maximum entropy classifier to estimate the probability of the reordering type rt 6 {M, DM, 8, DS} as follows:
Unified Linguistic Reordering Models
For each pair of constituents, it first extracts its leftmost and rightmost reordering types (line 6) and then gets their respective probabilities returned by the maximum entropy classifiers defined in Section 3.1
maximum entropy is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Metallinou, Angeliki and Bohus, Dan and Williams, Jason
Generative state tracking
In this work we will use maximum entropy models.
Generative state tracking
4.1 Maximum entropy models
Generative state tracking
The maximum entropy framework (Berger et al.
maximum entropy is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Liu, Jenny and Haghighi, Aria
Abstract
We take a maximum entropy reranking approach to the problem which admits arbitrary features on a permutation of modifiers, exploiting hundreds of thousands of features in total.
Conclusion
The straightforward maximum entropy reranking approach is able to significantly outperform preVious computational approaches by allowing for a richer model of the prenominal modifier ordering process.
Introduction
By mapping a set of features across the training data and using a maximum entropy reranking model, we can learn optimal weights for these features and then order each set of modifiers in the test data according to our features and the learned weights.
Introduction
In Section 3 we present the details of our maximum entropy reranking approach.
Model
At test time, we choose an ordering cc 6 7r(B) using a maximum entropy reranking approach (Collins and Koo, 2005).
Related Work
In this next section, we describe our maximum entropy reranking approach that tries to develop a more comprehensive model of the modifier ordering process to avoid the sparsity issues that previous ap-
maximum entropy is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Adler, Meni and Goldberg, Yoav and Gabay, David and Elhadad, Michael
Abstract
We introduce a novel algorithm that is language independent: it exploits a maximum entropy letters model trained over the known words observed in the corpus and the distribution of the unknown words in known tag contexts, through iterative approximation.
Conclusion
The algorithm we have proposed is language independent: it exploits a maximum entropy letters model trained over the known words observed in the corpus and the distribution of the unknown words in known tag contexts, through iterative approximation.
Method
Letters A maximum entropy model is built for all unknown tokens in order to estimate their tag distribution.
Method
For each possible such segmentation, the full feature vector is constructed, and submitted to the Maximum Entropy model.
Method
To address this lack of precision, we learn a maximum entropy model on the basis of the following binary features: one feature for each pattern listed in column Formation of Table 3 (40 distinct patterns) and one feature for “no pattern”.
maximum entropy is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Bergsma, Shane and Lin, Dekang and Goebel, Randy
Evaluation
For classification, we use a maximum entropy model (Berger et al., 1996), from the logistic regression package in Weka (Witten and Frank, 2005), with all default parameter settings.
Evaluation
Note that our maximum entropy classifier actually produces a probability of non-referentiality, which is thresholded at 50% to make a classification.
Results
When we inspect the probabilities produced by the maximum entropy classifier (Section 4.2), we see only a weak bias for the non-referential class on these examples, reflecting our classifier’s uncertainty.
Results
The suitability of this kind of approach to correcting some of our system’s errors is especially obvious when we inspect the probabilities of the maximum entropy model’s output decisions on the Test-200 set.
Results
Where the maximum entropy classifier makes mistakes, it does so with less confidence than when it classifies correct examples.
maximum entropy is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Espinosa, Dominic and White, Michael and Mehay, Dennis
Related Work
Additionally, as our tagger employs maximum entropy modeling, it is able to take into account a greater variety of contextual features, including those derived from parent nodes.
The Approach
3.2 Maximum Entropy Hypertagging
The Approach
The resulting contextual features and gold-standard supertag for each predication were then used to train a maximum entropy classifier model.
The Approach
Maximum entropy models describe a set of probability distributions of the form:
maximum entropy is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Chen, Yanping and Zheng, Qinghua and Zhang, Wei
Discussion
Par in Column 4 is the number of parameters in the trained maximum entropy model, which indicate the model complexity.
Feature Construction
A maximum entropy multi-class classifier is trained and tested on the generated relation instances.
Feature Construction
To implement the maximum entropy model, the toolkit provided by Le (2004) is employed.
Introduction
We apply these approaches in a maximum entropy based system to extract relations from the ACE 2005 corpus.
Related Work
The TRE systems use techniques such as: Rules (Regulars, Patterns and Propositions) (Miller et al., 1998), Kernel method (Zhang et al., 2006b; Zelenko et al., 2003), Belief network (Roth and Yih, 2002), Linear programming (Roth and Yih, 2007), Maximum entropy (Kambhatla, 2004) or SVM (GuoDong et al., 2005).
maximum entropy is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Cheung, Jackie Chi Kit and Penn, Gerald
Introduction
This is done using a maximum entropy model (call it MAXENT).
Introduction
Then, the remaining constituents are ordered using a second maximum entropy model (MAXENTZ).
Introduction
The maximum entropy model for both steps rely on the following features:
maximum entropy is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Bhat, Suma and Xue, Huichao and Yoon, Su-Youn
Conclusions
Empirically, we show that the proposed measure, based on a maximum entropy classification, satisfied the constraints of the design of an objective measure to a high degree.
Experimental Setup
5.3.4 Maximum Entropy Model Classifier
Experimental Setup
We used the maximum entropy classifier implementation in the MaxEnt toolkit4.
Experimental Setup
One straightforward way of using the maximum entropy classifier’s prediction for our case is to directly use its predicted score-level — l, 2, 3 or 4.
Models for Measuring Grammatical Competence
This is done by resorting to a maximum entropy model based approach, to which we turn next.
maximum entropy is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Meng, Xinfan and Wei, Furu and Liu, Xiaohua and Zhou, Ming and Xu, Ge and Wang, Houfeng
Experiment
Following the description in (Lu et al., 2011), we remove neutral sentences and keep only high confident positive and negative sentences as predicted by a maximum entropy classifier trained on the labeled data.
Experiment
This model use English labeled data and Chinese labeled data to obtain initial parameters for two maximum entropy classifiers (for English documents and Chinese documents), and then conduct EM-iterations to update the parameters to gradually improve the agreement of the two monolingual classifiers on the unlabeled parallel sentences.
Related Work
(2002) compare the performance of three commonly used machine learning models (Naive Bayes, Maximum Entropy and SVM).
Related Work
They propose a method of training two classifiers based on maximum entropy formulation to maximize their prediction agreement on the parallel corpus.
maximum entropy is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Xiong, Deyi and Zhang, Min
Abstract
The proposed sense-based translation model enables the decoder to select appropriate translations for source words according to the inferred senses for these words using maximum entropy classifiers.
Conclusion
We incorporate these learned word senses as translation evidences into maximum entropy classifiers which form the
Experiments
Our baseline system is a state-of-the-art SMT system which adapts Bracketing Transduction Grammars (Wu, 1997) to phrasal translation and equips itself with a maximum entropy based reordering model (Xiong et al., 2006).
Introduction
In order to incorporate word senses into SMT, we propose a sense-based translation model that is built on maximum entropy classifiers.
maximum entropy is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Goto, Isao and Utiyama, Masao and Sumita, Eiichiro and Tamura, Akihiro and Kurohashi, Sadao
Experiment
The L-BFGS method (Liu and Nocedal, 1989) was used to estimate the weight parameters of maximum entropy models.
Experiment
The maximum entropy method with Gaussian prior smoothing was used to estimate the model parameters.
Proposed Method
In this work, we use the maximum entropy method (Berger et al., 1996) as a discriminative machine learning method.
Proposed Method
The reason for this is that a model based on the maximum entropy method can calculate probabilities.
maximum entropy is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Xiong, Deyi and Zhang, Min and Li, Haizhou
Argument Reordering Model
After all features are extracted, we use the maXimum entropy toolkit in Section 3.3 to train the maXimum entropy classifier as formulated in Eq.
Predicate Translation Model
The essential component of our model is a maXimum entropy classifier pt(e|C that predicts the target translation 6 for a verbal predicate 2} given its surrounding context C(v).
Predicate Translation Model
This will increase the number of classes to be predicted by the maximum entropy classifier.
Predicate Translation Model
Using these events, we train one maximum entropy classifier per verbal predicate (16,121 verbs in total) via the off-the-shelf MaxEnt toolkit3.
maximum entropy is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Xiong, Deyi and Zhang, Min and Li, Haizhou
Abstract
We use a maximum entropy classifier to predict translation errors by integrating word posterior probability feature and linguistic features.
Conclusions and Future Work
In this paper, we have presented a maximum entropy based approach to automatically detect errors in translation hypotheses generated by SMT
Error Detection with a Maximum Entropy Model
For classification, we employ the maximum entropy model (Berger et al., 1996) to predict whether a word 21) is correct or incorrect given its feature vector p.
Introduction
We integrate two sets of linguistic features into a maximum entropy (MaxEnt) model and develop a MaxEnt-based binary classifier to predict the category (correct or incorrect) for each word in a generated target sentence.
maximum entropy is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Ohno, Tomohiro and Murata, Masaki and Matsubara, Shigeki
Discussion
In the experiment described in Section 5, we used the linguistic information provided by human as the features on the maximum entropy method.
Experiment
Here, we used the maximum entropy method tool (Zhang, 2008) with the default options except “-i 2000.”
Linefeed Insertion Technique
These probabilities are estimated by the maximum entropy method.
Linefeed Insertion Technique
4.2 Features on Maximum Entropy Method
maximum entropy is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Saha, Sujan Kumar and Mitra, Pabitra and Sarkar, Sudeshna
Abstract
Methods like Maximum Entropy and Conditional Random Fields make use of features for the training purpose.
Abstract
The feature reduction techniques lead to a substantial performance improvement over baseline Maximum Entropy technique.
Introduction
In their Maximum Entropy (MaXEnt) based approach for Hindi NER development, Saha et al.
Maximum Entropy Based Model for Hindi NER
Maximum Entropy (MaxEnt) principle is a commonly used technique which provides probability of belongingness of a token to a class.
maximum entropy is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zweig, Geoffrey and Platt, John C. and Meek, Christopher and Burges, Christopher J.C. and Yessenalina, Ainur and Liu, Qiang
Sentence Completion via Language Modeling
3.2 Maximum Entropy Class-Based N-gram Language Model
Sentence Completion via Language Modeling
The key ideas are the modeling of word n—gram probabilities with a maximum entropy model, and the use of word—class information in the definition of the features.
Sentence Completion via Language Modeling
Both components are themselves maximum entropy n—gram models in which the probability of a word or class label l given history h is determined by %exp(zk fk(h, The features fk(h, l) used are the presence of various patterns in the concatenation of hl, for example whether a particular suffix is present in hl.
maximum entropy is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Feng, Minwei and Peter, Jan-Thorsten and Ney, Hermann
Comparative Study
4.2 Maximum entropy reordering model
Comparative Study
(Zens and Ney, 2006) proposed a maximum entropy classifier to predict the orientation of the next phrase given the current phrase.
Introduction
The classifier can be trained with maximum likelihood like Moses lexicalized reordering (Koehn et al., 2007) and hierarchical lexicalized reordering model (Galley and Manning, 2008) or be trained under maximum entropy framework (Zens and Ney, 2006).
maximum entropy is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Lu, Bin and Tan, Chenhao and Cardie, Claire and K. Tsou, Benjamin
A Joint Model with Unlabeled Parallel Text
Maximum entropy (MaxEnt) models1 have been widely used in many NLP tasks (Berger et al., 1996; Ratnaparkhi, 1997; Smith, 2006).
Introduction
maximum entropy and SVM classifiers) as well as two alternative methods for leveraging unlabeled data (transductive SVMs (Joachims, 1999b) and co-training (Blum and Mitchell, 1998)).
Related Work
Among the popular semi-supervised methods (e. g. EM on Nai've Bayes (Nigam et al., 2000), co-training (Blum and Mitchell, 1998), transductive SVMs (Joachims, 1999b), and co-regularization (Sindhwani et al., 2005; Amini et al., 2010)), our approach employs the EM algorithm, extending it to the bilingual case based on maximum entropy .
maximum entropy is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
LIU, Xiaohua and ZHANG, Shaodian and WEI, Furu and ZHOU, Ming
Our Method
We have replaced KNN by other classifiers, such as those based on Maximum Entropy and Support Vector Machines, respectively.
Our Method
Similarly, to study the effectiveness of the CRF model, it is replaced by its alternations, such as the HMM labeler and a beam search plus a maximum entropy based classifier.
Related Work
Other methods, such as classification based on Maximum Entropy models and sequential application of Per-ceptron or Winnow (Collins, 2002), are also practiced.
maximum entropy is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Scheible, Christian and Schütze, Hinrich
Distant Supervision
To increase coverage, we train a Maximum Entropy (MaxEnt) classifier (Manning and Klein,
Features
Following previous sequence classification work with Maximum Entropy models (e. g., (Ratna-parkhi, 1996)), we use selected features of adjacent sentences.
Sentiment Relevance
We divide both the SR and P&L corpora into training (50%) and test sets (50%) and train a Maximum Entropy (MaxEnt) classifier (Manning and Klein, 2003) with bag-of-word features.
maximum entropy is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Xiang, Bing and Luo, Xiaoqiang and Zhou, Bowen
Conclusions and Future Work
In this paper, we presented a novel structured approach to EC prediction, which utilizes a maximum entropy model with various syntactic features and shows significantly higher accuracy than the state-of-the-art approaches.
Experimental Results
We parse our test set with a maximum entropy based statistical parser (Ratna—parkhi, 1997) first.
Related Work
A maximum entropy model is utilized to predict the tags, but different types of ECs are not distinguished.
maximum entropy is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Tratz, Stephen and Hovy, Eduard
Automated Classification
We use a Maximum Entropy (Berger et al., 1996) classifier with a large number of boolean features, some of which are novel (e. g., the inclusion of words from WordNet definitions).
Automated Classification
Maximum Entropy classifiers have been effective on a variety of NLP problems including preposition sense disambiguation (Ye and Baldwin, 2007), which is somewhat similar to noun compound interpretation.
Automated Classification
The results for these runs using the Maximum Entropy classifier are presented in Table 4.
maximum entropy is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Boxwell, Stephen and Mehay, Dennis and Brew, Chris
Identification and Labeling Models
As in previous approaches to SRL, Brutus uses a two-stage pipeline of maximum entropy classifiers.
Introduction
For the identification and labeling steps, we train a maximum entropy classifier (Berger et al., 1996) over sections 02-21 of a version of the CCGbank corpus (Hockenmaier and Steedman, 2007) that has been augmented by projecting the Propbank semantic annotations (Boxwell and White, 2008).
Results
6G&H use a generative model with a back-off lattice, whereas we use a maximum entropy classifier.
maximum entropy is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Xue, Huichao and Hwa, Rebecca
A Classifier for Merging Basic-Edits
To predict whether two basic-edits address the same writing problem more discriminatively, we train a Maximum Entropy binary classifier based on features extracted from relevant contexts for the basic edits.
Experimental Setup
MaXEntMerger We use the Maximum Entropy classifier to predict whether we should merge the two edits, as described in Section 34.
Experimental Setup
We use a Maximum Entropy classifier along with features suggested by Swanson and Yamangil for this task.
maximum entropy is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: