Index of papers in Proc. ACL that mention

MaxEnt

Seen in text as:

MaxEnt (98)
MaXEnt (57)
MAXENT (14)
maxent (13)
maxe (6)
Maxent (6)

Seen in 186 sentences in 24 papers.

1. Word Clustering and Word Selection Based Feature Reduction for MaxEnt Based Hindi NER

Saha, Sujan Kumar and Mitra, Pabitra and Sarkar, Sudeshna

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	based Feature Reduction for MaxEnt
Introduction	In their Maximum Entropy ( MaXEnt ) based approach for Hindi NER development, Saha et al.
Introduction	(2008) also observed that the performance of the MaXEnt based model often decreases when huge number of features are used in the model.
Maximum Entropy Based Model for Hindi NER	Maximum Entropy ( MaxEnt ) principle is a commonly used technique which provides probability of belongingness of a token to a class.
Maximum Entropy Based Model for Hindi NER	MaxEnt computes the probability p(0\| h) for any 0 from the space of all possible outcomes 0, and for every h from the space of all possible histories H. In NER, history can be viewed as all information derivable from the training corpus relative to the current token.
Maximum Entropy Based Model for Hindi NER	The computation of probability (p(0\|h)) of an outcome for a token in MaxEnt depends on a set of features that are helpful in making predictions about the outcome.

MaxEnt is mentioned in 16 sentences in this paper.

Topics mentioned in this paper:

2. A Sense-Based Translation Model for Statistical Machine Translation

Xiong, Deyi and Zhang, Min

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Decoding with Sense-Based Translation Model	MaxEnt classifiers
Decoding with Sense-Based Translation Model	Once we get sense clusters for word tokens in test sentences, we load pre-trained MaXEnt classifiers of the corresponding word types.
Experiments	We trained our MaxEnt classifiers with the off-the-shelf MaxEnt tool.4 We performed 100 iterations of the L-BFGS algorithm implemented in the training toolkit on the collected training events from the sense-annotated data as described in Section 3.2.
Experiments	It took an average of 57.5 seconds for training a Maxent classifier.
Sense-Based Translation Model	entropy ( MaxEnt ) based classifier that is used to predict the translation probability p(é\|C(c)).
Sense-Based Translation Model	The MaxEnt classifier can be formulated as follows.
Sense-Based Translation Model	This is not a issue for the MaxEnt classifier as it can deal with arbitrary overlapping features (Berger et al., 1996).

MaxEnt is mentioned in 14 sentences in this paper.

Topics mentioned in this paper:

3. Confidence Measure for Word Alignment

Huang, Fei

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Alignment Link Confidence Measure	We combine the HMM alignment, the BM alignment and the MaXEnt alignment (ME) using the above link selection algorithm.
Alignment Link Confidence Measure	Figure 3 shows such an example, where alignment errors in the MaXEnt alignment are shown with dotted lines.
Improved MaXEnt Aligner with Confidence-based Link Filtering	In addition to the alignment combination, we also improve the performance of the MaXEnt aligner through confidence-based alignment link filtering.
Improved MaXEnt Aligner with Confidence-based Link Filtering	Here we select the MaXEnt aligner because it has
Introduction	In section 4 we show how to improve a MaXEnt word alignment quality by removing low confidence alignment links, which also leads to improved translation quality as shown in section 5.
Related Work	Regarding word alignment combination, in addition to the commonly used ”intersection-union-refine” approach (Och and Ney, 2003), (Ayan and Dorr, 2006b) and (Ayan et al., 2005) combined alignment links from multiple word alignment based on a set of linguistic and alignment features within the MaXEnt framework or a neural net model.
Sentence Alignment Confidence Measure	HMM 54.72 -0.710 BM 62.53 -0.699 MaxEnt 69.26 -0.699
Sentence Alignment Confidence Measure	We randomly selected 512 Chinese-English (CE) sentence pairs and generated word alignment using the MaxEnt aligner (Ittycheriah and Roukos, 2005).
Sentence Alignment Confidence Measure	For each sentence pair in the CE test set, we calculate the confidence scores of the HMM alignment, the Block Model alignment and the MaXEnt alignment, then select the alignment with the highest confidence score.
Translation	We extract phrase translation tables from the baseline MaXEnt word alignment as well as the alignment with confidence-based link filtering, then translate the test set with each phrase translation table.

MaxEnt is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

4. Shallow Analysis Based Assessment of Syntactic Complexity for Automated Speech Scoring

Bhat, Suma and Xue, Huichao and Yoon, Su-Youn

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	Subsequently, the feature extraction stage (a VSM or a MaxEnt model as the case may be) generates the syntactic complexity feature which is then incorporated in a multiple linear regression model to generate a score.
Experimental Setup	We used the maximum entropy classifier implementation in the MaxEnt toolkit4.
Experimental Setup	The results that follow are based on MaxEnt classifier’s parameter settings initialized to zero.
Models for Measuring Grammatical Competence	The inductive classifier we use here is the maximum-entropy model ( MaxEnt ) which has been used to solve several statistical natural language processing problems with much success (Berger et al., 1996; Borthwick et al., 1998; Borthwick, 1999; Pang et al., 2002; Klein et al., 2003; Rosenfeld, 2005).
Models for Measuring Grammatical Competence	The productive feature engineering aspects of incorporating features into the discriminative MaxEnt classifier motivate the model choice for the problem at hand.
Models for Measuring Grammatical Competence	In particular, the ability of the MaxEnt model’s estimation routine to handle overlapping (correlated) features makes it directly applicable to address the first limitation of the VSM model.

MaxEnt is mentioned in 15 sentences in this paper.

Topics mentioned in this paper:

5. Ordering Prenominal Modifiers with a Reranking Approach

Liu, Jenny and Haghighi, Aria

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Analysis	MAXENT seems to outperform the CLASS BASED baseline because it learns more from the training data.
Analysis	' —E\|— MaxEnt —@— ClassBased I I I I I I I I I I I I I I I I I I I I I I I I I
Analysis	' —E\|— MaXEnt —9— ClassB ased
Experiments	To evaluate our system ( MAXENT ) and our baselines, we partitioned the corpora into training and testing data.
Experiments	For each NP in the test data, we generated a set of modifiers and looked at the predicted orderings of the MAXENT , CLASS BASED, and GOOGLE N-GRAM methods.
Results	The MAXENT model consistently outperforms CLASS BASED across all test corpora and sequence lengths for both tokens and types, except when testing on the Brown and Switchboard corpora for modifier sequences of length 5, for which neither approach is able to make any correct predictions.
Results	MAXENT also outperforms the GOOGLE N-GRAM baseline for almost all test corpora and sequence lengths.
Results	For the Switchboard test corpus token and type accuracies, the GOOGLE N-GRAM baseline is more accurate than MAXENT for sequences of length 2 and overall, but the accuracy of MAXENT is competitive with that of GOOGLE N-GRAM.

MaxEnt is mentioned in 16 sentences in this paper.

Topics mentioned in this paper:

MAXENT (16)
n-gram (15)
reranking (7)

6. Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora

Lu, Bin and Tan, Chenhao and Cardie, Claire and K. Tsou, Benjamin

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Joint Model with Unlabeled Parallel Text	Maximum entropy ( MaxEnt ) models1 have been widely used in many NLP tasks (Berger et al., 1996; Ratnaparkhi, 1997; Smith, 2006).
A Joint Model with Unlabeled Parallel Text	With MaxEnt , we learn from the input data:
A Joint Model with Unlabeled Parallel Text	When 11 is 0, the algorithm ignores the unlabeled data and degenerates to two MaXEnt models trained on only the labeled data.
Experimental Setup 4.1 Data Sets and Preprocessing	MaxEnt: This method learns a MaxEnt classifier for each language given the monolingual labeled data; the unlabeled data is not used.
Results and Analysis	8 By making use of the unlabeled parallel data, our proposed approach improves the accuracy, compared to MaXEnt , by 8.12% (or 33.27% error reduction) on English and 3.44% (or 16.92% error reduction) on Chinese in the first setting, and by 5.07% (or 19.67% error reduction) on English and 3.87% (or 19.4% error reduction) on Chinese in the second setting.
Results and Analysis	8Significance is tested using paired t-tests with p<0.05: denotes statistical significance compared to the corresponding performance of MaXEnt ; * denotes statistical significance compared to SVM; and r denotes statistical significance compared to Co-SVM.
Results and Analysis	When 11 is set to 0, the joint model degenerates to two MaXEnt models trained with only the labeled data.

MaxEnt is mentioned in 11 sentences in this paper.

Topics mentioned in this paper:

7. Labeling Documents with Timestamps: Learning from their Time Expressions

Chambers, Nathanael

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments and Results	The MaXEnt classifiers are also from the Stanford toolkit, and both the document and year mention classifiers use its default settings (quadratic prior).
Experiments and Results	MaXEnt Unigram is our new discriminative model for this task.
Experiments and Results	MaXEnt Time is the discriminative model with rich time features (but not NER) as described in Section 3.3.2 (Time+NER includes NER).
Learning Time Constraints	Figure 2: Distribution over years for a single document as output by a MaxEnt classifier.
Learning Time Constraints	We train a MaxEnt model on each year mention, to be described next.
Learning Time Constraints	We use a MaxEnt classifier trained on the individual year mentions.
Timestamp Classifiers	We used a MaxEnt model and evaluated with the same filtering methods based
Timestamp Classifiers	Ultimately, this MaxEnt model vastly outperforms these NLLR models.
Timestamp Classifiers	The above language modeling and MaxEnt approaches are token-based classifiers that one could apply to any topic classification domain.

MaxEnt is mentioned in 15 sentences in this paper.

Topics mentioned in this paper:

MaxEnt (15)
unigrams (15)
NER (7)

8. Sentiment Relevance

Scheible, Christian and Schütze, Hinrich

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Distant Supervision	To increase coverage, we train a Maximum Entropy ( MaxEnt ) classifier (Manning and Klein,
Distant Supervision	The MaxEnt model achieves an F1 of 61.2% on the SR corpus (Table 3, line 2).
Distant Supervision	As described in Section 4, each document is represented as a graph of sentences and weights between sentences and source/sink nodes representing SR/SNR are set to the confidence values obtained from the distantly trained MaxEnt classifier.
Sentiment Relevance	We divide both the SR and P&L corpora into training (50%) and test sets (50%) and train a Maximum Entropy ( MaxEnt ) classifier (Manning and Klein, 2003) with bag-of-word features.

MaxEnt is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

9. Learning to Transform and Select Elementary Trees for Improved Syntax-based Machine Translations

Zhao, Bing and Lee, Young-Suk and Luo, Xiaoqiang and Li, Liu

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discussions and Conclusions	We achieved a high accuracy of 84.7% for predicting such boundaries using MaXEnt model on machine parse trees.
Elementary Trees to String Grammar	During training, we label nodes with translation boundaries, as one additional fitnction tag; during decoding, we employ the MaxEnt model to predict the translation boundary label probability for each span associated with a subgraph y, and discourage derivations accordingly for using nonterminals over the non—translation boundary span.
Experiments	To learn our MaxEnt models defined in § 3.3, we collect the events during extracting elm2str grammar in training time, and learn the model using improved iterative scaling.
Experiments	There are 16 thousand human parse trees with human alignment; additional 1 thousand human parse and aligned sent-pairs are used as unseen test set to verify our MaxEnt models and parsers.
Experiments	It showed our MaxEnt model is very accurate using human trees: 94.5% of accuracy, and about 84.7% of accuracy for using the machine parsed trees.
Introduction	The boundary cases were not addressed in the previous literature for trees, and here we include them in our feature sets for learning a MaxEnt model to predict the transformations.
Introduction	The rest of the paper is organized as follows: in section 2, we analyze the projectable structures using human aligned and parsed data, to identify the problems for SCFG in general; in section 3, our proposed approach is explained in detail, including the statistical operators using a MaxEnt model; in section 4, we illustrate the integration of the proposed approach in our decoder; in section 5, we present experimental results; in section 6, we conclude with discussions and future work.

MaxEnt is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

10. Hedge Classification in Biomedical Texts with a Weakly Supervised Selection of Keywords

Szarvas, Gy"orgy

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Methods	This is performed by weighting features to maximise the likelihood of data and, for each instance, decisions are made based on features present at that point, thus maxent classification is quite suitable for our purposes.
Methods	As feature weights are mutually estimated, the maxent classifier is capable of taking feature dependence into account.
Methods	By downweighting such features, maxent is capable of modelling to a certain extent the special characteristics which arise from the automatic or weakly supervised training data acquisition procedure.
Results	Here we decided not to check whether these keywords made sense in scientific texts or not, but instead left this task to the maximum entropy classifier, and added only those keywords that were found reliable enough to predict spec label alone by the maxent model trained on the training dataset.
Results	This 54—keyword maxent classifier got an F5=1(spec) score of 79.73%.
Results	We manually examined all keywords that had a P(spec) > 0.5 given as a standalone instance for our maxent model, and constructed a dictionary of hedge cues from the promising candidates.

MaxEnt is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

11. Error Detection for Statistical Machine Translation Using Linguistic Features

Xiong, Deyi and Zhang, Min and Li, Haizhou

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Error Detection with a Maximum Entropy Model	We tune our model feature weights using an off-the-shelf MaXEnt toolkit (Zhang, 2004).
Error Detection with a Maximum Entropy Model	During test, if the probability p(correct\|¢) is larger than p(incorrect\|¢) according the trained MaXEnt model, the word is labeled as correct otherwise incorrect.
Experiments	Starting with MaXEnt models with single linguistic feature or word posterior probability based feature, we incorporated additional features incre-mentally by combining features together.
Experiments	We conducted three groups of experiments using the MaXEnt based error detection model with various feature combinations.
Experiments	Using discrete word posterior probabilities as features in the MaxEnt based error detection model is marginally better than word posterior probability thresholding in terms of CER, but obtains a 13.79% relative improvement in F measure.
Introduction	We integrate two sets of linguistic features into a maximum entropy ( MaxEnt ) model and develop a MaxEnt-based binary classifier to predict the category (correct or incorrect) for each word in a generated target sentence.

MaxEnt is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

12. Encoding Relation Requirements for Relation Extraction via Joint Inference

Chen, Liwei and Feng, Yansong and Huang, Songfang and Qin, Yong and Zhao, Dongyan

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusions	Furthermore, our framework is scalable for other local sentence level extractors in addition to the MaxEnt model.
Experiments	Our ILP model and its variants all outperform Mintz++ in precision in both datasets, indicating that our approach helps filter out incorrect predictions from the output of MaxEnt model.
Experiments	However, in the Riedel’s dataset, Mintz++, the MaxEnt relation extractor, does not perform well, and our framework cannot improve its performance.
Experiments	Hence, our framework does not perform well due to the poor performance of MaXEnt extractor and the lack of clues.
The Framework	By adopting ILP, we can combine the local information including MaXEnt confidence scores and the implicit relation backgrounds that are embedded into global consistencies of the entity tuples together.

MaxEnt is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

13. Fast Consensus Decoding over Translation Forests

DeNero, John and Chiang, David and Knight, Kevin

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Consensus Decoding Algorithms	The standard Viterbi decoding objective is to find 6* = arg maxe A - 6( f, e).
Consensus Decoding Algorithms	6 = arg maxe EP(e/\|f) [8(6; (3’)] arg maxe Z P(e’\|f) - 8(6; 6’)
Consensus Decoding Algorithms	arg maerE EP(e’\|f) [3(6; 6/” = arg maxe Z P(e/\|f) - ij'(€) ° ¢j(el) e’EE j

MaxEnt is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

14. Nonparametric Learning of Phonological Constraints in Optimality Theory

Doyle, Gabriel and Bicknell, Klinton and Levy, Roger

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiment	To establish performance for the phonological standard, we use the IBPOT learner to find constraint weights but do not update M. The resultant learner is essentially MaxEnt OT with the weights estimated through Metropolis sampling instead of gradient ascent.
Introduction	We consider this question by examining the dominant framework in modern phonology, Optimality Theory (Prince and Smolensky, 1993, OT), implemented in a log-linear framework, MaXEnt OT (Goldwater and Johnson, 2003), with output forms’ probabilities based on a weighted sum of
Phonology and Optimality Theory 2.1 OT structure	In IBPOT, we use the log-linear EVAL developed by Goldwater and J ohn-son (2003) in their MaxEnt OT system.
Phonology and Optimality Theory 2.1 OT structure	MEOT also is motivated by the general MaxEnt framework, whereas most other OT formulations are ad hoc constructions specific to phonology.
Phonology and Optimality Theory 2.1 OT structure	In MaXEnt OT, each constraint has a weight, and the candidates’ scores are the sums of the weights of violated constraints.

MaxEnt is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

15. Cut the noise: Mutually reinforcing reordering and alignments for improved machine translation

Visweswariah, Karthik and Khapra, Mitesh M. and Ramanathan, Ananthakrishnan

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Generating reference reordering from parallel sentences	This model was significantly better than the MaxEnt aligner (Ittycheriah and Roukos, 2005) and is also flexible in the sense that it allows for arbitrary features to be introduced while still keeping training and decoding tractable by using a greedy decoding algorithm that explores potential alignments in a small neighborhood of the current alignment.
Generating reference reordering from parallel sentences	The model thus needs a reasonably good initial alignment to start with for which we use the MaxEnt aligner (Ittycheriah and Roukos, 2005) as in McCarley et al.
Results and Discussions	None - 35.5 Manual 180K 52.5 MaxEnt 70.0 3.9M 49.5
Results and Discussions	We see that the quality of the alignments matter a great deal to the reordering model; using MaxEnt alignments cause a degradation in performance over just using a small set of manual word alignments.

MaxEnt is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

16. Improving Chinese Word Segmentation on Micro-blog Using Rich Punctuations

Zhang, Longkai and Li, Li and He, Zhengyan and Wang, Houfeng and Sun, Ni

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiment	Method P R F OOV—R Stanford 0.861 0.853 0.857 0.639 ICTCLAS 0.812 0.861 0.836 0.602 Li-Sun 0.707 0.820 0.760 0.734 Maxent 0.868 0.844 0.856 0.760 No-punc 0.865 0.829 0.846 0.760 No-balance 0.869 0.877 0.873 0.757 Our method 0.875 0.875 0.875 0.773
Experiment	Maxent only uses the PKU data for training, with neither punctuation information nor self-training framework incorporated.
Experiment	The comparison of Maxent and No-punctuation

MaxEnt is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

17. Adaptive HTER Estimation for Document-Specific MT Post-Editing

Huang, Fei and Xu, Jian-Ming and Ittycheriah, Abraham and Roukos, Salim

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Document-specific MT System	ment (HMM (Vogel et al., 1996) and MaxEnt (Ittycheriah and Roukos, 2005) alignment models, phrase pair extraction, MT model training (Ittycheriah and Roukos, 2007) and LM model training.
Related Work	Target part-of-speech and null dependency link are exploited in a MaXEnt classifier to improve the MT quality estimation (Xiong et al., 2010).
Static MT Quality Estimation	0 17 decoding features, including phrase translation probabilities (source-to-target and target-to-source), word translation probabilities (also in both directions), maxent prob-abilitiesl, word count, phrase count, distor-
Static MT Quality Estimation	1The maxent probability is the translation probability

MaxEnt is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

18. A Unified Model for Soft Linguistic Reordering Constraints in Statistical Machine Translation

Li, Junhui and Marton, Yuval and Resnik, Philip and Daumé III, Hal

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discussion	It shows that 1) as expected, our classifiers do worse on the harder semantic reordering prediction than syntactic reordering prediction; 2) thanks to the high accuracy obtained by the maxent classifiers, integrating either the syntactic or the semantic reordering constraints results in better reordering performance from both syntactic and semantic perspectives; 3) in terms of the mutual impact, the syntactic reordering models help improving semantic reordering more than the semantic reordering
Discussion	Syntactic Semantic l-m rm l-m rm MR08 75.0 78.0 66.3 68.5 +syn-reorder 78.4 80.9 69.0 70.2 +sem—reorder 76.0 78.8 70.7 72.7 +b0th 78.6 81.7 70.6 72.1 Maxent Classifier 80.7 85.6 70.9 73.5
Experiments	tactic parsing and semantic role labeling on the Chinese sentences, then train the models by using MaxEnt toolkit with L1 regularizer (Tsuruoka et al., 2009).3 Table 3 shows the reordering type distribution over the training data.
Related Work	Marton and Resnik (2008) employed soft syntactic constraints with weighted binary features and no MaXEnt model.

MaxEnt is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

19. Infusion of Labeled Data into Distant Supervision for Relation Extraction

Pershina, Maria and Min, Bonan and Xu, Wei and Grishman, Ralph

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Available at http://nlp. stanford.edu/software/mimlre. shtml.	—l— Guided DS Semi—MIML —.— DS+upsampling —'— MaxEnt
Available at http://nlp. stanford.edu/software/mimlre. shtml.	Our baselines: 1) MaXEnt is a supervised maximum entropy baseline trained on a human-labeled data; 2) DS+upsamp\|ing is an upsampling experiment, where MIML was trained on a mix of a distantly-labeled and human-labeled data; 3) Semi-MIML is a recent semi-supervised extension.
The Challenge	We experimentally tested alternative feature sets by building supervised Maximum Entropy ( MaxEnt ) models using the hand-labeled data (Table 3), and selected an effective combination of three features from the full feature set used by Surdeanu et al., (2011):
The Challenge	Table 3: Performance of a MaxEnt , trained on hand-labeled data using all features (Surdeanu et al., 2011) vs using a subset of two (types of entities, dependency path), or three (adding a span word) features, and evaluated on the test set.

MaxEnt is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

20. A Shift-Reduce Parsing Algorithm for Phrase-based String-to-Dependency Translation

Liu, Yang

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	\| standard \| 34.79 \| 56.93 \| + depLM 3529* 56.17 + maxent 35.40 56.09**
Introduction	+ depLM & maxent 35.71 55.87
Introduction	Adding dependency language model (“depLM”) and the maximum entropy shift-reduce parsing model ( “maxent” ) significantly improves BLEU and TER on the development set, both separately and jointly.

MaxEnt is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

21. A Joint Model for Discovery of Aspects in Utterances

Celikyilmaz, Asli and Hakkani-Tur, Dilek

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

MultiLayer Context Model - MCM	and predicted dialog act by arg maxa 13(a\|ud*):
MultiLayer Context Model - MCM	* N M311 a; = arg maXa [6351 a * HF”, M1: ] (6)
MultiLayer Context Model - MCM	For each segment wuj in u, its predicted slot are determined by arg maXS P(sj\|wuj,d*,sj_1):

MaxEnt is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

22. Enlisting the Ghost: Modeling Empty Categories for Machine Translation

Xiang, Bing and Luo, Xiaoqiang and Zhou, Bowen

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In this paper we present a comprehensive treatment of ECs by first recovering them with a structured MaxEnt model with a rich set of syntactic and lexical features, and then incorporating the predicted ECs into a Chinese-to-English machine translation task through multiple approaches, including the extraction of EC-specific sparse features.
Chinese Empty Category Prediction	We propose a structured MaXEnt model for predicting ECs.
Chinese Empty Category Prediction	(1) is the familiar log linear (or MaXEnt ) model, where fk,(ei_1,T, 6,) is the feature function and

MaxEnt is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

23. Measuring Sentiment Annotation Complexity of Text

Joshi, Aditya and Mishra, Abhijit and Senthamilselvan, Nivvedan and Bhattacharyya, Pushpak

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discussion	We use three sentiment classification techniques: Na‘1've Bayes, MaxEnt and SVM with un-igrams, bigrams and trigrams as features.
Discussion	MaxEnt (Movie) -0.29 (72.17) MaxEnt (Twitter) -0.26 (71.68) SVM (Movie) -().24 (66.27) SVM (Twitter) -().19 (73.15)
Discussion	MaxEnt has the highest negative correlation of -().29 and -().26.

MaxEnt is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

24. A Convolutional Neural Network for Modelling Sentences

Kalchbrenner, Nal and Grefenstette, Edward and Blunsom, Phil

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	unigram, bigram, trigram 92.6 MAXENT POS, chunks, NE, supertags
Experiments	unigram, bigram, trigram 93.6 MAxENT POS, wh-word, head word
Experiments	SVM 81.6 BINB 82.7 MAXENT 83.0 MAX-TDNN 78.8 NBOW 80.9 DCNN 87.4

MaxEnt is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: