Distant Supervision | To increase coverage, we train a Maximum Entropy ( MaxEnt ) classifier (Manning and Klein, |
Distant Supervision | The MaxEnt model achieves an F1 of 61.2% on the SR corpus (Table 3, line 2). |
Distant Supervision | As described in Section 4, each document is represented as a graph of sentences and weights between sentences and source/sink nodes representing SR/SNR are set to the confidence values obtained from the distantly trained MaxEnt classifier. |
Sentiment Relevance | We divide both the SR and P&L corpora into training (50%) and test sets (50%) and train a Maximum Entropy ( MaxEnt ) classifier (Manning and Klein, 2003) with bag-of-word features. |
Generating reference reordering from parallel sentences | This model was significantly better than the MaxEnt aligner (Ittycheriah and Roukos, 2005) and is also flexible in the sense that it allows for arbitrary features to be introduced while still keeping training and decoding tractable by using a greedy decoding algorithm that explores potential alignments in a small neighborhood of the current alignment. |
Generating reference reordering from parallel sentences | The model thus needs a reasonably good initial alignment to start with for which we use the MaxEnt aligner (Ittycheriah and Roukos, 2005) as in McCarley et al. |
Results and Discussions | None - 35.5 Manual 180K 52.5 MaxEnt 70.0 3.9M 49.5 |
Results and Discussions | We see that the quality of the alignments matter a great deal to the reordering model; using MaxEnt alignments cause a degradation in performance over just using a small set of manual word alignments. |
Experiment | Method P R F OOV—R Stanford 0.861 0.853 0.857 0.639 ICTCLAS 0.812 0.861 0.836 0.602 Li-Sun 0.707 0.820 0.760 0.734 Maxent 0.868 0.844 0.856 0.760 No-punc 0.865 0.829 0.846 0.760 No-balance 0.869 0.877 0.873 0.757 Our method 0.875 0.875 0.875 0.773 |
Experiment | Maxent only uses the PKU data for training, with neither punctuation information nor self-training framework incorporated. |
Experiment | The comparison of Maxent and No-punctuation |
Introduction | | standard | 34.79 | 56.93 | + depLM 3529* 56.17** + maxent 35.40** 56.09** |
Introduction | + depLM & maxent 35.71** 55.87** |
Introduction | Adding dependency language model (“depLM”) and the maximum entropy shift-reduce parsing model ( “maxent” ) significantly improves BLEU and TER on the development set, both separately and jointly. |
Abstract | In this paper we present a comprehensive treatment of ECs by first recovering them with a structured MaxEnt model with a rich set of syntactic and lexical features, and then incorporating the predicted ECs into a Chinese-to-English machine translation task through multiple approaches, including the extraction of EC-specific sparse features. |
Chinese Empty Category Prediction | We propose a structured MaXEnt model for predicting ECs. |
Chinese Empty Category Prediction | (1) is the familiar log linear (or MaXEnt ) model, where fk,(ei_1,T, 6,) is the feature function and |