Experiments | Indeed, the difference between our best results and those of Elsner and Charniak are not statistically significant. |
Experiments | Table 3 finally shows that syntactic information improves the performance of our system (yet not significantly) and gives the best results (PACC). |
Experiments | The best results , that present a statistically significant improvement when compared to the random baseline, are obtained when distance information and the number of entities “shared” by two sentences are taken into account (PW). |
The Entity Grid Model | These extensions led to the best results reported so far for the sentence ordering task. |
Abstract | Results on the Penn Treebank show that our conversion method achieves 42% error reduction over the previous best result . |
Abstract | Evaluation on the Penn Chinese Treebank indicates that a converted dependency treebank helps constituency parsing and the use of unlabeled data by self-training further increases parsing f-score to 85.2%, resulting in 6% error reduction over the previous best result . |
Experiments of Grammar Formalism Conversion | The best result of Xia et al. |
Experiments of Grammar Formalism Conversion | Finally Q-10-method achieved an f-score of 93.8% on WSJ section 22, an absolute 4.4% improvement (42% error reduction) over the best result of Xia et al. |
Experiments of Parsing | Moreover, the use of unlabeled data further boosted the parsing performance to 85.2%, an absolute 1.0% improvement over the previous best result presented in Burkett and Klein (2008). |
Introduction | Our conversion method achieves 93.8% f-score on dependency trees produced from WSJ section 22, resulting in 42% error reduction over the previous best result for DS to PS conversion. |
Introduction | When coupled with self-training technique, a reranking parser with CTB and converted CDT as labeled data achieves 85.2% f-score on CTB test set, an absolute 1.0% improvement (6% error reduction) over the previous best result for Chinese parsing. |
Evaluation of NE Recognition | From the table we observe that the best result is obtained when k is 100. |
Evaluation of NE Recognition | Similarly when we deal with all the words in the corpus (17,465 words), we got best results when the words are clustered into 1100 clusters. |
Evaluation of NE Recognition | the best result is obtained when important words for two preceding and two following positions (defined in Section 4.3) are selected. |
Maximum Entropy Based Model for Hindi NER | While experimenting with static word features, we have observed that a window of previous and next two words (wi_2...wi+2) gives best result (69.09) using the word features only. |
Word Clustering | The value of k (number of clusters) was varied till the best result is obtained. |
Conclusion | Thus, the MAP goes up from 29.6% (best result on the balanced corpora) to 42.3% ( best result on the unbalanced corpora) in the breast cancer domain, and from 16.5% to 26.0% in the diabetes domain. |
Conclusion | Here, the MAP goes up from 42.3% (best result on the unbalanced corpora) to 46.9% ( best result on the unbalanced corpora with prediction) in the breast cancer domain, and from 26.0% to 29.8% in the diabetes domain. |
Experiments and Results | We chose the balanced corpora where the standard approach has shown the best results in the previous experiment, namely [breast cancer corpus 12] and [diabetes corpus 7]. |
Experiments and Results | We can see that the best results are obtained by the Sourcepred approach for both comparable corpora. |
Experiments and Results | We can also notice that the Balanced + Prediction approach slightly outperforms the baseline while the U nbalanced+ Prediction approach gives the best results . |
Introduction | Our named entity recognition system achieves an Fl-score of 90.90 on the CoNLL 2003 English data set, which is about 1 point higher than the previous best result . |
Named Entity Recognition | Table 4 summarizes the evaluation results for our NER system and compares it with the two best results on the data set in the literature, as well the top-3 systems in CoNLL 2003. |
Named Entity Recognition | The best F—score of 90.90, which is about 1 point higher than the previous best result , is obtained with a combination of clusters. |
Query Classification | The best result is achieved with multiple phrasal clusterings. |
Abstract | By evaluating our model on the TempEval data we show that this approach leads to about 2% higher accuracy for all three types of relations —and to the best results for the task when compared to those of other machine learning based systems. |
Introduction | In comparison to other participants of the “TempEval” challenge our approach is very competitive: for two out of the three tasks we achieve the best results reported so far, by a margin of at least 2%. |
Results | We see that for task A, our global model improves an already strong local model to reach the best results both for strict scores (with a 3% points margin) and relaxed scores (with a 5% points margin). |
Results | We also achieve competitive relaxed scores which are in close range to the TempEval best results . |
Experiments | The best results are in bold; the differences among them are insignificant (according to McNemar’s test with p = .05). |
Experiments | The best result for PNDP+ (the PA algorithm using all features besides the DBN features) on the development set is with A = 100 and 5 epochs. |
Experiments | The best result for Pegasos with the same features on the development set is with A = 0.01 and 10 epochs. |
Extensions | Best result in each column in bold. |
Extensions | Best result in each column in bold. |
Extensions | Best result in each column in bold. |
Discussion | means that the best configuration for PP-attachment does not always produce the best results for parsing |
Results | The SFU representation produces the best results for Bikel (F-score 0.010 above baseline), while for Charniak the best performance is obtained with word+SF (F-score 0.007 above baseline). |
Results | For both parsers the best results are achieved with SFU, which was also the best configuration for parsing with Bikel. |
Results | Comparing the semantic representations, the best results are achieved with SFU, as we saw in the gold-standard PP-attachment case. |
Evaluation | We compared our best results with the participating systems of the task. |
Evaluation | As shown in the table, our best results outperform all of the systems and the MFS baseline. |
Evaluation | It maps our best results in the distribution of all the 20 unsupervised/knowledge-based participating systems. |
Abstract | Our model obtains the best results to date on recent shared task data for Arabic, Chinese, and English. |
Conclusion | We evaluated our system on all three languages from the CoNLL 2012 Shared Task and present the best results to date on these data sets. |
Introduction | We obtain the best results to date on these data sets.1 |
Results | In Table 2 we compare the results of the nonlocal system (This paper) to the best results from the CoNLL 2012 Shared Task.10 Specifically, this includes Fernandes et al.’s (2012) system for Arabic and English (denoted Fernandes), and Chen and Ng’s (2012) system for Chinese (denoted C&N). |
Multitask Quality Estimation 4.1 Experimental Setup | Including per-annotator noise to the pooled model provides a boost in performance, however the best results are obtained using the Combined kernel which brings the strengths of both the independent and pooled settings. |
Multitask Quality Estimation 4.1 Experimental Setup | The MTL model trained on 500 samples had an MAE of 0.7082 j: 0.0042, close to the best results from the full dataset in Table 2, despite using % as much data: here we use % as many training instances where each is singly (cf. |
Multitask Quality Estimation 4.1 Experimental Setup | However, making use of all these layers of metadata together gives substantial further improvements, reaching the best result with Com-binedA,3,T. |
Experiments | Best result on |
Experiments | We experimented with various discriminative learners on DEV, including logistic regression, perceptron and SVM, and found L1 regularized logistic regression to give the best result . |
Experiments | The final F1 of 42.0% gives a relative improvement over previous best result (Berant et al., 2013) of 31.4% by one third. |
Discussion and conclusion | Automated configuration selection had positive results, yet the system with context size one and an L2 language model component often produces the best results . |
Experiments & Results | Here we observe that a context width of one yields the best results . |
Experiments & Results | This combination of a classifier with context size one and trigram-based language model proves to be most effective and reaches the best results so far. |
Experiments | The best result is achieved by combining DT and PSQ with DK and VW. |
Experiments | Borda voting gives the best result under MAP which is probably due to the adjustment of the interpolation parameter for MAP on the development set. |
Experiments | Under NDCG and PRES, LinLearn achieves the best results , showing the advantage of automatically learning combination weights that leads to stable results across various metrics. |
Abstract | In our best result (on Assamese), our approach can predict 29% of the token-based out-of-vocabulary with a small amount of unlabeled training data. |
Evaluation | The best results (again, except for Pashto) are achieved using one of the three reranking methods (reranking by trigraph probabilities or morpheme boundaries) as opposed to doing no reranking. |
Introduction | In our best result (on Assamese), we show that our approach can predict 29% of the token-based out-of-vocabulary with a small amount of unlabeled training data. |
Experiments and Results | The three best results and the median from TAC-KBP 2012 systems are shown in the remaining columns for the sake of comparison. |
Experiments and Results | We observe that the complete algorithm (co-references, named entity labels and MDP) provides the best results on PER NE links. |
Experiments and Results | On GPE and ORG entities, the simple application of MDP without prior corrections obtains the best results . |
Prediction Experiments | the vocabulary size W exponentially and make the feature space more sparse, SME obtains its best result at W = 213, where the relative improvement over SAGE and SVM is 16.8% and 22.9% respectively (p < 0.001 under all comparisons). |
Prediction Experiments | After increasing the number of topics K, we can see SAGE consistently increase its accuracy, obtaining its best result when K = 30. |
Prediction Experiments | Except that the two models tie up when K = 10, SME outperforms SAGE for all subsequent variations of K. Similar to the region task, SME achieves the best result when K is sparser (p < 0.01 when K = 40 and K = 50). |
Experimental setup | ALL scores Significantly different from the best results for the three datasets (lines c7, i8, k7) are marked >|< (see text). |
Results and discussion | This subset always includes the best results and a number of other combinations where feature groups are added to or removed from the optimal combination. |
Results and discussion | On line k7, we show results for this run for KDD-T and for runs that differ by one feature group (lines k2—k6, k8).9 The overall best result (43.8%) is achieved when using all feature groups (line k8). |
Abstract | Our best results allow for 81.57% accuracy. |
Experiments and Results | However, as shown in Figure 4, style and content combined provided the best results . |
Experiments and Results | We found the best results to have an accuracy of 79.96% and 81.57% for 1979 and 1984 respectively using BOW, interests, online behavior, and all lexical-stylistic features. |
Conclusions, Summary and Future Work | the achievements of four different standard ML methods, to the goal of achieving the best results , as opposed to the other systems that mainly focused on one ML method, each. |
Experiments | Specifically, the AlWC_osWC feature variant achieves the best result with 87.75% accuracy. |
Experiments | Table 2 shows that SVM achieved the best result with 96.09% accuracy. |