Abstract | However, it has no statistically significant impact in terms of F-score as incorrect multiword expression recognition has important side effects on parsing. |
Evaluation | In order to establish the statistical significance of results between two parsing experiments in terms of F1 and UAS, we used a unidirectional t-test for two independent samples”. |
Evaluation | The statistical significance between two MWE identification experiments was established by using the McNemar—s test (Gillick and Cox, 1989). |
Evaluation | The results of the two experiments are considered statistically significant with the computed value p < 0.01. |
Experiments | The differences between scores marked with l are not statistically significant . |
Experiments | Our model outperforms all other generative models, though the improvement over the 71- gram model is not statistically significant . |
Experiments | We did not find that the use of our syntactic language model made any statistically significant increases in BLEU score. |
Experiments | They provide statistically significant improvements in review and segment F1, as well as accuracy, over the baseline models. |
Experiments | Augmenting RS-MiXMiX with sequential dependencies, yielding RS-MiXHMM, provides a moderate (though not statistically significant ) improvement in segment F1. |
Experiments | As a result, in addition to yielding alignments, RSA-MiXHMM provides small improvements over RS-MiXHMM (though they are not statistically significant ). |
Discussion of Translation Results | We realized smaller, yet statistically significant , gains on the mixed genre data sets. |
Discussion of Translation Results | The baseline contained 78 errors, while our system produced 66 errors, a statistically significant 15.4% error reduction at p S 0.01 according to a paired t-test. |
Experiments | The MT06 result is statistically significant at p g 0.01; MT08 is significant at p g 0.02. |
Experiments | We evaluated translation quality with BLEU-4 (Pa-pineni et al., 2002) and computed statistical significance with the approximate randomization method of Riezler and Maxwell (2005).9 |
Experiments | Statistical significance in BLEU differences |
Experiments | Such an improvement is statistically significant (p < 0.01). |
Experiments | Such a gain, which is statistically significant , confirms the effectiveness of semantic features. |
Experimental Results | The uparrow denotes the performance improvement compared to the precious method (above) with statistical significance under p value of 0.05, the short line ’-’ denotes there is no difference in statistical significance . |
Experimental Results | which just utilize the independent sentence-level features, behave not vary well here, and there is no statistically significant performance difference between them. |
Experimental Results | We also find that LCRF which utilizes the local context information between sentences perform better than the LR method in precision and F1 with statistical significance . |
Experiments | We also derive statistical significance of the results by using the model described in (Yeh, 2000) and implemented in (Pado, 2006). |
Experiments | Third, PTK, which produces more general structures, improves over BR by almost 1.5 ( statistically significant result) when using our dependency structures GRCT and LCT. |
Experiments | Finally, the best model of SPTK (i.e, using LCT) improves over the best PTK (i.e., using LCT) by almost 1 point ( statistically significant result): this difference is only given by lexical similarity. |
I’m ready for my closeup. | For the null hypothesis of random guessing, these results are statistically significant , p < 2‘6 m .016. |
I’m ready for my closeup. | Table 2 shows that all the subjects performed (sometimes much) better than chance, and against the null hypothesis that all subjects are guessing randomly, the results are statistically significant , p < 2‘6 m .016. |
Never send a human to do a machine’s job. | Accuracies statistically significantly greater than bag-of-words according to a two-tailed t-test are indicated with *(p<.05) and **(p<.01). |
Experiments and Analysis | The p—values in parentheses present the statistical significance of the improvements. |
Experiments and Analysis | The improvements shown in parentheses are all statistically significant (p < 10—5). |
Experiments and Analysis | The improvements shown in parentheses are all statistically significant (p < 10—5). |