Experimental setup | In order to speed up the experiments, a random sample of 2000 words was drawn from the pool and presented to the active learner each time. |
Results | The dashed curves in Figure 1 represent the baseline performance with no clustering, no context ordering, random sampling , and ALINE, unless otherwise noted. |
Results | For instance, on the Spanish dataset, random sampling reached 97% word accuracy after 1420 words had been annotated, whereas QBB did so with only 510 words — a 64% reduction in labelling effort. |
Results | It is important to note that empirical comparisons of different active learning techniques have shown that random sampling establishes a very |
Empirical Evaluation | Ideally, we would like to evaluate a random sample of the more than 1,000 languages represented in PANDICTIONARY.5 However, a high-quality evaluation of translation between two languages requires a person who is fluent in both languages. |
Empirical Evaluation | We provided our evaluators with a random sample of translations into their native language. |
Empirical Evaluation | To carry out this comparison, we randomly sampled 1,000 senses from English Wiktionary and ran the three algorithms over them. |
Seed diversity | We randomly sample Sgold from two sets of correct terms extracted from the evaluation cache. |
Unsupervised bagging | One approach is to use uniform random sampling from restricted sections of Lhand. |
Unsupervised bagging | We performed random sampling from the top 100, 200 and 500 terms of Lhand. |