Index of papers in Proc. ACL 2013 that mention
  • random sample
De Benedictis, Flavio and Faralli, Stefano and Navigli, Roberto
Comparative Evaluation
First, we randomly sampled 100 terms from our gold standard for each domain and each of the three languages.
Comparative Evaluation
Table 9: Number of domain glosses (from a random sample of 100 gold standard terms per domain) retrieved using Google Define and GlossBoot.
Comparative Evaluation
As for the precision of the extracted terms, we randomly sampled 50% of them for each system.
Experimental Setup
T0 calculate precision we randomly sampled 5% of the retrieved terms and asked two human annotators to manually tag their domain pertinence (with adjudication in case of disagreement; H = .62, indicating substantial agreement).
Experimental Setup
Precision was determined on a random sample of 5% of the acquired glosses for each domain and language.
random sample is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Melamud, Oren and Berant, Jonathan and Dagan, Ido and Goldberger, Jacob and Szpektor, Idan
Discussion and Future Work
We therefore focused on comparing the performance of our two-level scheme with state-of-the-art prior topic-level and word-level models of distributional similarity, over a random sample of inference rule applications.
Experimental Settings
Rule applications were generated by randomly sampling extractions from ReVerb, such as ( ‘Jack’, ‘agree with’, ‘Jill ’) and then sampling possible rules for each, such as ‘agree with —> feel sorry for’.
Introduction
In order to promote replicability and equal-term comparison with our results, we based our experiments on publicly available datasets, both for unsupervised learning of the evaluated models and for testing them over a random sample of rule applications.
Results
However, our result suggests that topic-level models might not be robust enough when applied to a random sample of inferences.
random sample is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Mukherjee, Arjun and Liu, Bing
Empirical Evaluation
the Max-Ent parameters A, we randomly sampled 500 terms from the held-out data (10 threads in our corpus which were excluded from the evaluation of tasks in §6.2, §6.3) appearing at least 10 times and labeled them as topical (361) or AD-expressions (139) and used the corresponding features of each term (in the context of posts where it occurs, §3) to train the Max-Ent model.
Empirical Evaluation
Instead, we randomly sampled 500 pairs (z 34% of the population) for evaluation.
Phrase Ranking based on Relevance
To compute coverage, we randomly sampled 500 documents from the corpus and listed the candidate n-grams3 in the collection of sampled 500 documents.
Phrase Ranking based on Relevance
We then computed the coverage to see how many of the relevant terms in the random sample were also present in top k phrases from the ranked candidate n-grams.
random sample is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Flati, Tiziano and Navigli, Roberto
Experiment 1: Oxford Lexical Predicates
We show in Table 3 the precision@l<: calculated over a random sample of 50 lexical predicates.11 As can be seen, while the classes quality is pretty high with low values of k, performance gradually degrades as we let k increase.
Experiment 1: Oxford Lexical Predicates
Starting from the lexical predicate items obtained as described in Section 4.2, we selected those items belonging to a random sample of 20 usage notes among those provided by the Oxford dictionary, totaling 3,245 items.
Experiment 1: Oxford Lexical Predicates
For tuning 04 we used a held-out set of 8 verbs, randomly sampled from the lexical predicates not used in the dataset.
random sample is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Tanigaki, Koichi and Shiba, Mitsuteru and Munaka, Tatsuji and Sagisaka, Yoshinori
Conclusions
Moreover, our smoothing model, though unsupervised, provides reliable supervision when sufficiently random samples of words are available as nearby words.
Discussion
Generally speaking, statistical reliability increases as the number of random sampling increases.
Discussion
Therefore, we can conclude that if sufficiently random samples of nearby words are provided, our smoothing model is reliable, though it is trained in an unsupervised fashion.
random sample is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: