Biological validation analysis | Genes with common annotations are considered as true positives . |
Biological validation analysis | The performance is based on the number of candidate genes that are considered true positives . |
Biological validation analysis | To quantify the statistical significance of a given number of true positives at a given iteration step we use a sliding window approach: At each iteration step i, we consider the same number of candidate genes as there are seed genes for the respective disease. |
Comparison with existing methods | a higher ratio of true positives TP/(TP+FP). |
Discussion | This can be used to estimate the expected true positive rate in the predictions and is particularly convenient for predicting new disease associations, where the total number of proteins involved in a disease is not known. |
Estimating the recovery rate | As expected, the highest rate of true positives is achieved in early iterations, so the highest ranked proteins are most likely to be part of the original full module. |
Estimating the recovery rate | Indeed, estimating the true positive rate is inherently difficult as the true set of proteins is by definition unknown. |
Validating disease modules | Hence, the true positive rate can be estimated by removing varying fractions of seed proteins. |
Correlation analysis in using bound estimates protease captures known pair correlations | From these 1275 pairs, the 127 (top 10%) pairs with the highest MI when calculated using the MSA were selected as the putative true positives to which we compared our procedure. |
Correlation analysis in using bound estimates protease captures known pair correlations | In total, 1275 pairs are plotted (127 putative true positives ) that are common to both our deep sequencing dataset and the Stanford HIVDB downloadable protease dataset (see Materials and Methods). |
Correlation analysis in using bound estimates protease captures known pair correlations | True positives were determined through a mutual information calculation similar to the calculations in [3]. |
Supporting Information | As in Fig 5, shown are the top 5% of 1275 pairs With 127 putative true positives from Stanford HIVDB. |
Supporting Information | As in Fig 5 and S3 Fig, shown are the top 5% of 1275 pairs With 127 putative true positives from Stanford HIVDB. |
E 3 A A g Time s 'r r a E A AA Time Time | The likelihood of false positives is greatly reduced, but so is the likelihood of identifying true positives . |
Simulated data benchmarks | We use it to further assess the importance of considering asymmetric waveforms, and we eXplore how multiple hypothesis correction impacts the results when the true positives represent a relatively small fraction of the simulated time series, as we eXpect to be the case in genome-wide studies. |
Simulated data benchmarks | The receiver operating characteristic (ROC) curve plots the true positive rate (TPR) as a function of the false positive rate (FPR) as the threshold for calling a time series as a positive is varied. |
Simulated data benchmarks | A score of 1 indicates that a method correctly identified all true positives and true negatives, while a score of —1 indicates that a method yielded all false positives and false negatives. |
Discussion | Within the correctly predicted interactions (i.e., true positives ), we included Flurbiprofen and Ibuprofen detailed information about the network routes. |
nAnnoLyze benchmark | First, the precision defined as the ratio between the true positives (TP; true drug-protein interactions found by nAnno-Lyze) and the sum of TP and false positives (FP, a link between a drug and a protein not in the PDB). |
nAnnoLyze benchmarking | 1A) with an optimal threshold at —2.5 local Z-score resulting in a precision of 0.63 and coverage of 0.19 corresponding to 1,148 true positive predictions (Fig. |