Biomarker discovery through functional and network analysis | The decrease of mutual information in ranked genes follows an inverse logarithmic relationship (Fig. |
Biomarker discovery through functional and network analysis | For each classifier, we selected the gene subset that accounts for the top 10% of the mutual information content of all genes, yielding feature sets that range from 49 to 136 genes. |
Feature selection by mutual information | Feature selection by mutual information |
Feature selection by mutual information | Mutual information is a stochastic measure of dependence [69] and it has been widely applied in feature selection in order to find an informative subset for model training [70]. |
Feature selection by mutual information | In our work, each of the eight models were trained with the top k-ranked genes based on their mutual information (M1) to the label where MI is measured by |
Selection of most informative genes and functional enrichment analysis | The most informative genes are selected by measuring the mutual information (in bits) for each of the characteristic variables and then selecting the top 10% genes based on their information content. |
Selection of most informative genes and functional enrichment analysis | In addition to DAVID, we have performed a GSEA analysis [75] where each gene is ranked by its mutual information (S9 Table). |
Supporting Information | The intersection of the feature gene set when mutual information (MI) and differential expression (DEG) are used for ranking. |
Supporting Information | Ranked list of all genes in the EcoGEC compendium based on their mutual information for the phase, growth and aerobic classifier, before and after iterative learning. |
Supporting Information | Ranks and mutual information of the genes selected in each classifier of carbon source and oxygen supply. |
Caution about correlation | This concern extends to methods based on mutual information (e.g., relevance networks [17]) since, as Fig. |
Caution about correlation | 1 shows, the bivariate joint distribution of relative abundances (from which mutual information is estimated) can be quite different from the bivariate joint distribution of the absolute abundances that gave rise to them. |
Correlations between relative abundances tell us absolutely nothing | This many-to-one mapping means that other measures of statistical association (e.g., rank correlations or mutual information ) will not tell us anything either when applied to purely relative data. |
Introduction | Thus, relative data is also problematic for mutual information and other distributional measures of association. |