Intra-Tumour Signalling Entropy Determines Clinical Outcome in Breast and Lung Cancer
Christopher R. S. Banerji, Simone Severini, Carlos Caldas, Andrew E. Teschendorff

Abstract

Infra-tumour heterogeneity, the diversity of the cancer cell population within the tumour of an individual patient, is related to cancer stem cells and is also considered a potential prognostic indicator in oncology. The measurement of cancer stem cell abundance and infra-tumour heterogeneity in a clinically relevant manner however, currently presents a challenge. Here we propose signalling entropy, a measure of signalling pathway promiscuity derived from a sample’s genome-wide gene expression profile, as an estimate of the stemness of a tumour sample. By considering over 500 mixtures of diverse cellular expression profiles, we reveal that signalling entropy also associates with infra-tumour heterogeneity. By analysing 3668 breast cancer and 1692 lung adenocarcinoma samples, we further demonstrate that signalling entropy correlates negatively with survival, outperforming leading clinical gene expression based prognostic tools. Signalling entropy is found to be a general prognostic measure, valid in different breast cancer clinical subgroups, as well as within stage I lung adenocarcinoma. We find that its prognostic power is driven by genes involved in cancer stem cells and treatment resistance. In summary, by approximating both stemness and infra-tumour heterogeneity, signalling entropy provides a powerful prognostic measure across different epithelial cancers.

Author Summary

The Cancer Stem Cell (CSC) hypothesis, the idea that a small population of tumour cells have the capacity to seed and grow the tumour, and intra-tumour heterogeneity, the diversity of the cancer cell population Within the tumour of an individual patient, have long been considered the basis of potential prognostic indicators in oncology. The identification of CSC based expression signatures and the measurement of intra-tumour heterogeneity, for an assessment of prognostic power in a clinically relevant manner, however, currently presents a challenge. Most proposed methodologies require the collection of new data sets and thus are limited in sample size, making them difficult to validate. Here we consider signalling entropy, a measure of signalling pathway promiscuity, as a means of quantifying the stemness and heterogeneity of any given cancer sample, applicable to publicly available data sets. By considering over 5300 primary tumour samples from both breast and lung cancer patients, we here demonstrate that signalling entropy provides a more robust and general prognostic measure than other leading clinical prognostic indicators.

Introduction

The abundance of CSCs is considered likely to be of prognostic value as well as a source of intra-tu-mour heterogeneity, a feature that has long been considered of possible prognostic value in oncology [3—6]. Although putative CSCs have been identified by surface marker eXpression for several malignancies, isolated, and demonstrated to be chemotherapeutic resistant [7—11], it remains a significant challenge to obtain a prognostic measure of their abundance from tumour bulk gene eXpression profiles across multiple malignancies. Embryonic Stem (ES) cell gene eXpression signatures are clear candidates for such a measure and indeed have been demonstrated to be prognostic in breast and lung cancer [12—15]. Their overall prognostic significance seems limited, however, and they are unable to discriminate CSCs from the tumour bulk [12, 16]. The clinical assessment of intra-tumour heterogeneity also poses a significant challenge, with current eXperimental approaches requiring multiple biopsies per tumour leaving them severely limited in sample size [17—19]. We posited that an eXpression based measure of signalling promiscuity may quantify the stemness of a tumour in a manner which is related to intra-tumour heterogeneity, and thus provide us with an improved prognostic model.

Specifically, we consider signalling entropy which is computed from the integration of a sample’s genome-wide gene eXpression profile with an interactome, and provides an overall measure of the signalling promiscuity in the sample [16]. We note that the term signalling entropy was chosen, as opposed to alternatives such as interactome/ network entropy, to emphasise the fact that our measure quantifies network traffic (signalling) as opposed to network topology. Importantly, as shown by us previously, signalling entropy correlates with stemness and differentiation potential within distinct cellular developmental lineages [16]. Indeed, we showed that human embryonic stem cells and induced pluripotent stem cells eXhibited the highest levels of signalling entropy, with adult stem cells (e.g. hematopoietic stem cells) showing significantly lower values, and terminally differentiated cells eXhibiting the lowest entropy values within a lineage [16]. These results were derived mostly from cell-lines, which are characterised by relatively homogeneous cell populations, and were further validated in time-course differentiation experiments [16]. Importantly, we also demonstrated that cancerous tissue displays a higher signalling entropy than its healthy counterpart [16, 20], with CSCs showing higher values than the tumour bulk [16]. Thus, signalling entropy provides an approximation of the stemness of a cellular sample. In addition to quantifying stemness of the signalling regime of a homogeneous cell population, signalling entropy, if computed over a heterogeneous cell population, should also quantify the intercellular diversity in pathway activation. To investigate this we performed an analytical investigation of signalling entropy, coupled with empirical validation. We derived a sufficient condition on the eXpression profiles of homogeneous cell populations for signalling entropy to be a measure of intra-sample heterogeneity on average. We subsequently verified that this condition is satisfied by considering 33 distinct adult tissue eXpression profiles corresponding to 528 pairwise mixtures. Thus, we show that signalling entropy is a good candidate for a correlate of intra-sample heterogeneity.

We here compute signalling entropy for a total of 5360 tumour samples, focusing on two highly heterogeneous cancers, non-small cell lung cancer (NSCLC) and breast cancer, which constitute the two leading causes of cancer death worldwide [21]. Survival rates for early stage NSCLC are particularly poor [21, 22] , and identification of prognostic and predictive biomarkers within the stage I stratum is considered a high priority [23]. In breast cancer, the power of gene eXpression based prognostic indicators, such as OncotypeDX and MammaPrint [24, 25], is highly subtype dependent [26, 27] and a clinical breast cancer prognostic signature, which is independent of estrogen receptor (ER) status is lacking. Most importantly, current gene eXpression based prognostic indicators ignore CSC contributions and intra-tumour heterogeneity [17]. Thus, signalling entropy, a measure of both cell anaplasia and intra-tumour heterogeneity, may form the basis of a general and more robust prognostic indicator. By examining gene eXpression profiles of over 3500 primary breast cancers and 1300 lung adenocarcinomas, we here demonstrate that signalling entropy is prognostic in breast cancer, regardless of ER status, and in lung adenocarcinomas, within the stage I stratum.

Results

Rationale of signalling entropy as a prognostic measure

Briefly, we employ the mass-action principle to derive, for each sample, a stochastic matrix pi], describing the interaction probability of the proteins encoded by genes 1' and j in the given sample. The signalling entropy is then computed as the normalised entropy rate of the Markov chain described by pij. This entropy rate gives a steady state measure of the disorder (or promiscuity) in signalling information flow over the network in the given sample (Materials and Methods).

Importantly, we also demonstrated that signalling entropy is elevated in CSCs as compared to the tumour bulk [16]. Thus, given a homogeneous cell population, a high signalling entropy suggests that signalling within each cell is very promiscuous and that the cells may therefore have a plastic stem cell like phenotype. However, a heterogeneous sample, consisting of cells with distinct, though not necessarily promiscuous signalling regimes, should also on average display a high signalling entropy, suggesting that signalling entropy may associate with intra-tumour heterogeneity (Fig. 1A). To investigate whether signalling entropy associates with intra-sample heterogeneity, we considered our measure evaluated for three theoretical samples: namely two homogeneous samples consisting only of cell type x or y respectively, and a third heterogeneous sample consisting of a 50:50 mixture of cell types x and y. It is clear that if cell type x has an expression profile that maximises signalling entropy and cell type y does not, then the signalling entropy of the mixture Will be lower than the signalling entropy of x, thus signalling entropy is not a point-wise measure of heterogeneity. However, as most biologically realistic cell types have distinct expression profiles, corresponding to the existence of non-overlapping active pathways between cell type pairs [29] , we posited that the signalling entropy of a mixed sample may be higher than that of a homogeneous sample on average. By appealing to detailed balance we examined a closed form expression for signalling entropy. It is a consequence of simple algebra that if signalling entropy is super-additive over the set w 2 tropy (x) —|— g Signalling Entropy(y)) then signalling entropy will on average be elevated in of biologically admissible expression profiles (i.e., Signalling Entropy ( ) > % Signalling En-mixed samples as opposed to homogeneous samples (Materials and Methods, 81 Text, SS Fig, S9 Fig, 810 Fig and 811 Fig). We thus derived a condition for point-wise super-additivity of our measure and then considered a data set of gene expression profiles for 33 distinct adult tissues, representing 528 possible pairwise mixtures [29]. For every possible mixture the derived condition for super-additivity was satisfied (Fig. 1B). Whence the signalling entropies of the mixed samples was significantly higher than that of homogeneous samples on average (Fig. 1C). This provides strong evidence that signalling entropy is a correlate of intra-sample heterogeneity. Thus, signalling entropy associates with tumour stemness in a manner associated with CSC abundance and intra-tumour heterogeneity, making our measure a good candidate for an improved prognostic indicator.

Signalling entropy is prognostic in the major subtypes of breast cancer

This data set profiles a large number of clinical variables and thus is a suitable platform to examine the clinical associations of our measure. Using outcome first as a binary phenotype, we observed that patients who died of breast cancer had a higher signalling entropy than patients who were alive at last follow up, a result which was seen in both METABRIC subsets (p < 1e — 7). Using a Cox proportional hazards model, on 5 year censored survival data, we ascertained that high signalling entropy is associated with increased risk of death in breast cancer (c-index = 0.6, p < 1.1e — 6). Stratifying patients into 3 groups, representing the 3 tertiles of the signalling entropy distribution, revealed that tumours with a high entropy exhibited a doubling of the hazard rate compared to low entropy tumours.

In addition, signalling entropy was also found to be independent of a prognostic ES cell signature described by Ben-Porath el al. [12] and the prognostic grade signature described by Sotiriou et al. [31] (SI Text, SI Fig & SZ Fig). Signalling entropy was significantly prognostic within each tumour grade strata; notably it was prognostic within the grade 2 stratum in both METABRIC data sets (p < 0.036), an important result given the difficulty in deciding treatment courses in this intermediate prognosis group [31]. The fact that signalling entropy is prognostic independently of all other measures of cell anaplasia, suggests that our measure may be capturing more than just the stemness of a tumour sample, and that intra-tumour heterogeneity may be contributing to its prognostic power.

described prognostic associations for a number of random gene expression signatures in breast cancer [32]. To ascertain whether random effects may be driving our findings, we evaluated the prognostic associations of the three random gene expression signatures described by Venet et al.. We found that only one was prognostic in both discovery and validation METABRIC data sets and that its prognostic power was determined by ER status (S3 Fig). To further assess the impact of random effects and the importance of our network, we randomised the gene eXpression profiles of the METABRIC data sets over the network. Performing 5 randomisations and recomputing signalling entropy for the 1980 samples in both METABRIC data sets, revealed that randomised signalling entropy did not display robust prognostic associations independently of ER status. We are therefore confident that the prognostic power of signalling entropy is not driven by random effects.

All these datasets described both ER positive and negative tumours with accompanying clinical outcome, profiled on either AffymetriX or Illummina platforms and totalling 1688 samples [33—40], (81 Table). Meta-analysis revealed that signalling entropy is prognostic across both ER positive and ER negative samples (ER positive: c-indeX = 0.63, 95% CI = (0.604, 0.657),p = 8.5e — 15, ER negative: c-indeX = 0.57, 95% CI = (0.538, 0.602), p = 0.032, Fig. 2A). Five of the additional eight data sets also described histological tumour grade for each sample, allowing us to further confirm that signalling entropy is prognostic within the grade 2 stratum (c-indeX = 0.63, 95% CI = (0.581, 0.675), p = 1.05e — 6, Fig. 2A).

In a meta-analysis over the 10 breast cancer validation sets we found that unlike signalling entropy Mam-maPrint was not significantly prognostic over ER negative samples (Fig. 2B).

Due to differences in the normalisation between RT-PCR and microarrays, a direct comparison between our measure and OncotypeDX is difficult to perform. Moreover, not all the genes required for computing the OncotypeDX recurrence score were present in all the array platforms considered. However, using a microarray version of OncotypeDX, we found that it performed comparably to signalling entropy across both ER positive (signalling entropy vs. OncotypeDX: p = 0.13) and ER negative samples (signalling entropy vs. OncotypeDX: p = 0.7, S4 Fig). Thus signalling entropy is prognostic in the two major clinical subtypes of breast cancer and hence is a more robust prognostic indicator than MammaPrint.

Signalling entropy is prognostic in stage I lung adenocarcinoma

To evaluate the clinical associations of our measure we first computed signalling entropy for each mi-croarray sample in The Director’s Challenge dataset profiling 398 tumours [42], and for the 455 lung adenocarcinoma RNA-seq tumour samples downloaded from The Cancer Genome Atlas (TCGA) database (http://cancergenome.nih.gov/). We found that signalling entropy was significantly lower in lung adenocarcinoma patients who were alive at last follow up as opposed to those who had died (p < 0.03). Fitting Cox proportional hazard models to 3 year censored data revealed that an increased signalling entropy implied a worse prognosis in lung adenocar-cinoma (c-indeX = 0.6, p < 0.007). We again separated patients into tertiles of the signalling entropy distribution and found that high signalling entropy conferred almost a doubling of the hazard rate, as assessed over the first 3 years following diagnosis (HR = 1.9, p < 0.02). Signalling entropy was found to be associated with tumour stage, grade and smoking status, in both TCGA and Director’s Challenge data sets, yet importantly the prognostic power of signalling entropy was independent of these clinical variables (Sl Text, SS Fig & S6 Fig). It is of particular note that signalling entropy is significantly prognostic if computed from either microarray or RNA-seq data sets, this result attests to the biological relevance of our measure which is not masked by experimental technique.

Sub-staging by size is currently the standard clinical approach to stratify stage I tumours, however, on meta-analysis we found that this stratification, unlike signalling entropy was not significantly prognostic over the stage I stratum (Fig. 3B).

Signalling entropy’s prognostic power in breast cancer can be represented by a small number of genes

Moreover, our measure associates with tumour grade and ER status in breast cancer and thus the factors driving its prognostic power independently of these variables is unclear. We posited that the prognostic power of our measure, independent of ER status and grade may be captured by the expression of a small number of genes, analogously to the way the prognostic power of tumour grade was captured by the expression of the 97 gene Sotiriou et al. signature [31].

We then refined this gene set by fitting a Cox proportional hazards model on 5 year censored data using all the identified genes as covariates and deleting genes which were not significantly prognostic independently of others in the gene set. This resulted in a small set of 81 genes, 10 of which were negatively correlated with signalling entropy and 71 of which were positively correlated 82 Table. A Signalling Entropy prognostic score (SE score) was then defined as the t-statistic evaluating the hypothesis that the 71 positively correlated genes are expressed more highly than the 10 negatively correlated genes (after z-score normal-ising the data).

Consequently, the genes utilised to construct our SE score are both correlated with outcome and with signalling entropy and thus should provide a prognostic indicator representative of signalling promiscuity. Criticism of feature selection for prognostic classifiers based on gene sets ranked by correlation with outcome has stemmed from the considerable discordance of such features between data sets [47, 48]. By using signalling entropy to refine the prognostic gene set we found that this gene set instability was reduced. The genes which were both prognostic and correlated with signalling entropy showed more concordance between discovery and validation sets of METABRIC as compared to the genes which were only prognostic. Moreover, this increase in overlap was significantly higher than would be eXpected by chance (p < 10e — 5, based on re-sampling size matched sets of prognostic genes and assessing overlap). To further confirm this increased rodustness, we derived a set of genes for constructing an SE score from the METABRIC validation set, using an identical procedure to that performed on the discovery set. This gene list was slightly shorter than for the discovery set (55 genes, 34 positively correlated and 13 negatively correlated with signalling entropy) but had an overlap of 4 genes, significantly more than would be eXpected by chance (p = 0.012, based on re-sam-pling size matched sets of prognostic genes and assessing overlap). We provide the lists of prognostic genes both correlated and uncorrelated with signalling entropy as well as the validation set derived SE score genes in S3 Table.

A signalling entropy derived prognostic score outperforms microarray based prognostic indicators in lung adenocarcinoma

Signalling entropy is correlated with, yet prognostically independent of tumour stage in lung adenocarcinoma, we therefore aimed to derive a score that represented the prognostic power of our measure independently of tumour stage. To achieve this we considered the Director’s Challenge data set of 398 lung adenocarcinomas as a discovery set [42]. We performed an analogous procedure as described above for breast cancer to identify genes associated with signalling entropy’s prognostic power independently of tumour stage in lung cancer, with the only differences being that we adjusted for tumour stage, rather than ER status and grade, and used 3 year censored data rather than 5 year. This resulted in a small set of 29 genes, 8 of which were negatively correlated with signalling entropy and 21 of which were positively correlated (S4 Table). An SE score was then defined again as the t-statistic evaluating the hypothesis that the positively correlated genes are expressed more highly than the negative. We next compared our SE score to a leading gene expression based prognostic indicator for lung adenocarcinoma, the expression of the gene CADM 1, which was recently found to be a superior prognostic indicator to many others in the literature [44]. CADMI expression performed comparably to the SE score in a meta-analysis, however, it was outperformed by pathological tumour stage (CADMI expression vs stage: p = 0.03). In contrast the SE score performed comparably to tumour stage (SE score vs stage: p = 0.13, Fig. 5B).

We therefore evaluated Whether prognostic models Which combined either the SE score or CADMI eXpression With stage Ia/b status Within the stage I sub group, outperformed stage Ia/b status alone. We found that the SE score improved over stage Ia/b alone in a meta-analysis across 765 stage I lung ade-nocarcinomas (SE score+stage vs stage: p = 0.025), whereas CADMI expression made no improvement over stage Ia/b (CADMI expression+stage vs stage: p = 0.13, Fig. 5C). Whence it may be argued that the SE score provides a stronger candidate prognostic tool than CADMI expression for clinical application.

[22] , similarly to OncotypeDX however, this score is based on RT-PCR and thus a direct comparison is difficult. However, a microarray based approximation of the Kratz et al. score was found to perform comparably to signalling entropy both across all samples (SE score vs

The prognostic impact of signalling entropy is associated with genes involved in cancer stem cells and treatment resistance

Given the power of signalling entropy as a prognostic factor in both breast and lung cancer we next investigated which genes and pathways were associated with signalling entropy’s prognostic impact, independently of other clinical variables.

For lung adenocarcinoma we considered a list of 158 genes identified as prognostic independently of stage, and correlated with signalling entropy, again independently of stage, in both the Di-rector’s Challenge and TCGA data sets. The two gene lists displayed an overlap of 47 genes (S5 Table displays both gene lists). We performed a gene set enrichment analysis, using a Fisher’s Exact test, comparing each of these gene lists separately against the Molecular Signatures Database [50] (S6 Table shows the top 10 enriched gene sets for both gene lists). The decision to use these gene sets for the enrichment screens, rather than the genes utilised to derive the SE scores was due to them being derived from multiple data sets and thus more robustly representative of signalling entropy’s prognostic associations. We note that gene set enrichment analysis performed on the genes comprising the SE scores gave broadly similar results (S7 Table).

The strongest enrichment was for genes associated with poor survival in lung cancer, histological grade in breast cancer and cell proliferation, supporting the notion that signalling entropy is a prognostic measure of cell anaplasia. In addition, considerable enrichment was found for genes down regulated by the therapeutic agent salirasib and by EGFR inhibitors, as well as for genes as up regulated in cell lines resistant to the chemotherapeutic doxorubicin, supporting the hypothesis that signalling entropy associates with therapeutic resistance.

Examples include, genes down-regulated by EZH2, a well known stem cell gene involved in the pathogenesis of several cancers and which plays a documented role in both breast and lung CSCs [51—54]. The set of genes down regulated by CTNNBI knockout, a critical component of the Wnt signalling pathway, posited to be important in CSCs and their therapeutic resistance [55] were also enriched. Targets of BMP2 were among the most enriched gene sets in breast but not lung cancer, which is intriguing given the role of this gene specifically in breast CSCs [56]. Enrichment was also found for many gene sets associated with immune system processes. Thus signalling entropy is prognostically related to genes associated with both CSCs and treatment resistance, across multiple malignancies and independently of clinical variables. This result confirms our initial postulate that signalling entropy is a powerful prognostic measure, related both to cell anaplasia and CSCs as well as treatment resistance.

Discussion

The notion of tumour cell plasticity raises further challenges [57] With recent discoveries suggesting that CSCs may arise from the tumour bulk by simple changes [58]. This calls into question the notion that CSCs only ever occupy a small proportion of the tumour, and paint a picture of cancer cells as malleable entities capable of generating considerable heterogeneity. Recent observations have also demonstrated the importance of characterising such intra-tu-mour heterogeneity in the prognostic assessment of epithelial cancers [17]. The measurement of both CSC abundance and intra-tumour heterogeneity in a clinically relevant manner, however, presents a challenge [59]. The majority of currently suggested approaches are limited in sample size, and require the time consuming collection of large new data sets (such as multiple biopsies from single tumours) for validation and proof of concept.

Importantly, our measure is applicable to the plethora of publicly available bulk tumour, genome wide eXpression data, facilitating swift validation of its prognostic impact on large data sets. By considering 5360 primary tumour samples, we have demonstrated that our measure is a powerful prognostic indicator in both breast and lung cancer. In breast cancer our measure is prognostic within the grade 2 stratum and both ER positive and negative subtypes. In lung adenocarcinoma, our measure is prognostic within the stage I stratum, outperforming tumour size.

Moreover, it is associated with yet prognostically independent of a number of clinical variables in both breast and lung cancer. We thus used feature selection to derive a small set of genes which capture the prognostic power of signalling entropy independently of other clinical variables, thus representing a more readily applicable quantifier of stem-ness and intra-tumour heterogeneity.

Arguably the most successful application has been to breast cancer, where OncotypeDX and MammaPrint are currently in clinical trials for guiding the management of ER positive breast cancer [26, 27]. Though powerful, these assays are limited to the ER positive subtype and importantly ignore CSC abundance and intra-tumour heterogeneity. There also exist many more sophisticated prognostic signatures for breast cancer, derived from within the DREAM challenge consortium, and several of which have demonstrated improvement over MammaPrint or OncotypeDX [61—64]. The aim of our work, was first to introduce a prognostic measure of signalling promiscuity, which by approximating CSC abundance and intra-tumour heterogeneity may prove a basis by which to improve the construction of prognostic models for epithelial cancers, and secondly, to compare it to clinically well established or validated signatures such as MammaPrint and OncotypeDX. A direct comparison of signalling entropy to the prognostic indicators from the DREAM challenge, which have not yet entered the clinical setting, is beyond the scope of this work.

Even so, signalling entropy was found to be more robust than MammaPrint across ER+ and ER-breast cancer. Although signalling entropy was not found to outperform existing prognostic markers in lung adenocarcinoma, by using the SE score, derived by signalling entropy guided feature selection, it was possible to outperform existing state of the art prognostic factors such as CADMI eXpression across independent data sets. The nature of signalling entropy as a measure of pathway promiscuity, which correlates with CSCs and associates with intra-tumour heterogeneity [8, 9], led us to postulate that it may associate with the phenotypic plasticity of a tumour that enables subversion of therapeutic response. Here we demonstrated that signalling entropy’s prognostic power in epithelial cancers is indeed related to both treatment resistance and CSC pathways. We thus propose signalling entropy as a powerful and readily applicable tool for assessing the prognostic impact of signalling promiscuity across multiple epithelial cancers. In addition to being a strong prognostic factor which outperforms the leading eXpression based indicators, our measure may also provide insights into intra-tumour heterogeneity, treatment resistance and CSC mechanisms.

Materials and Methods

Details of data sets used, the interaction network and all statistical methods can be found in the 81 Text.

Signalling Entropy

Briefly, each sample is first integrated with a Protein Interaction Network (PIN) (see SI Text) to create a sample specific stochastic matrix, P = (pl-j). By integrating each sample with the PIN, rather than considering a complete network in which every protein pair can directly interact, we bene-f1t both from a reduction in computational complexity and an improved biological relevance from a focus on direct interactions. Integration with the PIN filters out indirect interactions even if strong correlations are present, making our analysis robust to confounding effects. By using each sample to weigh the PIN we are also reducing the noise present in the network by providing it with a sample-specific biological context. The im row of P defines a probability distribution describing the rates of reaction of protein 1' with each of its neighbours in the PIN. These distributions are constructed by appealing to a simplified version of the mass action principle, namely that the rate of a reaction is proportional to the product of the active masses of the reagents involved. We assume that log normalised gene expression is a rough proxy for protein concentration and thus compute P as follows:

We note that from this definition Zj pij = 1 for all j, i.e., P is row stochastic, and the ith row corresponds to the weighted interaction distribution of protein 1' in the given sample. We note that not all proteins in the PIN have a corresponding probe in the microarray or sequence in the RNA-seq data, consequentially the PIN we consider is the maximally connected component of the original PIN after the removal of missing proteins. For each protein 1' we then define the local entropy of its interaction distribution, 8,, Which quantifies the promiscuity of its signalling Within the sample: Signalling entropy is a global measure of signalling promiscuity in a given sample and thus is computed from the entire stochastic matrix pi]- as the entropy rate, §R, of the stochastic process described by pz-j: where 71,- denotes the stationary distribution of the stochastic matrix, satisfying 21-71in : nj. We note that 71,- is therefore the non-degenerate eigenvector of P corresponding to the eigenvalue 1 and that by the Perron Frobenius theorem, the existence of 71,- requires that the matrix P be irreducible; this is guaranteed by the fact that the PIN considered is connected and non-bipartite [65]. The maximum entropy rate of a weighted network, M R, depends solely upon its adjacency matrix, A = (Az-j), and can be calculated as the entropy rate of the stochastic matrix p,-]- = Al-j vj/ lvi, Where i and v are the dominant eigenvalue and corresponding eigenvector of A, respectively [66]. In order to ensure the results presented in this paper are comparable With those of preVious studies on signalling entropy, we Will present our findings in terms of normalised signalling entropy: A closed form expression for signalling entropy is derived and analysed in the 81 Text. R-scripts for the computation of signalling entropy are freely available for download at W. sourceforge.net/projects/signalentropy.

Super-additivity and heterogeneity

Here we show that if signalling entropy is super-additive then the hypothesis is correct. Let us first define some preliminaries: Let x,- E R> 0 be the expression of gene 1' in cell type X, and denote the vector containing all such variables by x = (xi);1 E Q C R”, where Q is some bounded domain. In our analysis x will represent the vector of log normalised gene expression values for a homogeneous sample, we note that as the expression of genes cannot be infinite we bound x within a finite domain Q, of biologically admissible expression regimes. Our hypothesis on signalling entropy thus amounts to proving the following proposition: Proposition. Let x, y E Q, then It is clear that if the claim is true then the proposition must be true. Notice first that if the claim is true then as it is a strict bound Ele > 0 such that SR > SR2“) —|— SR7”) —|— e. Whence and thus the proposition is true. Thus if signalling entropy is super-additive over homogeneous cell types, this implies that signalling entropy Will on average be elevated in heterogeneous mixtures of cell types. These propositions are examined in detail in 81 Text.

Supporting Information

81 Text. This document contains supplementary materials and methods and supplementary results to complement the manuscript.

GEO and ArrayExpress accession numbers for the breast cancer and lung adeno-carcinoma data sets. Sample counts are provided for ER segregated and grade 2 samples in the case of breast cancer, and also for stage I samples in the case of lung adenocarcinoma. (XLSX)

The genes utilised to construct the signalling entropy prognostic score in breast cancer, derived from the MET ABRIC discovery set. Genes are separated into those found to positively correlate With signalling entropy and those negatively correlated.

The genes utilised to construct the signalling entropy prognostic score derived from the MET ABRIC validation set. Genes are separated into those found to positively correlate With signalling entropy and those negatively correlated, the genes Which overlap With the discovery set derived set are highlighted in yellow. Also presented are genes Which are prognostic independently of ER status and grade in both discovery and validation sets of METABRIC (middle table). Genes Which are both prognostic and correlated With signalling entropy, independently of ER status and grade, in both discovery and validation sets of METABRIC are presented as the rightmost table.

The genes utilised to construct the signalling entropy prognostic score in lung adenocarcinoma. Genes are separated into those found to positively correlate With signalling entropy and those negatively correlated. (XLSX)

Genes utilised in the gene set enrichment analysis to identify gene sets associated with signalling entropy’s prognostic power in breast and lung cancer. In the case of breast cancer these are prognostic genes Which correlate With signalling independently of ER status and grade and Whose prognostic power is also independent of these variables, in both METAB-RIC data sets. In the case of lung cancer, these are prognostic genes Which are correlated With signalling entropy independently of tumour stage and Whose prognostic power is also independent of stage, in both the TCGA and Director’s Challenge lung adenocarcinoma data sets. Genes are separated into those found to positively correlate With signalling entropy and

Gene set enrichment analysis results displaying the top 10 most significant enriched gene sets associated with signalling entropy’s prognostic power in breast and lung cancer. Tables display results for the gene set enrichment analysis performed on gene lists identified in lung and breast cancer separately, both With and Without the intersection of the S7 Table. Gene set enrichment analysis results displaying the top 10 most significant enriched gene sets associated with the genes utilised to construct the SE score in both breast cancer and lung adenocarcinoma.

and Sotiriou et al. tumour grade signatures. The p-values denote the significance of the Pearson 82 Fig. Signalling entropy outperforms the Ben-Porath ES cell signature in measuring tumour grade. A) Signalling entropy is associated With histological tumour grade. B) Unlike signalling entropy the Ben-Porath et al. signature cannot discriminate between grade 1 and grade 2 breast cancers in the METABRIC discovery data set. All p-values are derived from S3 Fig. Prognostic associations of random gene expression signatures in MET ABRIC.

in each METABRIC data set, p-values denote the significance of a Cox-regression for each random signature as assessed by a Wald-test. We see that only KRISHNAN2007DEFEAT is significantly prognostic in both METABRIC datasets. B) Kaplan-Meyer plots for 5 year censored survival data are presented for each of the KRISHNAN2007DEFEAT expression signature in each METABRIC data set, divided into ER+ and ER samples, p values denote the significance of a Cox-regression for each random signature as assessed by a Wald-test. We see that the random signature is not prognostic Within ER subtypes.

The plots display the concordance indices for signalling entropy and a microarray based approximation of OncotypeDX in each data set alongside 95% confidence intervals. The overall concordance indices were derived via meta-analysis using a random effects model. The vertical line denotes concordance index = 0.5, data sets Where the confidence interval for the concordance index crosses this line did not reach significance. Meta-analysis across 10 data sets reveals that signalling entropy performs comparably to OncotypeDX across (A) ER positive samples and (B) ER negative samples.

A) Signalling entropy is correlated With the Ben-Porath et al. tumour grade signature, the p-value denotes the significance of the Pearson correlation coefficient. B) Signalling entropy is associated With histological tumour grade, p-values are derived from Wilcoxon tests.

p-values are derived from Wilcoxon tests.

A) The plots display the concordance indices for signalling entropy and a microarray based approximation of the Kratz et al. score in each data set alongside 95% confidence intervals. The overall concordance indices were derived via meta-analysis using a random effects model. The vertical line denotes concordance index = 0.5, data sets Where the confidence interval for the concordance index crosses this line did not reach significance. Meta-analysis across 6 data sets reveals that signalling entropy performs comparably to the Kratz et al. score across all samples. (B) The plots display the concordance indices for signalling entropy and the Kratz et al. score combined With stage Ia/b status for stage I samples in each data set alongside its 95% confidence interval. Meta-analysis across 6 data sets reveals that signalling entropy performs compa-S9 Fig. Demonstration that the claim SR > SR2“) —|— SR2”) is correct for all pairwise combi-810 Fig. Signalling entropy of homogeneous and mixed tissues. The first box represents the signalling entropy distribution of 33 unmixed tissues, Whilst each subsequent labelled box represents the signalling entropy distribution of the labelled tissue mixed With each of the remaining 32 tissues. The red line represents the median of the unmixed samples. We see that for 20/ 33 tissue types, the median of the mixture is greater than the median of the pure samples, suggesting that on average the signalling entropy of the mixture is greater than the signalling entropy of the pure sample. SI 1 Fig. Demonstration that the claim, Inf“ (SR —SR dxdy > 0 is correct for samples

Acknowledgments

The authors would like to thank Peter Sollich and Reimer Kuehn for helpful discussions on the theoretical aspects of signalling entropy.

Author Contributions

Analyzed the data: CRSB AET. Contributed reagents/materials/analysis tools: SS CC. Wrote the paper: CRSB AET SS CC.

Topics

breast cancer

Appears in 32 sentences as: Breast Cancer (1) breast cancer (33) breast cancers (2)
In Intra-Tumour Signalling Entropy Determines Clinical Outcome in Breast and Lung Cancer
  1. By analysing 3668 breast cancer and 1692 lung adenocarcinoma samples, we further demonstrate that signalling entropy correlates negatively with survival, outperforming leading clinical gene expression based prognostic tools.
    Page 1, “Abstract”
  2. Signalling entropy is found to be a general prognostic measure, valid in different breast cancer clinical subgroups, as well as within stage I lung adenocarcinoma.
    Page 1, “Abstract”
  3. We here compute signalling entropy for a total of 5360 tumour samples, focusing on two highly heterogeneous cancers, non-small cell lung cancer (NSCLC) and breast cancer , which constitute the two leading causes of cancer death worldwide [21].
    Page 3, “Introduction”
  4. In breast cancer, the power of gene eXpression based prognostic indicators, such as OncotypeDX and MammaPrint [24, 25], is highly subtype dependent [26, 27] and a clinical breast cancer prognostic signature, which is independent of estrogen receptor (ER) status is lacking.
    Page 3, “Introduction”
  5. By examining gene eXpression profiles of over 3500 primary breast cancers and 1300 lung adenocarcinomas, we here demonstrate that signalling entropy is prognostic in breast cancer , regardless of ER status, and in lung adenocarcinomas, within the stage I stratum.
    Page 3, “Introduction”
  6. Signalling entropy is prognostic in the major subtypes of breast cancer
    Page 5, “Signalling entropy is prognostic in the major subtypes of breast cancer”
  7. In order to assess the prognostic significance of signalling entropy in breast cancer, we first computed its value for each microarray sample of the Molecular Taxonomy of Breast Cancer International Research Consortium dataset (METABRIC) [30], a total of 1980 samples divided into a discovery and validation sets of equal proportion.
    Page 5, “Signalling entropy is prognostic in the major subtypes of breast cancer”
  8. Using outcome first as a binary phenotype, we observed that patients who died of breast cancer had a higher signalling entropy than patients who were alive at last follow up, a result which was seen in both METABRIC subsets (p < 1e — 7).
    Page 5, “Signalling entropy is prognostic in the major subtypes of breast cancer”
  9. Using a Cox proportional hazards model, on 5 year censored survival data, we ascertained that high signalling entropy is associated with increased risk of death in breast cancer (c-index = 0.6, p < 1.1e — 6).
    Page 5, “Signalling entropy is prognostic in the major subtypes of breast cancer”
  10. described prognostic associations for a number of random gene expression signatures in breast cancer [32].
    Page 5, “Signalling entropy is prognostic in the major subtypes of breast cancer”
  11. To further validate the prognostic impact of signalling entropy we considered eight further independent breast cancer data sets.
    Page 6, “Signalling entropy is prognostic in the major subtypes of breast cancer”

See all papers in March 2015 that mention breast cancer.

See all papers in PLOS Comp. Biol. that mention breast cancer.

Back to top.

adenocarcinoma

Appears in 20 sentences as: adenocarcinoma (20)
In Intra-Tumour Signalling Entropy Determines Clinical Outcome in Breast and Lung Cancer
  1. By analysing 3668 breast cancer and 1692 lung adenocarcinoma samples, we further demonstrate that signalling entropy correlates negatively with survival, outperforming leading clinical gene expression based prognostic tools.
    Page 1, “Abstract”
  2. Signalling entropy is found to be a general prognostic measure, valid in different breast cancer clinical subgroups, as well as within stage I lung adenocarcinoma .
    Page 1, “Abstract”
  3. Signalling entropy is prognostic in stage I lung adenocarcinoma
    Page 6, “Signalling entropy is prognostic in stage I lung adenocarcinoma”
  4. We next investigated the prognostic power of our measure in lung adenocarcinoma .
    Page 6, “Signalling entropy is prognostic in stage I lung adenocarcinoma”
  5. To evaluate the clinical associations of our measure we first computed signalling entropy for each mi-croarray sample in The Director’s Challenge dataset profiling 398 tumours [42], and for the 455 lung adenocarcinoma RNA-seq tumour samples downloaded from The Cancer Genome Atlas (TCGA) database (http://cancergenome.nih.gov/).
    Page 6, “Signalling entropy is prognostic in stage I lung adenocarcinoma”
  6. We found that signalling entropy was significantly lower in lung adenocarcinoma patients who were alive at last follow up as opposed to those who had died (p < 0.03).
    Page 6, “Signalling entropy is prognostic in stage I lung adenocarcinoma”
  7. Early stage lung adenocarcinoma suffers from a high relapse rate and it is important to establish more robust prognostic assessments in the stage I subgroup for chemotherapeutic treatment stratification [22].
    Page 8, “Signalling entropy is prognostic in stage I lung adenocarcinoma”
  8. A signalling entropy derived prognostic score outperforms microarray based prognostic indicators in lung adenocarcinoma
    Page 10, “A signalling entropy derived prognostic score outperforms microarray based prognostic indicators in lung adenocarcinoma”
  9. We next investigated whether a similar SE score could be computed for lung adenocarcinoma .
    Page 10, “A signalling entropy derived prognostic score outperforms microarray based prognostic indicators in lung adenocarcinoma”
  10. Signalling entropy is correlated with, yet prognostically independent of tumour stage in lung adenocarcinoma , we therefore aimed to derive a score that represented the prognostic power of our measure independently of tumour stage.
    Page 10, “A signalling entropy derived prognostic score outperforms microarray based prognostic indicators in lung adenocarcinoma”
  11. We next compared our SE score to a leading gene expression based prognostic indicator for lung adenocarcinoma , the expression of the gene CADM 1, which was recently found to be a superior prognostic indicator to many others in the literature [44].
    Page 10, “A signalling entropy derived prognostic score outperforms microarray based prognostic indicators in lung adenocarcinoma”

See all papers in March 2015 that mention adenocarcinoma.

See all papers in PLOS Comp. Biol. that mention adenocarcinoma.

Back to top.

gene expression

Appears in 19 sentences as: gene eXpression (8) gene expression (11)
In Intra-Tumour Signalling Entropy Determines Clinical Outcome in Breast and Lung Cancer
  1. Here we propose signalling entropy, a measure of signalling pathway promiscuity derived from a sample’s genome-wide gene expression profile, as an estimate of the stemness of a tumour sample.
    Page 1, “Abstract”
  2. By analysing 3668 breast cancer and 1692 lung adenocarcinoma samples, we further demonstrate that signalling entropy correlates negatively with survival, outperforming leading clinical gene expression based prognostic tools.
    Page 1, “Abstract”
  3. Although putative CSCs have been identified by surface marker eXpression for several malignancies, isolated, and demonstrated to be chemotherapeutic resistant [7—11], it remains a significant challenge to obtain a prognostic measure of their abundance from tumour bulk gene eXpression profiles across multiple malignancies.
    Page 2, “Introduction”
  4. Embryonic Stem (ES) cell gene eXpression signatures are clear candidates for such a measure and indeed have been demonstrated to be prognostic in breast and lung cancer [12—15].
    Page 2, “Introduction”
  5. Specifically, we consider signalling entropy which is computed from the integration of a sample’s genome-wide gene eXpression profile with an interactome, and provides an overall measure of the signalling promiscuity in the sample [16].
    Page 2, “Introduction”
  6. Importantly, because signalling entropy can be computed from a bulk tumour gene eXpression profile, it allows us to assess the prognostic significance of our measure in large numbers of clinical specimens.
    Page 3, “Introduction”
  7. In breast cancer, the power of gene eXpression based prognostic indicators, such as OncotypeDX and MammaPrint [24, 25], is highly subtype dependent [26, 27] and a clinical breast cancer prognostic signature, which is independent of estrogen receptor (ER) status is lacking.
    Page 3, “Introduction”
  8. Most importantly, current gene eXpression based prognostic indicators ignore CSC contributions and intra-tumour heterogeneity [17].
    Page 3, “Introduction”
  9. By examining gene eXpression profiles of over 3500 primary breast cancers and 1300 lung adenocarcinomas, we here demonstrate that signalling entropy is prognostic in breast cancer, regardless of ER status, and in lung adenocarcinomas, within the stage I stratum.
    Page 3, “Introduction”
  10. Signalling entropy is derived from the integration of a sample’s gene expression profile with a human protein interactome, and provides a rough proxy for the overall level of signalling promiscuity in the sample.
    Page 3, “Rationale of signalling entropy as a prognostic measure”
  11. We thus derived a condition for point-wise super-additivity of our measure and then considered a data set of gene expression profiles for 33 distinct adult tissues, representing 528 possible pairwise mixtures [29].
    Page 5, “Rationale of signalling entropy as a prognostic measure”

See all papers in March 2015 that mention gene expression.

See all papers in PLOS Comp. Biol. that mention gene expression.

Back to top.

eXpression profiles

Appears in 14 sentences as: eXpression profile (2) expression profile (3) eXpression profiles (5) expression profiles (4)
In Intra-Tumour Signalling Entropy Determines Clinical Outcome in Breast and Lung Cancer
  1. Here we propose signalling entropy, a measure of signalling pathway promiscuity derived from a sample’s genome-wide gene expression profile , as an estimate of the stemness of a tumour sample.
    Page 1, “Abstract”
  2. By considering over 500 mixtures of diverse cellular expression profiles , we reveal that signalling entropy also associates with infra-tumour heterogeneity.
    Page 1, “Abstract”
  3. Although putative CSCs have been identified by surface marker eXpression for several malignancies, isolated, and demonstrated to be chemotherapeutic resistant [7—11], it remains a significant challenge to obtain a prognostic measure of their abundance from tumour bulk gene eXpression profiles across multiple malignancies.
    Page 2, “Introduction”
  4. Specifically, we consider signalling entropy which is computed from the integration of a sample’s genome-wide gene eXpression profile with an interactome, and provides an overall measure of the signalling promiscuity in the sample [16].
    Page 2, “Introduction”
  5. We derived a sufficient condition on the eXpression profiles of homogeneous cell populations for signalling entropy to be a measure of intra-sample heterogeneity on average.
    Page 3, “Introduction”
  6. We subsequently verified that this condition is satisfied by considering 33 distinct adult tissue eXpression profiles corresponding to 528 pairwise mixtures.
    Page 3, “Introduction”
  7. Importantly, because signalling entropy can be computed from a bulk tumour gene eXpression profile , it allows us to assess the prognostic significance of our measure in large numbers of clinical specimens.
    Page 3, “Introduction”
  8. By examining gene eXpression profiles of over 3500 primary breast cancers and 1300 lung adenocarcinomas, we here demonstrate that signalling entropy is prognostic in breast cancer, regardless of ER status, and in lung adenocarcinomas, within the stage I stratum.
    Page 3, “Introduction”
  9. Signalling entropy is derived from the integration of a sample’s gene expression profile with a human protein interactome, and provides a rough proxy for the overall level of signalling promiscuity in the sample.
    Page 3, “Rationale of signalling entropy as a prognostic measure”
  10. It is clear that if cell type x has an expression profile that maximises signalling entropy and cell type y does not, then the signalling entropy of the mixture Will be lower than the signalling entropy of x, thus signalling entropy is not a point-wise measure of heterogeneity.
    Page 3, “Rationale of signalling entropy as a prognostic measure”
  11. However, as most biologically realistic cell types have distinct expression profiles , corresponding to the existence of non-overlapping active pathways between cell type pairs [29] , we posited that the signalling entropy of a mixed sample may be higher than that of a homogeneous sample on average.
    Page 5, “Rationale of signalling entropy as a prognostic measure”

See all papers in March 2015 that mention eXpression profiles.

See all papers in PLOS Comp. Biol. that mention eXpression profiles.

Back to top.

gene sets

Appears in 14 sentences as: Gene set (2) gene set (8) gene sets (10)
In Intra-Tumour Signalling Entropy Determines Clinical Outcome in Breast and Lung Cancer
  1. We then refined this gene set by fitting a Cox proportional hazards model on 5 year censored data using all the identified genes as covariates and deleting genes which were not significantly prognostic independently of others in the gene set .
    Page 8, “Signalling entropy’s prognostic power in breast cancer can be represented by a small number of genes”
  2. Criticism of feature selection for prognostic classifiers based on gene sets ranked by correlation with outcome has stemmed from the considerable discordance of such features between data sets [47, 48].
    Page 8, “Signalling entropy’s prognostic power in breast cancer can be represented by a small number of genes”
  3. By using signalling entropy to refine the prognostic gene set we found that this gene set instability was reduced.
    Page 8, “Signalling entropy’s prognostic power in breast cancer can be represented by a small number of genes”
  4. To determine which gene sets were enriched among the genes prognostically related to signalling entropy independently of other variables, we considered for breast cancer a list of 320 genes which were prognostic, independent of ER status and grade, and correlated with signalling entropy, again independently of ER status and grade, in both MEATBRIC datasets.
    Page 13, “The prognostic impact of signalling entropy is associated with genes involved in cancer stem cells and treatment resistance”
  5. We performed a gene set enrichment analysis, using a Fisher’s Exact test, comparing each of these gene lists separately against the Molecular Signatures Database [50] (S6 Table shows the top 10 enriched gene sets for both gene lists).
    Page 13, “The prognostic impact of signalling entropy is associated with genes involved in cancer stem cells and treatment resistance”
  6. The decision to use these gene sets for the enrichment screens, rather than the genes utilised to derive the SE scores was due to them being derived from multiple data sets and thus more robustly representative of signalling entropy’s prognostic associations.
    Page 13, “The prognostic impact of signalling entropy is associated with genes involved in cancer stem cells and treatment resistance”
  7. We note that gene set enrichment analysis performed on the genes comprising the SE scores gave broadly similar results (S7 Table).
    Page 13, “The prognostic impact of signalling entropy is associated with genes involved in cancer stem cells and treatment resistance”
  8. Enrichment was also found for gene sets associated with stem cells and certain CSC pathways.
    Page 13, “The prognostic impact of signalling entropy is associated with genes involved in cancer stem cells and treatment resistance”
  9. Targets of BMP2 were among the most enriched gene sets in breast but not lung cancer, which is intriguing given the role of this gene specifically in breast CSCs [56].
    Page 13, “The prognostic impact of signalling entropy is associated with genes involved in cancer stem cells and treatment resistance”
  10. Enrichment was also found for many gene sets associated with immune system processes.
    Page 13, “The prognostic impact of signalling entropy is associated with genes involved in cancer stem cells and treatment resistance”
  11. Genes utilised in the gene set enrichment analysis to identify gene sets associated with signalling entropy’s prognostic power in breast and lung cancer.
    Page 17, “Supporting Information”

See all papers in March 2015 that mention gene sets.

See all papers in PLOS Comp. Biol. that mention gene sets.

Back to top.

stem cells

Appears in 13 sentences as: Stem Cell (1) stem cell (4) Stem Cells (1) stem cells (9)
In Intra-Tumour Signalling Entropy Determines Clinical Outcome in Breast and Lung Cancer
  1. The cancer stem cell hypothesis, that a small population of tumour cells are responsible for tumorigenesis and cancer progression, is becoming widely accepted and recent evidence has suggested a prognostic and predictive role for such cells.
    Page 1, “Abstract”
  2. Infra-tumour heterogeneity, the diversity of the cancer cell population within the tumour of an individual patient, is related to cancer stem cells and is also considered a potential prognostic indicator in oncology.
    Page 1, “Abstract”
  3. The measurement of cancer stem cell abundance and infra-tumour heterogeneity in a clinically relevant manner however, currently presents a challenge.
    Page 1, “Abstract”
  4. We find that its prognostic power is driven by genes involved in cancer stem cells and treatment resistance.
    Page 1, “Abstract”
  5. The Cancer Stem Cell (CSC) hypothesis, the idea that a small population of tumour cells have the capacity to seed and grow the tumour, and intra-tumour heterogeneity, the diversity of the cancer cell population Within the tumour of an individual patient, have long been considered the basis of potential prognostic indicators in oncology.
    Page 1, “Author Summary”
  6. Over recent years considerable evidence has arisen supporting the hypothesis that some cancers are hierarchically organised, akin to the organisation of healthy cells, with a small population of Cancer Stem Cells (CSCs) driving a heterogeneous, hierarchical structure [1, 2].
    Page 2, “Introduction”
  7. Indeed, we showed that human embryonic stem cells and induced pluripotent stem cells eXhibited the highest levels of signalling entropy, with adult stem cells (e.g.
    Page 2, “Introduction”
  8. hematopoietic stem cells ) showing significantly lower values, and terminally differentiated cells eXhibiting the lowest entropy values within a lineage [16].
    Page 2, “Introduction”
  9. As shown by us previously, stem cells have a high signalling entropy which decreases during differentiation, a result not forthcoming using other molecular entropy measures [16, 28].
    Page 3, “Rationale of signalling entropy as a prognostic measure”
  10. Thus, given a homogeneous cell population, a high signalling entropy suggests that signalling within each cell is very promiscuous and that the cells may therefore have a plastic stem cell like phenotype.
    Page 3, “Rationale of signalling entropy as a prognostic measure”
  11. The prognostic impact of signalling entropy is associated with genes involved in cancer stem cells and treatment resistance
    Page 13, “The prognostic impact of signalling entropy is associated with genes involved in cancer stem cells and treatment resistance”

See all papers in March 2015 that mention stem cells.

See all papers in PLOS Comp. Biol. that mention stem cells.

Back to top.

Meta-analysis

Appears in 12 sentences as: Meta-analysis (6) meta-analysis (6)
In Intra-Tumour Signalling Entropy Determines Clinical Outcome in Breast and Lung Cancer
  1. Meta-analysis revealed that signalling entropy is prognostic across both ER positive and ER negative samples (ER positive: c-indeX = 0.63, 95% CI = (0.604, 0.657),p = 8.5e — 15, ER negative: c-indeX = 0.57, 95% CI = (0.538, 0.602), p = 0.032, Fig.
    Page 6, “Signalling entropy is prognostic in the major subtypes of breast cancer”
  2. In a meta-analysis over the 10 breast cancer validation sets we found that unlike signalling entropy Mam-maPrint was not significantly prognostic over ER negative samples (Fig.
    Page 6, “Signalling entropy is prognostic in the major subtypes of breast cancer”
  3. Sub-staging by size is currently the standard clinical approach to stratify stage I tumours, however, on meta-analysis we found that this stratification, unlike signalling entropy was not significantly prognostic over the stage I stratum (Fig.
    Page 8, “Signalling entropy is prognostic in stage I lung adenocarcinoma”
  4. CADMI expression performed comparably to the SE score in a meta-analysis , however, it was outperformed by pathological tumour stage (CADMI expression vs stage: p = 0.03).
    Page 10, “A signalling entropy derived prognostic score outperforms microarray based prognostic indicators in lung adenocarcinoma”
  5. We found that the SE score improved over stage Ia/b alone in a meta-analysis across 765 stage I lung ade-nocarcinomas (SE score+stage vs stage: p = 0.025), whereas CADMI expression made no improvement over stage Ia/b (CADMI expression+stage vs stage: p = 0.13, Fig.
    Page 11, “A signalling entropy derived prognostic score outperforms microarray based prognostic indicators in lung adenocarcinoma”
  6. Meta-analysis comparison of signalling entropy with OncotypeDX.
    Page 18, “Supporting Information”
  7. The overall concordance indices were derived via meta-analysis using a random effects model.
    Page 18, “Supporting Information”
  8. Meta-analysis across 10 data sets reveals that signalling entropy performs comparably to OncotypeDX across (A) ER positive samples and (B) ER negative samples.
    Page 18, “Supporting Information”
  9. Meta-analysis comparison of signalling entropy with the score of Kratz et al.
    Page 19, “Supporting Information”
  10. The overall concordance indices were derived via meta-analysis using a random effects model.
    Page 19, “Supporting Information”
  11. Meta-analysis across 6 data sets reveals that signalling entropy performs comparably to the Kratz et al.
    Page 19, “Supporting Information”

See all papers in March 2015 that mention Meta-analysis.

See all papers in PLOS Comp. Biol. that mention Meta-analysis.

Back to top.

microarray

Appears in 10 sentences as: microarray (9) microarrays (1)
In Intra-Tumour Signalling Entropy Determines Clinical Outcome in Breast and Lung Cancer
  1. In order to assess the prognostic significance of signalling entropy in breast cancer, we first computed its value for each microarray sample of the Molecular Taxonomy of Breast Cancer International Research Consortium dataset (METABRIC) [30], a total of 1980 samples divided into a discovery and validation sets of equal proportion.
    Page 5, “Signalling entropy is prognostic in the major subtypes of breast cancer”
  2. These results are in contrast to the performance of MammaPrint, a microarray based breast cancer prognostic signature currently being assessed in the MINDACT trial [41].
    Page 6, “Signalling entropy is prognostic in the major subtypes of breast cancer”
  3. Due to differences in the normalisation between RT-PCR and microarrays , a direct comparison between our measure and OncotypeDX is difficult to perform.
    Page 6, “Signalling entropy is prognostic in the major subtypes of breast cancer”
  4. However, using a microarray version of OncotypeDX, we found that it performed comparably to signalling entropy across both ER positive (signalling entropy vs. OncotypeDX: p = 0.13) and ER negative samples (signalling entropy vs. OncotypeDX: p = 0.7, S4 Fig).
    Page 6, “Signalling entropy is prognostic in the major subtypes of breast cancer”
  5. It is of particular note that signalling entropy is significantly prognostic if computed from either microarray or RNA-seq data sets, this result attests to the biological relevance of our measure which is not masked by experimental technique.
    Page 6, “Signalling entropy is prognostic in stage I lung adenocarcinoma”
  6. A signalling entropy derived prognostic score outperforms microarray based prognostic indicators in lung adenocarcinoma
    Page 10, “A signalling entropy derived prognostic score outperforms microarray based prognostic indicators in lung adenocarcinoma”
  7. However, a microarray based approximation of the Kratz et al.
    Page 12, “A signalling entropy derived prognostic score outperforms microarray based prognostic indicators in lung adenocarcinoma”
  8. We note that not all proteins in the PIN have a corresponding probe in the microarray or sequence in the RNA-seq data, consequentially the PIN we consider is the maximally connected component of the original PIN after the removal of missing proteins.
    Page 15, “Signalling Entropy”
  9. The plots display the concordance indices for signalling entropy and a microarray based approximation of OncotypeDX in each data set alongside 95% confidence intervals.
    Page 18, “Supporting Information”
  10. A) The plots display the concordance indices for signalling entropy and a microarray based approximation of the Kratz et al.
    Page 19, “Supporting Information”

See all papers in March 2015 that mention microarray.

See all papers in PLOS Comp. Biol. that mention microarray.

Back to top.

positively correlated

Appears in 9 sentences as: positively correlate (4) positively correlated (5)
In Intra-Tumour Signalling Entropy Determines Clinical Outcome in Breast and Lung Cancer
  1. This resulted in a small set of 81 genes, 10 of which were negatively correlated with signalling entropy and 71 of which were positively correlated 82 Table.
    Page 8, “Signalling entropy’s prognostic power in breast cancer can be represented by a small number of genes”
  2. A Signalling Entropy prognostic score (SE score) was then defined as the t-statistic evaluating the hypothesis that the 71 positively correlated genes are expressed more highly than the 10 negatively correlated genes (after z-score normal-ising the data).
    Page 8, “Signalling entropy’s prognostic power in breast cancer can be represented by a small number of genes”
  3. This gene list was slightly shorter than for the discovery set (55 genes, 34 positively correlated and 13 negatively correlated with signalling entropy) but had an overlap of 4 genes, significantly more than would be eXpected by chance (p = 0.012, based on re-sam-pling size matched sets of prognostic genes and assessing overlap).
    Page 10, “Signalling entropy’s prognostic power in breast cancer can be represented by a small number of genes”
  4. This resulted in a small set of 29 genes, 8 of which were negatively correlated with signalling entropy and 21 of which were positively correlated (S4 Table).
    Page 10, “A signalling entropy derived prognostic score outperforms microarray based prognostic indicators in lung adenocarcinoma”
  5. An SE score was then defined again as the t-statistic evaluating the hypothesis that the positively correlated genes are expressed more highly than the negative.
    Page 10, “A signalling entropy derived prognostic score outperforms microarray based prognostic indicators in lung adenocarcinoma”
  6. Genes are separated into those found to positively correlate With signalling entropy and those negatively correlated.
    Page 17, “Supporting Information”
  7. Genes are separated into those found to positively correlate With signalling entropy and those negatively correlated, the genes Which overlap With the discovery set derived set are highlighted in yellow.
    Page 17, “Supporting Information”
  8. Genes are separated into those found to positively correlate With signalling entropy and those negatively correlated.
    Page 17, “Supporting Information”
  9. Genes are separated into those found to positively correlate With signalling entropy and
    Page 18, “Supporting Information”

See all papers in March 2015 that mention positively correlated.

See all papers in PLOS Comp. Biol. that mention positively correlated.

Back to top.

cell population

Appears in 6 sentences as: cell population (5) cell populations (2)
In Intra-Tumour Signalling Entropy Determines Clinical Outcome in Breast and Lung Cancer
  1. Infra-tumour heterogeneity, the diversity of the cancer cell population within the tumour of an individual patient, is related to cancer stem cells and is also considered a potential prognostic indicator in oncology.
    Page 1, “Abstract”
  2. The Cancer Stem Cell (CSC) hypothesis, the idea that a small population of tumour cells have the capacity to seed and grow the tumour, and intra-tumour heterogeneity, the diversity of the cancer cell population Within the tumour of an individual patient, have long been considered the basis of potential prognostic indicators in oncology.
    Page 1, “Author Summary”
  3. These results were derived mostly from cell-lines, which are characterised by relatively homogeneous cell populations , and were further validated in time-course differentiation experiments [16].
    Page 2, “Introduction”
  4. In addition to quantifying stemness of the signalling regime of a homogeneous cell population, signalling entropy, if computed over a heterogeneous cell population , should also quantify the intercellular diversity in pathway activation.
    Page 2, “Introduction”
  5. We derived a sufficient condition on the eXpression profiles of homogeneous cell populations for signalling entropy to be a measure of intra-sample heterogeneity on average.
    Page 3, “Introduction”
  6. Thus, given a homogeneous cell population , a high signalling entropy suggests that signalling within each cell is very promiscuous and that the cells may therefore have a plastic stem cell like phenotype.
    Page 3, “Rationale of signalling entropy as a prognostic measure”

See all papers in March 2015 that mention cell population.

See all papers in PLOS Comp. Biol. that mention cell population.

Back to top.

enrichment analysis

Appears in 6 sentences as: enrichment analysis (6)
In Intra-Tumour Signalling Entropy Determines Clinical Outcome in Breast and Lung Cancer
  1. We performed a gene set enrichment analysis , using a Fisher’s Exact test, comparing each of these gene lists separately against the Molecular Signatures Database [50] (S6 Table shows the top 10 enriched gene sets for both gene lists).
    Page 13, “The prognostic impact of signalling entropy is associated with genes involved in cancer stem cells and treatment resistance”
  2. We note that gene set enrichment analysis performed on the genes comprising the SE scores gave broadly similar results (S7 Table).
    Page 13, “The prognostic impact of signalling entropy is associated with genes involved in cancer stem cells and treatment resistance”
  3. Genes utilised in the gene set enrichment analysis to identify gene sets associated with signalling entropy’s prognostic power in breast and lung cancer.
    Page 17, “Supporting Information”
  4. Gene set enrichment analysis results displaying the top 10 most significant enriched gene sets associated with signalling entropy’s prognostic power in breast and lung cancer.
    Page 18, “Supporting Information”
  5. Tables display results for the gene set enrichment analysis performed on gene lists identified in lung and breast cancer separately, both With and Without the intersection of the
    Page 18, “Supporting Information”
  6. Gene set enrichment analysis results displaying the top 10 most significant enriched gene sets associated with the genes utilised to construct the SE score in both breast cancer and lung adenocarcinoma.
    Page 18, “Supporting Information”

See all papers in March 2015 that mention enrichment analysis.

See all papers in PLOS Comp. Biol. that mention enrichment analysis.

Back to top.

confidence interval

Appears in 5 sentences as: confidence interval (3) confidence intervals (2)
In Intra-Tumour Signalling Entropy Determines Clinical Outcome in Breast and Lung Cancer
  1. The plots display the concordance indices for signalling entropy and a microarray based approximation of OncotypeDX in each data set alongside 95% confidence intervals .
    Page 18, “Supporting Information”
  2. The vertical line denotes concordance index = 0.5, data sets Where the confidence interval for the concordance index crosses this line did not reach significance.
    Page 18, “Supporting Information”
  3. score in each data set alongside 95% confidence intervals .
    Page 19, “Supporting Information”
  4. The vertical line denotes concordance index = 0.5, data sets Where the confidence interval for the concordance index crosses this line did not reach significance.
    Page 19, “Supporting Information”
  5. score combined With stage Ia/b status for stage I samples in each data set alongside its 95% confidence interval .
    Page 19, “Supporting Information”

See all papers in March 2015 that mention confidence interval.

See all papers in PLOS Comp. Biol. that mention confidence interval.

Back to top.

feature selection

Appears in 5 sentences as: feature selection (5)
In Intra-Tumour Signalling Entropy Determines Clinical Outcome in Breast and Lung Cancer
  1. By using signalling entropy to refine a set of prognostic genes identified by Cox regression, our approach refines the feature selection approach based on correlation with outcome [24].
    Page 8, “Signalling entropy’s prognostic power in breast cancer can be represented by a small number of genes”
  2. Criticism of feature selection for prognostic classifiers based on gene sets ranked by correlation with outcome has stemmed from the considerable discordance of such features between data sets [47, 48].
    Page 8, “Signalling entropy’s prognostic power in breast cancer can be represented by a small number of genes”
  3. We thus used feature selection to derive a small set of genes which capture the prognostic power of signalling entropy independently of other clinical variables, thus representing a more readily applicable quantifier of stem-ness and intra-tumour heterogeneity.
    Page 14, “Discussion”
  4. In comparing signalling entropy to signatures such as MammaPrint it is worth pointing out that a direct comparison is unfair signalling entropy does not involve feature selection .
    Page 14, “Discussion”
  5. Although signalling entropy was not found to outperform existing prognostic markers in lung adenocarcinoma, by using the SE score, derived by signalling entropy guided feature selection , it was possible to outperform existing state of the art prognostic factors such as CADMI eXpression across independent data sets.
    Page 14, “Discussion”

See all papers in March 2015 that mention feature selection.

See all papers in PLOS Comp. Biol. that mention feature selection.

Back to top.

cancer cell

Appears in 3 sentences as: cancer cell (2) cancer cells (1)
In Intra-Tumour Signalling Entropy Determines Clinical Outcome in Breast and Lung Cancer
  1. Infra-tumour heterogeneity, the diversity of the cancer cell population within the tumour of an individual patient, is related to cancer stem cells and is also considered a potential prognostic indicator in oncology.
    Page 1, “Abstract”
  2. The Cancer Stem Cell (CSC) hypothesis, the idea that a small population of tumour cells have the capacity to seed and grow the tumour, and intra-tumour heterogeneity, the diversity of the cancer cell population Within the tumour of an individual patient, have long been considered the basis of potential prognostic indicators in oncology.
    Page 1, “Author Summary”
  3. This calls into question the notion that CSCs only ever occupy a small proportion of the tumour, and paint a picture of cancer cells as malleable entities capable of generating considerable heterogeneity.
    Page 13, “Discussion”

See all papers in March 2015 that mention cancer cell.

See all papers in PLOS Comp. Biol. that mention cancer cell.

Back to top.

RNA-seq

Appears in 3 sentences as: RNA-seq (3)
In Intra-Tumour Signalling Entropy Determines Clinical Outcome in Breast and Lung Cancer
  1. To evaluate the clinical associations of our measure we first computed signalling entropy for each mi-croarray sample in The Director’s Challenge dataset profiling 398 tumours [42], and for the 455 lung adenocarcinoma RNA-seq tumour samples downloaded from The Cancer Genome Atlas (TCGA) database (http://cancergenome.nih.gov/).
    Page 6, “Signalling entropy is prognostic in stage I lung adenocarcinoma”
  2. It is of particular note that signalling entropy is significantly prognostic if computed from either microarray or RNA-seq data sets, this result attests to the biological relevance of our measure which is not masked by experimental technique.
    Page 6, “Signalling entropy is prognostic in stage I lung adenocarcinoma”
  3. We note that not all proteins in the PIN have a corresponding probe in the microarray or sequence in the RNA-seq data, consequentially the PIN we consider is the maximally connected component of the original PIN after the removal of missing proteins.
    Page 15, “Signalling Entropy”

See all papers in March 2015 that mention RNA-seq.

See all papers in PLOS Comp. Biol. that mention RNA-seq.

Back to top.

sample size

Appears in 3 sentences as: sample size (3)
In Intra-Tumour Signalling Entropy Determines Clinical Outcome in Breast and Lung Cancer
  1. Most proposed methodologies require the collection of new data sets and thus are limited in sample size , making them difficult to validate.
    Page 2, “Author Summary”
  2. The clinical assessment of intra-tumour heterogeneity also poses a significant challenge, with current eXperimental approaches requiring multiple biopsies per tumour leaving them severely limited in sample size [17—19].
    Page 2, “Introduction”
  3. The majority of currently suggested approaches are limited in sample size , and require the time consuming collection of large new data sets (such as multiple biopsies from single tumours) for validation and proof of concept.
    Page 14, “Discussion”

See all papers in March 2015 that mention sample size.

See all papers in PLOS Comp. Biol. that mention sample size.

Back to top.