Mining Host-Pathogen Protein Interactions to Characterize Burkholderia mallei Infectivity Mechanisms
Vesna Memišević, Nela Zavaljevski, Seesandra V. Rajagopala, Keehwan Kwon, Rembert Pieper, David DeShazer, Jaques Reifman, Anders Wallqvist

Abstract

We recently used yeast two-hybrid (Y2H) screening to identify a small set of novel Burkholderia proteins that were shown to attenuate disease progression in an aerosol infection animal model using the virulent Burkholderia mallei ATCC 23344 strain. Here, we performed an extended analysis of primarily nine B. mallei virulence factors and their interactions with human proteins to map out how the bacteria can influence and alter host processes and pathways. Specifically, we employed topological analyses to assess the connectivity patterns of targeted host proteins, identify modules of pathogen-interacting host proteins linked to processes promoting infec-tivity, and evaluate the effect of crosstalk among the identified host protein modules. Overall, our analysis showed that the targeted host proteins generally had a large number of interacting partners and interacted with other host proteins that were also targeted by B. mallei proteins. We also introduced a novel Host-Pathogen Interaction Alignment (HPIA) algorithm and used it to explore similarities between host-pathogen interactions of B. mallei, Yersinia pestis, and Salmonella enterica. We inferred putative roles of B. mallei proteins based on the roles of their aligned Y. pestis and S. enterica partners and showed that up to 73% of the predicted roles matched existing annotations. A key insight into Burkholderia pathogenicity derived from these analyses of Y2H host-pathogen interactions is the identification of eukaryotic-specific targeted cellular mechanisms, including the ubiquitination degradation system and the use of the focal adhesion pathway as a fulcrum for transmitting mechanical forces and regulatory signals. This provides the mechanisms to modulate and adapt the host-cell environment for the successful establishment of host infections and intracellular spread.

Author Summary

Burkholderia mallei uses secreted virulence factor proteins as a means to execute host-pathogen interactions and promote pathogenesis. While validated virulence factor proteins have been shown to attenuate infection in animal models, their actual roles in modifying and influencing host processes are not well understood. Here, we used host-pathogen protein-protein interactions derived from yeast two-hybrid screens to study nine known B. mallei virulence factors and map out potential virulence mechanisms. From the data, we derived both general and specific insights into Burkholderia host-pathogen infectivity pathways. We showed that B. mallei virulence factors tended to target multifunctional host proteins, proteins that interacted with each other, and host proteins with a large number of interacting partners. We also identified similarities between host-pathogen interactions of B. mallei, Yersi-m'a pestis, and Salmonella enterica using a novel host-pathogen interactions alignment algorithm. Importantly, our data are compatible with a framework in which multiple B. mallei virulence factors broadly influence key host processes related to ubiquitin-medi-ated proteolysis and focal adhesion. This provides B. mallei the means to modulate and adapt the host-cell environment to advance infection.

Introduction

This host-adapted bacterium is equipped with an extensive set of mechanisms for invasion and modulation of eukaryotic host-cell environments. Key mechanisms of B. mallei pathogenicity are encoded in virulence factors (proteins required for virulence) that control and promote pathogenic internalization, survival, and replication within host cells [2, 3]. While a number of B. mallei proteins associated with pathogenicity have been characterized and mapped to adhesion, endosomal escape and evasion of host-cell autophagy, actin-based motility, multi-nucleated giant cell formation, replication, and cell-to-cell spread [3—7] , the identities of their host targets are largely unknown, and the underlying mechanisms by which the bacterial proteins affect these processes are poorly understood.

We employed several bioinformatics approaches to identify and select a small number of putative and known virulence factors, and used yeast two-hybrid (Y2H) assays to identify their interacting protein partners in human and murine hosts. The analysis of these host-B. mallei protein-protein interactions (PPIs) allowed us to identify three novel B. mallei ATCC 23344 virulence factors and show that they attenuated B. mallei virulence in mouse aerosol challenge experiments. Although our PPI data contained extensive interactions between multiple host proteins and B. mallei proteins, we did not fully explore these data to more generally characterize B. mallei virulence mechanisms. Here, we performed a systematic analysis of these interactions to investigate the mechanisms by which B. mallei virulence factors interact with host proteins to establish infection, evade host immune responses, and spread within the host. We evaluated whether the virulence factors target specific (nonrandom) host proteins and processes and whether they jointly affect entry into and survival within the host cells. Furthermore, we evaluated whether we could detect commonalities in Gram-negative bacterial host-pathogen interactions among B. mallei, Yersim'a pestis, and Salmonella enterica virulence factors.

Although the identified interactions represent only a fraction of all possible interactions between host and pathogen proteins (ranging from less than 10 interactions to a few thousand PPIs), they have proved to be a valuable source of information about bacterial pathogenicity mechanisms. Analyses of these host-pathogen PPI datasets showed that virulence-associated pathogen proteins preferentially target host proteins involved in biological processes essential for cell vitality, e.g., signaling, cell cycle, or immune response [9—13]. Additionally, other studies demonstrated that similarities in host-pathogen PPIs can be used to predict novel host proteins that are targeted by bacterial proteins [14—16].

In addition, the analysis revealed specific host processes relevant to B. mallei virulence factors’ pathogenicity, e.g., signaling and communication, protein modification and regulation, and cytoskeleton organization, and suggested that virulence factors preferentially targeted multifunctional host proteins, thereby affecting multiple host cellular processes simultaneously. When we used all of our interaction data, including host interactions with putative but not validated B. mallei virulence factors, we identified additional host processes and molecular pathways that were previously eXperimentally associated with B. mallei pathogenicity [2, 17—21]. Moreover, our evaluation of the relationship between targeted host proteins involved in different processes and pathways supported a previously observed mechanism for bacterial interference with eu-karyotic hosts: virulence factors can focus interference by targeting key host proteins whose effect can propagate through and influence multiple host processes and pathways [2, 17, 18]. Additionally, we introduced a novel Host-Pathogen Interaction Alignment (HPIA) algorithm and used it to eXplore similarities between host-pathogen interactions of B. mallei, Y. pestis [13], and Salmonella enterica subsp. enterica serovar Typhimurium [12]. Using the HPIA algorithm, we identified a statistically significant number of functionally similar host-pathogen interactions between these three PPI datasets. We inferred putative roles for B. mallei proteins based on the role of their aligned Y. pestis and S. enterica partners and showed that up to 73% of the putatively annotated B. mallei protein roles matched their existing annotations.

A key insight from these analyses into Burkhol-deria pathogenicity is the concerted targeting of the ubiquitination degradation system and use of the focal adhesion pathway as a fulcrum for signaling and changing cell morphology. These mechanisms provide B. mallei with the ability to modulate and adapt the host-cell environment to establish intracellular host infections.

Results/Discussion

mallei PPIs by merging human-B. mallei and ortholo-gous murine-B. mallei protein interaction data identified in our previous Y2H screens [8]. The resulting dataset consisted of 1,235 unique interactions between 21 B. mallei and 828 human proteins. Fig. 1 shows these interactions and their Y2H-library origins. It also shows that the majority of the B. mallei proteins interacted with unique host proteins, i.e., 615 (74%) of host proteins interacted with a single B. mallei protein. Importantly, the bulk of the host-B. mallei

mallei PPI O Human protein detected in YZH screens — Host-B. mallet PPI conserved across species 0 Human library * Known virulence factor Q Murine library . Human & murinc libraries

B. mallei-human host-pathogen protein interactions. The set of human-B. mallei protein-protein interactions (PPls) was created by merging human-B. mallei and orthologous murine-B. mallei interaction data. The set consists of 1 ,235 unique interactions (gray and purple lines) between 21 B. mallei (green hexagons) and 828 human proteins (pink, blue, and purple circles). Known virulence factors are also indicated in the graph. proteins), implying that the captured data were largely reflective of host-pathogen virulence mechanisms. We start our analysis by assessing the characteristics of the host proteins targeted by these virulence factors.

Characteristics of host proteins interacting with known B. mal/ei virulence factors

First, we applied functional enrichment analyses based on Gene Ontology (GO) annotation data [23] to assess the characteristics of the human proteins targeted by the nine virulence factors. Table 2 shows that these virulence factors interacted with a statistically significant number of human proteins that were associated with 1) protein ubiquitination and ubiquitin ligase activity, 2) vesicle organization, and 3) protein complexes located in the cytoskeleton, in lysosomes, and in the nuclear lumen. These results were consistent with the experimentally observed pathogen interference with host cytoskeleton organization and ubiquitination levels [2, 3, 19—21, 24].

We mapped the identified host proteins interacting with B. mallei onto a human PPI network [25] consisting of 76,043 physical PPIs among 11,688 proteins. Of the 663 human proteins interacting with the nine B. mallei virulence factors, approximately 75% (498) were present in our human PPI network. This set contained proteins that had, on average, a significantly larger number of interacting partners per protein (19.5 vs. 13.0) than would be expected from a corresponding random selection of proteins from the entire human PPI network (Table 3). Among the highest-interact-ing host proteins targeted by the virulence factors, we found the adapter protein YWHAG (14-3-3 protein gamma) with 376 interactions. This protein, an interacting partner of BimA, has been implicated in the regulation of a large spectrum of signaling pathways [26]. Further topological measures associated with the set of 498 proteins, such as their clustering coefficient (a measure of interactions among nearest neighbors), were not different from the random selection (Table 3). We observed small effects on the length of the shortest path between any two proteins in the set, but it was unclear how to associate these topological parameters with B. mallei virulence.

For the 498-protein set, we found 202 unique proteins that participated in 325 human protein-protein interactions. In comparison with randomly selected proteins, the B. mallei-targeted proteins were engaged in a significantly larger number of these interactions (0.65 vs. 0.28 on a per-protein basis). A further examination of sets of connected human proteins that also interacted with the virulence factors, revealed the presence of a single, large connected component, i.e., a subnetwork in which a path connects any two proteins to each other. This largest connected component was composed of 202 proteins and contained the majority (95%) of the 325 interactions between the human proteins interacting with B. mallei (Table 3). The other 11 connected components consisted of five or fewer proteins, an observation that was not statistically significant from a random selection of proteins (data not shown). We found a fivefold increase, from 0.28 to 1.53 (0.95*325/202), in the number of human PPIs for each protein in the largest connected component that were all targeted by the virulence factors, compared to a random selection of proteins. These results suggest that a property of the B. mallei virulence phenotype is to target well-connected host proteins in a unique set of interconnected host proteins. Next, we used this property to expand on our initial set of GO annotations to better characterize B. mallei infectivity and pathogenesis.

B. mal/ei virulence factors target interactions among host proteins

We hypothesized that interactions among these host proteins are equally important targets as the proteins themselves and could be used to shed light on how virulence factors exert their influence. As detailed in Materials and Methods, we used the largest connected component identified above to detect 93 sets of human PPIs in which, in each set, all human proteins interacted with at least one of the nine known B. mallei virulence factors and had the same GO biological process annotations; we denoted these sets interaction modules.

Overall, the identified biological processes were similar to the ones identified when interactions among host proteins were not taken into account; however, they provided an improved annotation granularity. For example, the interaction modules allowed us to identify a biological process termed “pos-itive regulation of protein ubiquitination” instead of just “protein ubiquitination.” Importantly, the analysis provided evidence of a much larger effort to target intracellular host signaling processes, in particular those related to the immune response. Fig. 2 shows the subset of 116 proteins and 163 interactions from the largest connected component that were part of the 93 identified interaction modules and the location of six interaction modules. Each of the interaction modules constituting ubiquitination and ligase activity, transcriptional regulation, immune response, cy-toskeleton organization, and mRNA processing, consisted of proteins and interactions that were closely grouped together in the largest connected component (Fig. 2A—E). Fig. 2 also shows that some human proteins are a part of multiple interaction modules, suggesting that B. mallei interacts with multifunctional or “moonlighting” host proteins [27]. Multifunctional proteins have been associated with such neurological disorders as Alzheimer’s and Parkinson’s diseases [28] , as well as with bacterial virulence in H elicobacter pylori, Mycobacterium tuberculosis, and Streptococcus pneumonia [29]. Given the multifaceted role of these proteins in enzymatic catalysis, signal transduction, transcriptional regulation, apoptosis, motility, and growth [30, 31], interactions with them suggest an avenue for B. mallei to simultaneously interfere with multiple host-cellular processes to facilitate invasion and survival. In particular, Fig. 2F shows the largest interaction module associated With biological processes linked to multifunctional proteins. This interaction module contained 54 interactions among 44 human proteins associated With various types of regulation (regulation of gene expression, cytokinesis, or apoptosis), signal transduction (GTPase mediated signal transduction and Ianus kinase/signal transduction), and response triggering (immune response and response to stress). Additionally, this module contained host-interacting partners of eight out of the nine B. mallei virulence factors from our set, lacking only BopA. These results suggest that B. mallei virulence factors target multifunctional host proteins to simultaneously interfere With multiple host processes required for normal cellular function.

Putative B. mallei virulence factors improve characterization of B. mallei targets

mallei interaction dataset contained a number of putative virulence factors, we also evaluated the effect of adding these virulence factors into our analysis to characterize host targets. Similarly to the above analyses, we first evaluated the prevalent characteristics of human proteins using GO annotation [23]. The identified molecular annotations largely matched those identified for known virulence factors only, but also included additional GO terms, such as terms related to RNA metabolic processes (S3 Table). Table 3 shows that the analysis of topological properties of host proteins interacting with known and putative virulence factors displayed the same trends observed in the analysis of the interacting partners of known virulence factors. Next, we evaluated the extent to which known and putative virulence factors also targeted connected subsets of host PPIs. We identified 75 statistically significant interaction modules whose GO biological process annotations largely overlapped with the ones identified for interacting partners of known virulence factors only. Although the number of statistically significant interaction modules was smaller than above (an increase in the number of host proteins dilutes the enrichment), the addition of new host proteins increased the size (in terms of proteins and interactions) of previously identified interaction modules (S4 Table). This suggests that with the increase of protein annotation or with the identification of additional host-B. mallei PPIs, we will be able to identify larger and more complete host interaction modules targeted by B. mallei virulence factors.

We identified two statistically significantly enriched host pathways: bacterial invasion of epithelial cells and focal adhesion. Fig. 3 shows that the proteins targeted in the focal adhesion pathway appeared to be coordinated for pathway activation and largely interacted with each other (yellow boxes). The majority of these molecular interactions belonged to a connected sub-pathway located at the beginning of the pathway (the probability of observing such connectivity at random is < 106), and they provided a link between membrane receptors and signaling events that led to reorganization of the actin cytoskeleton.

Human-B.

mal/ei interactions and their effect on the crosstalk between different biological processes

It has been shown that a number of bacterial pathogens, including Yersinia, Salmonella, Shigella, Listeria, and Barkholderia, interfere with host signaling pathways to stimulate the host’s cytoskeleton rearrangement [2, 33]. These changes in signaling lead to changes in the host-cell shape and facilitate bacterial internalization and cell-to-cell spread [33]. Fig. 4 shows host proteins that interacted with known and putative B. mallei Virulence factors that can be directly associated with cytoskeleton organization. The largest statistically significant interaction module, represented by red stars, contained proteins previously identified as bacterial targets vital for host actin cytoskeleton rearrangement, e.g., membrane-associated small GTPases (CDC42 and RALA), Filamin-A (FILA), and Rho GDP-dissociation inhibitor (ARHGDIB) [2, 33]. The remaining cytoskeleton-related host proteins, represented as dark red circles, participated in smaller cytoskeleton organization interaction modules that were, on average, less than two proteins (three interactions or edges) away from the largest module.

Initially, out of all human proteins interacting with the examined B. mallei proteins, we evaluated the relationships among proteins involved in the focal adhesion pathway that participated in or interacted with human proteins associated with cytoskeleton organization. The shaded area in Fig. 4 shows that six of 12 proteins were present in both systems. Fig. 5 shows an extension of this analysis that includes B. mallei interacting host proteins that are components of eight other molecular pathways that shared proteins with the focal adhesion pathway. Fig. 5A (left) shows that among these nine pathways, the number of pathways that shared one or more proteins was low. However, the number of PPIs connecting the proteins from one pathway with proteins from another pathway was markedly higher [Fig. 5A (right)]. The large number of signaling pathways affected via the Proteins involved in cytoskeleton organization process —PPI within the module fir Part of the interaction module - - -PP| outside the module 0 Outside of the interaction module Part of the focal adhesion

_ . pathway Connector proteins involved III other/unknown processes Part ot the focal adhesion pathway Other

Actin cytoskeleton organization as a virulence factor target. Human proteins targeted by B. mallei proteins formed an interaction module that was primarily linked to cytoskeleton organization and focal adhesion. Twenty-five of these proteins were involved in cytoskeleton organization processes; 13 of them (red stars) interacted with each other (forming an interaction module), and the remaining 12 proteins (dark red circles) were on average < 2 nodes (< 3 edges) away from the interaction module. The figure also shows the overlap between the cytoskeleton organization interaction module and the focal adhesion pathway (shaded area), where connecting protein interactions from focal adhesion pathway proteins or other proteins appear as smaller circles and dashed lines. Note that all human proteins shown interacted with one or more B. mallei proteins.

Fig. 5B illustrates the propagation and number of cross-pathway interactions that were mediated Via the focal adhesion pathway and shows the known Virulence factors (Table 1) that can be associated with each pathway. Thus, Virulence factors affected biological processes and molecular pathways associated with multiple interconnecting host processes, providing an eXplanation of how interference With the function of a single protein propagated to and influenced multiple host processes and pathways.

Using multiple host-pathogen interaction networks to predict the role of pathogen proteins

However, detecting specific mechanisms of action for each pathogen protein based on enrichment analysis of large-scale Y2H protein interaction data is not trivial. This partly stems from experimental, biological, and statistical considerations. For example, the Y2H methodology is biased for certain types of interactions in a nonnative environment [34], binding events may or may not be biologically relevant, and statistical testing is hampered by small effect sizes and small statistical power. Conversely, this and previous studies have shown that multiple pathogens tend to target the same host proteins, biological processes, and pathways [2, 9, 10, 18]. Hence, one could potentially use common pathogenic mechanisms to more robustly characterize bacterial proteins and their host targets.

These datasets contained 62 host-pathogen interactions for 21 S. enterica proteins [12] and 223 interactions for 69 Y. pestis proteins [13]. An initial orthol-ogy-based approach to retrieve annotations proved too restrictive and did not generate any novel insights into B. mallei virulence. Instead, we used an alternative network alignment-based methodology optimized for interspecies alignment, i.e., we differentiated between host and pathogen proteins and avoided mapping host proteins to pathogen proteins and vice versa. As detailed in Materials and Methods, we introduced a novel alignment algorithm (HPIA) designed specifically for the alignment of cross-species interactions. We used the HPIA algorithm to identify similarities between host-pathogen interactions using the B. mallei, S. enterica, and Y. pestis PPI datasets, based on a combined similarity measure that included topological similarity, sequence similarity, and functional similarity. Table 5 lists the B. mallei proteins, their aligned protein partners, and inferred function(s) derived from the alignment.

We identified similarities between T3SS proteins involved in bacterial internalization for all three pathogens, including the orthologs BipB-SipB and BipC-SipC. Additionally, the B. mallei PilA protein was aligned to Y. pestis f1mbrial protein FimA6, another cell adhesion protein. Furthermore, although the S. enterica and Y pestis interaction datasets included mainly T3SS proteins, the two B. mallei Vgr proteins associated with bacterial survival and replication via Type 6 Secretion Systems (T6SS) were aligned to S. enterica and Y pestis proteins also known to promote bacterial survival and replication. Thus, while the aligned proteins may have different roles within each pathogen, it is possible that they interact with a similar type of host proteins, causing the alignment algorithm to capture these similarities. Overall, the alignment-based inferred roles for six out of the 11 (55%) annotated B. mallei proteins matched their existing annotation and their corresponding secretion system assignment (Table 5, fourth and second columns, respectively). If we only consider matching functionality and ignore the Vgr association to T6SS, the inferred functions matched the existing annotation in eight of 11 (73%) cases. For the B. mallei virulence factors without known functions in pathogenicity, listed in the lower part of Table 5, the functional mappings from S. enterica and Y. pestis provided an indication of their mechanistic role in virulence. Of special importance were the three novel virulence factors we identified from the Y2H data and experimentally verified in a B. mallei ATCC 23344 aerosol mouse infection model: BMAA1865, BMAAO728, and BMAAO553. The HPIA algorithm identified similarities between BMAA1865, S. enterica protein SopE2, and Y. pestis protein YopE, based on their interactions with host proteins involved in actin-cytoskeleton rearrangement processes. The alignment also identified similarities between B. mallei protein BMAAO728 and S. enterica protein SseL based on their interactions with host proteins involved in ubiquitination. Furthermore, the HPIA algorithm identified similarities between B. mallei protein BMAAO553 and S. enterica protein SseI associated with regulation of the host cytoskeleton and inhibition of cell motility. These results imply that the hypothetical protein BMAA1865 has a role in the host actin-cytoskeleton manipulation, that BMAAO728 has a role in (de)ubiquitination, and that serine/threonine phosphatase BMAAO553 has a role in cytoskeleton regulation. These are the same roles we previously proposed for these three proteins based on the literature review of pathogenic mechanisms [8].

The host-pathogen PPI data linked this protein to multiple processes related to pathogenicity. We were not able to test its pathogenicity in an animal model, because this protein appeared to be essential. However, the alignment results imply that this protein had a role in the regulation of TSSS secretion, as it is mapped to two TSSS regulators that are more likely to be localized in the bacterial cytoplasm than to be translocated into the infected cell: SpiC in S. enterica and YscN in Y. pestis [35—38].

Multiple B. mal/ei virulence factors target eukaryotic-specific host-cell processes

Thus, the specific virulence adaptations retained in the evolution of B. mallei as an obligate mammalian pathogen include targeting the ubiquitination degradation/ signaling system and using the focal adhesion pathway as a fulcrum for transmitting mechanical forces and regulatory signals. This provides the mechanisms to modulate and adapt the host-cell environment for the successful establishment of host infections.

Expanding on the crosstalk analysis shown in Fig. 5B, we created an interconnected host-pathway map. We inferred the connection between two pathways from the number of inter-pathway PPIs where both host proteins are interacting with at least one B. mallei protein. The larger the number of such host protein pairs, the larger the potential influence B. mallei have on the crosstalk between the involved pathways. Fig. 6 shows the extent of this influence between host pathways and the central role the focal adhesion pathway plays in propagating cell signaling and affecting key host cellular processes relevant to B. mallei pathogenesis. The pathways implicated in Fig. 5B directly related to focal adhesion are marked with stars in Fig. 6. These, in turn, are interconnected with a large number of signaling pathways (Fig. 6—grey background) that ultimately control cell cycle, morphology, and growth. A number of known disease processes, marked in red symbols, are also interconnected or directly connected to this signaling network. We hypothesized that this virulence factor host-pathogen network is central in controlling key cellular mechanisms that allow B. mallei to adapt the host cell environment and ensure robust infection.

Summary

The limitation of this approach is that while a host-pathogen protein interaction may occur, as determined via Y2H experimentation, this type of data does not allow us to resolve when, where, or why such interactions are important. Furthermore, even though the strict statistical threshold at a false discovery rate (FDR) g 5% minimizes the chances of identifying random data correlations, it does not test our hypothesis that the pathway is involved in B. mallei virulence. Conversely, the strength of our analysis is that the identified host interactions are dominated by known and validated virulence factors, allowing us to create new hypotheses around the biological interpretation of pathway interaction patterns. Our results showed that host-pathogen PPIs represent a rich source of information about molecular mechanisms of pathogenicity, and that these interactions can be used to identify and characterize host molecular pathways and processes targeted by pathogens. Specifically, our topological analysis of human-B. mallei protein interactions showed that known and putative B. mallei virulence factors tend to target multifunctional host proteins, host proteins that interact with each other, and host proteins with a large number of interacting partners. Additionally, the analysis identified a number of host processes and pathways relevant to B. mallei pathogenicity, many of which have been linked to bacterial pathogenicity in previous eXperimental studies, e.g., signaling and communication, protein modification and regulation, cyto-skeleton organization, and focal adhesion. Furthermore, the topological analysis suggested that B. mallei virulence factors target host molecular processes through interference with their direct and indirect host-interacting partners, implying that the process of pathogenic internalization and intracellular survival requires the modulation of multiple host cellular processes. We further introduced the novel HPIA algorithm that can be used to identify common sets of host-pathogen interactions by aligning (mapping) host-to-host and pathogen-to-pathogen proteins from two interaction datasets. We used the HPIA algorithm to compare human-B. mallei interactions to those of human-Y. pestis and human-S. enterica and identified a statistically significant number of aligned interactions. We also showed that the resulting alignments could be used to predict roles of B. mallei proteins based on the roles of their aligned Y. pestis and S. enterica partners.

Ultimately, the bacterial host-virulence program is derived from a survival strategy developed in the rhizosphere, i.e., in a generally competitive environment containing multiple, diverse species. Using multiple virulence factors to target eukaryotic-specific mechanisms common to eukaryotic rhizosphere species, B. mallei broadly influences key processes in ubiquitination and cell signaling to modulate and adapt the host-cell environment for its benefit.

Materials and Methods

Human-B.

mal/ei protein interaction set

mallei PPIs, we merged human-B. mallei and mu-rine-B. mallei PPI datasets identified in [8]. These datasets contained 586 interactions between 409 human and 21 B. mallei proteins, and 797 interactions between 574 murine and 25 B. mallei proteins; 19 B. mallei proteins appeared in both sets, including nine known B. mallei Virulence factors (Table 1). When creating the merged set, we considered only a subset of murine-B. mallei PPIs in which the B. mallei proteins also interacted with human proteins and, thus, had shown the ability to bind to human proteins. The merging procedure consisted of four steps. In the first step, we identified B. mallei proteins that interacted with both hosts (19 B. mallei proteins). Then, we found human orthologs for each of the 419 (73%) murine proteins that interacted with the B. mallei proteins identified in step 1. In the third step, we assessed whether the human orthologs constituted unique proteins, i.e., whether they were not a part of the eXperimentally detected human-B. mallei interactions. If not, we added this interaction into the ortho-logous human-B. mallei dataset. The resulting orthologous dataset consisted of 649 interactions between 419 human proteins and 19 B. mallei proteins, corresponding to 82% of the murine-B. mallei PPIs. Finally, we merged the experimental human-B. mallei dataset with the ortholo-gous human-B. mallei dataset to create a merged set of human-B. mallei PPIs. The resulting merged dataset consisted of 1,235 unique interactions between 21 B. mallei and 828 human proteins (81 Data and S6 Table). Approximately 72% (890) of these represent interactions among the nine B. mallei known Virulence factors (Table 1) and 663 unique human proteins. All proteins were annotated by their official gene symbols as defined in the HUGO Gene Nomenclature Committee database [39]. We used the National Center for Biotechnology Information HomoloGene database of ho-mologs (http://WWW.ncbi.n1m.nih.gov/homologene) to identify human-murine orthologs [40].

Topological properties of human proteins interacting with B. mal/ei in the human PPI network

We calculated the following topological properties for a set of human proteins interacting with B. mallei: 1) the number of human proteins interacting with B. mallei proteins (Np); 2) the average number of their interacting partners in the human PPI network (D); 3) the clustering coefficient, i.e., the number of interactions among the nearest neighbors (C); the average shortest path between any two proteins in the set (SP); the average number of interacting partners in the human PPI network where both partners interact with B. mallei proteins (Di); and the number of host proteins in the largest connected component (N itECG). All calculations were performed in R using the igraph package [41]. We evaluated whether the observed values for each of the five properties were statistically significant as follows. From the human interactome, we randomly selected the same number of proteins as the number of proteins interacting with

Next, we calculated each of the five topological properties for this random set of proteins, repeating the procedure 103 times. This procedure yielded 103 values for each property, which followed a Normal distribution (Normality was evaluated using the quantile-quantile plots and the Kolmogorov-Smirnov test [42], where we found that there was not enough evidence in the data to suggest that the distributions were not Normal). Then, for each property, we evaluated a relationship between the observed value for proteins interacting with B. mallei and the values obtained for random protein sets using a Z-score. Finally, we computed the p-values corresponding to the resulting Z-scores.

Gene set functional enrichment analyses

As the universe of human proteins, we used all constituent proteins from the human PPI network. As GO terms are specified at multiple levels of detail, we used a complete GO tree annotation, excluding the root and the top two levels of GO terms. GO annotation was obtained from BioMart [43]. For the KEGG enrichment analysis, as the universe of human proteins, we used the human proteins available in KEGGgraph that participated in at least one KEGG pathway [44]. All obtained p-values were assessed using the Benjamini and Hochberg multiple test correction [45]. We retained only annotations that were enriched at an FDR control level of 0.05, i.e., there is a less than 5% chance that the obtained p-values are not statistically significant.

In the standard enrichment analysis, we computed the probability of observing the number of proteins annotated with a given term using the hypergeometric distribution. In the network-based enrichment analysis, we first identified the largest connected component in the human interactome that consisted of human proteins interacting with at least one B. mallei virulence factor (denoted as LCC). Then, we counted the number of proteins 11, in the LCC that were annotated with a G0 biological process term t. Additionally, for each term t, we identified connected (sub)networks of LCC in which all proteins were annotated with t; these we termed interaction modules. We denoted such interaction modules as IMt and the number of proteins in these modules as mt.

First, we evaluated whether the observed interacting proteins and their corresponding biological processes in the LCC were statistically significant compared to an equal number of random proteins. For each GO term t, we counted the number of proteins in the random set that were annotated with t (denoted as rt). Additionally, for each 1‘, we identified interaction modules in which all proteins from the random set were annotated with t. We denoted the number of proteins in such interaction modules as 5,. We repeated this procedure 104 times. Given the obtained values, we defined the probability pt of observing 11, proteins annotated with t as and the probability prp of observing an interaction module IMt, given a random set of host proteins, as

We randomly rewired the human protein interaction network, while preserving the same degree distribution as observed in the original network. Next, we mapped human proteins from the LCC on the rewired network and, for each term 1‘, identified interaction modules in which all proteins were annotated with t. We denoted the number of proteins in such modules as wt. We repeated this procedure 104 times. Finally, we calculated the probability pm of observing an interaction module, IMt, given a random set of host interactions, as IMt interaction modules With pt 3 0.01, pm 3 0.01, and pm 3 0.01 contain a statistically significant number of human proteins and interactions among them, and are statistically significantly enriched in a biological process t.

Human-S. enterica and human-Y.

pestis protein interactions set

enterica serovar Typhimurium dataset consisted of 62 interactions between 21 S. enterica Virulence-associated proteins and 51 human proteins identified in several small-scale experiments [12]. The majority of S. enterica proteins from this set were associated with the bacterial TSSS. The human-Y. pestis PPI dataset consisted of a union of 204 interactions identified by Y2H screens and 23 interactions identified in several small-scale experiments [13]. The combined human-Y. pestis PPI dataset contained 223 unique interactions between 69 Y. pes-tis Virulence-associated proteins and 125 human proteins. The majority of Y. pestis proteins were also associated with the bacterial TSSS. For the basic comparison of the human-B. mallei, human-S. enterica, and human-Y. pestis PPI networks’ characteristics, see S7 Table.

The HPIA algorithm

Although the existing network alignment algorithms can be applied to host-pathogen PPIs, these algorithms are not optimized for interspecies alignment, i.e., they cannot differentiate between different types of proteins, such as host and pathogen proteins, and consequently may map host proteins to pathogen proteins and vice versa. The primary motivation for designing a network alignment algorithm was to be able to use previous data and insights from other host-pathogen interaction studies for interpreting our B. mallei host interaction data. If conserved network motifs and interaction exist, we can use this information to infer/predict more complex roles for proteins than just transferring sequence-based annotation information [47, 56]. Thus, we designed the HPIA algorithm specifically for the alignment of host-pathogen interactions to augment the sparsely annotated B. mallei protein data.

First, due to the non-exhaustive nature of Y2H experimentation, the underlying interaction data are not complete [8]. Second, the selected pathogen species are not identical to each other, i.e., proteins and biological processes have evolved differently between the species. Because of this sparse and diverse nature of the pathogen data at hand, we cannot a priori expect to obtain satisfactory alignments based on simply mapping human proteins to each other. In this sense, a “perfect” alignment is never attainable, and instead we must rely on approximate alignments with desirable properties, such as biological consistency and sequence similarity. Hence, we developed an alternate approach for aligning bipartite graphs for which one set of nodes (pathogen) are less-well characterized than the other nodes (human).

Thus, we can exploit the fact that all three host-pathogen PPI networks contained interactions with human proteins that participated in similar biological processes. In effect, this allowed us to extend the known annotations from the other networks to the previously uncharacterized B. mallei virulence factors. Furthermore, the algorithm guarantees that host proteins will be aligned only to host proteins and that pathogen proteins will be aligned only to pathogen proteins.

Let G1(U1, V1, E1) and G2(U2, V2, E2) be two bipartite graphs (networks), where U1 and V1 are two disjoint sets of nodes in G1, U2 and V2 are two disjoint sets of nodes in G2, and E1 and E2 are sets of edges of G1 and G2 such that every edge in G1 connects a node in U1 to one node in V1, and every edge in G2 connects a node in U2 to one node in V2 (i.e., no two nodes within the same set are adjacent). Without loss of generality, we can assume that |U1| <|U2| and |V1| <|V2| (hence, G1 < G2). The HPIA algorithm is a global network alignment algorithm that uniquely matches each node from U1 to exactly one node in U2, and each node from V1 to exactly one node in V2. Formally, the alignment of G1 to G2 can be represented as a set oftwo ordered pairs {(ul, uz), (1/1, 1/2)}, where u] 6 U1, uZ 6 U2, 1/] 6 V1, and v2 6 V2, and no two ordered pairs share a node. For the host-pathogen interaction networks (G 1 and G2), sets U1 and U2 correspond to pathogen proteins, sets V1 and V2 correspond to host proteins, and sets E1 and E2 correspond to interactions between host and pathogen proteins. Thus, our host-pathogen interaction set corresponds to the host-pathogen network, nodes correspond to proteins, and edges correspond to host-pathogen interactions.

The HPIA algorithm is a seed-and-extend algorithm that consists of the following three steps: 1) preprocessing, 2) identification of local alignment, and 3) identification of global alignment. In this context, we referred to “local alignment” as a smaller area of local network similarity between two networks where not all nodes need to be included, and “global” where all nodes from the smaller network must be aligned to nodes from the larger network [28]. Both mappings are 1-to-1, i.e., one node from one network can be aligned to only one node of the other network. In the first step, the algorithm reads in host-pathogen networks and node annotations, and it calculates similarities between nodes based on either the provided annotation (for node sets with available annotation) or the default topological similarity (for node sets without available annotation). This step also includes handling user-specified seed nodes (nodes that should be mapped to each other). The HPIA algorithm allows a user to provide a set of node pairs (seeds) and/ or to employ the algorithm’s feature to automatically search for seed pairs (this search is based on the sequence similarity or equivalent protein names). The HPIA algorithm can treat seed pairs in two ways: 1) as a suggestion for node alignment, i.e., nodes that should be aligned to each other if other alignment constraints are satisfied (see below), or 2) as a requirement for the alignment, i.e., seed pairs that have to be aligned to each other. The first step also includes initialization of the aligned pairs list as empty.

All three of these sets contain proteins from the host and pathogen sets and, thus, a pair of seed nodes can come from either the host or pathogen sets of proteins. However, if there are seed node candidates from both sets, the HPIA algorithm preferentially selects a pathogen set. Once selected, the seed pair ($1, $2) is added to the list of aligned pairs. Next, the HPIA algorithm expands around the seed nodes by greedily aligning their direct neighbors $1,- and $2j(51,- e N[51] and 521- e N[51] ), based on the given node similarity measure (see below). The HPIA algorithm repeats step two while there exists at least one unaligned pair of host proteins or pathogen proteins adjacent to at least one other unaligned protein. When there are no such pairs left, HPIA proceeds to the third step.

Each pair of nodes is aligned one at a time based on the given node similarity (see below); network connectivity information is not taken into account explicitly (only as a part of the node similarity measure). The results of the alignment are lists of aligned nodes and edges and the following alignment statistics: the total number and percentage of aligned nodes, the number and percentage of aligned pathogen and host nodes, and the number and percentage of aligned edges.

If the “relaxSeeds” option is given, the nodes will be aligned to each other only if other alignment constraints are satisfied as detailed below. If the “relaxSeeds” option is not given, the seed pairs are forced to be aligned to each other. We recommend that the “relaxSeeds” parameter be turned on if the “additionalSeeds” parameter is used.

The implementation of the HPIA algorithm is also presented as pseudocode in SI Text. Fig. 7 shows a high-level description of the HPIA alignment algorithm.

The HPIA algorithm uses one or more topologically and biologically based protein similarity measures to identify conserved interactions. Similarity between two proteins can always be calculated based on at least one of the metrics and, thus, the algorithm always has a metric to match one protein to another. If no node annotation is provided, the HPIA algorithm uses the default topological similarity measure, 8 Dfln 1, 112), to calculate the similarity between nodes 111 6 G1 and n2 6 G2: Where deg(n) denotes the degree of node n and nd(n) denotes the neighborhood density of 11, defined as where N [n] = N(n)U{n} represents the closed neighborhood of node n, i.e., the node n and the set of its adjacent nodes (for a PPI network, this corresponds to a protein and all of its interacting partners). a is a parameter in [0, 1] that controls the contribution of the degree of a node to its similarity function. We empirically selected a = 0.7, as we wanted to weight the number of direct host-pathogen interactions higher than the number of host-pathogen-host or pathogen-host-pathogen interactions. If node annotation is provided, the HPIA algorithm defines a similarity between pathogen proteins u1 6 U1 and uz 6 U2 as where SGDV_p and SGDV_ H denote the graphlet degree vector similarity [57] derived from the host-pathogen interaction network and the host-PPI network, respectively; SE3 represents the BLAST expected value (E-value) similarity [58], defined as 1— E-value for E-values g 1 and 0 otherwise; and SGO denotes the GO term annotation similarity calculated using the Iaccard similarity measure [59]. If a specific type of annotation is not provided, the HPIA algorithm assigns the similarity value 0 to the corresponding similarity parameters, e.g., if BLAST E-values are not provided, the value of SE3 for all pairs of nodes is set to 0. We did not add the graphlet degree vector similarity for the pathogen networks because only a few pathogen-PPI networks are available, none of which are for B. mallei.

For the topological node annotation, we used graphlet degree vectors [57] of all host and pathogen proteins from the host-pathogen PPI networks. Host proteins found in the host PPI network [25] were additionally annotated with another set of graphlet degree vectors calculated based on the host PPI network topology. We used GO annotation [23] downloaded from UNIPROT [60] [the lowest (leaf) level] as the biological node annotation. Additionally, we used BLAST E-values of g 0.01 to define similarities between proteins [58]. Protein sequences were downloaded from UNIPROT and aligned using BLAST pairwise sequence alignment.

To assess the topological quality of the alignment, we used edge correctness (EC), defined as the percentage of edges in G1 that were aligned to edges in G2 [51]. To assess the biological quality of the alignment, we evaluated whether the number of aligned protein pairs that share one or more GO term(s) was statistically significant compared to the number we could eXpect at random using the standard model of sampling without replacement, as described in previous studies [51—53]. We used the same approach to assess the statistical significance of the alignment of two bipartite networks, G1(U1, V1, E1) and G2(U2, V2, E2), with the EC of x% (similar to the implementation described above [51—53]). We aligned each pair of host-pathogen interaction networks 30 times and reported the average and standard deviations of the alignment scores over all runs, as well as the best score (SS Table). We ascertained the robustness of the alignments with respect to the E-value cutoff and observed no significant differences in the results when lowering the cutoff value from 10'2 to 103. To assess the biological quality of the alignment, we evaluated whether the number of aligned protein pairs that share one or more GO term(s) was statistically significant compared to the number we could eXpect from a random alignment. Given that all obtained alignments were of similar biological quality, we further refined our prediction by using the alignments that had the highest EC score, i.e., we used the alignment with the highest EC score to infer the role of B. mallei proteins.

Implementation and availability

The cross-species network alignment algorithm was developed in C++. Executable files and eXamples for the HPIA algorithm are provided at http://www.bhsai.org/downloads/hpia/.

Supporting Information

This file contains a set of human-B. mallei PPIs and a set of human PPIs used in this study. (XLSX)

This file contains the HPIA algorithm pseudo code and supplementary tables. (DOCX) 81 Table. A list of interaction modules statistically significantly enriched in Gene Ontology (GO) biological processes for human proteins interacting with known B. mallei virulence 82 Table. Host proteins associated with distinct biological processes that interacted with known B. mallei Virulence factors. S3 Table. Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways statistically significantly enriched in human proteins interacting with known and/ or putative B. mallei Virulence factors. S4 Table. Subnetworks statistically significantly enriched in Gene Ontology (GO) biological processes for human proteins interacting with known and/ or putative B. mallei Viru-SS Table. Functional annotation of B. mallei proteins inferred from the host-pathogen in-S6 Table. A summary of host-B. mallei interactions used in the study. (DOCX)

Characteristics of host-pathogen networks used in network alignment. (DOCX) SS Table. Functional annotation of B. mallei proteins inferred from the host-pathogen net

Acknowledgments

The opinions and assertions contained herein are the private Views of the authors and are not to be construed as official or as reflecting the Views of the United States (U.S.) Army or of the

Department of Defense. This paper has been approved for public release With unlimited distribution.

Author Contributions

Generated the interaction data: SVR KK RP. Developed the algorithms and performed the calculations: VM. Analyzed the data: VM NZ AW. Wrote the paper: VM AW IR.

Topics

statistically significant

Appears in 20 sentences as: statistical significance (1) statistically significant (15) statistically significantly (5)
In Mining Host-Pathogen Protein Interactions to Characterize Burkholderia mallei Infectivity Mechanisms
  1. Using the HPIA algorithm, we identified a statistically significant number of functionally similar host-pathogen interactions between these three PPI datasets.
    Page 3, “Introduction”
  2. Table 2 shows that these virulence factors interacted with a statistically significant number of human proteins that were associated with 1) protein ubiquitination and ubiquitin ligase activity, 2) vesicle organization, and 3) protein complexes located in the cytoskeleton, in lysosomes, and in the nuclear lumen.
    Page 5, “Characteristics of host proteins interacting with known B. mal/ei virulence factors”
  3. The other 11 connected components consisted of five or fewer proteins, an observation that was not statistically significant from a random selection of proteins (data not shown).
    Page 7, “Characteristics of host proteins interacting with known B. mal/ei virulence factors”
  4. We identified 75 statistically significant interaction modules whose GO biological process annotations largely overlapped with the ones identified for interacting partners of known virulence factors only.
    Page 9, “Putative B. mallei virulence factors improve characterization of B. mallei targets”
  5. Although the number of statistically significant interaction modules was smaller than above (an increase in the number of host proteins dilutes the enrichment), the addition of new host proteins increased the size (in terms of proteins and interactions) of previously identified interaction modules (S4 Table).
    Page 9, “Putative B. mallei virulence factors improve characterization of B. mallei targets”
  6. We identified two statistically significantly enriched host pathways: bacterial invasion of epithelial cells and focal adhesion.
    Page 10, “Putative B. mallei virulence factors improve characterization of B. mallei targets”
  7. The largest statistically significant interaction module, represented by red stars, contained proteins previously identified as bacterial targets vital for host actin cytoskeleton rearrangement, e.g., membrane-associated small GTPases (CDC42 and RALA), Filamin-A (FILA), and Rho GDP-dissociation inhibitor (ARHGDIB) [2, 33].
    Page 10, “Human-B. mal/ei interactions and their effect on the crosstalk between different biological processes”
  8. pestis and human-S. enterica and identified a statistically significant number of aligned interactions.
    Page 17, “Summary”
  9. We evaluated whether the observed values for each of the five properties were statistically significant as follows.
    Page 18, “Topological properties of human proteins interacting with B. mal/ei in the human PPI network”
  10. We retained only annotations that were enriched at an FDR control level of 0.05, i.e., there is a less than 5% chance that the obtained p-values are not statistically significant .
    Page 18, “Gene set functional enrichment analyses”
  11. We evaluated whether the observed IMt interaction modules were statistically significant as follows.
    Page 19, “Gene set functional enrichment analyses”

See all papers in March 2015 that mention statistically significant.

See all papers in PLOS Comp. Biol. that mention statistically significant.

Back to top.

cytoskeleton

Appears in 19 sentences as: cytoskeleton (20)
In Mining Host-Pathogen Protein Interactions to Characterize Burkholderia mallei Infectivity Mechanisms
  1. In addition, the analysis revealed specific host processes relevant to B. mallei virulence factors’ pathogenicity, e.g., signaling and communication, protein modification and regulation, and cytoskeleton organization, and suggested that virulence factors preferentially targeted multifunctional host proteins, thereby affecting multiple host cellular processes simultaneously.
    Page 3, “Introduction”
  2. Table 2 shows that these virulence factors interacted with a statistically significant number of human proteins that were associated with 1) protein ubiquitination and ubiquitin ligase activity, 2) vesicle organization, and 3) protein complexes located in the cytoskeleton , in lysosomes, and in the nuclear lumen.
    Page 5, “Characteristics of host proteins interacting with known B. mal/ei virulence factors”
  3. These results were consistent with the experimentally observed pathogen interference with host cytoskeleton organization and ubiquitination levels [2, 3, 19—21, 24].
    Page 5, “Characteristics of host proteins interacting with known B. mal/ei virulence factors”
  4. Table 4 shows that these interaction modules were associated with biological processes related to ligase activity, ubiquitination, protein modification, transcription and translation, immune response, signaling, cytoskeleton organization, development, and mRNA processing.
    Page 7, “B. mal/ei virulence factors target interactions among host proteins”
  5. The majority of these molecular interactions belonged to a connected sub-pathway located at the beginning of the pathway (the probability of observing such connectivity at random is < 106), and they provided a link between membrane receptors and signaling events that led to reorganization of the actin cytoskeleton .
    Page 10, “Putative B. mallei virulence factors improve characterization of B. mallei targets”
  6. One of the most prominently recurring results across all of our analyses was the link between B. mallei pathogenicity and host cytoskeleton organization.
    Page 10, “Human-B. mal/ei interactions and their effect on the crosstalk between different biological processes”
  7. It has been shown that a number of bacterial pathogens, including Yersinia, Salmonella, Shigella, Listeria, and Barkholderia, interfere with host signaling pathways to stimulate the host’s cytoskeleton rearrangement [2, 33].
    Page 10, “Human-B. mal/ei interactions and their effect on the crosstalk between different biological processes”
  8. 4 shows host proteins that interacted with known and putative B. mallei Virulence factors that can be directly associated with cytoskeleton organization.
    Page 10, “Human-B. mal/ei interactions and their effect on the crosstalk between different biological processes”
  9. The largest statistically significant interaction module, represented by red stars, contained proteins previously identified as bacterial targets vital for host actin cytoskeleton rearrangement, e.g., membrane-associated small GTPases (CDC42 and RALA), Filamin-A (FILA), and Rho GDP-dissociation inhibitor (ARHGDIB) [2, 33].
    Page 10, “Human-B. mal/ei interactions and their effect on the crosstalk between different biological processes”
  10. The remaining cytoskeleton-related host proteins, represented as dark red circles, participated in smaller cytoskeleton organization interaction modules that were, on average, less than two proteins (three interactions or edges) away from the largest module.
    Page 11, “Human-B. mal/ei interactions and their effect on the crosstalk between different biological processes”
  11. 4) between the cytoskeleton organization interaction modules were annotated with biological processes different from cytoskeleton organization, as well as the multifunctional nature of some cyto-skeletal reorganization proteins, we examined the occurrence of shared proteins that interacted with multiple pathways, i.e., pathway crosstalk.
    Page 11, “Human-B. mal/ei interactions and their effect on the crosstalk between different biological processes”

See all papers in March 2015 that mention cytoskeleton.

See all papers in PLOS Comp. Biol. that mention cytoskeleton.

Back to top.

PPI network

Appears in 11 sentences as: PPI network (10) PPI networks (2) PPI networks’ (1)
In Mining Host-Pathogen Protein Interactions to Characterize Burkholderia mallei Infectivity Mechanisms
  1. We mapped the identified host proteins interacting with B. mallei onto a human PPI network [25] consisting of 76,043 physical PPIs among 11,688 proteins.
    Page 5, “Characteristics of host proteins interacting with known B. mal/ei virulence factors”
  2. Of the 663 human proteins interacting with the nine B. mallei virulence factors, approximately 75% (498) were present in our human PPI network .
    Page 5, “Characteristics of host proteins interacting with known B. mal/ei virulence factors”
  3. This set contained proteins that had, on average, a significantly larger number of interacting partners per protein (19.5 vs. 13.0) than would be expected from a corresponding random selection of proteins from the entire human PPI network (Table 3).
    Page 5, “Characteristics of host proteins interacting with known B. mal/ei virulence factors”
  4. Topological properties of human proteins interacting with B. mal/ei in the human PPI network
    Page 18, “Topological properties of human proteins interacting with B. mal/ei in the human PPI network”
  5. We calculated the following topological properties for a set of human proteins interacting with B. mallei: 1) the number of human proteins interacting with B. mallei proteins (Np); 2) the average number of their interacting partners in the human PPI network (D); 3) the clustering coefficient, i.e., the number of interactions among the nearest neighbors (C); the average shortest path between any two proteins in the set (SP); the average number of interacting partners in the human PPI network where both partners interact with B. mallei proteins (Di); and the number of host proteins in the largest connected component (N itECG).
    Page 18, “Topological properties of human proteins interacting with B. mal/ei in the human PPI network”
  6. As the universe of human proteins, we used all constituent proteins from the human PPI network .
    Page 18, “Gene set functional enrichment analyses”
  7. pestis PPI networks’ characteristics, see S7 Table.
    Page 19, “Human-S. enterica and human-Y. pestis protein interactions set”
  8. Thus, we can exploit the fact that all three host-pathogen PPI networks contained interactions with human proteins that participated in similar biological processes.
    Page 20, “The HPIA algorithm”
  9. Where deg(n) denotes the degree of node n and nd(n) denotes the neighborhood density of 11, defined as where N [n] = N(n)U{n} represents the closed neighborhood of node n, i.e., the node n and the set of its adjacent nodes (for a PPI network , this corresponds to a protein and all of its interacting partners).
    Page 22, “The HPIA algorithm”
  10. For the topological node annotation, we used graphlet degree vectors [57] of all host and pathogen proteins from the host-pathogen PPI networks .
    Page 23, “The HPIA algorithm”
  11. Host proteins found in the host PPI network [25] were additionally annotated with another set of graphlet degree vectors calculated based on the host PPI network topology.
    Page 23, “The HPIA algorithm”

See all papers in March 2015 that mention PPI network.

See all papers in PLOS Comp. Biol. that mention PPI network.

Back to top.

connected component

Appears in 9 sentences as: connected component (8) connected components (1)
In Mining Host-Pathogen Protein Interactions to Characterize Burkholderia mallei Infectivity Mechanisms
  1. A further examination of sets of connected human proteins that also interacted with the virulence factors, revealed the presence of a single, large connected component , i.e., a subnetwork in which a path connects any two proteins to each other.
    Page 7, “Characteristics of host proteins interacting with known B. mal/ei virulence factors”
  2. This largest connected component was composed of 202 proteins and contained the majority (95%) of the 325 interactions between the human proteins interacting with B. mallei (Table 3).
    Page 7, “Characteristics of host proteins interacting with known B. mal/ei virulence factors”
  3. The other 11 connected components consisted of five or fewer proteins, an observation that was not statistically significant from a random selection of proteins (data not shown).
    Page 7, “Characteristics of host proteins interacting with known B. mal/ei virulence factors”
  4. We found a fivefold increase, from 0.28 to 1.53 (0.95*325/202), in the number of human PPIs for each protein in the largest connected component that were all targeted by the virulence factors, compared to a random selection of proteins.
    Page 7, “Characteristics of host proteins interacting with known B. mal/ei virulence factors”
  5. As detailed in Materials and Methods, we used the largest connected component identified above to detect 93 sets of human PPIs in which, in each set, all human proteins interacted with at least one of the nine known B. mallei virulence factors and had the same GO biological process annotations; we denoted these sets interaction modules.
    Page 7, “B. mal/ei virulence factors target interactions among host proteins”
  6. 2 shows the subset of 116 proteins and 163 interactions from the largest connected component that were part of the 93 identified interaction modules and the location of six interaction modules.
    Page 7, “B. mal/ei virulence factors target interactions among host proteins”
  7. Each of the interaction modules constituting ubiquitination and ligase activity, transcriptional regulation, immune response, cy-toskeleton organization, and mRNA processing, consisted of proteins and interactions that were closely grouped together in the largest connected component (Fig.
    Page 7, “B. mal/ei virulence factors target interactions among host proteins”
  8. We calculated the following topological properties for a set of human proteins interacting with B. mallei: 1) the number of human proteins interacting with B. mallei proteins (Np); 2) the average number of their interacting partners in the human PPI network (D); 3) the clustering coefficient, i.e., the number of interactions among the nearest neighbors (C); the average shortest path between any two proteins in the set (SP); the average number of interacting partners in the human PPI network where both partners interact with B. mallei proteins (Di); and the number of host proteins in the largest connected component (N itECG).
    Page 18, “Topological properties of human proteins interacting with B. mal/ei in the human PPI network”
  9. In the network-based enrichment analysis, we first identified the largest connected component in the human interactome that consisted of human proteins interacting with at least one B. mallei virulence factor (denoted as LCC).
    Page 18, “Gene set functional enrichment analyses”

See all papers in March 2015 that mention connected component.

See all papers in PLOS Comp. Biol. that mention connected component.

Back to top.

enrichment analysis

Appears in 8 sentences as: enrichment analyses (3) enrichment analysis (7)
In Mining Host-Pathogen Protein Interactions to Characterize Burkholderia mallei Infectivity Mechanisms
  1. First, we applied functional enrichment analyses based on Gene Ontology (GO) annotation data [23] to assess the characteristics of the human proteins targeted by the nine virulence factors.
    Page 5, “Characteristics of host proteins interacting with known B. mal/ei virulence factors”
  2. However, detecting specific mechanisms of action for each pathogen protein based on enrichment analysis of large-scale Y2H protein interaction data is not trivial.
    Page 13, “Using multiple host-pathogen interaction networks to predict the role of pathogen proteins”
  3. Gene set functional enrichment analyses
    Page 18, “Gene set functional enrichment analyses”
  4. We performed GO and KEGG enrichment analyses in R using the Bioconductor packages Bio-Mart and KEGGgraph, respectively [43, 44].
    Page 18, “Gene set functional enrichment analyses”
  5. For the KEGG enrichment analysis , as the universe of human proteins, we used the human proteins available in KEGGgraph that participated in at least one KEGG pathway [44].
    Page 18, “Gene set functional enrichment analyses”
  6. We performed two types of enrichment analysis: standard enrichment analysis and network-based enrichment analysis .
    Page 18, “Gene set functional enrichment analyses”
  7. In the standard enrichment analysis , we computed the probability of observing the number of proteins annotated with a given term using the hypergeometric distribution.
    Page 18, “Gene set functional enrichment analyses”
  8. In the network-based enrichment analysis , we first identified the largest connected component in the human interactome that consisted of human proteins interacting with at least one B. mallei virulence factor (denoted as LCC).
    Page 18, “Gene set functional enrichment analyses”

See all papers in March 2015 that mention enrichment analysis.

See all papers in PLOS Comp. Biol. that mention enrichment analysis.

Back to top.

immune response

Appears in 6 sentences as: immune response (5) immune responses (1)
In Mining Host-Pathogen Protein Interactions to Characterize Burkholderia mallei Infectivity Mechanisms
  1. Here, we performed a systematic analysis of these interactions to investigate the mechanisms by which B. mallei virulence factors interact with host proteins to establish infection, evade host immune responses , and spread within the host.
    Page 2, “Introduction”
  2. Analyses of these host-pathogen PPI datasets showed that virulence-associated pathogen proteins preferentially target host proteins involved in biological processes essential for cell vitality, e.g., signaling, cell cycle, or immune response [9—13].
    Page 3, “Introduction”
  3. Table 4 shows that these interaction modules were associated with biological processes related to ligase activity, ubiquitination, protein modification, transcription and translation, immune response , signaling, cytoskeleton organization, development, and mRNA processing.
    Page 7, “B. mal/ei virulence factors target interactions among host proteins”
  4. For example, the interaction modules allowed us to identify a biological process termed “pos-itive regulation of protein ubiquitination” instead of just “protein ubiquitination.” Importantly, the analysis provided evidence of a much larger effort to target intracellular host signaling processes, in particular those related to the immune response .
    Page 7, “B. mal/ei virulence factors target interactions among host proteins”
  5. Each of the interaction modules constituting ubiquitination and ligase activity, transcriptional regulation, immune response , cy-toskeleton organization, and mRNA processing, consisted of proteins and interactions that were closely grouped together in the largest connected component (Fig.
    Page 7, “B. mal/ei virulence factors target interactions among host proteins”
  6. This interaction module contained 54 interactions among 44 human proteins associated With various types of regulation (regulation of gene expression, cytokinesis, or apoptosis), signal transduction (GTPase mediated signal transduction and Ianus kinase/signal transduction), and response triggering ( immune response and response to stress).
    Page 8, “B. mal/ei virulence factors target interactions among host proteins”

See all papers in March 2015 that mention immune response.

See all papers in PLOS Comp. Biol. that mention immune response.

Back to top.

interaction networks

Appears in 5 sentences as: interaction network (2) interaction networks (3)
In Mining Host-Pathogen Protein Interactions to Characterize Burkholderia mallei Infectivity Mechanisms
  1. Using multiple host-pathogen interaction networks to predict the role of pathogen proteins
    Page 13, “Using multiple host-pathogen interaction networks to predict the role of pathogen proteins”
  2. We randomly rewired the human protein interaction network , while preserving the same degree distribution as observed in the original network.
    Page 19, “Gene set functional enrichment analyses”
  3. For the host-pathogen interaction networks (G 1 and G2), sets U1 and U2 correspond to pathogen proteins, sets V1 and V2 correspond to host proteins, and sets E1 and E2 correspond to interactions between host and pathogen proteins.
    Page 20, “The HPIA algorithm”
  4. If node annotation is provided, the HPIA algorithm defines a similarity between pathogen proteins u1 6 U1 and uz 6 U2 as where SGDV_p and SGDV_ H denote the graphlet degree vector similarity [57] derived from the host-pathogen interaction network and the host-PPI network, respectively; SE3 represents the BLAST expected value (E-value) similarity [58], defined as 1— E-value for E-values g 1 and 0 otherwise; and SGO denotes the GO term annotation similarity calculated using the Iaccard similarity measure [59].
    Page 23, “The HPIA algorithm”
  5. We aligned each pair of host-pathogen interaction networks 30 times and reported the average and standard deviations of the alignment scores over all runs, as well as the best score (SS Table).
    Page 23, “The HPIA algorithm”

See all papers in March 2015 that mention interaction networks.

See all papers in PLOS Comp. Biol. that mention interaction networks.

Back to top.

proteins targeted

Appears in 5 sentences as: proteins targeted (5)
In Mining Host-Pathogen Protein Interactions to Characterize Burkholderia mallei Infectivity Mechanisms
  1. We start our analysis by assessing the characteristics of the host proteins targeted by these virulence factors.
    Page 4, “Results/Discussion”
  2. First, we applied functional enrichment analyses based on Gene Ontology (GO) annotation data [23] to assess the characteristics of the human proteins targeted by the nine virulence factors.
    Page 5, “Characteristics of host proteins interacting with known B. mal/ei virulence factors”
  3. Among the highest-interact-ing host proteins targeted by the virulence factors, we found the adapter protein YWHAG (14-3-3 protein gamma) with 376 interactions.
    Page 5, “Characteristics of host proteins interacting with known B. mal/ei virulence factors”
  4. 3 shows that the proteins targeted in the focal adhesion pathway appeared to be coordinated for pathway activation and largely interacted with each other (yellow boxes).
    Page 10, “Putative B. mallei virulence factors improve characterization of B. mallei targets”
  5. Human proteins targeted by B. mallei proteins formed an interaction module that was primarily linked to cytoskeleton organization and focal adhesion.
    Page 11, “Human-B. mal/ei interactions and their effect on the crosstalk between different biological processes”

See all papers in March 2015 that mention proteins targeted.

See all papers in PLOS Comp. Biol. that mention proteins targeted.

Back to top.

eukaryotic

Appears in 4 sentences as: eukaryotic (4)
In Mining Host-Pathogen Protein Interactions to Characterize Burkholderia mallei Infectivity Mechanisms
  1. Burkholderia pathogenicity relies on protein virulence factors to control and promote bacterial internalization, survival, and replication within eukaryotic host cells.
    Page 1, “Abstract”
  2. Burkholderia species need to manipulate many host processes and pathways in order to establish a successful intracellular infection in eukaryotic host organisms.
    Page 2, “Author Summary”
  3. This host-adapted bacterium is equipped with an extensive set of mechanisms for invasion and modulation of eukaryotic host-cell environments.
    Page 2, “Introduction”
  4. Using multiple virulence factors to target eukaryotic-specific mechanisms common to eukaryotic rhizosphere species, B. mallei broadly influences key processes in ubiquitination and cell signaling to modulate and adapt the host-cell environment for its benefit.
    Page 17, “Summary”

See all papers in March 2015 that mention eukaryotic.

See all papers in PLOS Comp. Biol. that mention eukaryotic.

Back to top.

Gene Ontology

Appears in 4 sentences as: Gene Ontology (4)
In Mining Host-Pathogen Protein Interactions to Characterize Burkholderia mallei Infectivity Mechanisms
  1. First, we applied functional enrichment analyses based on Gene Ontology (GO) annotation data [23] to assess the characteristics of the human proteins targeted by the nine virulence factors.
    Page 5, “Characteristics of host proteins interacting with known B. mal/ei virulence factors”
  2. A list of interaction modules statistically significantly enriched in Gene Ontology (GO) biological processes for human proteins interacting with known B. mallei virulence 82 Table.
    Page 24, “Supporting Information”
  3. Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways statistically significantly enriched in human proteins interacting with known and/ or putative B. mallei Virulence factors.
    Page 24, “Supporting Information”
  4. Subnetworks statistically significantly enriched in Gene Ontology (GO) biological processes for human proteins interacting with known and/ or putative B. mallei Viru-SS Table.
    Page 24, “Supporting Information”

See all papers in March 2015 that mention Gene Ontology.

See all papers in PLOS Comp. Biol. that mention Gene Ontology.

Back to top.

KEGG

Appears in 4 sentences as: KEGG (5)
In Mining Host-Pathogen Protein Interactions to Characterize Burkholderia mallei Infectivity Mechanisms
  1. 1 to identify biological pathways targeted by all tested B. mallei proteins using the Kyoto Encyclopedia of Genes and Genomes ( KEGG ) annotation database [32].
    Page 10, “Putative B. mallei virulence factors improve characterization of B. mallei targets”
  2. We performed GO and KEGG enrichment analyses in R using the Bioconductor packages Bio-Mart and KEGGgraph, respectively [43, 44].
    Page 18, “Gene set functional enrichment analyses”
  3. For the KEGG enrichment analysis, as the universe of human proteins, we used the human proteins available in KEGGgraph that participated in at least one KEGG pathway [44].
    Page 18, “Gene set functional enrichment analyses”
  4. Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes ( KEGG ) pathways statistically significantly enriched in human proteins interacting with known and/ or putative B. mallei Virulence factors.
    Page 24, “Supporting Information”

See all papers in March 2015 that mention KEGG.

See all papers in PLOS Comp. Biol. that mention KEGG.

Back to top.

protein-protein interactions

Appears in 4 sentences as: protein-protein interactions (4)
In Mining Host-Pathogen Protein Interactions to Characterize Burkholderia mallei Infectivity Mechanisms
  1. Here, we used host-pathogen protein-protein interactions derived from yeast two-hybrid screens to study nine known B. mallei virulence factors and map out potential virulence mechanisms.
    Page 2, “Author Summary”
  2. mallei protein-protein interactions (PPIs) allowed us to identify three novel B. mallei ATCC 23344 virulence factors and show that they attenuated B. mallei virulence in mouse aerosol challenge experiments.
    Page 2, “Introduction”
  3. mallei protein-protein interactions (PPls) was created by merging human-B.
    Page 4, “Results/Discussion”
  4. For the 498-protein set, we found 202 unique proteins that participated in 325 human protein-protein interactions .
    Page 7, “Characteristics of host proteins interacting with known B. mal/ei virulence factors”

See all papers in March 2015 that mention protein-protein interactions.

See all papers in PLOS Comp. Biol. that mention protein-protein interactions.

Back to top.

signaling pathways

Appears in 4 sentences as: signaling pathways (4)
In Mining Host-Pathogen Protein Interactions to Characterize Burkholderia mallei Infectivity Mechanisms
  1. This protein, an interacting partner of BimA, has been implicated in the regulation of a large spectrum of signaling pathways [26].
    Page 5, “Characteristics of host proteins interacting with known B. mal/ei virulence factors”
  2. It has been shown that a number of bacterial pathogens, including Yersinia, Salmonella, Shigella, Listeria, and Barkholderia, interfere with host signaling pathways to stimulate the host’s cytoskeleton rearrangement [2, 33].
    Page 10, “Human-B. mal/ei interactions and their effect on the crosstalk between different biological processes”
  3. The large number of signaling pathways affected via the
    Page 11, “Human-B. mal/ei interactions and their effect on the crosstalk between different biological processes”
  4. These, in turn, are interconnected with a large number of signaling pathways (Fig.
    Page 15, “Multiple B. mal/ei virulence factors target eukaryotic-specific host-cell processes”

See all papers in March 2015 that mention signaling pathways.

See all papers in PLOS Comp. Biol. that mention signaling pathways.

Back to top.