Detection of gene communities in multi-networks reveals cancer drivers
Dr Laura CANTINI
Institut Curie, INSERM U900, PSL Research University, Mines ParisTech, Paris, France
mercredi 22 mars 2017 - 11h00
- Salle de réunion 4004, IGBMC
Invité(e) par Génomique fonctionnelle et cancer, Hinrich GRONEMEYER
In the past years the advent of high-throughput experimental technologies provided biologists with a flood of molecular data. This huge amount of information requires the design of efficient methodologies to be interpreted. Among them, network analysis proved to be very effective to capture the molecular complexity of human diseases. Thus far, network-based computational methods were primarily focused on the analysis of single biological networks. However, such approach turned out to be insufficient to unveil functional regulatory patterns originating from complex interactions across multiple layers of biological relationships. Therefore, a new pressing request in molecular biology is to design network-based methods allowing combined use of multiple levels of genomic information. Many solutions have been proposed in the last few years. Among them a special role has been played by multiplex networks, which emerged recently as one of the major contemporary topics in network theory. Some relevant applications in biology already exist: Li and colleagues studied a multilayer structure composed of 130 co-expression networks, in which each layer represents a different experimental condition. Subsequently, they also constructed two-layer networks, composed of a standard co-expression network and an exon co-splicing network. More recently, Bennett and co-workers identified communities on the multiplex network of physical, genetic and co-expression interactions, in yeast, using mathematical programming with the modularity by Newman and Girvan as objective function. Following this line we propose a multi-network-based approach for the identification of candidate driving genes in cancer. We use the expression multi-networks instead of multiplex because we will not consider couplings between the layers.
Cancer is a complex disease caused by a progressive accumulation of dysfunctions in neoplastic cells. During the last decade, technological advancements enabled laboratories to quantitatively monitor these alterations. Efficient methodologies were designed to interpret these data and identify the genes driving the neoplastic growth. However these approaches are classically applied to study separately biological measurements that are clearly not independent. For this reason, we consider the identification of driver cancer genes as perfectly suited for a multi-network-type analysis. To address this problem, we combined, in a single multi-network, four different gene networks: (i) Transcription Factor (TF) co-targeting network, (ii) microRNA co-targeting network, (iii) Protein-Protein Interaction (PPI) network and (iv) gene co-expression network. The rationale behind this choice is that the insurgence of cancer is typically due to a dysregulation of the signaling and/or of the regulatory network of the cell. These regulatory pathways are tightly controlled in the cell both at the transcriptional and at the post-transcriptional levels and their alteration very often involves modification in the expression levels of genes which are at the same time partners in a protein-protein interaction and targeted by the same set of transcription factors and miRNAs. These are exactly the events which are selected and prioritized in the Multi-network-based analysis that we propose. Following the construction of the multi-network, we proceed with the identification of communities, that is, of groups of nodes that are densely connected to each other, but sparsely connected to the other nodes of the network. This is achieved by detection of gene communities within each multi-network layer and subsequent identification of communities via consensus clustering across the four layers. It is well known that community detection within a network is an open and difficult problem, for this reason we tested some well-known community detection algorithms all of which can be run in our multi-network analysis package “Gene4x” described in Fig. 1 and available at https://github.com/lcan88/Gene4x.git. To check whether the multi-network communities are more biologically relevant than the communities obtained in the expression network alone, we applied the analysis to human gastric, lung, pancreas and colon cancer datasets, and tested the resulting multi-network or co-expression network communities for functional enrichment, or differential expression between tumor and normal tissues. In all four cancer types, the multi-network communities highlighted new relevant tumor-specific functional enrichments (including chromosomal aberrations, candidate markers and driver genes) not detected by the co-expression network alone, providing evidence of the power of Multi-network-based approaches in extracting knowledge from complex, multidimensional molecular data.