Computational analysis of multilevel omics data for the elucidation of molecular mechanisms of cancer
Cancer is a group of diseases that arises from irreversible genomic and epigenomic alterations that result in unrestrained proliferation of abnormal cells. Detailed understanding of the molecular mechanisms underlying a cancer would aid the identification of most, if not all, genes responsible for its progression and the development of molecularly targeted chemotherapy. The challenge of recurrence after treatment shows that our understanding of cancer mechanisms is still poor. As a contribution to overcoming this challenge, we provide an integrative multi-omic analysis on glioblastoma multiforme (GBM) for which large data sets on di erent classes of genomic and epigenomic alterations have been made available in the Cancer Genome Atlas data portal. The rst part of this study involves protein network analysis for the elucidation of GBM tumourigenic molecular mechanisms, identification of driver genes, prioritization of genes in chromosomal regions with copy number alteration, and co-expression and transcriptional analysis. Functional modules were obtained by edge-betweenness clustering of a protein network constructed from genes with predicted functional impact mutations and differentially expressed genes. Pathway enrichment analysis was performed on each module to identify statistical overrepresentation of signaling pathways. Known and novel candidate cancer driver genes were identi ed in the modules, and functionally relevant genes in chromosomal regions altered by homologous deletion or high-level amplication were prioritized with the protein network. Co-expressed modules enriched in cancer biological processes and transcription factor targets were identified using network genes that demonstrated high expression variance. Our findings show that GBM's molecular mechanisms are much more complex than those reported in previous studies. We next identified differentially expressed miRNAs for which target genes associated with the protein network were also differentially expressed. MiRNAs and target genes were prioritized based on the number of targeted genes and targeting miRNAs, respectively. MiRNAs that correlated with time to progression were selected by an elastic net-penalized Cox regression model for survival analysis. These miRNA were combined into a signature that independently predicted adjuvant therapy-linked progression-free survival in GBM and its subtypes and overall survival in GBM. The results show that miRNAs play significant roles in GBM progression and patients' survival finally, a prognostic mRNA signature that independently predicted progression-free and overall survival was identified. Pathway enrichment analysis was carried on genes with high expression variance across a cohort to identify those in chemoradioresistance associated pathways. A support vector machine-based method was then used to identify a set of genes that discriminated between rapidly- and slowly-progressing GBM patients, with minimal 5 % cross-validation error rate. The prognostic value of the gene set was demonstrated by its ability to predict adjuvant therapy-linked progression-free and overall survival in GBM and its subtypes and was validated in an independent data set. We have identified a set of genes involved in tumourigenic mechanisms that could potentially be exploited as targets in drug development for the treatment of primary and recurrent GBM. Furthermore, given their demonstrated accuracy in this study, the identified miRNA and mRNA signatures have strong potential to be combined and developed into a robust clinical test for predicting prognosis and treatment response.