Transcription regulation and candidate diagnostic markers of esophageal cancer
Abstract
Esophageal cancer (EC) ranks among the ten most frequent cancers worldwide. Mortality rates associated with EC are very similar to the incidence rates due to the relatively late stage of diagnosis and the poor efficacy of treatment. The aim of this study was to enhance our insights of putative transcriptional circuitry of EC genes, thereby potentially positively impacting our knowledge of therapeutic targets, providing indications as to more appropriate lines of treatment, and additionally allowing for the determination of putative candidate diagnostic markers for the early stage detection of EC.
This thesis reports on the development of a novel comprehensive database (Dragon Database of Genes Implicated in Esophageal Cancer, DDEC) as an integrated knowledge database aimed at representing a gateway to esophageal cancer related data. More importantly, it illustrates how the biocurated genes in the database may represent a reliable starting point for divulging transcriptional regulation, diagnostic markers and the biology related to esophageal cancer. DDEC contains known and novel information for 529 differentially expressed EC genes compiled using scientific publications from PubMed and is freely accessible for academic and non-profit users at http://apps.sanbi.ac.za/ddec/. The novel information provided to users of the DDEC is the lists of putative transcription factors that potentially control the 529 manually curated genes. The value of the information accessible through the database was further refined by providing precompiled text-mined and data-mined reports about each of these genes to allow for easy exploration of information about associations of EC-implicated genes with other human genes and proteins, metabolites and enzymes, toxins, chemicals with
pharmacological effects, disease concepts and human anatomy. This feature has the capacity to display potential associations that are rarely reported and thus difficult to identify, and it enables the inspection of potentially new ‘association hypotheses’ generated based on the precompiled reports.
This study further illustrates how the biocurated esophageal squamous cell
carcinoma (ESCC) genes in the database may represent a reliable starting point for exploring beyond current knowledge of the transcriptional circuitry of estrogen related hormone therapy. The genes were used to develop a method that identified 44 combinations of transcription factors (TFs) that characterize the promoter sequence of estrogen responsive genes implicated in ESCC. These significantly over-represented combinations of TFs were then used to increase confidence in the 47 novel putative estrogen response genes that may be related to ESCC too. Coincidently, two of the novel putative estrogen response genes were verified by current (2009), experimental publications.