A computational characterisation of the relationship between genome structure and disease genes
Kibler, Tracey Deborah
MetadataShow full item record
This is a pilot study to investigate the relationship between disease gene status and the structure of the human genome with specific reference to regions of recombination. It compares certain characteristics of a control set of genes, with no reported association or function in any known disease, with a second set of well-curated genes with a known association to a disease. One of the benefits of recombination is the introduction of new combinations of genetic variation in the genome. Recombination hotspots are regions on the chromosome where higher than normal frequencies of breaking and rejoining between homologous chromosomes occur during meiosis. The hotspot regions exhibit both a non-random distribution across the human genome and varying frequencies of breaking and rejoining. The study analyzed a set of features that represent general properties of human genes; namely base composition (percentage GC content), genetic variation (single nucleotide polymorphisms - SNPs), gene length, and positional effect (distance from chromosome end), in both the disease-associated gene set and the control set. These features were linked to recombination hotspots in the human genome and the frequency of recombination at these hotspots. Descriptive statistics was used to determine differences between the occurrences of these features in disease-associated genes compared to the control set, as well as differences in the occurrence of these same features in subset of genes containing an internal recombination hotspot compared to the genes with no internal recombination hotspot. The study found that disease-associated genes are generally longer than those in the control set, which is consistent with previous studies. It also found that disease-associated genes are much more likely to contain a recombination hotspot than those genes with no disease association. The study did not, however, find any association between disease gene status and the other set of features; namely GC content, SNP numbers or the position of a gene on the chromosome. Further analysis of the data suggested that the increased probability of disease-associated genes containing a recombination hotspot is most likely an effect of longer gene length and that the presence of a recombination hotspot is not sufficient in its own right to cause disease gene status.