Integrated Genomics Approach To Uncovering Gene Regulation and Epigenetics Across Eukaryotes

Open Access
Author:
Chang, Gue Su
Graduate Program:
Integrative Biosciences
Degree:
Doctor of Philosophy
Document Type:
Dissertation
Date of Defense:
March 15, 2013
Committee Members:
  • Benjamin Franklin Pugh, Dissertation Advisor
  • Ross Cameron Hardison, Committee Member
  • Istvan Albert, Committee Member
  • Yu Zhang, Committee Member
Keywords:
  • genomics
  • eukaryote transcriptional regulation
  • epigenetics
  • bioinformatics
  • high throughput sequencing
Abstract:
How is gene expression controlled in eukaryotic organisms? Recently, the Encyclopedia of DNA Elements (or ENCODE) project has systematically mapped regulatory protein-DNA interactions including transcription-associated factors, epigenetic modifications, and 3D chromosome conformation. This research has uncovered unprecedented details on eukaryotic gene control, and these high-throughput genomic data have led to many challenges in genomics and bioinformatics. This dissertation presents an integrative genome-wide approach to understand transcriptional regulation in eukaryotes, by utilizing massively parallel sequencing and bioinformatics. In order to achieve this research goal, we performed functional, comparative, and statistical genomics analyses with high-throughput genomic data. Gene and nucleosome organization in the Dictyostelium genome Genome-wide mapping of nucleosomes has significantly expanded our understanding of chromatin structure and function in eukaryotic transcriptional initiation and regulation. We present the first high-resolution maps of in vivo nucleosome locations for the social amoeba Dictyostelium discoideum. Its exceptional A/T-richness (78%) enabled us to study the role of the extreme nucleotide usage in organizing nucleosomes, genes, and promoters across the genome. Our functional and comparative genomics analysis revealed a variety of functionally distinct polymeric A/T elements in the Dictyostelium genome. These tracts established the boundary of Dictyostelium genes, associated with nucleosome-free regions and precisely positioned TATA boxes in the promoter. Dictyostelium utilized polymeric-A/T elements for nucleosome placement. Moreover as situated in an earliest branch from the last common ancestor of all eukaryotes, the exceptional ability of D. discoideum in alternating unicellular and multicellular form provided us with an ideal system to understand which principle governs the evolution of the chromatin structure across eukaryotes by emphasizing multicellular development. Surprisingly, Dictyostelium chromatin was organized as in higher multicellular eukaryotes. The phylogenetic linkage in the NELF homology and position of the first genic nucleosome across major eukaryotes implies that transcriptional regulation imposed on multicellularity has interplayed with eukaryotic chromatin over evolutionary time. Comprehensive and high resolution genome-wide response of p53 to UV-damage TP53 has been found the most frequently mutated among human cancers ( 50%) and intensively studied because of its tumor suppressive function. p53 transactivates a variety of genes in response to cellular signals and environmental stimuli. This activity requires DNA sequence-specific binding, which is notably degenerate in sequence requirement. Thus, a major challenge has been genome-wide characterization of interaction between p53 and its response elements (or REs). We employed ChIP-exo as a high-resolution genome-wide mapping assay, and comprehensively identified ~2,000 p53-bound REs across the human genome in response to UV-induced stress. Strikingly a characteristic 6-peak ChIP-exo pattern was associated with p53/RE binding, and a half-site overlapping spatial relationship was commonly found between REs. p53 regulates a subset of genes in each stress response, and how p53 achieves such specificity has been a long-standing question. Our systematic motif analysis revealed a stereotyped spatial arrangement of p53 REs with other nearby stress-response elements, such as AP1, NRF2, FOXO3. This result may provide a mechanistic basis for p53-mediated response specificity. p53 usually binds to distant REs for target gene activation, which leads to a challenge in identifying target genes. Our ChIP-exo mapping of the transcription preinitiation complex components, TFIIB and Pol II, provided not only a reliable way to identify transcription factor target genes, but also a genome-wide insight into p53-dependent transactivation mechanism. 154 genes activated by p53 in UV-stress response were identified (80% novel), which were involved in various functions such as cell growth and death control, and DNA repair. This high-confidence target gene set greatly expanded our understanding of p53-regulated DNA repair and cell proliferation network. Recent GWAS evidence indicates that sequence variations in non-coding DNA can be a significant risk factor for disease, but their catalogue remains a challenge. Our comprehensive search for single nucleotide polymorphism (or SNP) present at p53-bound REs showed a strong association between a SNP at the UV-inducible p53 RE for the POLH gene and Xeroderma pigmentosum variant (or XPV). This result may exemplify how non-coding regulatory variants contribute to gene expression and human disease.