Prediction, Validation and Targeted Interference of Erythroid Regulatory Modules

Open Access
Wang, Hao
Graduate Program:
Biochemistry, Microbiology, and Molecular Biology
Doctor of Philosophy
Document Type:
Date of Defense:
May 26, 2006
Committee Members:
  • Ross Cameron Hardison, Committee Chair
  • Webb Colby Miller, Committee Member
  • Pamela J Mitchell, Committee Member
  • Robert Paulson, Committee Member
  • Anton Nekrutenko, Committee Member
  • regulatory potential
  • gene regulation
  • conserved noncoding sequences
  • transcription factor binding sites
One of the major goals for functional genomics is to comprehensively delineate the network of regulatory modules that controls the level, timing and tissue-specific expression of genes. Such regulatory modules consist of cis-regulatory modules, which are the DNA sequences involved in regulation, and the trans-acting factors, mostly proteins, that act at those sites. Many cis-regulatory modules (CRM) are clusters of distinctive motifs that, when bound by sequence-specific transcription factors, cause an increase or decrease in the amount of transcription from a target promoter. These modules can be identified experimentally by gene transfer or mutagenesis experiments, and in this thesis I modify and adapt two functional assays for finding CRMs. Work in this thesis shows that this process can be facilitated by bioinformatic approaches based on analysis of multispecies alignments. We use a discriminatory function, called the regulatory potential (RP) score, to find patterns in aligned sequences that are more similar to those observed in known regulatory elements. Using transcriptome analysis from rescued G1E cells (a murine Gata1- cell line) and induced murine erythroleukemia cells, we identify genes whose expression changes dramatically during erythroid maturation and might have GATA-1 and its binding site involved in their regulations. We couple RP score with another filter, conservation of predicted binding sites for the transcription factor GATA-1, to refine the prediction because most known erythroid CRMs have binding sites for this essential erythroid transcription factor and its binding specificity has been studied thoroughly. These candidate CRMs were tested in reporter gene assays in transiently transfected K562 cells and marked-murine erythroleukemia cells (MEL_RL5) by site-directed integration. Selected CRMs are also tested by site-directed mutagenesis of GATA-1 and by chromatin immunoprecipitation to measure in vivo GATA-1 occupancy. These tested DNA elements served as an initial set in which to evaluate the power of bioinformatic predictions based on RP scores plus conserved GATA-1 binding sites. The validated elements can be examined for features in common and the results can be fed back into the efforts to improve the prediction procedures and reiterative runs of bioinformatic analyses are expected to refine the prediction of erythroid cis-regulatory modules with greater utility and effectiveness. After targeted stable integration in MEL_RL5 cells, the orientation of the integrants was determined by genomic blot-hybridization in previous studies. To develop a higher throughput approach for functional analysis of preCRMs, we mapped the location of the RL5 locus of MEL cells. This also allows future studies on the effects of integrated CRMs on flanking loci. The beta-globin gene complex has served as a paradigm for studying regulated switch in gene expression during development. Although extensively studied, the precise role in regulation has not been established for more than a few of the known regulatory modules and proteins acting at these sites are less well defined. Reverse-genetic approaches using antisense strategies have been used to test the function of candidate proteins at a selected subset of these sites. We have been working on constructing cell lines inducibly expressing interfering molecules that can be the beginnings of effective approaches to elucidating complex networks of interactions and pathways in globin gene regulation. Such inducible interfering systems into which erythroid cis-regulatory modules can be tested should lead to a more accurate delineation of these DNA sequences.