Structural and Functional Roles of RNA Structure in Regulatory Processes in Plants

Open Access
Kwok, Chun Kit
Graduate Program:
Doctor of Philosophy
Document Type:
Date of Defense:
January 14, 2014
Committee Members:
  • Philip C. Bevilacqua, Dissertation Advisor/Co-Advisor
  • Sarah Mary Assmann, Dissertation Advisor/Co-Advisor
  • Scott A Showalter, Committee Member
  • Tae Hee Lee, Committee Member
  • Song Tan, Committee Member
  • RNA
  • RNA structure
  • RNA folding
  • Plants
  • Arabidopsis thaliana
  • G-quadruplex
  • regulation
  • DMS
  • Structure-Seq
The single-stranded nature of ribonucleic acid (RNA) provides plasticity for this ancient biomolecule to fold into diverse secondary and tertiary structures, allowing it to perform diverse biological functions. RNA structure plays critical roles in regulation of gene expression and cellular processes. Knowledge of RNA structure therefore provides important molecular insights regarding its function in biological systems. While much is known about how RNA folds in the test tube under non-biological conditions, our current understanding on in vivo RNA structure is very limited. This can be mainly attributed to three obstacles: (1) Inability to detect minute amount of transcripts and their structures in cells; (2) Inability to determine in vivo structure of thousands of different transcripts at once; (3) Insufficient data to support the hypothesis that RNA structure regulates cellular processes in eukaryotes, particularly in plants. The broad objective of my thesis research is to enable and carry out studies on RNA structure determination in vivo by establishing novel, sensitive, and high-throughput methods at the interface of chemistry and biology to allow probing of in vivo RNA structures in myriad low-abundance transcripts, and to investigate their regulatory roles in cellular processes. In addition, the chemical properties and biological significance of G-quadruplex structures (GQSs) are investigated. I am conducting my experiments in the model plant species Arabdopsis thaliana, which was chosen because (i) plants are sessile and are likely to have evolved (RNA-based) regulatory mechanisms to cope with stresses; (ii) its life cycle is short (4-6 weeks) and intact plants can be used for in vivo experiments; (iii) it is the first plant genome being fully sequenced and has a small genome (~125 Mb in A.thaliana versus ~3,000 Mb in human), yet it provides similar complexity as human (~30,000 genes). Regarding obstacle 1, I have developed a sensitive method to query the in vivo RNA structure of low-abundance transcripts by integrating chemicals that covalently modify RNA in vivo (dimethyl sulfate (DMS) or an acylation SHAPE reagent) with a general amplification technique (ligation-mediated PCR), i.e. “DMS/SHAPE-LMPCR”. This novel method probes RNA structure with attomole sensitivity, an improvement of five orders of magnitude over conventional methods. I benchmarked and demonstrated the utility of this method by chemically probing and analyzing the structures of high- and low-abundance coding and non-coding RNAs in living A. thaliana seedlings. Importantly, I revealed that in vitro and in vivo DMS/SHAPE chemical modifications patterns can be radically different from each other, emphasizing the critical importance of probing RNA structure in vivo. I also constructed the secondary structure of the low-abundance U12 small nuclear RNA (snRNA) in A. thaliana from comparative sequence analysis among 11 plant species, verified the phylogenetic structural model with DMS/SHAPE-LMPCR, and provided strong evidence that the single-stranded Sm-protein binding site in U12 snRNA is indeed bound by Sm-proteins in vivo. This universally applicable method opens the door to identify and explore the specific structure-function relationships of the multitude of low-abundance RNAs that prevail in living cells. Regarding obstacle 2, I have helped to establish a high-throughput, genome-wide in vivo RNA structure probing method, “Structure-Seq”, in which in vivo DMS methylation of unprotected As and Cs is identified by next generation DNA sequencing. Our innovative method promises to vastly expand our understanding of how RNA structure regulates cellular processes in living cells. This method was applied to A. thaliana seedlings and yielded the first in vivo genome-wide RNA structurome at nucleotide resolution for any organism, with structural information across more than 10,000 transcripts in a single experiment. I performed extensive experiments to demonstrate that DMS chemical probing was in vivo, confirmed that results from Structure-Seq and conventional gel-based DMS probing were strongly correlated, and verified that the in vivo Structure-Seq structure predictions for rRNAs were in good agreement with evolutionarily-derived phylogenetic rRNA structures. Using Structure-Seq, characteristic global RNA structural patterns were revealed and novel regulatory features were uncovered in RNA processing steps of translation, alternative polyadenylation and alternative splicing. Moreover, the structural properties of stress-related mRNAs in vivo were identified. Overall, our study suggests that differential folding of RNA structures can regulate a number of cellular processes in cells. The development of Structure-Seq allows the RNA structurome and its biological roles to be interrogated on a genome-wide scale and should be applicable to any organism. I also developed a new technology for single-stranded DNA (ssDNA) ligation for improved yield of ligation products with low nucleotide bias. This technique may be applicable to the DMS/SHAPE-LMPCR and Structure-Seq studies described above. A provisional patent is pending on this work. Existing ssDNA ligation methods were shown to suffer from slow kinetics, poor yield, and severe nucleotide preference. To resolve these issues, I introduced a hybridization-based strategy to allow efficient and low-bias ligation of ssDNAs. With this strategy, DNAs with different ends were found to complete the ligation in less than 2 h, with low nucleotide bias and ~95% yield. Furthermore, it was successfully applied to LMPCR. This technique potentially can be applied in protocols that require ligation of ssDNAs, including cDNA library construction. Regarding obstacle 3, I have performed three detailed studies to better understand the structure and function of GQSs. A GQS is composed of a guanine-rich sequence that contains the pattern GxLaGxLbGxLcGx, where x ≥ 2, and loops (L) a, b, and c are ≥ 1. (A) A systematic circular dichroism (CD) and fluorescence study was performed on RNA GQSs with varying G-stretch lengths and loop sequences. CD titration results suggested that when the length of the G-stretch increases (i.e. G2 GQS to G6 GQS), a general decrease in Hill coefficient/folding cooperativity (n) occurs. I demonstrated that the decrease in folding cooperativity was due to the population of intermediates in the GQS folding by showing a clear three-state transition in G3 and G4 GQS. Fluorescence titration results revealed for the first time that RNA GQSs exhibit intrinsic fluorescence. (B) The intrinsic fluorescence of GQSs was further explored using DNA GQSs, with varying loop sequences, loop lengths, and G-stretch lengths. Label-free fluorescence enhancements of up to 16-fold and a shift in fluorescence emission maximum to the visible light portion of the spectrum were reported upon GQSs formation. The studies of A and B serve to provide a deeper understanding of folding and intrinsic fluorescence of GQSs. The molecular switch (high n) or rheostat (low n) mode and intrinsic fluorescence of GQS potentially can be engineered to develop label-free detection methods and biosensors. (C) I have investigated the regulatory role of a GQS in planta. A thermostable GQS was identified within the 5’UTR of Ataxia Telangiestasia-mutated and Rad3-related (ATR) mRNA in A. thaliana. I found that ATR GQS was more thermostable than its competitive structure at physiological K+ and Mg2+ concentrations. In-line probing further confirmed the formation of ATR GQS in vitro in the context of the complete 5’UTR. Reporter gene assay results suggested that ATR GQS acts as a translational repressor in living cells. This is the first GQS being reported to regulate translation in any plant organisms, and the results provide the possibility to manipulate gene expression by modulating GQS formation. To summarize, my thesis work consists of cross-disciplinary research between chemistry and biology into the investigation of in vivo RNA structures. My work serves to improve current RNA structural probing methods and advance our understanding of the role of RNA structures in the regulation of cellular processes. In addition, some of these results may be engineered and applied for bio-applications, such as nucleic acid ligation and amplification, biosensing, and gene expression manipulation.