Functional and Evolutionary Genomics of Plant Small RNAs

Open Access
Ma, Zhaorong
Graduate Program:
Integrative Biosciences
Doctor of Philosophy
Document Type:
Date of Defense:
June 10, 2013
Committee Members:
  • Michael Axtell, Dissertation Advisor
  • Michael Axtell, Committee Chair
  • Claude Walker Depamphilis, Committee Member
  • Naomi S Altman, Committee Member
  • Ross Cameron Hardison, Committee Member
  • genomics
  • small RNAs
  • microRNAs
  • comparative genomics
  • evolutionary genomics
  • targeted enrichment
  • targeted genomic enrichment
In plants, microRNAs (miRNAs) and small interfering RNAs (siRNAs) account for the majority of the small RNA population. They play critical roles in multiple cellular processes through post-transcriptional regulation of RNA targets. Some miRNAs are well conserved among different plant lineages, while others are less conserved. It is not clear whether less-conserved miRNAs have the same functionality as the well conserved ones. Heterochromatic siRNAs are broadly produced in the Arabidopsis thaliana genome, sometimes from active “hotspot” loci. It is unknown whether individual heterochromatic siRNA hotspots are retained as hotspots between plant species. In Chapter 2, we compare small RNAs in two closely related species (Arabidopsis thaliana and Arabidopsis lyrata) and find that less-conserved miRNAs have high rates of divergence in MIRNA hairpin structures, mature miRNA sequences, and target complementary sites in the other species. The fidelity of miRNA biogenesis from many less-conserved MIRNA hairpins frequently deteriorates in the sister species relative to the species of first discovery. We also observe that heterochromatic siRNA occupied loci have a slight tendency to be retained as heterochromatic siRNA loci between species, but the most active A. lyrata heterochromatic siRNA hotspots are generally not syntenic to the most active heterochromatic siRNA hotspots of A. thaliana. Altogether, our findings indicate that many MIRNAs and most heterochromatic siRNA hotspots are rapidly changing and evolutionarily transient within the Arabidopsis genus. Small RNAs are broadly present in all known plant species, many of which play important regulatory roles. In Chapter 3, we surveyed the small RNA populations from three plants: the tree crop Theobroma cacao, oil palm Elaeis guineensis Jacq., and the model moss Physcomitrella patens. In Theobroma cacao, we computationally identified 83 conserved miRNAs and 91 miRNA targets using sequence similarity and secondary structure information. In oil palm (Elaeis guineensis Jacq.), we identified 28 expressed miRNA families during flower development by analyzing smallRNAseq data. In Physcomitrella patens, we identified a novel family of trans-acting siRNA (ta-siRNA) loci associated with miR156- and miR529-directed slicing by scanning the genome for ta-siRNA-like sRNA accumulation patterns in different genetic background. These studies as a whole demonstrate that many small RNA species are deeply conserved in the plant kingdom. On the other hand, novel classes of small RNAs can evolve in specific lineages. Conserved plant microRNAs (miRNAs) modulate important biological processes but little is known about conserved cis-regulatory elements (CREs) surrounding MIRNA genes. In Chapter 4, we developed a solution-based targeted genomic enrichment methodology to capture, enrich and sequence flanking genomic regions surrounding conserved MIRNA genes with a locked-nucleic acid (LNA)-modified, biotinylated probe complementary to the mature miRNA sequence. Genomic DNA bound by the probe is captured by streptavidin-coated magnetic beads, amplified, sequenced and assembled de novo to obtain genomic DNA sequences flanking MIRNA locus of interest. We demonstrate the effectiveness of this method in Arabidopsis thaliana. We demonstrate the sensitivity and specificity of this enrichment methodology to enrich targeted regions spanning 10-20 kb surrounding known MIR166 and MIR165 loci. Assembly of the sequencing reads successfully recovered all targeted loci. While further optimization for larger, more complex genomes is needed, this method may enable determination of flanking genomic DNA sequence surrounding a known core (like a conserved mature miRNA) from multiple species that currently don't have a full genome assembly available. Altogether, by sequencing data analysis and comparative genomics, these studies contribute to the understanding of the function and evolution of plant small RNAs.