Genome Architecture and Evolution Shaped by Transposable Elements

Open Access
- Author:
- Chen, Di
- Graduate Program:
- Genetics
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- December 01, 2020
- Committee Members:
- Mark Shriver, Dissertation Advisor/Co-Advisor
Ross Cameron Hardison, Committee Chair/Co-Chair
Stephen Wade Schaeffer, Committee Member
Robert Paulson, Outside Member
Robert Paulson, Program Head/Chair - Keywords:
- Transposable Elements
L1
Functional Data Analysis
Genome Evolution
Genetics - Abstract:
- Transposable Elements (TEs) are important constituents of the human genome and are considered to play a critical role in shaping the genome architecture and evolution. In previous in vivo and in vitro studies, the genomic distribution of TEs has been investigated along with some of their functions in gene regulation and various cellular processes. However, to date, there has not been a high-resolution, genome-wide study of TEs in an evolutionary framework, through which the insertion and fixation preferences of the elements can be addressed in detail. Also, the interactions between TE activities and local genome landscape have not been fully revealed. The long-term goal of this study is to characterize the transposition dynamics of TEs and to further understand their contribution to the structure, function, and evolution of the human genome. In this dissertation, I focused on one specific TE family, namely the Long Interspersed Element-1 (LINE-1 or L1), which constitutes >17% of the human genome and still actively transpose in it. I studied the genome-wide insertion and fixation preferences of L1s at a high-resolution and investigated their interactions with different genomic landscape features such as histone modifications and DNA methylation. In detail, I analyzed three large datasets of L1s that integrated at different evolutionary time scales: 17,037 de novo L1s (from an L1 insertion cell-line experiment conducted in-house), and 1,212 polymorphic and 1,205 human-specific L1s (from public databases). I also characterized 49 genomic features—proxying chromatin accessibility, transcriptional activity, replication, recombination, etc.—in the ±50 kb flanks of these elements. These features were contrasted between the three L1 datasets and L1-depleted regions using state-of-the-art Functional Data Analysis (FDA) statistical methods, which treat high-resolution data as mathematical functions. The results indicate that de novo, polymorphic and human-specific L1s are surrounded by different genomic features acting at specific locations and scales. This led to an integrative model of L1 transposition, according to which L1s preferentially integrate into open-chromatin regions enriched in non-B DNA motifs, whereas they are fixed in regions largely free of purifying selection—depleted of genes and non-coding most conserved elements. Intriguingly, the results also suggest that L1 insertions modify local genomic landscape by extending CpG methylation and increasing mononucleotide microsatellite density. Altogether, the findings in this dissertation substantially improved our understanding of L1 integration and fixation preferences, and implied the critical role of TE activities in human health and diseases.