High-resolution Characterization and Lineage Tracing of Extrachromosomal DNA Mutational Dynamics
Open Access
- Author:
- Mei, Han
- Graduate Program:
- Biochemistry, Microbiology, and Molecular Biology
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- May 24, 2022
- Committee Members:
- Wendy Hanna-Rose, Program Head/Chair
Ross Hardison, Major Field Member
Ken Keiler, Major Field Member
Anton Nekrutenko, Chair & Dissertation Advisor
Francesca Chiaromonte, Outside Unit & Field Member - Keywords:
- Mutation
Adaptation
Plasmid
Evolutionary dynamics
Population genetics - Abstract:
- Mutations are the source of genetic variation, and serve as raw materials for evolutionary forces to act upon. Our understanding of mutational dynamics has been greatly advanced by interrogating the evolutionary forces that determine the fates of mutations. In early adaptation, mutations are found in extremely low frequencies, which poses a technical challenge to reveal their identities using whole-genome sequencing methods. One goal of this dissertation is to identify mutations in early adaptation. To see early mutational dynamics at a high resolution we selected extrachromosomal DNA (ecDNA). We chose bacterial plasmid as a representative of ecDNA. Plasmid has a smaller genome size compared to chromosomal DNA, represents a smaller mutational target, and can be fully sequenced using ultra-sensitive next-generation sequencing applications. The dynamics of mutations on plasmid has been largely overlooked. In theory, mutations occurring on plasmid are subjected to the same evolutionary forces as chromosomal mutations. In addition, mutations on plasmid are subjected to another two evolutionary forces—segregational drift and plasmid interference. Segregational drift describes the process where multiple plasmid copies in a parent cell are randomly divided into two daughter cells during cell division. Plasmid interference describes the process that different variants of the same plasmid genome compete for fixation in individual cell lineages. The dynamics of plasmid in individual cell lineages are independent processes. The lack of understanding in the evolutionary dynamics of plasmid mutations lies in the challenge in tracing the independent mutational dynamics of plasmid in individual cell lineages. Another goal of this dissertation is to provide a solution to solve this challenge. Chapter 1 serves as an introduction of this dissertation. The following four topics are covered: the molecular mechanism of mutation, the evolutionary forces determining the dynamics of chromosomal mutations, the evolutionary forces determining the dynamics of extrachromosomal mutations, and the challenges in learning the evolutionary dynamics of plasmid mutations. In Chapter 2, we monitored the mutational dynamics of bacterial plasmid in evolution experiments. Adaptive mutations were uncovered at early adaptation at very low frequencies using the duplex sequencing application. The dynamics of these adaptive mutations were traced over the course of evolution. In addition, we presented a statistical framework, which, for the first time, accounted for plasmid heteroplasmy—the coexistence of different variants of the same plasmid in individual cells—in a quantitative model. This model captured the empirical frequency dynamics of the plasmid mutations. With the knowledge of the identities and low frequencies of adaptive mutations on plasmid, next we aimed to monitor their dynamics and explore underlying evolutionary forces. The plasmid mutational dynamics in individual cell lineages are independent and mutually exclusive events. Thus the dynamics need to be traced at individual cell lineage level. Chapter 3 presents the method siBar to perform such tracing by tagging both chromosomal and plasmid DNAs using random DNA barcodes. The barcodes are short stretches of DNA nucleotides. All plasmid copies within a cell carry the same DNA barcode, so do the plasmids in all descendent cells within this cell lineage. In this way, the challenge of tracing independent plasmid mutational dynamics in individual cell lineages can be solved. Chapter 4 describes another work when COVID-19 emerged as a public health threat. In response to the pandemic, we wanted to explore the evolutionary dynamics of the virus using existing sequencing data. In particular we focused on the frameshifting region between ORF1a and ORF1b, which is of essential biological significance to coronavirus replication. We found the overlap region between ORF1a and ORF1b is phylogenetically conserved in coronavirus genomes. In addition, we showed exceptional conservation and detected signatures of selection within the frameshifting region using existing SARS-CoV-2 sequencing data. Chapter 5 summarizes my work and highlights major contributions of this dissertation. In addition, I outline future applications of the siBar method. The expected results are presented supporting different hypotheses regarding the question of what it takes to overcome for a plasmid mutation to reach fixation in a population. I also list some outstanding questions that siBar might help answer.