Computational methods for comparative genomics of non-model species: a case study in the parasitic plant family Orobanchaceae

Open Access
- Author:
- Wafula, Eric
- Graduate Program:
- Biology
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- October 11, 2019
- Committee Members:
- Claude Walker Depamphilis, Dissertation Advisor/Co-Advisor
James Harold Marden, Committee Chair/Co-Chair
Istvan Albert, Outside Member
Naomi S Altman, Outside Member
Stephen Wade Schaeffer, Program Head/Chair - Keywords:
- Computational biology
Evolutionary biology
Bioinformatics
Comparative genomics
Next-generation sequencing
de novo assembly
Genome annotation
Gene expression
Whole genome duplication
Gene duplication
Phylogenomics
Ancestral reconstruction
Transcriptomics
Orthogroups
Gene family evolution
Parasitic plants
Orobanchaceae
Striga - Abstract:
- The rapid development of sequencing technologies coupled with the continuous drop in the cost of sequencing has facilitated studies of genomes, transcriptomes, and metagenomes of a variety of organisms at unprecedented resolution. However, sequencing and accurately assembling the genomes of many non-model organisms, especially plants, remains cost-prohibitive because they are often large or complex, which pose challenges to current sequencing technologies and assembly algorithms. Therefore, many researchers are now relying on comparative genomic approaches that integrate data from genomes and transcriptomes to gain novel insights into evolutionary history, including the unique features of complex non-model organisms. The genomes of parasitic angiosperms are relatively understudied. Past genome-scale research has been focused primarily on understanding the mechanisms of plant parasitism as a means to control weedy species that parasitize crops. Research efforts to understand the evolutionary aspects of parasitic plants have been restricted to the plastome degradation associated with the reduction and loss of photosynthesis. In this dissertation, I present PlantTribes 2, a gene family analysis framework that utilizes objective classifications of complete protein sequences from genomes for comparative and evolutionary analyses of gene families and transcriptomes on a genome-scale. Utilizing PlantTribes 2, and the draft genome of Striga asiatica, including transcriptomes of sister lineages, I present evidence for an ancient polyploidy event shared by parasitic Orobanchaceae and closely related non-parasitic sister lineages. The observed gene family evolutionary dynamics in Striga reveal an association between whole-genome duplication (WGD) and the evolutionary origins of parasitism in Orobanchaceae. Gene losses are overrepresented by older genes whose functions are complemented by the host, while gene gains often result from the WGD event specific to Orobanchaceae whose functions are associated with further adaptations to the parasitic lifestyle. The evolutionary transition from autotrophy to heterotrophy is associated with changes in gene functions common to non-parasitic plants. These findings will provide a focus for future studies into the mechanisms of plant parasitism and potential targets for parasite control.