“Omics of Disease”: Automation, Weaving, and Capturing

Open Access
- Author:
- Aguilar, Morris
- Graduate Program:
- Bioinformatics and Genomics (PhD)
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- October 30, 2023
- Committee Members:
- David Koslicki, Program Head/Chair
Andrew Patterson, Major Field Member
Dajiang Liu, Outside Unit & Field Member
James Broach, Major Field Member
Dokyoon Kim, Special Member
Molly Hall, Chair & Dissertation Advisor - Keywords:
- Multi-Omic Analysis
Genomics
Metabolomics
Small RNA
Precision Medicine
Bioinformatics
Disease Pathogenesis
Biomarker Discovery
Neurodegenerative Diseases
Computational Biology
Cardiovascular Disease - Abstract:
- The complex relationship between genetics, metabolites, microvesicles RNAs, and disease progression is central to biomedical research. This dissertation navigates through the intricate layers of human diseases, highlighting the crucial demand for new, innovative methods and tools to decode the fundamental mechanisms vital for advancing multi-omic analyses in the field. Throughout its chapters, it takes a deep dive into various realms - from developing automated metabolomics pipelines and analyzing intricate cardiovascular networks to examining microvesicle RNAs - each revealing distinct facets of our multi-omic journey and significantly enriching our comprehension of human health and illness. While metabolomics analysis has shed light on the metabolic alterations in CVD and PD, the field still grapples with identifying a comprehensive set of metabolites that could serve as disease biomarkers. These metabolic changes offer potential diagnostic and therapeutic targets, emphasizing the importance of metabolic profiling in understanding disease mechanisms. However, automation tools with standardized methodologies are needed for the field to get closer to quantifying a comprehensive set of metabolites that might be biomarkers for disease. This dissertation documents an attempt to create a semi-automated metabolomics pipeline (SAMP) to provide an alternative to Chenomx, a commercial computer-assisted manual quantification software. We putatively identified 79 metabolites previously unreported in the dataset with our approach. However, a follow-up concordance analysis between SAMP and Chenomx revealed major inaccuracies in SAMP and supported Chenomx as the superior tool. We discuss SAMP’s shortcomings and provide guidance for future attempts to assemble an automated metabolomics pipeline. Future automation of the manual task of metabolite quantification will aid in the broad application of NMR metabolomics to metabolic profile individuals with a disease and enable its incorporation in future multi-omic analyses. As we gather more data for other -omic domains, the field encounters the challenge of organizing these domains in a manner that aids in identifying biomolecular features that are likely to play a larger role in disease pathogenesis. Investigations that have considered a single–omic domain, such as genome-wide association studies, have successfully identified genetic markers of disease; however, the genetic signatures need to be in the context of other biological domains. This added context can aid the field in generating new hypotheses on how interdomain features function for a given disease. Multi-omic association networks can intertwine biomolecular features such as genomic, metabolomic, drug treatment, and disease features to capture cardiovascular disease's multifaceted nature. This network simplifies the vast connectivity of -omic domains in disease states, streamlining the identification of multi-omic features in complex diseases. We ran nine regression analyses cumulatively yielded 3,254 statistically significant results (total tests: 183,520,500) and these results were used to construct the network. The analysis of the network revealed that dyslipidemia plays a pivotal role in cardiovascular disease, exhibiting connections throughout genomic, metabolic, and pharmacological domains. Newly identified associations between genome and metabolite, particularly concerning collagen synthesis and cell growth, could potentially explain the structural alterations in the coronary and vascular systems seen in cardiovascular disease. Utilizing a multi-omic network that integrates genomic, metabolomic, and microvesicle domains can significantly enhance the selection of features for disease risk prediction models, providing a more comprehensive and accurate representation of underlying biological mechanisms, thereby enhancing precision medicine. Such an approach is instrumental in understanding the interconnected changes across -omic features, laying a solid foundation for subsequent molecular investigations. Focusing on neuronally derived microvesicle small RNAs (smRNAs), these non-coding RNAs are critical players in cellular communication and disease manifestation for neurodegenerative diseases such as PD. Encased within microvesicles, these RNAs bridge cellular interactions; however, the challenge is that not all microvesicles found in blood serum have neuronal origins. Our study focused on determining if small RNA (smRNA) profiles in neuron-specific serum exosomes and microvesicles differ between Parkinson's disease (PD) patients and healthy individuals. Using a proven neuronal marker (CD171), we extracted and isolated exosomes and microvesicles from these samples. Our findings revealed that in serum, CD171-enriched exosomes and microvesicles displayed 29 smRNAs with significantly different expression levels between PD patients and controls, with 23 smRNAs being upregulated and 6 downregulated in PD cases. Pathway analysis indicated these smRNAs are involved in regulating cell proliferation and signaling pathways. Univariate logistic regression models identified four smRNAs with an Area Under Curve (AUC) of at least 0.74, effectively distinguishing PD subjects from controls. Furthermore, a random forest model using the 29 smRNA panel achieved high predictive accuracy, with an AUC of 0.942. The microvesicle RNAs were established to have potential as biomarkers for PD according to their capacity to predict the PD phenotype. This sets the stage for future multi-omic endeavors to incorporate this domain to achieve non-invasive early diagnoses and precise therapeutic strategies.