Towards precision public health: Building novel methods for utilizing formal and informal data
Restricted (Penn State Only)
- Author:
- Zhou, Jiayan
- Graduate Program:
- Pathobiology
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- December 12, 2022
- Committee Members:
- Kumble Prabhu, Program Head/Chair
Qunhua Li, Outside Field Member
Yifei Huang, Outside Unit Member
Shefali Setia Verma, Special Member
Molly Hall, Chair & Dissertation Advisor
Kumble Prabhu, Major Field Member - Keywords:
- precision public health
GWAS
EDGE
phenome-wide gene-gene interactions
epistasis
EWAS
environment-environment interactions
knowledge-based filtering
social media
COVID-19
vaccine acceptance index
complex traits
common diseases - Abstract:
- Precision public health, a concept similar to precision medicine, is a newfangled term widely utilized to describe personalized treatments, interventions, or practices for patients with similar characteristics. Identifying and understanding the typical features shared by certain patient subgroups relies on gathering and analyzing the enormous amount of data from different sources. Advancements in data-intensive biology, information technology, and health care have promoted the capability to collect and store large-scale biological data for precision medicine. Meanwhile, the explosive growth of health-related data from biomedical research and electronic health records to surveys and social media provides us with complementary tactics to study diseases and health issues. However, certain limitations have hindered the clinical utility of these data, including less explained variance for non-additive genetic effects using current additive encoded genome-wide association study (GWAS), lack of standards to solve the testing burdens for the environmental factors involved in interaction analysis, and restricted capability to make a quantitative estimation for sentiment and other textual analysis. The chapters of this dissertation describe: 1) novel approaches, and 2) new applications of existing methods that fill some of these gaps. The three aims described in this dissertation include: 1) an Elastic Data-driven Genetic Encoding (EDGE), a flexible alternative to the additive encoding method for genome-wide association study (GWAS), 2) an Integrative Genome-Exposome Method (IGEM) to enable knowledge-driven filtering with exposure-exposure (ExE) and gene-environment (GxE) interaction, and 3) a reliable quantitative method for textual analysis. Developed methods from each aim would assist in finding novel non-additive genetic risks, uncovering the biological-related exposure-related interactions, and considering public opinions, while advocating precision public health.