Statistical Models for Recovering Dynamic Gene Networks from Static Data

Chen, Chixiang

Statistical Models for Recovering Dynamic Gene Networks from Static Data

Open Access

Author:: Chen, Chixiang
Graduate Program:: Biostatistics
Degree:: Doctor of Philosophy
Document Type:: Dissertation
Date of Defense:: May 06, 2020
Committee Members:: Rongling Wu, Dissertation Advisor/Co-Advisor
Rongling Wu, Committee Chair/Co-Chair
Ming Wang, Committee Chair/Co-Chair
Vernon Michael Chinchilli, Committee Member
Joanna Floros, Outside Member
Runze Li, Committee Member
Xiang Zhan, Committee Member
Arthur Steven Berg, Program Head/Chair
Ming Wang, Dissertation Advisor/Co-Advisor
Keywords:: The Genotype-Tissue Expression
Disease progression
Network learning
Ordinary differential equations
Quasi-dynamic network
Abstract:: Because of their power to reveal temporal changes of complex systems, dynamic networks have been increasingly used in a wide range of disciplines including medical science. The prerequisite of reconstructing such networks is the collection of temporal or perturbed data. However, these types of data are hardly available in genomic studies of medicine, significantly limiting the application of dynamic networks to characterize biological principles behind human health and diseases. In this thesis, I argue that static expression data can be converted into their dynamic representation, allowing quasi-dynamic gene networks to be recovered from non-temporal and non-perturbed data. The basic premise of my argument is that genes constituting a network co-vary over samples, which can be interpreted through lens of game theory. Originated in economic research, game theory states that each player (gene) strives to maximize its payoff (expression) based on its own strategy and the strategy of other players until the Nash equilibrium is reached. I integrate game theory to dissect the net expression of a gene into its underlying independent and dependent expression components, through a system of informative, dynamic, omnidirectional, and personalized networks (idopNetworks). The independent expression is one that occurs when the gene is assumed to be in isolation, whereas the dependent expression is one due to accumulative regulation by other genes. I modify and apply derivative-free approaches to solve idopNetworks, implemented with variable selection. I code the estimates of pairwise dependent expression of genes into a graph, forming a bidirectional, signed, and weighted gene network that captures all network features. Because dependent expression components dynamically vary over samples, I convert sample-specific networks into context-specific networks to investigate how gene co-regulation architecture evolves along with disease progression and differs among exposure factors. The methodologies developed in this thesis include two parts. In the first part, I integrate the power equation of the part-whole relationship and evolutionary game theory to derive the idopNetworks model. This model views gene networks as a closed system in which each gene interacts with every other gene as a function of network behavior. In the second part, I integrate the varying coefficient model and evolutionary game theory to develop the idopNetworks model by assuming gene networks as an open system. In this system, gene interactions change as a function of some actors extrinsic to the system. In both parts, I demonstrate the utility and usefulness of my models by analyzing the Genotype-Tissue Expression (GTEx) data. Although the GTEx data have been extensively analyzed before, my data analysis obtains some previously uncharacterized gene co-regulation mechanisms that mediate tissue specificity. I have also derived the asymptotic property of the proposed learning procedure and evaluate its statistical performance through extensive computer simulation. The proposed statistical models provide an impetus to shift genomic medicine from a reductionist thinking to network-based paradigm.

Tools