A NEW GAUSSIAN GRAPHICAL MODEL-BASED NETWORK INFERENCE METHOD TO INVESTIGATE GENE CO-EXPRESSION RELATIONSHIPS USING SINGLE-CELL RNA-SEQUENCING DATA
Restricted (Penn State Only)
Author:
Tang, Elle
Graduate Program:
Statistics
Degree:
Master of Science
Document Type:
Master Thesis
Date of Defense:
November 01, 2024
Committee Members:
Qunhua Li, Thesis Advisor/Co-Advisor Lingzhou Xue, Committee Member Nicole Lazar, Program Head/Chair
Keywords:
Gaussian Graphical Model single-cell RNA-sequencing single-cell genomics gene co-expression networks
Abstract:
The rapid rise of single-cell genomics data has greatly enhanced our understanding of biological mechanisms at the cellular level, distinguishing it from bulk genomics data, which provides an averaged view using tissue samples containing numerous cells. However, single-cell data presents analytical challenges, particularly due to its high dimensionality and the substantial proportion of zeros in gene expression values. These zeros may indicate either a true absence of gene expression in a given cell or a "dropout" where the measurement was not detected, and they are not biologically interchangeable. Gene co-expression networks are valuable tools for investigating relationships between genes using RNA-sequencing data, including single-cell RNA-sequencing data. This thesis develops a novel method based on the Gaussian Graphical Model (GGM) framework, representing genes as nodes and their co-expression relationships as edges. Our approach integrates a modeling framework tailored for single-cell gene expression data with a flexible technique that allows for the joint estimation of networks, effectively accounting for heterogeneity across biological conditions. This method facilitates meaningful comparisons of gene relationships across different biological states, thereby enhancing our ability to extract biological insights from single-cell genomics data.