Statistical Methods for the Functional Genomic Analysis of the X Chromosome

Open Access
- Author:
- Sauteraud, Renan
- Graduate Program:
- Biostatistics
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- June 28, 2021
- Committee Members:
- Dajiang Liu, Chair & Dissertation Advisor
David Mauger, Major Field Member
Arthur Berg, Major Field Member
Nancy Olsen, Special Member
Arthur Berg, Program Head/Chair
Laura Carrel, Outside Field Member & Dissertation Advisor
Ziaur Rahman, Outside Unit Member - Keywords:
- Biostatistics
X-Chromosome inactivation
Mixture models
Sex-biased diseases - Abstract:
- The X Chromosome plays an important role in human development and disease. However, functional genomic and disease association studies of X genes greatly lag behind autosomal gene studies.Several analytical challenges arise from the unique biology of X including chromosome copy number differences between males and females and X chromosome inactivation (XCI) in females with two copies of the X. Because of XCI, most genes are only expressed from one allele. Yet, ~30\% of X genes “escape” XCI and are transcribed from both alleles, many only in a proportion of the population. Such inter-individual differences are likely to be disease-relevant, particularly for sex-biased disorders. In the first chapter, we introduce XCIR (X-Chromosome Inactivation for RNA-Seq), a novel statistical method to identify escape genes using bulk RNA-sequencing data. Our approach jointly models the probability of errors common to the study of XCI along with the sample mosaicism. In simulations, we show improvement in power to detect escape genes over existing methods. We further validate the data in controlled experiment and apply XCIR to publicly available data. Finally, we address limitations specific to expression based approaches and quantify their impact in the context of XCI and the analysis of X-linked genes. In the second chapter, we apply our novel method to real data in order to understand the functional biology for X-linked genes. Using annotated XCI states, we examined the contribution of X-linked genes to the disease heritability in the UK Biobank dataset.We show that escape and variable escape genes explain the largest proportion of X heritability, which is in large part attributable to X genes with Y homology. Finally, we investigated the role of each XCI state in sex-biased diseases and found that while XY homologous gene pairs have a larger overall effect size, enrichment for variable escape genes is significantly increased in female-biased diseases. These results, for the first time, quantify the importance of variable escape genes for the etiology of sex-biased disease. Our method, available as an R package, is more powerful than alternative approaches and is computationally efficient to handle large population-scale datasets allowing the analysis of a broad range of phenotypes.