Computational Methods For Characterizing The Sequence And Chromatin Determinants Of Transcription Factor Binding

Open Access
- Author:
- Srivastava, Divyanshi
- Graduate Program:
- Bioinformatics and Genomics (PhD)
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- May 27, 2021
- Committee Members:
- Frank Pugh, Major Field Member
Shaun Mahony, Chair & Dissertation Advisor
Ross Hardison, Outside Field Member
Daniel Kifer, Outside Unit Member
George Perry, Program Head/Chair - Keywords:
- Transcription Factors
Gene Regulation
Chromatin
Neural Networks
Neural Networks for Biological Sequences
Interpretable Models For DNA Sequences
Protein-DNA Binding - Abstract:
- Transcription factor binding sites are determined by interplay between a TF’s sequence preferences and the chromatin environment within which a TF is expressed or induced. While a large body of work has characterized TF sequence preferences, the chromatin predeterminants of TF binding remain incompletely understood. In this dissertation, we develop computational approaches that examine the sequence and preexisting chromatin predictors of induced TF binding. We develop Bichrom, an interpretable convolutional neural network that integrates DNA sequence data and preexisting chromatin data to predict genome-wide TF binding. We apply Bichrom to several TFs induced in predetermined chromatin environments. We demonstrate the preexisting chromatin is a differential determinant of induced TF binding - TFs differ greatly in their dependence on preexisting chromatin environments. We leverage Bichrom’s additive architecture to examine the per-site predictors of TF binding. We find that TF binding landscapes are extremely heterogeneous and that distinct subsets of TF binding sites are predicted by distinct combinations of sequence and chromatin features. In the second half of this dissertation, we examine the sequence and preexisting chromatin determinants of paralogous TF binding. We focus on homeodomain-containing HOX TFs that recognize similar DNA sequence motifs and yet direct independent transcriptional programs. We analyze a subset of HOX TFs which are responsible for motor neuron subtype specification. We demonstrate that HOXC9 differentiates its genome-wide binding preferences by binding a set of previously inaccessible sites. We finally apply Bichrom to an expanded set of paralogous HOX TFs. Our results demonstrate that paralogous TFs may differ in their abilities to bind inaccessible chromatin. Taken together, our work provides novel insights into the determinants of genome-wide TF binding.