UNDERSTANDING GENE REGULATION AT A FINE-SCALE: ZOOMING INTO 3D CHROMATIN CONFORMATION

Open Access
- Author:
- An, Lin
- Graduate Program:
- Bioinformatics and Genomics
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- April 23, 2019
- Committee Members:
- Yu Zhang, Dissertation Advisor/Co-Advisor
Yu Zhang, Committee Chair/Co-Chair
Feng Yue, Committee Member
James Riley Broach, Committee Member
Ross Cameron Hardison, Outside Member - Keywords:
- Epigenetics
chromatin conformation
TAD - Abstract:
- Mammalian genomes are spatially organized into different levels. The three-dimensional genome organization is essential for gene expression, as it reflects or possibly enables the physical interactions between distal regulatory elements and their target genes. The advances in chromatin conformation capture technologies in the past decade, have both expanded and refined our understanding of the multiple levels of chromatin organization. Among the current technologies, Hi-C technology shows the highest potential with its unbiased genome-wide coverage. It provides exceptional resources as well as challenges in understanding chromatin interactions in detail. Observations from Hi-C suggest that the chromatin forms frequent local interactions in specific regions, which are called as Topologically Associating Domains (TADs). While TADs are often interpreted as the structural units to study regulatory mechanism, previous observation shows that hierarchy is present in TADs, with smaller TADs nested within larger ones. Though several different TAD calling algorithms have been developed, limited research has been done to reliably identify hierarchical TAD structures and understand their roles in gene regulation. The first tool introduced in this thesis is an optimized nested TAD calling method (OnTAD), which can identify different levels of TADs in a biologically meaningful manner. By incorporating epigenetics and transcriptional information, our analyses bring new insights towards understanding the complex system of gene regulation. Investigations on site-to-site interactions (e.g. enhancer-gene pairs) from Hi-C data are often limited by their resolution. As Hi-C measures pair-wise interactions, improving resolution requires quadratic increasing of sequencing depth. Inspired by the super-resolution image technique, the second tool in this thesis is a deep convolutional neural network (HiCPlus) to impute the high-resolution Hi-C results from low-sequence depth Hi-C data. It shows the power to recover biologically and statistically significant chromatin interactions with improved resolution. We also applied HiCPlus to 20 different tissue/cell-types, which only have low-resolution experimental data, to provide the community with a useful resource to investigate chromatin interaction at a fine scale. Besides improving resolution on existing Hi-C data, the third tool in this thesis utilizes one-dimensional epigenetic features to predict three-dimensional chromatin interactions in Hi-C. It can provide support for studies in cell-types with no experimental Hi-C data available. Meanwhile, it also sheds light on understanding driving factors for chromatin interaction formation. In the last part of the thesis, I included our effort on generating comprehensive chromatin segmentation maps for cell populations in the mouse hematopoietic lineage. During this process, we observed an unexpected chromatin state that is enriched with high signal in most of the input features, termed as ‘heterogeneous state’. I presented our exploration in finding the cause of such state and solution to improve the final segmentation map.