STUDIES OF GENE EXPRESSION EVOLUTION: GENES ON THE INACTIVE X CHROMOSOME AND DUPLICATE GENES

Open Access
Author:
Park, Chungoo
Graduate Program:
Biology
Degree:
Doctor of Philosophy
Document Type:
Dissertation
Date of Defense:
July 20, 2010
Committee Members:
  • Kateryna Dmytrivna Makova, Dissertation Advisor
  • Kateryna Dmytrivna Makova, Committee Chair
  • Laura Carrel, Committee Member
  • Francesca Chiaromonte, Committee Member
  • Webb Colby Miller, Committee Member
  • Claude Walker Depamphilis, Committee Member
Keywords:
  • X CHROMOSOME INACTIVATION. GENE DUPLICATION
Abstract:
Understanding the determinants of the rate of protein evolution is one of the major goals in molecular evolution. Among the potential variables, expression abundance is one of the most important factors for determining protein evolutionary rates; the variation in gene expression appears to contribute to the evolutionary divergence and phenotypic diversity among species and individuals. Here we perform studies to characterize variation in gene expression patterns on the inactive X chromosome, and across duplicate genes in mammals. Specifically, several questions are addressed in greater detail in this dissertation. First, what genomic signals determine the expression status of genes on the inactive X chromosome? Second, does selection operate differently on genes that escape inactivation vs. genes that are inactivated? Third, do genomic features and motifs predict candidate X-linked mental retardation (XLMR) genes? Fourth, what drives the rapid expression divergence observed between human paralogs? To investigate these issues, we use genome-scale gene expression data and bioinformatic analyses. We find that (1) the majority of the sequences enriched in the vicinity of inactivated genes are found within L1 repeats (indicating an involvement of L1 repeats in X chromosome inactivation), and these sequences capture most of the genomic signal determining inactivation; some unique or overrepresented motifs in boundary regions (indicating that they are candidates for the boundary elements separating genes with different X inactivation profiles) are also found; (2) escape genes experience stronger purifying selection than inactivated genes at both the protein-coding and gene expression levels, and this effect largely results from the importance of function and dosage of escape genes; (3) sequence motifs that are mutually exclusively overrepresented in either XLMR or non-XLMR genes effectively capture genomic signals to distinguish between them; and (4) turnover of transcription start sites, structural heterogeneity of coding sequences, and divergence of cis-regulatory regions between duplicate gene copies play a pivotal role in determining the expression divergence of duplicate genes. Results from these studies provide valuable insights into the regulation of inactive X expression and understanding the X chromosome inactivation mechanism, and will further aid in our understanding of long-range control of gene expression on the X chromosome. Moreover, they provide important information for understanding human transcriptome heterogeneity, complexity, and evolution.