Structural and functional characterization of the transcription factor Pdx1 and its interactions with DNA and proteins

Open Access
Bastidas, Monique
Graduate Program:
Doctor of Philosophy
Document Type:
Date of Defense:
May 18, 2015
Committee Members:
  • Scott A Showalter, Dissertation Advisor
  • Philip C. Bevilacqua, Committee Member
  • Christine Dolan Keating, Committee Member
  • David Scott Gilmour, Committee Member
  • protein binding
  • NMR
  • biophysical
  • homeodomain
Homeobox proteins are vital DNA-binding transcription factors that control the spatial and temporal expression of cells and tissue during early development. These proteins are necessary for other biological processes, including the activation of genes. The DNA-binding homeodomain motif of homeobox proteins bind sequence specifically to double stranded DNA containing the core binding site (5´-TAAT-3´), achieving high specificity in vivo. Our current understanding of the mechanism by which these transcription factors recognize their cognate sites is not clear. Moreover, the homeodomain motif is often a small domain within a larger polypeptide chain tethered to intrinsically disordered domains at one or both termini. Intrinsically disordered proteins (IDPs) play a vital role in maintaining cellular functions. The homeobox transcription factor, Pdx1 is an IDP, important for maintaining proper function of the pancreas and β-cells by recognizing A-box elements (TA-rich sequences containing a 5´-TAAT-3´ core site). The overarching goal of this dissertation was to further the understanding of the mechanism by which homeobox proteins recognize their target sites and bind with high specificity, as well as illuminate the structural and functional role of the disordered C-terminus of Pdx1. Using isothermal titration calorimetry (ITC), it was shown that the homeodomain (HD) of Pdx1 exhibited varying binding affinities for promoter-derived (islet amyloid polypeptide (IAPP) and insulin) and consensus-derived DNA sequences indicating nucleotide sequence discrimination in the flanking regions. The calorimetric data demonstrated that Pdx1 preferentially bound sequence specific insulin elements containing a pentameric DNA sequence (5´-CTAAT-3´) rather than IAPP elements encompassing a 5´-TTAAT-3´ sequence. Furthermore, binding studies of the HD with a panel of consensus-derived mutants demonstrate a slight preference for an insulin-like 5´-CTAAT-3´ sequence over an IAPP-like 5´-TTAAT-3´ sequence based on a combination of a tighter binding constant and a more favorable binding enthalpy. Molecular dynamics simulations further showed a strictly conserved arginine has preference for the pentanucleotide sequence 5´-CTAAT-3´ rather than 5´-TTAAT-3´, due to a more electronegative binding pocket. To determine the residue(s) involved in sequence specificity upstream from the core site and understand the role of residues in the C-terminus in DNA-binding, a slightly longer homeomdomain construct (HDx) was studied using nuclear magnetic resonance (NMR) and ITC. The construct was much more stable than the HD construct, which was likely a result of the increased stability of helix 3 as evidenced in the NMR data. Using ITC, the binding affinities of HDx to a panel of DNA were in the low nanomolar range and tighter than the HD affinities to the same panel of sequences. In an attempt to identify the residues responsible for imparting specificity upstream from the core site, a mutational analysis with HDx was carried out. Neither mutations resulted in decreased specificity; however, one mutation contributed slightly to the binding enthalpy. Many folded proteins contain regions that are random coil in structure or form transient secondary structures, which are important for cell signaling, molecular recognition, and transcription. The non-homeodomain regions of Pdx1, which make up a large portion of the protein, are predicted to be disordered based on the amino acid sequence. However, there is no concrete evidence to suggest whether the non-homeodomain regions of Pdx1 are truly unstructured or whether they adopt residual structure. The NMR data of the Pdx1 C-terminus shows that this region adopts random coil structure in solution. Biologically, the plasticity of the intrinsically disordered regions is utilized as a probe to weakly associate with multiple binding partners. One such partner for Pdx1 is SPOP, which is involved in mediating the interaction between an ubiquitin ligase complex and its substrate. Calorimetric data and NMR demonstrated a direct binding interaction and identification of the SPOP binding regions of Pdx1 by NMR.