Sequence-to-function models for efficient optimization of metabolic pathways and genetic circuits

Open Access
Farasat, Iman
Graduate Program:
Chemical Engineering
Doctor of Philosophy
Document Type:
Date of Defense:
June 01, 2015
Committee Members:
  • Howard M Salis, Dissertation Advisor
  • Howard M Salis, Committee Chair
  • Ali Borhan, Committee Member
  • Timothy Charles Meredith, Committee Member
  • Costas D Maranas, Committee Member
  • metabolic pathway
  • genetic circuit
  • Cas9
  • modeling
  • combinatorial optimization
  • protein expression
  • optimization
The latest advances in metabolic engineering and synthetic biology have yielded engineered organisms that manufacture valuable chemicals, detect variations in the environment, and resist against undesirable conditions. In many cases, addition of rationally designed genetic elements results in a low detectable activity that must be optimized for commercial purpose. However, optimizing the genetic code to maximize the performance of these organisms is a time- and labor-intensive process, and has remained a challenge for decades, particularly when measurement throughput is limited. Here, we developed a systematic workflow for optimizing multi-component genetic systems. This includes employing biophysical modeling to build quantitative maps that relate a genetic system’s DNA sequence to the expression rate of its genetic elements and final phenotypic activity. We used these maps to optimize three types of genetic systems: metabolic pathways, genetic circuits, and CRISPR/(d)Cas9 systems. For each type of system, we first performed a minimal number of experiments to systematically develop a biophysical sequence-expression-activity map (SEAMAP), which was then used to optimize the genetic system at the DNA sequence level. We showed that designing experiments, based on governing biophysics, substantially reduces the time and effort needed to optimize performance of genetic systems. We also created a protein expression optimization framework, the RBS Library Calculator, to automatically generate an efficient pool of mutants with maximum diversity for building these biophysics-based maps. This framework facilitates altering the expression rate of every protein in a multi-protein genetic system with a minimal number of experiments. We employed this framework to alter the expression rate of 21 proteins of different classes by up to 100,000-fold: 10 enzymes in three metabolic pathways, T7 RNA polymerase and 4 transcription factors in three analog genetic circuits, and 6 individual proteins in diverse gram-negative and gram-positive microbes