Computational tools for genome-scale synthetic lethality analysis and metabolic modeling of microbial communities

Open Access
Zomorrodi, Ali Reza
Graduate Program:
Chemical Engineering
Doctor of Philosophy
Document Type:
Date of Defense:
May 17, 2012
Committee Members:
  • Costas D Maranas, Dissertation Advisor
  • Costas D Maranas, Committee Chair
  • Ali Borhan, Committee Member
  • Howard M Salis, Committee Member
  • Reka Z Albert, Committee Member
  • Metabolic networks
  • Flux balance analysis
  • Metabolic Enginnering
  • Optimization
  • Systems biology
Genome-scale metabolic reconstructions are increasingly becoming available for a wide range of microorganisms. Given the inherent complexity of these reconstructions on one hand and their growing influence on biological, biotechnological and biomedical research on the other hand, it is timely to develop new analysis and modeling tools to improve their predictive ability, elucidate and quantify the full range of metabolic capabilities of the underlying microbial system, and provide guidance for metabolic engineering efforts. The central theme of this dissertation is the development and deployment of efficient mathematical modeling approaches, and in particular optimization-based algorithms, for the curation, analysis and redesign of metabolic networks of single and multi-species microbial systems. First, an efficient optimization-based procedure, namely SL Finder, is introduced for the targeted enumeration of multi-gene (and by extension multi-reaction) synthetic lethals using genome-scale metabolic models. The complete identification of all double and triple gene and reaction synthetic lethals as well as some quadruple and higher order ones using the iAF1260 metabolic model of E. coli uncovered complex patterns of network robustness and gene/reaction utilization and interdependence thereby providing a bird’s-eye-view of the avenues available for redirecting metabolism. Subsequently, a systematic optimization-based protocol is presented for the curation of metabolic models using multi-gene deletion (i.e., synthetic lethal) experiments. By using the existing and developed curation procedures, 90 distinct modifications for the iMM904 metabolic model of S. cerevisiae along with several regulatory constraints were identified and vetted using literature sources. Incorporation of the suggested modifications led to substantial improvements in the prediction accuracy of the iMM904 model for essentiality and synthetic lethality data. Next, a comprehensive flux balance analysis (FBA) framework, called OptCom, is introduced for the metabolic modeling and analysis of microbial communities. In contrast to earlier FBA approaches that are based on optimization problems with a single objective function, OptCom relies on a multi-level and multi-objective optimization formulation to properly describe trade-offs between individual vs. community level fitness criteria. The applicability of OptCom is demonstrated by modeling three different microbial communities of varying complexities to uncover the inter-species interactions, identify the optimality levels of growth for the species involved, and examine the possibility of adding a new member to an existing microbial community. In next part, the OptForce procedure is employed as a computational strain design tool to identify the minimal set of metabolic interventions leading to overproduction of L-serine in E. coli. The suggested interventions include not only straightforward upregulation of the terminal pathway but also non-intuitive manipulations distant from the target product. Finally, the focus is shifted from steady state flux balance analysis to kinetic and dynamic modeling of metabolic networks using the ensemble modeling (EM) approach. Here, an optimization-based algorithm is proposed to pro-actively identify gene/enzyme perturbations that maximally reduce the number of retained models in the ensemble after each round of model screening. The applicability of this procedure is demonstrated using a metabolic model of the central metabolism of E. coli and by successively identifying single, double and triple enzyme perturbations (i.e., knockouts, overexpressions or combinations thereof) that cause the maximum divergent flux predictions by the models in the ensemble. Overall, the wide array of mathematical tools presented in this dissertation highlights their utility for model-driven analysis and redesign of metabolic networks.