Elucidation and Synthetic Design of Biochemical Pathways using novoStoic

Open Access
- Author:
- Kumar, Akhil
- Graduate Program:
- Integrative Biosciences
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- February 17, 2017
- Committee Members:
- Costas D. Maranas, Dissertation Advisor/Co-Advisor
Costas D. Maranas, Committee Chair/Co-Chair
Ross C. Hardison, Committee Member
Andrew Patterson, Committee Member
Reka Albert, Outside Member - Keywords:
- retrosynthesis
biochemical pathways
metabolic pathways
biochemical
metabolic
database
reaction rules - Abstract:
- Next generation pathway design algorithms and tools facilitate with ease and speed the design of novel sophisticated biosynthetic routes. The development of computational designs for the biosynthesis of xenobiotics being the goal of this dissertation, we discuss and demonstrate solutions to two key challenges. The first challenge we identify is in the pace of extraction of metabolic knowledge i.e. directly using the data in the way it was published. Difficulties in directly using data from genome-scale metabolic models (GSMs) as well as semi-curated databases such as BRENDA1, KEGG2, and MetaCyc3, EcoCyc4, BioCyc3. The difficulties arise from the incompatibilities of representation, duplications, and errors i.e. with a single metabolite annotated with multiple names across different data sources. Also, in many cases, the same metabolite is annotated with multiple structures. This ambiguity gravely slows down the pooling of information across data sources. As a consequence, duplications in reaction information would not reveal otherwise (synthetic) lethal gene deletions. Such ambiguity affects the quality of predictions related to overall metabolic potential of an organism. In addition, non-standard metabolite names and ids prevent the direct comparisons needed to identify reactions that overlap multiple data sources. This would also lead to fragmented/disconnected datasets that would provide smaller reaction domains for pathway traversal algorithms. The second challenge we identify is in the capacity of various algorithms and computer-aided design (CAD) tools to conceive novel biosynthetic pathway designs while syncretizing various engineering challenges. To systemize the engineering calculations needed for designing biosynthesis of high-value iii chemicals, existing CAD tools explore the complex biochemical reaction space and enumerate metabolic engineering strategies for the heterologous production of target chemicals from substrates with native or engineered enzymes. Existing CAD tools are however limited and approximate in their design elements i.e. they do not consider all the metabolic engineering paradigms in an integrated fashion5. The design elements such as reaction rules, network size, non-linear pathway topology, mass-conservation, cofactor balance, thermodynamic feasibility, chassis selection, toxicity, yield, and cost have never been unified into a single scheme in current CAD tools, until this work. In the first chapter, we present the novel reaction rule based pathway design (CAD) tool and demonstrate with results i.e. biosynthetic designs to three pharmaceuticals namely phenylephrine, epinine and naproxen. The second chapter presents a novel atom mapping algorithm, which heavily uses the concept of prime factorization. In the third chapter, we demonstrate the algorithms we developed for the purposes of curating biochemical data i.e. development of MetRxn. Finally, in chapter 4) we present an example of the MetRxn data being leveraged within a metabolic model.