
WEBINAR
Only
AI in Chemical Biology: New Frontiers
Wednesday, March 17, 2021, 11:00 AM - 5:00 PM EDT
Artificial Intelligence (AI) has the potential to enhance our understanding of chemical biology – with broad applications that include predicting reaction chemistry, automating lab processes, detangling complex biochemical processes, and hastening drug discovery and development. With rapid developments over the last decade in machine learning, natural language processing and other aspects of AI, we are at the precipice of a new era in this field, however appropriately integrating these methodologies into chemical biology research can be a daunting. To address these challenges, this one day symposium will showcase recent advances in chemical biology that were enabled by AI and highlight best practices for employing AI techniques in this field.
Registration
Program Supporters
Chemical Biology Lead Supporter
Chemical Biology Discussion Group Members
Wednesday
March 17, 2021
Introduction and Welcome Remarks
Speaker
Keynote: A Deep Learning Approach to Antibiotic Discovery
Speaker
Session 1
Automating the Efficient Synthesis of Analogs from AI Designs
Speaker
Break
Machine Learning and Automation for Molecular and Materials Design
Speaker
Protein Folding with MELDxMD Guided by Machine Learning Predicted Contacts
Speaker
Molecular dynamics (MD) simulations possess the ability to model biomolecular structures with atomic-scale resolution. A major bottleneck, however, is the vast configurational space that a protein can sample when folding. To alleviate this burden, we use our MD accelerator, MELD, which leverages external information to reduce the conformational problem. MELD accelerated MD (MELDxMD) complements bioinformatics, AI, and experimental approaches by integrating data from all of them into a technique that delivers high resolution structures with free energy scoring. Here, we employ MELDxMD with inferred contacts between amino acids (aa) from machine learning, to generate accurate protein structures given their aa sequence. Specifically, our protocol inputs the contacts from trRosetta—a state-of-the-art machine learning public server—as distance restraints to quickly guide the MELDxMD simulations to stable states. Importantly, the simulations successfully identify many of these native states as the lowest free energy conformations on the energy landscape, an attribute only accessible to physics-based methods. Furthermore, the formulated simulations allow modeling of proteins that surpass the upper 100 aa size limit of MD folding, as evident by accurate conformations of proteins at the maximum single-domain range (150-200aa). Overall, the work highlights the power of integrating AI with chemical biology techniques to tackle problems previously unattainable to conventional methods.
Comparing Recurrent Neural Networks, Generative Adversarial Networks and Variational Autoencoders for Drug Design
Speaker
Interest in de novo molecule generation for lead compound design in drug discovery has led to the rapid development of new machine learning models. This new generation, including recurrent neural networks (RNNs), generative adversarial networks (GANs), and variational autoencoders (VAE), are capable of learning how to generate new molecule structures not previously seen in the training set with desired activities or specified drug-like properties. While preliminary results in these areas have been promising, external evaluation of generative models has yet to be performed, leaving their efficacy in lead drug design unknown. We have explored the space of generative models to increase knowledge on their potential uses for both large (peptides) and small molecules. We use external examples, including retroactive virtual screening and real-world parallel drug design alongside expert chemists to compare and contrast the strengths and limitations of three different generative model architectures. We show that different RNN architectures are flexible and modular, and can (computationally) cheaply generate hundreds of thousands of compounds, that are the same molecules as suggested by medicinal chemists with little information other than restricting similarity to the lead compound and desired drug-like properties. Our results suggest that RNN generative models should be tightly integrated into drug design and discovery pipelines as well as to other industries (animal health).
Networking Break and Virtual Poster Session
Session 2
Visualizing Molecular Machines in Motion using Cryo Electron Microscopy and Deep Learning
Speaker
Macromolecular machines such as the ribosome undergo massive structural changes as they assemble. While we have long appreciated such structural changes exist, experimentally visualizing and analyzing large ensembles of these structures is challenging. Here, I briefly describe cryoDRGN, our newly developed software package to analyze structural heterogeneity in protein complexes visualized by cryo-electron microscopy (cryo-EM). This approach, which uses a purpose-built neural network based on a variational autoencoder, maps individual particle images to a low-dimensional latent space. Direct inspection of this latent space provides insights into the degree of structural heterogeneity in the dataset, provides estimates of particle abundance in each state, and relates the observed states to one another. Users can then visualize structural ensembles of their protein complex by generating three-dimensional structures throughout this latent space using the trained networks. I describe our application of this framework to understand how a bacterial methyltransferase guides assembly of the small ribosomal subunit. In analyzing a series of related cryo-EM datasets, we surprisingly uncovered that this methyltransferase ‘proof-reads’ the assembling ribosomes by preferentially disassembling ribosomes that have been erroneously constructed. This unexpected insight into the bacterial ribosome biogenesis process was enabled by cryoDRGN, and it highlights the utility of our approach.
Antibiotic Discovery by Means of Computers
Speaker
Machines have the potential to outperform humans and revolutionize our world. In this talk, I will describe our efforts to use machines to develop computational approaches for antibiotic discovery, as well as low-cost rapid diagnostics. Computers can already be programmed for superhuman pattern recognition of images and text. In order for machines to discover novel antibiotics, they have to first be trained to sort through the many characteristics of molecules and determine which properties should be retained, suppressed, or enhanced to optimize antimicrobial activity. Said differently, machines need to be able to understand, read, write, and eventually create new molecules. I will discuss how we trained a computer to execute a fitness function following a Darwinian algorithm of evolution to select for molecular structures that interact with bacterial membranes, yielding the first artificial antimicrobials that kill bacteria both in vitro and in relevant animal models. My lab has also developed pattern recognition algorithms to mine the human proteome, identifying throughout the body thousands of antibiotics encoded in proteins with unrelated biological function, and has applied computational tools to successfully reprogram venoms into novel antimicrobials. I will also describe the development of diagnostic biosensors for COVID-19, further substantiating the exciting potential of machine biology. Computer-generated designs and innovations at the intersection between machines and biology may help to replenish our arsenal of effective drugs and generate novel diagnostics, providing much needed solutions to global health problems caused by infectious diseases.
Break
Synthesis in the Chemical Space Age
Speaker
The invention of a medicine typically requires multiple rounds of design and synthesis to achieve a desired balance of properties. The structure of a molecule is intimately linked to its properties, and in turn, the reactions used to make the molecule determine structure. We have been exploring the relationship of chemical reactions with physicochemical properties. Our aim is to develop an understanding of the impact that a synthetic transformation has on a molecule’s properties, and ultimately, to invent new reactions which offer complementary property profiles to the workhorse reactions from the synthetic chemists’ toolbox. To realize our vision, a team of synthetic chemists and computer scientists has been assembled, and together we develop robotics and algorithms for synthesis. We will detail some of the software and hardware we have developed to execute many synthetic experiments simultaneously. Both classic HTE in small glass shell vials, and ultraHTE in 1,536 microtiter plates are routinely applied in our research. New reactions under investigation in our laboratories will be shared, along with the application of new and existing reactions within a chemoinformatic system we have developed to tie synthetic transformations to molecular properties.
AI in Chemical Biology: A Perspective from Industry
Speaker
Artificial Intelligence (AI) in its simplistic definition leverages computers to do what humans do naturally and enables digital transformation that is scalable. Given the plethora of data that is generated today, AI applied to complex biological processes can further advance drug discovery regardless of modality. The focus of the presentation will be the industrial setting where we specifically utilize human genetic validation as drivers of our drug discovery. This presentation will review how AI can be enablers of solving problems through systematically converting data into predictions and insights by the integration of data, algorithmic design, predictions and improvements for chemical biology applications.