Support The World's Smartest Network
×

Help the New York Academy of Sciences bring late-breaking scientific information about the COVID-19 pandemic to global audiences. Please make a tax-deductible gift today.

DONATE
This site uses cookies.
Learn more.

×

This website uses cookies. Some of the cookies we use are essential for parts of the website to operate while others offer you a better browsing experience. You give us your permission to use cookies, by continuing to use our website after you have received the cookie notification. To find out more about cookies on this website and how to change your cookie settings, see our Privacy policy and Terms of Use.

We encourage you to learn more about cookies on our site in our Privacy policy and Terms of Use.

AI in Chemical Biology: New Frontiers

WEBINAR

Only

AI in Chemical Biology: New Frontiers

Wednesday, March 17, 2021, 11:00 AM - 5:00 PM EDT

Presented By

Chemical Biology Discussion Group

The New York Academy of Sciences

 

Artificial Intelligence (AI) has the potential to enhance our understanding of chemical biology – with broad applications that include predicting reaction chemistry, automating lab processes, detangling complex biochemical processes, and hastening drug discovery and development. With rapid developments over the last decade in machine learning, natural language processing and other aspects of AI, we are at the precipice of a new era in this field, however appropriately integrating these methodologies into chemical biology research can be a daunting. To address these challenges, this one day symposium will showcase recent advances in chemical biology that were enabled by AI and highlight best practices for employing AI techniques in this field.

Registration

Member
$30
Nonmember Academia, Faculty, etc.
$65
Nonmember Corporate, Other
$85
Nonmember Not for Profit
$65
Nonmember Student, Undergrad, Grad, Fellow
$45
Member Student, Post-Doc, Fellow
$15

Keynote Speaker

James Collins, PhD
James Collins, PhD

Massachusetts Institute of Technology

Speakers

Alán Aspuru-Guzik, PhD
Alán Aspuru-Guzik, PhD

University of Toronto

Georgia McGaughey
Georgia McGaughey, PhD

Vertex Pharmaceuticals

Timothy Cernak, PhD
Timothy Cernak, PhD

University of Michigan

Joseph (Joey) Davis
Joseph (Joey) Davis, PhD

Massachusetts Institute of Technology

César de la Fuente, PhD
César de la Fuente, PhD

University of Pennsylvania

Peter Madrid, PhD
Peter Madrid, PhD

SRI International

Scientific Organizing Committee

Nozomi Ando, PhD
Nozomi Ando, PhD

Cornell University

César de la Fuente, PhD
César de la Fuente, PhD

University of Pennsylvania

Sara Donnelly, PhD
Sara Donnelly, PhD

The New York Academy of Sciences

Sonya Dougal, PhD
Sonya Dougal, PhD

The New York Academy of Sciences

Barbara Knappmeyer, PhD
Barbara Knappmeyer, PhD

The New York Academy of Sciences

Wednesday

March 17, 2021

11:00 AM

Introduction and Welcome Remarks

Speaker

Barbara Knappmeyer, PhD
New York Academy of Sciences
11:10 AM

Keynote: A Deep Learning Approach to Antibiotic Discovery

Speaker

James Collins, PhD
Massachusetts Institute of Technology

Session 1

Session Chairperson
César de la Fuente, PhD, University of Pennsylvania
11:50 AM

Automating the Efficient Synthesis of Analogs from AI Designs

Speaker

Peter Madrid, PhD
SRI International
12:20 PM

Break

12:30 PM

Machine Learning and Automation for Molecular and Materials Design

Speaker

Álan Aspuru-Guzik, PhD
University of Toronto
1:00 PM

Protein Folding with MELDxMD Guided by Machine Learning Predicted Contacts

Speaker

Roy Nassar, MSc
Stony Brook University

Molecular dynamics (MD) simulations possess the ability to model biomolecular structures with atomic-scale resolution. A major bottleneck, however, is the vast configurational space that a protein can sample when folding. To alleviate this burden, we use our MD accelerator, MELD, which leverages external information to reduce the conformational problem. MELD accelerated MD (MELDxMD) complements bioinformatics, AI, and experimental approaches by integrating data from all of them into a technique that delivers high resolution structures with free energy scoring. Here, we employ MELDxMD with inferred contacts between amino acids (aa) from machine learning, to generate accurate protein structures given their aa sequence. Specifically, our protocol inputs the contacts from trRosetta—a state-of-the-art machine learning public server—as distance restraints to quickly guide the MELDxMD simulations to stable states. Importantly, the simulations successfully identify many of these native states as the lowest free energy conformations on the energy landscape, an attribute only accessible to physics-based methods. Furthermore, the formulated simulations allow modeling of proteins that surpass the upper 100 aa size limit of MD folding, as evident by accurate conformations of proteins at the maximum single-domain range (150-200aa). Overall, the work highlights the power of integrating AI with chemical biology techniques to tackle problems previously unattainable to conventional methods.

1:15 PM

Comparing Recurrent Neural Networks, Generative Adversarial Networks and Variational Autoencoders for Drug Design​

Speaker

Fabio Urbina, PhD
Collaborations Pharmaceuticals, Inc.

Interest in de novo molecule generation for lead compound design in drug discovery has led to the rapid development of new machine learning models. This new generation, including recurrent neural networks (RNNs), generative adversarial networks (GANs), and variational autoencoders (VAE), are capable of learning how to generate new molecule structures not previously seen in the training set with desired activities or specified drug-like properties. While preliminary results in these areas have been promising, external evaluation of generative models has yet to be performed, leaving their efficacy in lead drug design unknown. We have explored the space of generative models to increase knowledge on their potential uses for both large (peptides) and small molecules. We use external examples, including retroactive virtual screening and real-world parallel drug design alongside expert chemists to compare and contrast the strengths and limitations of three different generative model architectures. We show that different RNN architectures are flexible and modular, and can (computationally) cheaply generate hundreds of thousands of compounds, that are the same molecules as suggested by medicinal chemists with little information other than restricting similarity to the lead compound and desired drug-like properties. Our results suggest that RNN generative models should be tightly integrated into drug design and discovery pipelines as well as to other industries (animal health).

1:30 PM

Networking Break and Virtual Poster Session

Session 2

Session Chairperson
Nozomi Ando, PhD, Cornell University
3:00 PM

Visualizing Molecular Machines in Motion using Cryo Electron Microscopy and Deep Learning

Speaker

Joseph (Joey) Davis, PhD
Massachusetts Institute of Technology

Macromolecular machines such as the ribosome undergo massive structural changes as they assemble. While we have long appreciated such structural changes exist, experimentally visualizing and analyzing large ensembles of these structures is challenging. Here, I briefly describe cryoDRGN, our newly developed software package to analyze structural heterogeneity in protein complexes visualized by cryo-electron microscopy (cryo-EM). This approach, which uses a purpose-built neural network based on a variational autoencoder, maps individual particle images to a low-dimensional latent space. Direct inspection of this latent space provides insights into the degree of structural heterogeneity in the dataset, provides estimates of particle abundance in each state, and relates the observed states to one another. Users can then visualize structural ensembles of their protein complex by generating three-dimensional structures throughout this latent space using the trained networks. I describe our application of this framework to understand how a bacterial methyltransferase guides assembly of the small ribosomal subunit. In analyzing a series of related cryo-EM datasets, we surprisingly uncovered that this methyltransferase ‘proof-reads’ the assembling ribosomes by preferentially disassembling ribosomes that have been erroneously constructed. This unexpected insight into the bacterial ribosome biogenesis process was enabled by cryoDRGN, and it highlights the utility of our approach.

3:30 PM

Antibiotic Discovery by Means of Computers

Speaker

César de la Fuente, PhD
University of Pennsylvania

Machines have the potential to outperform humans and revolutionize our world. In this talk, I will describe our efforts to use machines to develop computational approaches for antibiotic discovery, as well as low-cost rapid diagnostics. Computers can already be programmed for superhuman pattern recognition of images and text. In order for machines to discover novel antibiotics, they have to first be trained to sort through the many characteristics of molecules and determine which properties should be retained, suppressed, or enhanced to optimize antimicrobial activity. Said differently, machines need to be able to understand, read, write, and eventually create new molecules. I will discuss how we trained a computer to execute a fitness function following a Darwinian algorithm of evolution to select for molecular structures that interact with bacterial membranes, yielding the first artificial antimicrobials that kill bacteria both in vitro and in relevant animal models. My lab has also developed pattern recognition algorithms to mine the human proteome, identifying throughout the body thousands of antibiotics encoded in proteins with unrelated biological function, and has applied computational tools to successfully reprogram venoms into novel antimicrobials. I will also describe the development of diagnostic biosensors for COVID-19, further substantiating the exciting potential of machine biology. Computer-generated designs and innovations at the intersection between machines and biology may help to replenish our arsenal of effective drugs and generate novel diagnostics, providing much needed solutions to global health problems caused by infectious diseases.

4:00 PM

Break

4:10 PM

Synthesis in the Chemical Space Age

Speaker

Timothy Cernak, PhD
University of Michigan

The invention of a medicine typically requires multiple rounds of design and synthesis to achieve a desired balance of properties. The structure of a molecule is intimately linked to its properties, and in turn, the reactions used to make the molecule determine structure. We have been exploring the relationship of chemical reactions with physicochemical properties. Our aim is to develop an understanding of the impact that a synthetic transformation has on a molecule’s properties, and ultimately, to invent new reactions which offer complementary property profiles to the workhorse reactions from the synthetic chemists’ toolbox. To realize our vision, a team of synthetic chemists and computer scientists has been assembled, and together we develop robotics and algorithms for synthesis. We will detail some of the software and hardware we have developed to execute many synthetic experiments simultaneously. Both classic HTE in small glass shell vials, and ultraHTE in 1,536 microtiter plates are routinely applied in our research. New reactions under investigation in our laboratories will be shared, along with the application of new and existing reactions within a chemoinformatic system we have developed to tie synthetic transformations to molecular properties.

4:40 PM

AI in Chemical Biology: A Perspective from Industry

Speaker

Georgia McGaughey, PhD
Vertex Pharmaceuticals

Artificial Intelligence (AI) in its simplistic definition leverages computers to do what humans do naturally and enables digital transformation that is scalable.  Given the plethora of data that is generated today, AI applied to complex biological processes can further advance drug discovery regardless of modality.  The focus of the presentation will be the industrial setting where we specifically utilize human genetic validation as drivers of our drug discovery.  This presentation will review how AI can be enablers of solving problems through systematically converting data into predictions and insights by the integration of data, algorithmic design, predictions and improvements for chemical biology applications.

5:10 PM

Closing Remarks

5:15 PM

Adjourn