Computational Workflow for Accelerated Molecular Design Using Quantum Chemical Simulations and Deep Learning Models

Publication Type
Conference Presentation
Publication Year
2023
Authors
Blanchard, Andrew
Zhang, Pei
Bhowmik, Debsindhu
Mehta, Kshitij
Gounley, John
Reeve, Samuel Temple
Irle, Stephan
Pasini, Massimiliano Lupo
Abstract

The community needs efficient methods for searching the chemical space of molecular compounds to automate and accelerate the design of new functional molecules such as pharmaceuticals. Given the high cost in both resources and time for experimental efforts, computational approaches play a key role in guiding the selection of promising molecules for further investigation. Here, the authors construct a workflow to accelerate design by combining approximate quantum chemical methods [that is, density-functional tight-binding (DFTB)], a graph convolutional neural network (GCNN) surrogate model for chemical property prediction, and a masked language model (MLM) for molecule generation. The authors use property data from the DFTB calculations are used to train the surrogate model; The authors use the surrogate model to score candidates generated by the MLM. The surrogate reduces computation time by orders of magnitude compared to the DFTB calculations, enabling an increased search of chemical space. Furthermore, the MLM generates a diverse set of chemical modifications based on pre-training from a large compound library. The authors use the workflow to search for near-infrared photoactive molecules by minimizing the predicted HOMO-LUMO gap as the target property. The results show that the workflow can generate optimized molecules outside of the original training set, which suggests that iterations of the workflow could be useful for searching vast chemical spaces in a wide range of design problems.

Citation
Date
Volume
1690
Publication Title
Accelerating Science and Engineering Discoveries Through Integrated Research Infrastructure for Experiment, Big Data, Modeling and Simulation
DOI
https://doi.org/10.1007/978-3-031-23606-8_1