Multi-fidelity active learning
Active learning (aka sequential learning, Bayesian optimization, optimal experimental design) presents a powerful means to efficiently navigate large and high dimensional spaces. Multi-fidelity active learning strategies offer a principled means to fuse data collected at multiple resolutions within a unified learning model. We have recently developed and employed multi-fidelity and multi-objective active learning to fuse high-throughput/low-accuracy computation with low-throughput/high-accuracy experiments to achieve superior performance than either screen alone. In one application to π-conjugated oligopeptides capable of self-assembling into supramolecular aggregates with good in-register stacking of the π-cores, the computational screening loop elevated the accuracy of our predictive models by 27% over those trained over the experimental data alone and drove the discovery of 26 novel molecular candidates with as good or better assembly performance than the best performing previously known molecules. We have also developed software implementations of multi-fidelity Gaussian process regression models that enable the specification of arbitrarily complex logic specifying the information flows between models (https://github.com/Ferg-Lab/mfGPR).
We are pursuing the following projects in this theme:
- Integration of asynchronous and simultaneous computational and experimental screening loops within an active learning design-build-test cycles
- Deployment of multiple levels of theory (e.g., coarse-grained MD, all-atom MD, DFT) and experiment (e.g., high-throughput optical measurements, low-throughput mechanical characterization) within a single, unified learning campaign
- Integration of multi-fidelity active learning with automated robotics within self-driving labs
Representative Publications
94. K. Shmilovich, S.S. Panda, A. Stouffer, J.D. Tovar, and A.L. Ferguson* “Hybrid computational-experimental data-driven design of self-assembling π-conjugated peptides” Digital Discovery 1 448-462 (2022) [ https://dx.doi.org/10.1039/d1dd00047k ]
92. K. Shmilovich, Y. Yao, J.D. Tovar, H.E. Katz, A. Schleife, and A.L. Ferguson* “Computational discovery of high charge mobility self-assembling π-conjugated peptides” Mol. Syst. Des. Eng. 7 447-459 (2022) [ http://dx.doi.org/10.1039/D2ME00017B ]
→ Selected by editors as MSDE HOT article
91. B. Mohr, K. Shmilovich, I.S. Kleinwächter, D. Schneider, A.L. Ferguson*, and T. Bereau “Data-driven discovery of cardiolipin-selective small molecules by computational active learning” Chem. Sci. 13 4498-4511 (2022) [ http://dx.doi.org/10.1039/D2SC00116K ]
→ Selected for 2022 ChemSci “Pick of the Week” collection
→ Featured in commentary M. Aldeghi and C.W. Coley “A focus on simulation and machine learning as complementary tools for chemical space navigation” Chem. Sci. (2022) [ https://doi.org/10.1039/d2sc90130g ]
68. K. Shmilovich, R.A. Mansbach, H. Sidky, O.E. Dunne, S.S. Panda, J.D. Tovar, and A.L. Ferguson* “Discovery of self-assembling π-conjugated peptides by active learning-directed coarse-grained molecular simulation” J. Phys. Chem. B 124 3873-3891 (2020) [ https://doi.org/10.1021/acs.jpcb.0c00708 ]
→ Invited submission to the “Machine Learning in Physical Chemistry” special issue
→ Selected as ACS Editors’ Choice article (March 30, 2020)
→ Selected for front cover art of JPCB vol. 124, issue 19 (May 14, 2020)