Predicting Chemical Reaction Barriers With Deep Reinforcement Learning

Prof. Anna C. Lindström

Authors

Prof. Anna C. Lindström Department of Chemistry and Molecular Biology, University of Gothenburg, Sweden

Keywords:

Chemical reaction barriers, deep reinforcement learning, reaction prediction, computational chemistry

Abstract

Estimating the energy barriers of chemical reactions is fundamental to understanding reaction mechanisms, kinetics, and designing new catalysts or synthetic pathways. Traditional methods for identifying transition states and calculating reaction barriers, such as the Nudged Elastic Band (NEB) or string methods, are often computationally expensive and can struggle with complex, high-dimensional potential energy surfaces (PES) [10, 18, 33]. This article explores the application of deep reinforcement learning (DRL) as a novel approach to efficiently and accurately predict chemical reaction barriers. By framing the search for transition states as a sequential decision-making problem, a DRL agent can learn optimal pathways on the PES. We detail the conceptual framework for defining the chemical system as an RL environment, specifying states, actions, and reward functions tailored to guide the agent towards saddle points. The discussion highlights the potential of DRL to navigate intricate chemical landscapes, offering a data-driven, autonomous methodology for barrier estimation that could significantly accelerate chemical discovery and materials design.

References

Y. Bai, E. Yang, B. Han, Y. Yang, J. Li, Y. Mao, G. Niu and T. Liu, Understanding and improving early stopping for learning with noisy labels, in: Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang and J.W. Vaughan, eds, Vol. 34, Curran Associates, Inc., 2021, pp. 24392–24403, https://dl.acm.org/doi/10.5555/3540261.3542128.

R. Barrett and J. Westermayr, Reinforcement learning for traversing chemical structure space: Optimizing transition states and minimum energy paths of molecules, The Journal of Physical Chemistry Letters 15(1) (2024), 349–356.

C. Beeler, S.G. Subramanian, K. Sprague, C. Bellinger, M. Crowley and I. Tamblyn, Demonstrating ChemGymRL: An interactive framework for reinforcement learning for digital chemistry, in: AI for Accelerated Materials Design – NeurIPS 2023 Workshop, 2023, https://openreview.net/forum?id=cSz69rFRvS.

C. Beeler, U. Yahorau, R. Coles, K. Mills, S. Whitelam and I. Tamblyn, Optimizing thermodynamic trajectories using evolutionary and gradient-based reinforcement learning, Phys. Rev. E 104 (2021), 064128.

C.M. Bishop, Training with noise is equivalent to Tikhonov regularization, Neural Computation 7(1) (1995), 108–116.

P.G. Bolhuis and D.W.H. Swenson, Transition path sampling as Markov chain Monte Carlo of trajectories: Recent algorithms, software, applications, and future outlook, Advanced Theory and Simulations 4(4) (2021), 2000237.

G. Brunner, O. Richter, Y. Wang and R. Wattenhofer, Teaching a machine to read maps with deep reinforcement learning, in: Proceedings of the AAAI Conference on Artificial Intelligence 32(1), 2018.

S. Choi, Prediction of transition state structures of gas-phase chemical reactions via machine learning, Nat Commun 14 (2023).

C. Duan, Y. Du, H. Jia and H.J. Kulik, Accurate transition state generation with an object-aware equivariant elementary reaction diffusion model, Nature Computational Science 3(12) (2023), 1045–1055.

W. E, W. Ren and E. Vanden-Eijnden, Simplified and improved string method for computing the minimum energy paths in barrier-crossing events, The Journal of Chemical Physics 126(16) (2007), 164103.

S. Fujimoto, H. van Hoof and D. Meger, Addressing Function Approximation Error in Actor-Critic Methods, 2018, https://arxiv.org/abs/1802.09477.

A. Goodrow, A.T. Bell and M. Head-Gordon, Transition state-finding strategies for use with the growing string method, The Journal of Chemical Physics 130(24) (2009), 244108.

S. Gow, M. Niranjan, S. Kanza and J.G. Frey, A review of reinforcement learning in chemistry, Digital Discovery 1 (2022), 551–567.

J. Guo, T. Gao, P. Zhang, J. Han and J. Duan, Deep reinforcement learning in finite-horizon to explore the most probable transition pathway, Physica D: Nonlinear Phenomena 458 (2024), 133955.

T. Haarnoja, A. Zhou, P. Abbeel and S. Levine, Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor, 2018, https://arxiv.org/abs/1801.01290.

T. Haarnoja, A. Zhou, K. Hartikainen, G. Tucker, S. Ha, J. Tan, V. Kumar, H. Zhu, A. Gupta, P. Abbeel and S. Levine, Soft Actor-Critic Algorithms and Applications, 2019, https://arxiv.org/abs/1812.05905.

S. Heinen, G.F. von Rudorff and O.A. von Lilienfeld, Transition state search and geometry relaxation throughout chemical compound space with quantum machine learning, The Journal of Chemical Physics 157(22) (2022), 221102.

G. Henkelman, B.P. Uberuaga and H. Jonsson´, A climbing image nudged elastic band method for finding saddle points and minimum energy paths, The Journal of Chemical Physics 113(22) (2000), 9901–9904.

L. Holdijk, Y. Du, F. Hooft, P. Jaini, B. Ensing and M. Welling, Stochastic Optimal Control for Collective Variable Free Sampling of Molecular Transition Paths, 2023, https://arxiv.org/abs/2207.02149.

R. Jackson, W. Zhang and J. Pearson, TSNet: Predicting transition state structures with tensor field networks and transfer learning, Chem. Sci. 12 (2021), 10022–10040.

M. Jafari and P.M. Zimmerman, Reliable and efficient reaction path and transition state finding for surface reactions with the growing string method, Journal of Computational Chemistry 38(10) (2017), 645–658.

H. Jung, R. Covino, A. Arjun et al., Machine-guided path sampling to discover mechanisms of molecular self-organization, Nat Comput Sci 3 (2023), 334–345.

L.P. Kaelbling, M.L. Littman and A.W. Moore, Reinforcement learning: A survey, J. Artif. Int. Res. 4(1) (1996), https://dl.acm.org/doi/10.5555/1622737.1622748, 237–285.

A. Khan and A. Lapkin, Searching for optimal process routes: A reinforcement learning approach, Computers & Chemical Engineering 141 (2020), 107027.

O.-P. Koistinen, F.B. Dagbjartsdottir´, V. Asgeirsson´, A. Vehtari and H. Jonsson´, Nudged elastic band calculations accelerated with Gaussian process regression, The Journal of Chemical Physics 147(15) (2017), 152720.

T. Lan and Q. An, Discovering catalytic reaction networks using deep reinforcement learning from first-principles, Journal of the American Chemical Society 143(40) (2021), 16804–16812.

T. Lan, H. Wang and Q. An, Enabling high throughput deep reinforcement learning with first principles to investigate catalytic reaction mechanisms, Nat Commun 15(6281) (2024).

K.-D. Luong and A. Singh, Application of transformers in cheminformatics, Journal of Chemical Information and Modeling 64(11) (2024), 4392–4409.

P. Maes, Modeling adaptive autonomous agents, Artificial Life 1(1–2) (1993), 135–162.

M.Z. Makoś, N. Verma, E.C. Larson, M. Freindorf and E. Kraka, Generative adversarial networks for transition state geometry prediction, The Journal of Chemical Physics 155(2) (2021), 024116.

T. Mannucci and E.-J. van Kampen, A hierarchical maze navigation algorithm with reinforcement learning and mapping, in: 2016 IEEE Symposium Series on Computational Intelligence (SSCI), 2016, pp. 1–8.

A.W. Mills, J.J. Goings, D. Beck, C. Yang and X. Li, Exploring potential energy surfaces using reinforcement machine learning, Journal of Chemical Information and Modeling 62(13) (2022), 3169–3179.

K. Muller̈ and L.D. Brown, Location of saddle points and minimum energy paths by a constrained simplex optimization procedure, Theoret. Chim. Acta 53 (1979), 75–93.

P. Nakkiran, G. Kaplun, Y. Bansal, T. Yang, B. Barak and I. Sutskever, Deep double descent: Where bigger models and more data hurt, in: International Conference on Learning Representations, 2020, https://openreview.net/forum?id=B1g5sA4twr.

D. Osmanković and S. Konjicija, Implementation of Q — learning algorithm for solving maze problem, in: 2011 Proceedings of the 34th International Convention MIPRO, 2011, https://ieeexplore.ieee.org/document/5967320, pp. 1619–1622.

E. Parisotto and R. Salakhutdinov, Neural Map: Structured Memory for Deep Reinforcement Learning, 2017, https://arxiv.org/abs/1702.08360.

G.M. Rotskoff, A.R. Mitchell and E. Vanden-Eijnden, Active importance sampling for variational objectives dominated by rare events: Consequences for optimization and generalization, in: Proceedings of the 2nd Mathematical and Scientific Machine Learning Conference, J. Bruna, J. Hesthaven and L. Zdeborova, eds, Proceedings of Machine Learning Research, Vol. 145, PMLR, 2022, pp. 757–780, https://proceedings.mlr.press/v145/rotskoff22a.html.

D. Silver, S. Singh, D. Precup and R.S. Sutton, Reward is enough, Artificial Intelligence 299 (2021), 103535.

R.S. Sutton and A.G. Barto, Reinforcement Learning: An Introduction, a Bradford Book, MIT Press, 1998, https://books.google.dk/books?id=CAFR6IBF4xYC. ISBN 9780262193986.

M. Towers, J.K. Terry, A. Kwiatkowski, J.U. Balis, G.D. Cola, T. Deleu, M. Goulao˜, A. Kallinteris, A. Kg, M. Krimmel, R. Perez-Vicente, A. Pierré, S. Schulhoff, J.J. Tai, A.T.J. Shen and O.G. Younis, 2023, Gymnasium Zenodo.

G. Veviurko, W. Bohmer̈ and M. de Weerdt, 2024, To the Max: Reinventing Reward in Reinforcement Learning, https://arxiv.org/abs/2402.01361.

P.R. Vlachas, J. Zavadlav, M. Praprotnik and P. Koumoutsakos, Accelerated simulations of molecular systems through learning of effective dynamics, Journal of Chemical Theory and Computation 18(1) (2022), 538–549.

B. Wander, M. Shuaibi, J.R. Kitchin, Z.W. Ulissi and C.L. Zitnick, CatTSunami: Accelerating Transition State Energy Calculations with Pre-trained Graph Neural Networks, 2024, https://arxiv.org/abs/2405.02078.

M. Wen, E.W.C. Spotte-Smith, S.M. Blau et al., Chemical reaction networks and opportunities for machine learning, Nat Comput Sci 3 (2023), 12–24.

M.A. Wiering and H. van Hasselt, Ensemble algorithms in reinforcement learning, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 38(4) (2008), 930–936.

C. Zhang and A.A. Lapkin, Reinforcement learning optimization of reaction routes on the basis of large, hybrid organic chemistry–synthetic biological, reaction network data, React. Chem. Eng. 8 (2023), 2491–2504.

J. Zhang, Y.-K. Lei, Z. Zhang, X. Han, M. Li, L. Yang, Y.I. Yang and Y.Q. Gao, Deep reinforcement learning of transition states, Phys. Chem. Chem. Phys. 23 (2021), 6888–6895.

X. Zhang, Actor-Critic Algorithm for High-dimensional Partial Differential Equations, 2020, https://arxiv.org/abs/2010.03647.

Z. Zhou, X. Li and R.N. Zare, Optimizing chemical reactions with deep reinforcement learning, ACS Central Science 3(12) (2017), 1337–1344.

Frontiers in Emerging Multidisciplinary Sciences

Article Details Page