Correcting Differences with Machine Learning

Δ-ML approach drastically reduces mean absolute error (MAE) in atomization enthalpies for PM7 and B3LYP methods in respect to accurate G4MP2 values

In our recent study, we propose using machine learning (ML) to correct differences in properties calculated at two quantum chemical (QC) methods with different accuracy.

In the Δ-ML approach ML model is trained on differences between some property $$P_t$$ calculated at the target level of theory and property $$P’_b$$ calculated at the baseline level of theory. Then this ML model is used to predict $$\Delta_b^t$$ for out-of-sample molecules.

Such Δ-ML approach can be used for correcting errors in property calculated at less accurate, but faster QC method. After generating reference data for training set molecules at more accurate, but slower level of theory, training ML model is relatively fast and ML-corrections are computed for new molecules essentially for free.

The Δ-ML approach can be also used for correcting differences between different properties. For instance, B3LYP can be used to calculate atomization energies and then ML corrects them to obtain enthalpies of atomization at 298 K with chemical accuracy (relative to more accurate target line G4MP2 method).

The concepts behind the Δ-ML and the automatic parametrization technique (APT) published earlier differ fundamentally. The Δ-ML makes explicit on-top corrections to properties pre-calculated with less accurate QC methods, while the APT improves QC methods implicitly by correcting their parameters, which are then used to calculate required properties. Thus, one or another of these approaches may be more preferable for the problem at hand.

Tagged with: , , , , , ,
One comment on “Correcting Differences with Machine Learning”
1. Pavlo Dral says:

This Δ-ML approach was featured and described in detail in Computational Chemistry Highlights (Comp. Chem. Highlights, 2015.04.big)

4 Pings/Trackbacks for "Correcting Differences with Machine Learning"
1. […] highlight by Jan Jensen about the Δ-ML approach proposed by us [1] was the most viewed highlight in Computational Chemistry Highlights in 2015. […]

2. […] is a baseline and also serves a role of a fail-safe. Our automatic parameterization technique and Δ-ML not only significantly decrease errors of the low-level QM methods, but also largely avoid huge […]

3. […] We note that it has been done e.g. in MNDOC, but one of the issues with this method is that it is significantly slower than MNDO. Alternative approach would be to use machine learning for predicting correlation energy as in our Δ-learning method. […]

4. […] parameters of semiempirical QC methods, to improve predictions made by low-level QC methods (Δ-learning), to generate very accurate molecular potential surfaces with significantly reduced computational […]

This site uses Akismet to reduce spam. Learn how your comment data is processed.