OMx Methods Score Well in Set from SAMPL5 Challenge

OMx methods have once again been shown to give as accurate results as DFT methods, but substantially faster.  The OM2 method has outperformed other semi-empirical methods and has essentially the same accuracy as BLYP for the distribution coefficient part of the SAMPL5 challenge.

We* have calculated distribution coefficients for the molecules from the SAMPL5 challenge between water and cyclohexane using MM and QM/MM free energy simulations[1]. Calculations have been done with explicit solvent. The MM part is based on the CHARMM force field, while DFT and a range of semi-empirical methods have been used for QM part of simulations. The Zwanzig equation has been used to calculate MM to QM/MM free energy corrections.

As a DFT method we have used BLYP/6-31G*, which we compared to the OMx (OM1, OM2, OM3, OM2-D3) and traditional (MNDO, MNDO/d, MNDOC, AM1, and PM3) semiempirical methods. OMx methods include orthogonalization and other important corrections, which make them the most robust semiempirical, general-purpose wavefunction-based methods.

The task of computing distribution coefficients is greatly complicated by the possible formation of different solute species. For instance, solutes can aggregate, become protonated, undergo tautomeric transformations and so on. Trying to include all these effects into simulations would make them unfeasible. Thus, we have had to neglect some of the above effects and compare partition coefficients calculated for a single form of each solute with the experimental distribution coefficients.

Another complication is of technical character: since parameters of some semi-empirical methods are available only for H, C, N, O, and F elements, we could perform calculations only for a subset of the compounds. In addition, the set has been further reduced due to the convergence problems arising from the use of the Zwanzig equation. Thus, all the comparisons below are for the reduced SAMPL5 set containing 22 out of initial 53 molecules.

The most important result looks as follows: using DFT and OMx methods improve upon the pure MM transfer free energies, while using traditional semi-empirical methods has lead to a slight deterioration of the calculated transfer free energies. The addition of dispersion corrections (OM2-D3 vs OM2) has practically had no effect on RMSE, but slightly decreased mean signed error of the transfer free energies. The use of OM2 and BLYP methods lead to our most accurate results for the transfer free energies with almost identical RMSEs of 3.5 and 3.6 kcal/mol, respectively. Nevertheless, using OM2 leads to additional computational cost of 5 CPU-hours, while BLYP requires whooping 1451 CPU-hours. This means that you can calculate free energy corrections with OM2 on your laptop.

1. Gerhard König, Frank C. Pickard IV, Jing Huang, Andrew C. Simmonett, Florentina Tofoleanu, Juyong Lee, Pavlo O. Dral, Samarjeet Prasad, Michael Jones, Yihan Shao, Walter Thiel, Bernard R. Brooks, Calculating Distribution Coefficients Based on Multi-Scale Free Energy Simulations: An Evaluation of MM and QM/MM Explicit Solvent Simulations of Water-Cyclohexane Transfer in the SAMPL5 Challenge. J. Comput. Aided Mol. Des. 2016, 30, 989–1006. DOI: 10.1007/s10822-016-9936-x.

* The research has been led by Gerhard König with a humble contribution to the semi-empirical part by the author of this post.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.