**We comprehensively analyzed the validity of the NDDO (neglect of diatomic differential overlap) approximation, which forms the basis for most modern semiempirical quantum chemical methods.**

Most of the modern general-purpose semiempirical quantum chemical (SQC) methods, e.g. MNDO, AM1, PM3, PM6, PM7, and Thiel’s OMx methods (see my post about them), are based on the NDDO (neglect of diatomic differential overlap) approximation. Within NDDO lots of integrals including all three- (3-c) and four-center (4-c) two-electron (2-e) integrals are completely neglected formally reducing the scaling of the methods from quartic O(N^{4}) to quadratic O(N^{2}). This, and the fact that the SQC methods use the valence-only minimal basis set, is a primary reason why the SQC methods are so fast. To make up for this simplification, parameters and parametric functions are introduced, which are fitted to the experimental and high-level theoretical data. Although the NDDO approximation is around for decades, the research into its validity is rather scarce. In our recent work led by Walter Thiel we generated and analyzed 15.6 million 1-electron (1-e) and 10.3 billion 2-e integrals (truly Big Data!) for a selection of typical 32 molecules.[1]

The SQC methods also use the integrals calculated in the nonorthogonal atomic basis as approximation to the integrals calculated in the Löwdin basis. Thus, we analyzed the effect of the transformation of the integrals in the nonorthogonal atomic basis to the Löwdin basis and the effect of neglecting various types of integrals in the Hartree–Fock approximation.

Some conclusions from this analysis:

- We confirmed that 4-c, many types of 3-c, and some of two-center (2-c) 2-e integrals can be safely neglected after the transformation to the Löwdin basis for the minimal, valence-only, basis set.
- Some types of integrals (2-c and 3-c 2-e hybrid integrals and 3-c 1-e nuclear attraction integrals) neglected in NDDO still have large values after the transformation to the Löwdin basis even in the minimal basis set.
- Although there is a reasonable correlation between the 1-e integrals calculated in the nonorthonormal atomic basis and Löwdin basis (in minimal basis set), the deviations between them are very large, i.e. orthogonalization effects should be included into the 1-e part of the Hamiltonian explicitly, like in Thiel’s OMx methods. These corrections make the OMx methods the most robust among the SQC methods.

Our analysis also provides additional reason against using larger basis set for improving the accuracy of the SQC methods (another reason is that using larger basis set would obviously slow down the methods). We have found out that the NDDO approximation completely breaks down for larger basis sets.

Finally, we offer possible improvements on the model, which future SQC methods may incorporate: they should include in some way sizable integrals identified in our analysis that currently are completely neglected in the NDDO-based SQC methods.

1. Xin Wu, **Pavlo O. Dral**, Axel Koslowski, Walter Thiel, Big Data Analysis of *Ab Initio* Molecular Integrals in the Neglect of Diatomic Differential Overlap Approximation. *J. Comput. Chem.* **2018**, *Early View*. DOI: 10.1002/jcc.25748.

[…] (OM2 and OM3), which is the most accurate among the NDDO-based models as follows from both our theoretical analysis and validation study. There are however differences between the ODMx and OMx methods that make the […]