Introduction
Electron behavior is crucial in chemistry, impacting bond formation and breaking. Quantum chemistry (QC), especially Density Functional Theory (DFT), offers detailed descriptions but faces computational limitations with larger molecules, hindering extensive simulations. Machine learning (ML), particularly deep neural networks (DNNs), offer a potential solution by learning the potential energy surface (PES) from QC data. Existing DNN potentials mostly handle neutral molecules, lacking the ability to accurately model open-shell systems. This paper addresses this gap by developing a DNN model capable of predicting properties of open-shell ions, focusing on intensive properties independent of system size, which pose challenges due to non-locality and long-range interactions. The research question is whether information from different molecular charge states can be fused to improve ML model accuracy, generalizability, and data efficiency. Two strategies are explored: multitask learning and data fusion. The success of deep learning with multimodal data suggests that learning different molecular states with a single model can be more efficient. The paper's importance lies in its potential to significantly enhance the efficiency and accuracy of simulations involving charged species, which are vital in many chemical processes.
Literature Review
The literature review discusses existing DNN-based molecular potentials, categorizing them into two groups. The first group, including Behler-Parrinello (BP), ANI, and TensorMol, uses symmetry functions to describe atomic environments, while the second group, such as HIP-NN, DTNN, SchNet, and PhysNet, uses different approaches such as message-passing techniques. The ANAKIN-ME (ANI) method, a transferable DNN-based potential, and the AIMNet architecture, which improves ANI's long-range interaction handling, are highlighted. The challenge of predicting intensive properties independent of system size is discussed, along with existing attempts to use ML for such properties. The paper also notes the higher computational cost of QM calculations for ionized states due to unrestricted Hamiltonian formalism.
Methodology
The study explores four DNN architectures: ANI, AIMNet, AIMNet-MT (a multitask model), and AIMNet-NSE (the proposed model with Neural Spin Equilibration). ANI, using atomic environment vectors (AEVs) based on symmetry functions, suffers from the curse of dimensionality. AIMNet addresses this by using learnable atomic feature vectors (AFVs), improving performance and enabling multitask learning. AIMNet-MT jointly predicts energies and spin-polarized atomic charges for neutral, cationic, and anionic states. The novel AIMNet-NSE incorporates a Neural Spin-charge Equilibration unit (NSE) that predicts partial spin-polarized atomic charges and weight factors, redistributing charges to match the specified total molecular spin charges. This process iteratively updates atomic feature vectors, similar to self-consistent field (SCF) iterations in quantum mechanics. The models are evaluated on the Ions-12 (up to 12 non-H atoms) and Ions-16 (13-16 non-H atoms) datasets, with vertical ionization potentials (IP) and electron affinities (EA) computed from energy differences. The performance is compared using root-mean-square errors (RMSEs) on total molecular energies, IP, and EA, and the ChEMBL-20 dataset is used for real-world application testing. The data sets were generated by first obtaining a single 3D conformation for each molecule from its SMILES representation and then generating many conformations using molecular dynamics simulation techniques based on a quantum mechanically derived force field (QMDFF). Reference QM energies and charges were obtained at PBE0/ma-def2-SVP level. The models were trained using minibatch gradient descent with the Adam optimizer, minimizing a weighted multi-target MSE loss function. Model ensembling (ens5) was used to improve prediction accuracy. Additionally, the effect of the number of iterative passes in the AIMNet model is investigated.
Key Findings
The AIMNet-NSE model significantly outperforms ANI and AIMNet in predicting energies and properties of charged molecules. It achieves RMSEs of around 3-4 kcal/mol for total energies and 0.10-0.15 eV for IP and EA on the Ions-16 dataset. The AIMNet-NSE model's accuracy is consistent across different charge states and molecule sizes. The iterative 'SCF-like' updates in AIMNet and AIMNet-NSE are crucial for accuracy, especially for cations. Data fusion in AIMNet-MT provides a marginal improvement. The model also accurately predicts conceptual DFT quantities (electronegativity, hardness, electrophilicity index, Fukui functions, and philicity indexes) without additional training, achieving R² values between 0.82 and 0.97. This allows for direct prediction of reaction outcomes, demonstrated by a case study on the regioselectivity of electrophilic aromatic substitution reactions. In this case study, a random forest (RF) classifier trained with AIMNet-NSE-derived descriptors (Fukui coefficients, atomic philicity indexes, and the AIM layer of the query atom in the cation-radical form) achieves 90% accuracy on the validation set and 85% on the test set. This performance is comparable to existing methods but with a six-order-of-magnitude speedup due to the elimination of QM calculations. The AIMNet-NSE model provides at least a two-order-of-magnitude speedup compared to existing semi-empirical methods like IPEA-xTB, while maintaining higher accuracy.
Discussion
The AIMNet-NSE model addresses the challenge of accurately representing spatially delocalized electronic density and long-range Coulombic interactions in charged molecules. The iterative message-passing mechanism effectively captures complex relationships between atoms. The key success factors are multimodal learning, a shared information-rich representation of atoms across modalities, and the NSE block for charge equilibration. The model serves as a physically-consistent charge equilibration scheme, improving the synergy between ML and physics-based models. The ability to derive conceptual DFT quantities directly from the model opens new avenues for reactivity prediction and reaction outcome modeling, offering a high-throughput alternative to computationally expensive QM methods.
Conclusion
The AIMNet-NSE architecture successfully predicts energies, electron affinities, ionization potentials, and conceptual DFT indexes for molecules with various charge states. Its accuracy and computational efficiency make it suitable for high-throughput applications, offering a faster and more accurate alternative to existing methods. Future research could explore extending the model to a wider range of elements, exploring different reaction types, and incorporating explicit solvent effects.
Limitations
The current model is primarily trained and tested on organic molecules containing specific elements (H, C, N, O, F, Si, P, S, and Cl). The accuracy may decrease when applied to molecules with different compositions or structures. The choice of DFT method (PBE0/ma-def2-SVP) used for reference data could also impact results. The model's transferability to systems not included in the training data set is subject to the limitations of training data.
Related Publications
Explore these studies to deepen your understanding of the subject.