Engineering and Technology

Machine learning the Hubbard U parameter in DFT+U using Bayesian optimization

M. Yu, S. Yang, et al.

Explore a novel machine learning approach that employs Bayesian optimization to pinpoint optimal Hubbard *U* parameters in DFT+*U* calculations, achieving band structures on par with, or even superior to, traditional linear response methods. This exciting research was conducted by Maituo Yu, Shuyang Yang, Chunzhi Wu, and Noa Marom.

00:00

Playback language: English

Index

Introduction

Density functional theory (DFT) is a widely used method in electronic structure simulations, particularly semi-local functionals like PBE, for high-throughput materials discovery. However, self-interaction error (SIE) in local and semi-local functionals leads to underestimation of band gaps, sometimes incorrectly predicting semiconductors as metals. Hybrid functionals, while providing improved band gaps, are computationally expensive. DFT+*U*, which adds a Hubbard-like correction to address SIE, offers a computationally efficient alternative. The accuracy of DFT+*U* is strongly dependent on the chosen Hubbard *U* parameters, which are often determined empirically or through computationally intensive methods like linear response (LR). This paper introduces a machine learning approach using Bayesian optimization (BO) to determine optimal *U* parameters for improved accuracy and efficiency.

Literature Review

Several first-principles methods exist for determining the Hubbard *U* parameter. The linear response (LR) method, based on constrained DFT (CDFT), calculates *U*eff from the difference between interacting and non-interacting density responses. This method requires large supercells for convergence, leading to high computational cost. The unrestricted Hartree-Fock (UHF) approach employs electrostatically embedded finite-sized clusters, also requiring convergence with increasing cluster size. The constrained random-phase approximation (CRPA) is another method, but significantly more computationally demanding. The current study proposes Bayesian optimization as an alternative, focusing on materials where semi-local functionals perform poorly, such as transition metal monoxides, europium chalcogenides, and narrow-gap semiconductors.

Methodology

The proposed method utilizes Bayesian optimization (BO) to maximize an objective function designed to reproduce the band gap and band structure obtained from accurate hybrid functionals (HSE in this case). The objective function, *f(U)*, is formulated as a weighted sum of the squared difference between HSE and PBE+*U* band gaps and the mean squared error of their band structures. A Gaussian process is employed as the statistical model in the BO algorithm, quantifying the uncertainty associated with each prediction. The upper confidence bound (UCB) acquisition function guides the selection of *U* values to sample in each iteration. The workflow involves an iterative process of performing PBE+*U* calculations based on the suggested *U* values, updating the posterior probability distribution of the objective function, and selecting the next promising *U* values until convergence is reached. The computationally expensive HSE calculation is performed only once. The computational cost is significantly reduced compared to LR, requiring only a small number of PBE+*U* calculations on unit cells instead of large supercells or clusters. The method's performance was assessed on transition metal oxides (NiO), europium chalcogenides (EuTe), and narrow-gap semiconductors (InAs), comparing the results to those obtained using the LR method.

Key Findings

The BO algorithm effectively determined optimal *U* parameters for the chosen materials. For NiO, one-parameter BO (applied to Ni *d* states) yielded *U*Nid = 6.8 eV, resulting in a band gap of 3.36 eV, correcting the CBM location and improving over the PBE result. Two-parameter BO (*U*Nid and *U*Op) yielded even better agreement with HSE. Similarly, for EuTe, BO with *U* applied to Eu 4*f* orbitals gave a band gap of 0.71 eV, in contrast to the metallic prediction of PBE and qualitatively matching HSE. InAs, requiring two-parameter BO on In *sp* and As *p* orbitals, demonstrated the ability of BO to find negative *U* values (-0.5 eV for In *sp* and -7.5 eV for As *p*), successfully replicating the band gap and structure from HSE. In all cases, the BO method achieved comparable or superior results compared to the LR method, while being significantly more computationally efficient (factors of 4.5 to 9 times faster, depending on the material and number of parameters). The transferability of the *U* parameters obtained for bulk InAs to a slab model was also demonstrated.

Discussion

The results demonstrate the effectiveness of employing Bayesian optimization to determine optimal Hubbard *U* parameters in DFT+*U* calculations. The approach consistently produced band structures comparable to those obtained from hybrid functionals, significantly reducing the computational cost. The ability to handle negative *U* values, as shown for InAs, highlights a significant advantage over the LR method. The success of BO underscores the potential of machine learning techniques to improve the accuracy and efficiency of electronic structure calculations. The findings suggest that PBE+*U*BO can provide the accuracy of hybrid functionals at the computational cost of semi-local functionals.

Conclusion

This work presents a novel machine learning-based approach using Bayesian optimization for determining optimal Hubbard *U* parameters in DFT+*U* calculations. The superior accuracy and significantly reduced computational cost compared to the established linear response method demonstrates the efficacy of this approach across various material classes. Future research could explore the applicability of this method to a broader range of materials and investigate the use of more sophisticated machine learning models for further improvement.

Limitations

The accuracy of the BO method depends on the choice of objective function, and the weights assigned to band gap and band structure agreement. While the current study employs a well-established hybrid functional (HSE) as a reference, the results could be affected by the accuracy of the reference data. The generalizability of the method to other types of materials may need further investigation. The current study uses the Dudarev formalism implemented in VASP; different DFT+*U* implementations may yield different results.

Related Publications

Explore these studies to deepen your understanding of the subject.

Medicine and Health

Recent Advancements and Perspectives in the Diagnosis of Skin Diseases Using Machine Learning and Deep Learning: A Review

J. Zhang, F. Zhong, et al.

Chemistry

Accelerating the prediction of CO2 capture at low partial pressures in metal-organic frameworks using new machine learning descriptors

I. B. Orhan, T. C. Le, et al.

Computer Science

Using the interest theory of rights and Hohfeldian taxonomy to address a gap in machine learning methods for legal document analysis

A. Izzidien

Business

Decoding consumer purchase decisions: exploring the predictive power of EEG features in online shopping environments using machine learning

Z. Xu and S. Liu

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny

Machine learning the Hubbard *U* parameter in DFT+*U* using Bayesian optimization

Related Publications

Machine learning the Hubbard U parameter in DFT+U using Bayesian optimization