Engineering and Technology
Machine learning the Hubbard *U* parameter in DFT+*U* using Bayesian optimization
M. Yu, S. Yang, et al.
The study addresses the challenge that semi-local DFT functionals (e.g., PBE) suffer from self-interaction error, leading to systematically underestimated band gaps and, in some cases, spurious metallicity. While hybrid functionals (e.g., HSE) often rectify these issues, their computational cost limits their use for large systems and high-throughput screening. The DFT+U method offers a more affordable correction, but its accuracy hinges on selecting appropriate, system-dependent Hubbard U parameters. Conventional strategies for determining U include empirical fitting to experiments, which fails when data are unavailable, and several first-principles approaches (linear response, unrestricted Hartree–Fock, cRPA), which can be computationally demanding. The research question is whether a machine-learning-based Bayesian optimization framework can efficiently and robustly determine optimal U parameters that reproduce hybrid-functional band gaps and qualitative band-structure features, thereby achieving hybrid-level accuracy at semi-local cost.
Prior work established DFT+U as a pragmatic correction for correlated materials, but obtaining reliable U values is nontrivial. The linear response (LR) method (Cococcioni and de Gironcoli) computes Ueff from differences between interacting and non-interacting response functions, requiring supercells and convergence with cell size, increasing cost. A modified self-consistent LR exists but remains demanding. The unrestricted Hartree–Fock (UHF) embedded-cluster approach also requires convergence with cluster size. Constrained RPA (cRPA) provides U from many-body screening but is significantly more expensive. Hybrid functionals such as HSE can serve as accurate references for band structures and gaps, although at higher cost than GGA. Prior benchmarking has shown hybrid functionals’ reliability across materials. These contexts motivate a data-efficient optimization strategy that leverages a single, accurate reference (e.g., HSE) to learn U parameters for affordable PBE+U calculations.
The authors propose Bayesian optimization (BO) to select Ueff values that make PBE+U band structures best match a chosen reference (HSE by default). The surrogate objective function f(U) is defined to penalize differences in both band gap and band structure: f(U) = −a1(Eg,HSE − Eg,PBE+U)^2 − a2(ΔBand)^2, where U = [U1, U2, …, Un] are species/orbital-specific Ueff values constrained to [−10, 10] eV. ΔBand is the mean squared error between selected PBE+U and HSE eigenvalues across Nk k-points and No bands; to avoid double counting the gap, both VBM and CBM are shifted to zero in each band structure prior to comparison. Default weights are a1 = 0.25 and a2 = 0.75; sensitivity to these was examined in supplementary information. A Gaussian process (RBF kernel) models f(U), yielding a posterior mean μ(U) and uncertainty σ(U). The upper confidence bound (UCB) acquisition with κ = 1 selects the next U: Un = argmax[μ(U) + κσ(U)]. Iterations continue until a maximum number N is reached (and/or a small change criterion Δf(U) < 0.001), after which the U maximizing f(U) is returned. Computationally, a single HSE calculation generates the reference, while N PBE+U calculations evaluate f(U); BO model updates are negligible compared to DFT costs. VASP with PAW is used throughout; SOC is included for transition metal oxides, Eu chalcogenides, and narrow-gap semiconductors. k-point grids: 8×8×8 for PBE/PBE+U and 6×6×6 for HSE; lattice parameters and cutoffs are in Supplementary Table 1; high-symmetry k-points for band plots are in Supplementary Table 2. For comparison, LR calculations apply small on-site potentials from −0.04 to 0.04 eV in 0.02 eV steps, performing both non-charge-self-consistent and charge-self-consistent calculations to obtain response matrices; a 3×3×3 supercell removes interactions between periodic images. Ueff is then computed from the difference of inverse response matrices.
- Transition metal oxide (NiO): PBE underestimates the gap (0.73 eV) versus HSE (4.26 eV; close to experimental 4.0–4.3 eV). BO with one parameter (U^Ni_d) converges in 13 iterations to 6.8 eV, yielding a corrected CBM and a 3.36 eV gap. LR gives U^Ni_d = 5.4 eV, gap 3.20 eV, with CBM incorrectly located between T and K. Two-parameter BO (U^Ni_d, U^O_p) finds U^Ni_d = 5.9 eV and U^O_p = 9.4 eV after 55 iterations, improving the gap to 3.70 eV and objective from −0.37 to −0.11 eV^2; two-parameter LR yields U^O_p = 8.3 eV and a 3.53 eV gap with correct CBM. Efficiency: On equal cores, LR is ~4.5× slower than BO for one parameter and ~8× slower for two parameters.
- Europium chalcogenide (EuTe): PBE predicts metallic behavior; HSE yields an indirect gap of 1.24 eV (VBM at Γ). BO applying U to Eu 4f selects U = 7.1 eV, opening a 0.71 eV gap and reproducing qualitative HSE features. LR gives Ueff = 5.5 eV and a 0.56 eV gap. Efficiency: LR with a 3×3×3 supercell requires ~9× more CPU time than BO.
- Narrow-gap semiconductor (InAs): PBE shows no gap; HSE gives 0.37 eV. Single-parameter U on either In p or As p does not open a gap. 2D BO finds U^In_sp = −0.5 eV and U^As_p = −7.5 eV, producing a 0.31 eV gap and good agreement with HSE. Negative Ueff values are theoretically permissible when J > U and can be appropriate for delocalized states; LR disallows negative Ueff and yields U^In = 0.7 eV and U^As = 3.3 eV, still producing no gap. BO is thus both more accurate and more efficient for InAs. Transferability of BO-derived U values from bulk to an 11-layer slab of InAs is demonstrated in the Supplementary Discussion.
The results confirm that Bayesian optimization can efficiently determine Hubbard U parameters that reproduce both band gaps and qualitative band-structure features obtained with hybrid functionals. Across diverse classes—strongly correlated oxides, f-electron chalcogenides, and narrow-gap semiconductors—BO-derived U values yield PBE+U band structures comparable to HSE at a fraction of the cost. BO outperforms linear response in multiple respects: it is substantially more computationally efficient (no supercells; one HSE plus a modest number of unit-cell PBE+U runs), more accurate in cases requiring negative Ueff (e.g., InAs), and at least comparable in accuracy otherwise (e.g., NiO, EuTe). The framework’s objective function allows explicit balancing of band-gap matching and band-structure similarity via weights, and the methodology is adaptable to different reference methods beyond HSE. These findings directly address the research goal of obtaining accurate, system-specific U parameters in a principled, data-efficient manner, potentially enabling accurate simulations of larger systems where hybrid functionals are prohibitive. Differences among DFT+U implementations may affect quantitative outcomes, but the BO strategy is general. The demonstrated transferability (e.g., bulk to slab) further indicates practical usability.
The paper introduces a Bayesian optimization framework to determine optimal Hubbard U parameters for DFT+U by maximizing agreement with a hybrid-functional reference in both band gap and band-structure features. Using a Gaussian-process surrogate and UCB acquisition, the method requires one HSE calculation and a limited number of PBE+U evaluations. Demonstrations on NiO, EuTe, and InAs show that PBE+U_BO closely reproduces HSE and is markedly more efficient than linear response; in challenging cases (InAs) it is also more accurate, owing to the ability to explore negative Ueff. The approach can deliver hybrid-level accuracy at semi-local cost, facilitating studies of larger systems such as surfaces and interfaces. Future directions include extending to alternative references (other hybrids or MBPT), exploring multi-objective or multi-fidelity BO, accommodating multiple orbitals per element where code support allows, and further assessing transferability across structures and environments.
- The approach relies on a reference band structure (here HSE), requiring at least one hybrid-functional calculation; for very large unit cells this may still be costly.
- Quantitative results may depend on the specific DFT+U implementation; different codes or formulations can yield different outcomes.
- In VASP, applying U to two orbitals of the same element is not implemented, limiting certain 2D optimizations (e.g., simultaneous f and d on Eu).
- The objective’s weighting parameters (a1, a2) influence optimization; while defaults are provided and sensitivity is discussed in the Supplementary Information, optimal weights may be system-dependent.
Related Publications
Explore these studies to deepen your understanding of the subject.

