Engineering and Technology
A multi-fidelity machine learning approach to high throughput materials screening
C. Fare, P. Fenner, et al.
Dive into a groundbreaking multi-fidelity machine learning approach that revolutionizes high-throughput materials screening by dynamically learning relationships between experimental and computational data. This innovative research by Clyde Fare, Peter Fenner, Matthew Benatan, Alessandro Varsi, and Edward O. Pyzer-Knapp offers a remarkable three-fold reduction in optimization costs.
~3 min • Beginner • English
Introduction
The study addresses the central challenge in materials discovery: navigating vast chemical and materials spaces where experimentally measurable properties (optoelectronic, structural, catalytic, physico-chemical) depend in complex ways on controllable variables, while experimental synthesis and characterization are costly. Simulations offer cheaper proxies but generally lack sufficient accuracy to replace experiments entirely. The prevalent solution—computational funnels—apply progressively more accurate and expensive methods (including simulations and staged experiments) to down-select candidates. However, funnels suffer drawbacks: they require prior knowledge of method accuracies and costs, fixed total resource specification in advance, and predetermined budget allocation across layers. These issues are exacerbated when integrating machine learning layers, whose accuracy is data-dependent and generally unknown for arbitrary inputs. The paper proposes a progressive, budget-aware, multi-fidelity Bayesian optimization framework that dynamically learns relationships among multiple measurement fidelities (computational and experimental), selecting both which candidate and which fidelity to evaluate, without requiring prior knowledge of fidelity accuracies or preallocation of resources. The approach aims to accelerate discovery by optimally trading off information gain from cheaper proxies against targeted high-fidelity measurements.
Literature Review
Prior work demonstrates the utility and limitations of simulation-driven screening and machine learning in materials discovery. Multi-fidelity machine learning has emerged to address data scarcity by jointly modeling multiple fidelities, often via multi-output models (e.g., co-kriging, graph neural networks), improving accuracy over single-fidelity models. For example, Chen et al. showed multi-fidelity graph networks improve band-gap prediction MAE by 22–45% using PBE as a lower-fidelity input. Patra et al. used co-kriging to fuse multiple sources for polymer bandgap prediction, improving performance and generalization over single-fidelity GPs. In optimization, Bayesian optimization (BO) has proven sample-efficient across domains including materials, using acquisition functions such as Expected Improvement (EI) and models like Gaussian processes (GPs) or Bayesian neural networks. Multi-fidelity BO variants exist: phased strategies that start with low-fidelity exploration then switch to high fidelity under stopping criteria; epsilon-greedy or LCB-based schemes; and approaches that leverage full low-fidelity datasets a priori to accelerate discovery. However, existing funnels and some MF-BO strategies often require strong prior assumptions about fidelity relations or static staging. This work introduces Targeted Variance Reduction (TVR), extending arbitrary single-fidelity acquisition functions to a multi-fidelity setting that explicitly accounts for cost and learned cross-fidelity correlations, selecting the next (candidate, fidelity) pair to minimize predictive variance at the highest-value target-fidelity location per unit cost.
Methodology
The methodology combines multi-output Gaussian process modeling with a novel multi-fidelity acquisition policy termed Targeted Variance Reduction (TVR), and evaluates on three mixed simulation–experiment datasets.
Modeling: multi-output Gaussian process (GP)
- Single-fidelity GP: Given training representations X and targets y, with a kernel k(·,·) (Matern 5/2 with ARD), the posterior mean μ and covariance Σ for test inputs X* are given by μ = K_*K^{-1}y and Σ = K_{**} − K_*K^{-1}K_*^T, with hyperparameters optimized via log marginal likelihood.
- Multi-fidelity extension: Represent each data point as a concatenation of a material representation x_i and a fidelity representation f_k. Fidelities are one-hot encoded with the high-fidelity (target) mapped to the zero vector and each lower fidelity to a unit basis vector. This biases learning toward modeling correlations between lower fidelities and the high-fidelity target. The prior covariance becomes a block matrix over (material, fidelity) pairs; on-diagonal blocks capture within-fidelity correlations, off-diagonal blocks capture cross-fidelity correlations. Through ARD on fidelity dimensions, the model learns which lower fidelities correlate with the target: uninformative fidelities receive short length scales in their one-hot dimensions, driving cross-covariances toward zero.
Acquisition policy: Multi-Fidelity Targeted Variance Reduction (MF-TVR)
- Baseline acquisition on target fidelity: Compute a standard single-fidelity acquisition function over the target fidelity; here Expected Improvement (EI) is used to identify the target-fidelity location with highest acquisition value (most promising candidate at the target fidelity).
- Targeted variance reduction: For each potential (candidate, fidelity) query, estimate the expected reduction in predictive variance at the identified target-fidelity maximizer if that (candidate, fidelity) were observed, using the current multi-output GP posterior. Score each candidate–fidelity pair by variance reduction per unit cost of that fidelity.
- Selection: Choose the (candidate, fidelity) with maximal variance reduction per cost. Evaluate, augment the dataset, retrain/update the GP, and iterate until budget is exhausted. This integrates exploration at cheaper fidelities and exploitation at the target fidelity continuously, without staged phases.
Datasets and representations
- Harvard Organic Photovoltaic (HOPV): 350 molecules; target is experimental power conversion efficiency; lower-fidelity computational analogs via Scharber model built from DFT energy levels using BP86, PBE0, B3LYP, M062X with def2-SVP. Costs: BP86=1.0, PBE0=1.25, B3LYP=1.75, M062X=2.0, Experiment=20.0. Representations: SOAP descriptors reduced to 20D by PCA.
- Alexandria: 946 structures; target experimental polarizability; lower fidelities are Hartree–Fock/6-31G** and B3LYP/aug-cc-PVTZ. Costs: HF=1.0, B3LYP=2.0, Experiment=6.0. Representations: MACCS keys reduced to 20D by PCA.
- Chen (Alchemy subset): 1766 structures; target experimental bandgap (maximize insulators); lower fidelity is PBE (PAW, 520 eV). Costs: PBE=0.5, Experiment=10. Representations: SOAP reduced to 20D by PCA.
Synthetic study
- Target function: Liu's 1D function f(x) = 1.5(x + 2.5)((6x^2)^2 sin(12x − 4) + 10).
- Lower-fidelity proxies: Generated with controlled Pearson correlation to the target via a principled method, spanning a grid of correlations and relative cost discounts. Performance compared between TVR-EI and optimally provisioned computational funnels in terms of total cost to reach 99th-percentile solutions.
Baselines and evaluation
- Baselines: (i) Composite computational funnels (best-case across budgets, ideally provisioned), (ii) single-fidelity BO (EI on target only), and (iii) random search in figures. Performance metric: normalized regret vs cost; zero regret indicates discovery of the global best at target fidelity. Repeated runs (n=15) with different seeds; report medians and interquartile ranges. Additional analysis of budget sensitivity for over/under-provisioned funnels (0.5× and 2× ideal budget).
Key Findings
- Overall efficiency: The proposed TVR-EI method reduces optimization cost by approximately 3× on average versus commonly used approaches. Table 1 reports expense multipliers (budget required relative to TVR-EI at zero regret): Composite funnel averages 2.7×, and single-fidelity BO averages 3.0× across three datasets (Alexandria, HOPV-15, Chen). Authors summarize an average efficiency gain of 2.8× and an average normalized relative regret gain of 20% over competitors.
- Synthetic study (cost–correlation grid): TVR-EI outperforms optimally provisioned funnels when proxies are relatively expensive or relatively accurate; funnels can perform better when proxies are very cheap and low-accuracy. Magnitude differences are asymmetric: worst TVR-EI underperformance corresponds to funnels needing ~20 fewer target samples, whereas best TVR-EI cases correspond to ~100 fewer target samples than funnels.
- Dataset-specific outcomes:
• Chen (experimental bandgap with PBE proxy): Well-correlated, low-cost proxy; both funnel and TVR-EI reach 99th-percentile insulators with ~10× cost reduction vs random and ~5× vs single-fidelity BO. TVR-EI performance comparable to funnel; TVR-EI allocates substantial budget to proxies and focuses few high-fidelity evaluations (Table 2 shows TVR-EI high-fidelity budget share 87.7%, high-fidelity samples 26.2%).
• HOPV-15 (PCE with multiple DFT proxies): Proxies are relatively expensive and mostly low-correlation with target. Single-fidelity BO and TVR-EI rapidly find optima, both far outperforming funnels; TVR-EI learns proxies are uninformative and allocates most budget to experiment. Table 3 (HOPV) shows average budget shares: Experiment 95.3% (avg 55 samples), M06-2X 0.2% (1.2 samples), B3LYP 1.2% (7.9 samples), PBE0 2.5% (19.4 samples), BP86 0.7% (16.3 samples). Correlation analysis shows PBEO and B3LYP highly correlated; TVR-EI prefers cheaper PBEO over B3LYP.
• Alexandria (polarizability with HF and B3LYP proxies): Intermediate regime with informative proxies and an optimizable target. TVR-EI significantly outperforms both funnel and single-fidelity BO, enhancing BO’s signal with additional proxy information. Table 2 shows TVR-EI spends 24.9% of budget on high-fidelity (6.6% of samples), indicating efficient use of proxies.
- Robustness and provisioning sensitivity: TVR-EI matches or exceeds best-case, ideally provisioned funnels. When funnels are over- or under-provisioned (2× or 0.5× ideal budget), their performance degrades markedly, accentuating TVR-EI’s advantage. TVR-EI automatically avoids uninformative proxies and exploits cheaper correlated proxies, without pre-specified staging or budget splits.
Discussion
The research question is how to more effectively combine experimental and computational (proxy) measurements in materials discovery without relying on rigid, pre-planned funnels that require prior knowledge of fidelity accuracies and fixed budgeting. The findings show that a multi-output GP coupled with Targeted Variance Reduction can dynamically learn cross-fidelity relationships and allocate budget where it maximally reduces uncertainty at promising target-fidelity locations per unit cost. This addresses mis-ordering risks and non-informative steps inherent in funnels by learning correlations on the fly.
Across synthetic and real tasks, TVR-EI adapts to varying cost–correlation regimes: it leverages informative proxies to reduce high-fidelity evaluations, and in regimes with poor proxies or easy targets, it focuses budget on the target fidelity, outperforming funnels and single-fidelity BO. The substantial cost savings (about 3× average) and consistent or superior regret profiles suggest practical gains in high-throughput materials screening. The approach is robust to uninformative fidelities (learning to ignore them) and exploits redundancy (e.g., choosing cheaper among highly correlated proxies), increasing practical efficiency. These properties are particularly valuable when true proxy accuracies are uncertain a priori, when budgets are fluid, or when experimental costs dominate, all common in materials discovery campaigns.
Conclusion
This work introduces a multi-fidelity Bayesian optimization framework, TVR-EI, that integrates experimental and computational fidelities through a multi-output GP and selects candidate–fidelity pairs by maximizing variance reduction at the most promising target-fidelity location per unit cost. Evaluations on three mixed simulation–experiment materials discovery tasks (Alexandria, HOPV-15, Chen) and synthetic functions demonstrate average cost reductions of roughly threefold compared with optimally provisioned computational funnels and single-fidelity BO, with improved or matched regret. TVR-EI dynamically allocates budget, learns cross-fidelity correlations, avoids uninformative proxies, and does not require pre-specified budgets or fidelity accuracy hierarchies.
Potential future work includes: extending to larger-scale settings with scalable GPs or Bayesian neural networks; exploring alternative multi-fidelity acquisition functions and information-theoretic criteria; incorporating richer fidelity representations (e.g., continuous fidelity parameters); integrating domain constraints and multi-objective settings; and deploying in closed-loop experimental platforms to validate real-world throughput gains.
Limitations
- Assumed idealized baselines: Comparisons include a composite, ideally provisioned funnel (best-case across budgets), which is rarely achievable in practice; while this strengthens the comparison baseline, it also means real-world funnel performance may be worse than reported.
- Proxy sensitivity: TVR-EI can exhibit conservative behavior when proxy–target mismatch is high, potentially underutilizing proxies that are somewhat informative. Performance depends on learned correlations; in very low-data regimes, correlation estimation may be uncertain.
- Cost modeling: The method requires cost assignments for each fidelity; mis-specified costs could affect allocation decisions.
- Implementation and scalability: Experiments used IBM’s Bayesian Optimization Accelerator; while GP advances mitigate cubic scaling, very large datasets or high-dimensional representations may necessitate approximate inference or alternative Bayesian models.
- Single-target fidelity assumption: The approach assumes a designated target fidelity; extension to scenarios with multiple competing high-fidelity targets or multi-objective optimization requires further development.
Related Publications
Explore these studies to deepen your understanding of the subject.

