logo
Loading...
Simple descriptor derived from symbolic regression accelerating the discovery of new perovskite catalysts

Chemistry

Simple descriptor derived from symbolic regression accelerating the discovery of new perovskite catalysts

B. Weng, Z. Song, et al.

Discover groundbreaking insights into oxide perovskite catalysts designed for enhanced oxygen evolution reaction activities! This research, conducted by authors including Baicheng Weng and Zhilong Song, unveils a novel descriptor that expedites the creation of promising new catalysts. Four out of five synthesized perovskites exhibited remarkable intrinsic activities, paving the way for future innovations in this vital field.... show more
Introduction

The study addresses the need for interpretable machine learning in materials discovery, focusing on oxide perovskites (ABO3) as oxygen evolution reaction (OER) catalysts. Traditional ML models are often black boxes and do not yield physically interpretable rules. Symbolic regression (SR) can generate explicit mathematical formulas linking materials parameters to performance. The authors propose that SR can identify a simple, physically meaningful descriptor to accelerate discovery of high-activity perovskite OER catalysts. They synthesize 18 known perovskites to build a consistent, comparable dataset of OER activities for SR analysis, select a descriptor balancing simplicity and accuracy, validate it using independent literature datasets, and then use it to screen, predict, and synthesize new perovskites with improved OER activity.

Literature Review

Prior work has proposed several descriptors for OER activity in perovskite oxides, including reaction free energy and eg occupancy, which have provided valuable trends. However, these often require DFT calculations, which can limit applicability to new materials due to methodological dependencies and uncertainties (e.g., surface spin states for eg). Perovskites are valued for structural flexibility, compositional versatility, and chemical stability, and have been explored for both OER and ORR. There is a longstanding need for simple, physically insightful descriptors that can be applied without heavy computational inputs. The tolerance factor t and octahedral factor μ are established structural parameters used in perovskite studies and ML, motivating their inclusion in SR.

Methodology

Data acquisition: The authors synthesized 18 conventional oxide perovskite catalysts under consistent conditions. For each perovskite, four samples were prepared and each sample was measured three times, yielding 12 measurements per composition. For each measurement, VRHE values at five current densities (50 µA cm−2, 5, 10, 15, and 20 mA cm−2) from LSV were recorded, giving 18 × 4 × 3 × 5 = 1080 data points. Values were normalized by catalyst loading and BET surface area. Seven of these perovskites overlap with prior data (Suntivich et al.), showing the same VRHE trends.

SR training: The SR target was to link materials parameters to VRHE. Candidate input features included electronic parameters (Nd for B-site TM, electronegativities χA and χB, valence QA) and structural parameters (RA, RB, tolerance factor t, octahedral factor μ). t and μ were defined using ionic radii in the conventional way. Genetic programming-based SR (gplearn) generated and evolved mathematical expressions, evaluated by mean absolute error (MAE) between predicted and experimental VRHE. A grid search over SR hyperparameters produced 8640 mathematical formulas characterized by MAE and complexity; across runs, 43,200,000 candidate expressions were considered. Pareto-optimal formulas balancing accuracy and simplicity were identified.

Descriptor selection and analysis: Nine Pareto-front formulas (A–I) were shortlisted; μ/t was selected as the best compromise between simplicity and predictive accuracy. Linear correlations between μ/t and VRHE were observed at all tested current densities. The descriptor was further validated by reanalyzing literature datasets (including the seminal eg-based volcano study), where μ/t yielded clear linear, monotonic trends with comparable MAE and Pearson correlation.

Experimental methods: Perovskites were synthesized via a modified Pechini method with calcination at 850–1000 °C in air/oxygen. Structural characterization included PXRD and Raman; morphology via TEM/STEM/HRTEM; composition via EDS and ICP-MS; surface area by BET. OER measurements used a glassy carbon rotating disk electrode with Pt counter and Ag/AgCl reference; catalyst inks contained Nafion; measurements were conducted at 5 mV s−1 with 98% iR compensation. Currents were normalized by loading and BET to obtain intrinsic specific and mass activities. Stability was assessed under galvanostatic conditions at 10 mA cm−2.

Key Findings
  • A simple structural descriptor μ/t (octahedral factor divided by tolerance factor) shows a strong linear, monotonic correlation with VRHE across multiple current densities, outperforming or matching more complex, DFT-dependent descriptors.
  • Among nine Pareto-front formulas identified by SR, μ/t best balances simplicity and accuracy.
  • Reanalysis of literature data (e.g., Suntivich et al.) using μ/t yields linear trends with MAE ≈ 21.0 meV and Pearson r ≈ 0.928, comparable to the eg volcano correlation (MAE ≈ 20.6 meV, r ≈ 0.923).
  • The approach used a consistent dataset of 1080 normalized VRHE points from 18 perovskites (4 samples × 3 measurements × 5 current densities), enabling robust SR despite small data volume.
  • Guided by μ/t, five new perovskites were synthesized: Cs0.25La0.75Mn0.5Ni0.5O3, Cs0.4La0.6Mn0.25Co0.75O3, Cs0.3La0.7NiO3, SrNi0.75Co0.25O3, and Sr0.25Ba0.75NiO3. Four (Cs0.4La0.6Mn0.25Co0.75O3, Cs0.3La0.7NiO3, SrNi0.75Co0.25O3, Sr0.25Ba0.75NiO3) exhibit among the highest intrinsic OER activities reported for oxide perovskites.
  • Additional validation: Ba0.75Sr0.25NiO3 was synthesized with perovskite structure and showed OER activity exceeding BSCF (Ba0.5Sr0.5Co0.8Fe0.2O3), supporting the descriptor’s design utility.
  • Feature importance within SR indicates μ, t, and QA correlate more strongly with activity than RA, Nd, χA, and χB.
  • Perovskites with t > 1 (often deemed structurally less stable) tend to show improved OER activity, aligning with μ/t insights.
Discussion

The μ/t descriptor links OER activity to structural factors of perovskites, suggesting that reduced octahedral factor and increased tolerance factor—indicative of lower structural stability—enhance catalytic activity. This provides an interpretable, DFT-free metric for rapid screening and design. The linear, monotonic relationship simplifies materials optimization relative to typical volcano trends and is robust across multiple current densities and independent datasets. The finding that active perovskites often have t > 1 challenges conventional stability criteria but shows such compositions can be synthesized under suitable conditions. Analysis of database trends indicates most known perovskites (t < 0.95, μ > 0.55) may be less active, consistent with the limited subset of highly active perovskite catalysts reported. Overall, SR with consistent small datasets can yield physically meaningful descriptors that directly guide synthesis and accelerate discovery.

Conclusion

Symbolic regression identified a simple, interpretable descriptor (μ/t) that quantitatively predicts OER activity of oxide perovskites and enabled rapid discovery of new high-performance catalysts. Five new perovskites were synthesized, four exhibiting top-tier intrinsic activities. The approach demonstrates that small, high-quality datasets combined with SR can uncover useful design rules without relying on DFT, offering a practical path for data-driven materials discovery. Future work should deepen mechanistic understanding of how μ/t mediates activity and stability, expand synthesis of predicted candidates, and explore generalization to related catalytic systems.

Limitations
  • The training dataset, while consistent and carefully measured, is relatively small and limited to 18 conventional perovskites plus new candidates.
  • Although μ/t shows strong correlations, discrepancies exist when aggregating heterogeneous literature data spanning decades and varying conditions.
  • The descriptor suggests optimal activity in regions of structural instability (t > 1); synthesis windows and long-term stability under operating conditions may constrain applicability.
  • A deeper mechanistic explanation linking μ/t, structural stability, and OER pathways is beyond the current scope and remains to be established.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny