
Engineering and Technology
An active learning framework for the rapid assessment of galvanic corrosion
A. Venkatraman, R. M. Katona, et al.
Discover an innovative active learning framework developed by Aditya Venkatraman and colleagues at Sandia National Laboratories. This groundbreaking approach optimizes predictions of galvanic corrosion while significantly reducing computational costs. Understand how low-cost surrogate models can help identify optimal environmental and geometric parameters for improved accuracy in cathodic current predictions.
~3 min • Beginner • English
Introduction
Materials degradation manifests across diverse applications and often involves high-dimensional variations in part geometry, service environment, and material properties. Physics-based models such as finite element methods (FEM) can inform degradation rates and extent, but require boundary conditions and geometric descriptors that are costly to measure across broad environmental and geometric spaces. In galvanic corrosion, dissimilar metals coupled in brine experience currents governed by environmental (temperature, humidity-derived film thickness, salt concentration), chemical, and geometric factors. The galvanic current (anodic or cathodic) provides a useful metric of susceptibility, but its prediction demands experimentally derived polarization curves for each environment and extensive FE simulations. This study addresses the need to accelerate accurate current predictions under varying conditions by developing an active learning (AL) framework that identifies high-value experimental and simulation inputs to efficiently calibrate a machine learning surrogate. The framework trains Gaussian Process and Neural Network surrogates on a FE dataset, fuses their predictions, and employs a staggered acquisition strategy that first selects environmental conditions (temperature, salt concentration) for experimental polarization measurements and then selects geometric variables (water layer thickness, cathode length) for FE simulations, maximizing information gain and reducing training costs while maintaining high predictive fidelity.
Literature Review
Prior work has applied FEM to model metallic corrosion with varying fidelity, spanning transient vs steady-state formulations, 1D/2D geometries, and differing couplings between electrodynamics and charge transport. Under steady-state, electroneutral conditions, the Laplace equation has been shown to model galvanic corrosion effectively, especially when informed by experimentally derived cathodic boundary conditions parameterized via Butler–Volmer kinetics and dependent on temperature and salt concentration. Studies have used rotating disk electrode measurements to obtain polarization curves for galvanic couples and then used FE to compute available cathodic currents. However, acquiring polarization curves for every environmental combination and running extensive FE simulations pose significant costs. Machine learning surrogates trained on simulation data can accelerate predictions by orders of magnitude, but require judicious selection of calibration data. Active learning offers acquisition functions such as predictive uncertainty, expected improvement, and expected information gain to sequentially select informative inputs. Existing AL protocols typically search the entire input domain and identify single points, which is suboptimal for problems where experimental BCs couple to environmental variables while simulations probe geometric dependencies. There is a need for stratified (staggered) AL to first select environments for experiments and then geometries for simulations, as well as batch selection methods that balance exploration and exploitation. Additionally, surrogate model performance can vary across domains; fusing multiple model forms (e.g., GP and NN) can improve robustness.
Methodology
Problem setup and data: The target output is the available cathodic current per width (Ic/w) for a galvanic couple SS304–AA7050 under varied environmental and geometric parameters. A dataset of 2520 FE simulations (Laplace-based model) calibrated to prior work provides Ic/w across a 4D input space: Temperature 25–45 °C, salt concentration {1, 3, 5} M, water layer (WL) thickness 7e-06–0.05 m, and cathode length 0.01–0.5 m (hypercube C1). The dataset was split into 2000 training and 520 test points. An initial batch of 50 training points was selected using MaxPro design to achieve space-filling coverage across lower-dimensional projections.
Surrogate models and fusion: Gaussian Process (GP) regression and Neural Network (NN) ensembles were trained to emulate Ic/w. The GP used an ARD-squared exponential kernel with hyperparameters learned from data, providing probabilistic predictions (mean and variance). The NN surrogate comprised an ensemble of five independently initialized feedforward networks (three hidden layers with 64, 64, and 32 units; ReLU activations), with architecture selected via K-fold cross-validation. Predictions from GP and NN were fused via an unbiased linear combination with input-dependent weights chosen to minimize predictive variance under the constraint that weights sum to one, following information fusion methods.
Active learning protocol: Starting from the initial 50-point set, a GP model was trained and used to compute acquisition values for the remaining candidate training points. Two acquisition functions were considered: predictive standard deviation (PS) and expected information gain (EI). Iteratively, the top candidate (or batches via weighted K-means clustering for later steps) was added and the GP retrained until cross-validation error saturated. Ten-fold cross-validation (10-fold CV) errors were used to assess learning progression.
Staggered AL for domain expansion: For out-of-distribution improvement and domain enlargement, an expanded domain C2 was defined: Temperature 10–60 °C, salt concentration 0.001–5 M, WL thickness 7e-06–0.1 m, cathode length 0.01–0.75 m. A staggered GP marginalizing the influence of geometric parameters (WL thickness, cathode length) was constructed to produce an acquisition function depending only on temperature and salt concentration, enabling selection of environments for experimental polarization curves. Weighted K-means clustering on acquisition-weighted candidates identified four temperature–concentration cluster centers for experiments. With those environments fixed, the GP acquisition was recomputed over geometric parameters, and for each environment ten geometric configurations were selected via weighted K-means, yielding a batch of 40 new FE simulations per AL iteration.
FE simulations: COMSOL Multiphysics v6.1 was used to solve steady-state Laplace equation-based models in a simplified 2D geometry of an SS304L cathode galvanically coupled to AA7050-T7451. The anode length was fixed at 0.01 m; cathode length and WL thickness were varied per AL. The WL lower limit was set by the natural convection boundary layer. Solution conductivities were obtained from OLI Studio Analyzer 9.5. Anodic kinetics (AA7050) from Liu et al. were used as anodic boundary conditions and assumed constant across chloride concentrations and temperatures; cathodic kinetics depended on temperature and salt concentration via experimentally measured polarization curves. Steady-state conditions were assumed, and polarization scans were not dependent on pH. Charge conservation was enforced to ensure equal and opposite anodic and cathodic currents.
Experiments for boundary conditions: Polarization curves were measured in a three-electrode cell with SS304L (0.196 cm²) as working electrode, Ag/AgCl reference, and Pt-coated Nb mesh counter electrode. Tests used ACS-grade solutions at controlled temperatures (±0.1 °C) in a 150 mL water-jacketed cell, sparged with lab air to maintain O2 saturation. After 1 h OCP equilibration, polarization scans were performed (scan rate 0.167) from OCP to −1.4 V vs SCE. These curves informed the FE cathodic boundary conditions at selected temperature–concentration combinations identified via AL.
Evaluation: Model performance was assessed via 10-fold CV on training data and MAE on the held-out 520-point test set. Out-of-distribution performance was evaluated on an extrapolation dataset (2885 points) with environments at (25 °C, 0.6 M), (35 °C, 1 M), (35 °C, 3 M), and (35 °C, 5.3 M), and with WL thickness and cathode length extending beyond C1 limits. After one staggered AL iteration adding 40 new points, models were retrained and re-evaluated.
Key Findings
- Active learning substantially reduced training costs: both PS and EI acquisition functions drove 10-fold CV error to the same saturation value (~5 mA/m) as using all 2000 training points, but with far fewer points. PS required ~450 added points to saturate, EI ~650; both achieved at least 50% reduction in computational cost versus using all simulations. Random selection required more than twice as many points, on average, to reach comparable accuracy.
- Test-set accuracy: fused surrogate model achieved mean absolute errors of 3.7 mA/m (PS-designed training set) and 3.2 mA/m (EI-designed training set).
- Extrapolation performance (2885-point dataset beyond original geometry ranges): before AL, MAE was 0.019 A/m (PS) and 0.017 A/m (EI). After a single staggered AL iteration adding only 40 new observations (4 environments × 10 geometries), MAE improved to 0.014 A/m (PS) and 0.009 A/m (EI), with EI yielding nearly 50% error reduction (0.017 to 0.009 A/m).
- The staggered AL protocol effectively prioritized points outside the original domain C1, focusing on regions with highest acquisition values, and balanced exploration/exploitation via weighted K-means batch selection.
- Fusing GP and NN predictions improved robustness across the domain. Once trained, the surrogate provides current predictions under new conditions roughly four orders of magnitude faster than running full FE simulations with new experiments.
Discussion
The study demonstrates that a stratified, two-step active learning framework can efficiently guide the acquisition of both experimental polarization curves and FE simulations to build accurate surrogates for galvanic current prediction across high-dimensional environmental and geometric spaces. By marginalizing geometric effects to select environments and then optimizing geometric inputs conditioned on those environments, the framework addresses the practical coupling between experimental boundary conditions and simulation parameters. The results show that both PS and EI acquisitions rapidly reduce error within the original domain, while EI is more effective for domain expansion and robust out-of-distribution improvements, consistent with its objective of maximizing expected information gain in a domain-averaged quantity. Fusing GP and NN predictions further enhances robustness. Collectively, these advances directly address the challenge of costly boundary condition measurements and simulation sweeps, providing a principled way to expand into new regimes with minimal additional data while maintaining accuracy. The approach is broadly applicable to materials degradation problems where experimental and simulation responses are controlled by different factor sets and where multi-physics complexity leads to nonuniform surrogate performance.
Conclusion
This work introduces a staggered active learning protocol that efficiently calibrates a low-cost surrogate for galvanic current prediction as a function of environmental and geometric parameters. Combining GP and NN surrogates with information-theoretic acquisition and batch selection, the framework halves training data requirements to reach target accuracy within the original domain and drastically improves out-of-distribution performance with only 40 additional observations. The EI acquisition function provided the most robust gains for domain expansion. Once established, the surrogate accelerates current prediction by roughly four orders of magnitude relative to full FE-experiment workflows. The protocol generalizes to other corrosion modes and high-dimensional materials degradation problems that require experimental boundary conditions and simulation-driven responses. Future work could involve multiple staggered AL iterations for further error reduction, incorporating additional variables (e.g., alloy composition, couple pair, corrosion potentials), relaxing modeling assumptions (e.g., transient effects, pH dependence), and exploring adaptive, multi-fidelity strategies for even greater efficiency.
Limitations
- Extrapolative accuracy degrades outside the training domain until targeted data are added; only one staggered AL iteration was performed in this study.
- Surrogate model forms may not perform optimally across the entire domain; performance can remain nonuniform despite fusion.
- FE modeling assumptions may limit generalizability: steady-state, electroneutral Laplace formulation in 2D geometry; anodic kinetics assumed constant across chloride concentrations and temperatures; polarization scans assumed independent of pH; WL lower bound set by natural convection boundary layer.
- The study focused on four input variables; other influential factors (e.g., alloy composition variations, additional environmental parameters) were not included.
- Batch selection relied on weighted K-means heuristics; alternative batch acquisition strategies could further optimize exploration–exploitation trade-offs.
Related Publications
Explore these studies to deepen your understanding of the subject.