Engineering and Technology

Data-driven pitting evolution prediction for corrosion-resistant alloys by time-series analysis

X. Jiang, Y. Yan, et al.

Discover groundbreaking research by Xue Jiang, Yu Yan, and Yanjing Su that utilizes a data-driven approach with Long Short-Term Memory neural networks to predict free corrosion potential in cobalt-based alloys and duplex stainless steels. This innovative method significantly enhances the forecasting of corrosion behavior over time, surpassing traditional machine learning techniques.

00:00

Playback language: English

Index

Introduction

Corrosion, particularly pitting corrosion, is a dynamic and complex process influenced by both material properties (alloy composition, microstructure, hardness) and environmental factors (corrosive solution, temperature, time). Understanding and predicting pitting evolution is crucial for designing durable corrosion-resistant alloys. Traditional experimental and theoretical approaches struggle to fully capture the interplay of these factors across time. This paper addresses this limitation by proposing a data-driven approach using time-series analysis. Machine learning (ML), especially with its capability to handle high-dimensional datasets and complex relationships, has shown promise in materials science. While ML has been applied to predict uniform corrosion rates, its application to predicting the dynamic evolution of pitting corrosion remains less explored. This study focuses on leveraging the inherent time-series nature of corrosion propagation, using a Long Short-Term Memory (LSTM) neural network to capture the temporal dependencies in corrosion data and predict future pitting behavior. The use of LSTM offers advantages over traditional methods like Support Vector Regression (SVR), k-Nearest Neighbor Regression (KNR), Gradient Boosting Regression (GBR), Random Forest Regression (RFR), and AdaBoost Regression (AdaBR) which lack the ability to efficiently model time-dependent processes, assuming independent and identically distributed data. The research uses cobalt-based alloys and duplex stainless steels as case studies, evaluating the predictive performance of LSTM against traditional ML algorithms.

Literature Review

Numerous studies have investigated pitting corrosion mechanisms, often describing it either macroscopically (discrete events) or microscopically (continuous processes). Previous research has applied artificial neural networks, random forests, and other machine learning algorithms to predict uniform corrosion and design new corrosion-resistant materials, achieving success in handling multi-factor coupling in corrosion problems. However, these traditional methods generally assume independent and identically distributed data points, failing to account for the temporal dependencies crucial for modeling dynamic processes like pitting corrosion. Time-series analysis methods, such as LSTM networks, are specifically designed to handle sequential data and learn temporal dependencies, making them suitable for the present application. Existing studies using Random Forest methods to model corrosion rates often focus on specific aspects like inhibitor effects. This study, however, is unique in its focus on using time-series analysis to predict pitting evolution, providing a novel approach to understand and model long-term corrosion behavior.

Methodology

The study employed four alloys: Stellite 6, Stellite 12, Stellite 706, and Zeron 100. These alloys were subjected to long-term immersion tests (150 days) in 3.5 wt.% NaCl solution at 18 °C. The free corrosion potential (*E*corr) was measured daily. The first 30 days of data were excluded, as pitting initiation was highly variable during this period. Traditional machine learning algorithms (SVR, KNR, GBR, RFR, and AdaBR) were initially applied, treating immersion time as an independent variable. Hyperparameter tuning was performed using grid search and five-fold cross-validation, minimizing the mean squared error (MSE). The best-performing model, GBR, was evaluated using MSE and mean relative error (MRE). Subsequently, the data was transformed into time-series sequences using a sliding window (lookback). The LSTM model was implemented using TensorFlow and Keras. Hyperparameters such as lookback window size (3, 5, 8, 15 days), number of hidden layers and units, batch size, and dropout rate were optimized to prevent overfitting. To assess the importance of different features (alloy composition, hardness, immersion time) in predicting *E*corr, permutation feature importance and SHAP (SHapley Additive exPlanations) values were computed using the GBR model. Finally, an additional 70-day immersion test was conducted to validate the predictive capabilities of both GBR and LSTM models on unseen data. The accuracy of predictions was assessed by comparing the predicted *E*corr values with the actual measured values, using MSE and visual inspection of the trends.

Key Findings

The traditional machine learning models (especially GBR) exhibited good fitting during the training phase (120 days) but failed to accurately predict the *E*corr evolution during the subsequent 70 days of unseen testing. In contrast, the LSTM model showed superior predictive performance, accurately capturing the trends of *E*corr over time. The optimal lookback window for the LSTM model was found to be 5 days. Feature importance analysis revealed that the elements Fe, C, and Si, along with immersion time, were the most significant factors influencing *E*corr predictions for both GBR and LSTM, though the order of importance slightly varied. The permutation feature importance analysis and SHAP value analysis consistently showed that immersion time significantly influenced the predictions, indicating the crucial role of temporal dependencies in pitting evolution. The dataset used in the study has been shared publicly to ensure data reusability and transparency.

Discussion

The superior performance of the LSTM model underscores the importance of incorporating time-series dependencies in predicting pitting corrosion evolution. Traditional ML methods, assuming data independence, are inadequate for capturing the dynamic nature of this process. The LSTM's ability to model temporal dependencies makes it particularly effective in this context. The identification of Fe, C, and Si as significant features highlights the role of alloy composition in influencing corrosion resistance. The consistent importance of immersion time emphasizes the cumulative effects of corrosion over time. The findings of this study offer a valuable new approach for predicting long-term corrosion behavior, which is critical for material design and service life estimations. The approach can be expanded to other materials and corrosion environments.

Conclusion

This study successfully demonstrated the use of an LSTM neural network to predict pitting corrosion evolution in corrosion-resistant alloys. The LSTM model outperformed traditional ML methods by capturing the time-series dependencies present in the corrosion data. Key influencing factors were identified, including alloy composition (Fe, C, Si) and immersion time. This approach offers a valuable tool for predicting long-term corrosion behavior, potentially leading to improved materials design and service life predictions. Future work could focus on incorporating more comprehensive data, including microscopic observations and electrochemical parameters, to enhance model accuracy and expand the applicability of this method to a wider range of alloys and corrosion conditions.

Limitations

The study focused on a specific set of alloys and corrosion conditions (3.5 wt.% NaCl solution at 18 °C). The generalizability of the LSTM model to other alloys and environments needs further investigation. The model's accuracy could also be improved by incorporating additional features, such as microstructural information obtained through advanced characterization techniques. Although the dataset was shared for public use, further exploration of data preprocessing and feature engineering methods could lead to refined prediction capabilities. Finally, although a larger dataset would improve the robustness of the model, the model is already quite effective, given the size of the dataset.

Related Publications

Explore these studies to deepen your understanding of the subject.

Medicine and Health

DeepRTAlign: toward accurate retention time alignment for large cohort mass spectrometry data analysis

Y. Liu, Y. Yang, et al.

Engineering and Technology

An open tool for creating battery-electric vehicle time series from empirical data, emobpy

C. Gaete-morales, H. Kramer, et al.

Agriculture

A data-driven crop model for maize yield prediction

Y. Chang, J. Latham, et al.

Environmental Studies and Forestry

Supplementary Information for Data-driven predictions of the time remaining until critical global warming thresholds are reached

N. S. Diffenbaugh and E. A. Barnes

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny