Introduction
Corrosion inhibition research has evolved significantly since its early categorization of inhibitors. The field is undergoing a paradigm shift towards data-driven discovery, utilizing machine learning for prediction and categorization. This shift is evident in both mechanistic and statistical research approaches. Mechanistic research focuses on controlled experiments and computational models to understand inhibition mechanisms, with substantial work on AA2024-T3 corrosion inhibition. High-throughput screening methodologies, inspired by pharmaceutical drug discovery, generate large datasets but often lack mechanistic detail. Computational methods, including finite element method (FEM), density functional theory (DFT), and molecular dynamics (MD) simulations, provide valuable insights into inhibitor-substrate interactions. The integration of these experimental and computational approaches has facilitated the development of machine learning-based QSPR models, but the lack of high-quality, multidimensional datasets remains a significant challenge. This study addresses this gap by creating a robust, multidimensional, time-dependent electrochemical database for AA2024-T3 exposed to approximately 80 small organic molecules. The data generated will be used to train predictive machine learning models, improving the efficiency of inhibitor discovery and selection.
Literature Review
The literature extensively documents various approaches to corrosion inhibitor research, ranging from empirical methods to sophisticated computational techniques. Studies have explored the mechanisms of inhibition for a wide array of inorganic (chromates, rare-earths, molybdates, cobalt ions, magnesium-based pigments, lithium salts) and organic (imidazole, triazole/thiazole, quinoline, carbamate, thiosemicarbazone derivatives) compounds. High-throughput screening methods, while generating large datasets, often lack mechanistic information. Computational modelling using FEM, DFT, and MD simulations offer valuable microscopic insights into inhibitor-substrate interactions. QSPR models, employing machine learning, aim to correlate molecular descriptors with inhibitor efficiency. However, previous research, such as Winkler et al.'s work on AA2024 and AA7075, highlighted the limitations of relying solely on *in vacuo* DFT data for predictive models. The CORDATA database offers an open-source approach, but data heterogeneity remains a challenge. While high-throughput datasets exist for aluminum alloys, the lack of multidimensional input features is a significant limitation for accurate modeling. Therefore, this study emphasizes the importance of a multidimensional electrochemical dataset for improved machine learning applications.
Methodology
AA2024-T3 samples (20 mm x 20 mm, 2 mm thick) were prepared by mechanical grinding and polishing to a mirror finish. Electrochemical experiments were performed in a three-electrode cell with 300 ml of 0.1 M NaCl solution (with and without 1 mM of each of ~80 small organic molecule inhibitor candidates). Three techniques were used: Linear Polarization Resistance (LPR) measurements were taken every 10 minutes for 24 hours (±10 mV potential range, 0.5 mV s⁻¹ scan rate). Electrochemical Impedance Spectroscopy (EIS) measurements were performed at 2 and 24 hours (10 kHz-10 mHz frequency range, 10 mV peak-to-peak amplitude). Potentiodynamic Polarization (PDP) curves were recorded after the 24-hour EIS measurement (-250 mV to +250 mV vs. OCP, 0.5 mV s⁻¹ scan rate). All measurements were repeated at least three times. The pH of the electrolytes was measured before and after the electrochemical experiments. Inhibitor performance was assessed using inhibition efficiency (IE) and inhibition power (IP). Molecular descriptors were generated using RDKit, and DFT calculations were performed using Turbomole. Recursive Feature Elimination (RFE) was used for feature selection, and Random Forest (RF) models were trained to predict inhibition performance using different feature sets: structural features only, structural features combined with DFT data, structural features with average pH, and structural features combined with DFT data and pH. The model's performance was evaluated using R² and RMSE. A 6-fold cross-validation was also performed to assess the model's robustness.
Key Findings
Potentiodynamic polarization, EIS, and LPR measurements revealed a wide range of inhibitor behaviors, with corrosion current densities varying by up to two orders of magnitude. The best inhibitors reduced corrosion current density by more than 10-fold. Analysis of corrosion and breakdown potentials indicated that the inhibitors acted as mixed or anodic inhibitors. EIS impedance modulus values showed more than a 2-order of magnitude range. Time-weighted average LPR values (Rp), calculated using trapezoidal integration, provided a comprehensive metric for inhibitor screening. The study found that inhibition power (IP) was a superior metric for comparing inhibitor performance compared to inhibition efficiency (IE), providing a more linear and unbiased assessment. For initial screening, time-weighted LPR measurements showed high correlations with other techniques and are a suitable proxy for protective behavior. Analysis revealed that time-dependent measurements varied significantly within the first 6 hours, with more stable behavior observed thereafter. Inhibitor ranking based on time-weighted LPR (converted to IP) revealed that N and S heteroatom-containing compounds consistently performed well, while compounds containing only oxygen often acted as corrosion accelerators. Analysis of electrochemical potentials (corrosion potential, breakdown potential, and their difference representing the passive range) showed that while corrosion potential was significantly influenced by inhibitors, the breakdown potential remained relatively constant. Machine learning model analysis using RFE and RF indicated that incorporating DFT parameters (especially HOMO) and average pH values significantly improved prediction accuracy (R²) and reduced RMSE, highlighting the importance of both molecular structure and environmental factors in inhibitor performance. The model with combined descriptors and IP as a target yielded the best results.
Discussion
This research successfully addressed the need for a high-quality, multidimensional dataset for corrosion inhibitor discovery. The findings confirm the importance of time-resolved measurements for accurate assessment of inhibitor performance and the superiority of inhibition power as a performance metric. The study highlighted the complex relationship between molecular structure, environmental factors (pH), and corrosion inhibition. The improved predictive accuracy of machine learning models incorporating DFT data and experimental pH values demonstrate the synergistic value of combining experimental and computational approaches. The results offer valuable insights into the selection of relevant descriptors for developing more accurate QSPR models. The identification of key features such as N and S heteroatoms and pH enhances the ability to predict the performance of untested compounds, accelerating the discovery of novel corrosion inhibitors.
Conclusion
This study provides a valuable experimental foundation for the data-driven discovery of corrosion inhibitors. The creation of a comprehensive electrochemical database and the demonstration of improved machine learning models using augmented descriptors (combining structural, DFT, and experimental parameters) contribute significantly to the field. Future research should focus on expanding the dataset, exploring more advanced machine learning techniques, and investigating the use of additional mechanistic descriptors to further refine predictive models. Developing faster, high-resolution electrochemical techniques would also significantly benefit the research field.
Limitations
The study used a limited number of molecules. While 80 molecules were initially tested, only 59 were fully dissolved in the solutions. This could limit the generalizability of the findings. The pH measured was the bulk electrolyte pH. While this information was valuable for the models, the local pH at the electrode surface could be different. The relatively low R² and high standard deviations of RMSE in the cross-validation suggest that more training data is required for more robust predictions and reducing sensitivity to outliers.
Related Publications
Explore these studies to deepen your understanding of the subject.