Engineering and Technology

Teaching solid mechanics to artificial intelligence—a fast solver for heterogeneous materials

J. R. Mianroodi, N. H. Siboni, et al.

Discover how Jaber Rezaei Mianroodi, Nima H. Siboni, and Dierk Raabe have revolutionized local stress calculations in complex materials with their innovative deep neural network. Achieving up to 8300 times speedup compared to conventional solvers, this research highlights a transformative approach for micromechanics in non-linear materials.

00:00

~3 min • Beginner • English

Index

Introduction

The study addresses the challenge of solving mechanical equilibrium equations in materials with complex, heterogeneous microstructures and non-linear constitutive behavior, where conventional numerical solvers (finite element, spectral) become computationally expensive. The research question is whether a deep neural network can serve as a surrogate solver to predict spatially resolved stress fields in heterogeneous solids (elastic and elasto-plastic), with acceptable accuracy and significantly reduced computational cost, and whether it generalizes to microstructures beyond those used in training. The motivation stems from the need for fast, scalable simulations in multi-physics, multi-scale materials modeling where microstructural complexity and non-linearities (e.g., yielding, localization) impose high computational burdens.

Literature Review

Prior work has used machine learning primarily for homogenized property predictions and accelerated materials discovery. Examples include CNNs linking 3D microstructures to homogenized properties (Cecen et al.), multi-fidelity training frameworks for elastic membranes (Aydin et al.), surrogate models for intergranular fracture (Fernández et al.), hierarchical neural hybrids for failure probabilities (Li et al.). A smaller body of work targets spatially resolved fields: cGANs predicting stress/strain from microstructure (Yang et al.), pretrained models for PDEs on unseen domains (Wang et al.), and ML surrogates for microstructure evolution (Pandey et al.). The present work differs by providing a general AI-based solver for local stresses in heterogeneous solids, including non-linear elasto-plastic behavior, focusing on accuracy, generalization, and computational speed compared with a spectral solver (DAMASK).

Methodology

Governing equations: Large-deformation elasto-plastic mechanics as implemented in DAMASK. Mechanical equilibrium Div P = 0 with boundary conditions; multiplicative split F = Fe Fp; plastic flow via J2 plasticity; von Mises equivalent stress SM = sqrt(3/2)||Sdev||; elastic response S = C:E with Green strain E; numerical solution of the non-linear PDEs in DAMASK via spectral methods used to generate reference data. Constitutive cases considered: (i) isotropic elasticity; (ii) elastic-perfectly-plastic (zero hardening), isotropic J2 plasticity with rate sensitivity (n=20, γ0=10^-3 s^-1). Material property distributions per domain: elastic case Y∈{80,90,100,110,120} GPa, ν∈{0.1,0.2,0.3,0.4}; elasto-plastic case Y∈{60,80,100,120} GPa, ν∈{0.1,0.2,0.3,0.4}, Sy∈{50,100,150,200} MPa. Data generation: 1000 2D Voronoi microstructures (20 domains each), grid 256×256. Loading: elastic case εxy=0.01 (applied in one step); elasto-plastic case εxx=0.001 applied in 100 increments. DAMASK outputs von Mises stress maps used as ground truth. Inputs to ML: channels containing spatial maps of Y, ν; for elasto-plastic, additional channels for Sy and Y/ν; values scaled to [0,255]; outputs are scaled von Mises stress maps (one channel). Network: Modified U-Net with contracting/expanding paths and skip connections. Key modifications: larger convolution kernel size k=9 (vs typical k=3) to better capture derivative-like operations; separable convolutions instead of standard convolutions to learn weighted channel combinations (appropriate since input channels are not highly correlated); variations in depth Ns (e.g., 4 and 8) and number of channels; final layer reduces channels to 1 with sigmoid. Training: Implemented in Keras/TensorFlow; dataset split 950 train / 50 test; loss = mean absolute error; optimizer = Adam (β1=0.9, β2=0.999, ε=1e-7); batch size 32; 400–800 epochs; training achieves MAE ≈ 0.02 without significant overfitting. Evaluation includes generalization to geometries not seen in training (e.g., curved inclusions, circular/square hard particles in soft matrix). Performance benchmarking: CPU-only, single-core comparison with DAMASK on AMD EPYC 7702 3.34 GHz to estimate floating-point cost; AI inference also benchmarked on same CPU.

Key Findings

- Accuracy: For heterogeneous elastic media (mechanical contrast up to ~1.5), DNN achieves 3.8% MAPE; for elasto-plastic media (yield stress contrast up to 4×), DNN achieves 6.4% MAPE. - Speed: Elastic case inference ~0.12 s vs DAMASK ~12.13 s (103× speed-up). Elasto-plastic: DAMASK total ~22 min for 100 increments (~13.2 s per increment); DNN ~0.158 s total, yielding ~8300× speed-up; even a single DAMASK increment is ~84× slower than DNN one-shot. - Error distribution: For elastic tests similar to training (Voronoi), most points (91%) have ≤4% relative error; largest errors localize at domain boundaries and near simulation box borders (attributed to lack of periodic boundary handling in CNN). Generalization to unseen topologies (e.g., circular/square inclusions): correct stress partitioning captured; max relative error ~12%, again concentrated near interfaces; example minima around a circular inclusion: DAMASK 802.5 MPa vs AI 849.4 MPa. - Network depth effect: Deeper U-Net (Ns=8) better captures wake of stress fields but amplifies oscillations at property discontinuities (analogy to including higher frequencies in spectral solvers); shallower network (Ns=4) reduces overshoots. - Kernel size effect: Small kernels (k=3,5) fail to learn correct distributions (higher loss); larger kernels (k=7,9) train to low loss (~0.02) and produce accurate fields; all reported results used k=9. - Generalization: The trained model predicts stress distributions for geometries far from training set (non-Voronoi, curved interfaces) with acceptable accuracy, indicating robust generalization.

Discussion

The DNN-based surrogate offers a non-iterative, single-pass alternative to conventional iterative solvers for heterogeneous, non-linear solid mechanics, drastically reducing computational cost while maintaining acceptable accuracy for many applications. Errors are higher than those from high-accuracy FEM/spectral solutions in some regions, particularly near interfaces and boundaries; however, for tasks such as topology/microstructure optimization, solver acceleration, and hotspot identification, current accuracy is sufficient. The analysis links U-Net architectural choices to numerical analogs (e.g., depth to derivative order/high-frequency content, kernel size to derivative accuracy), offering guidance for balancing accuracy and oscillation control. Boundary-related errors likely stem from not enforcing periodic boundary conditions in convolutions; incorporating periodic padding and filtering (analogous to spectral filtering) is expected to reduce errors. The surrogate can also augment conventional solvers by providing high-quality initial stress guesses to reduce iterations. Extending the framework to predict full stress/strain tensor fields and to multiple boundary conditions is straightforward conceptually. Incorporating physics (e.g., via physics-informed neural networks) could improve accuracy, stability, and uncertainty quantification.

Conclusion

This work demonstrates a U-Net-based deep learning surrogate that predicts spatially resolved stress fields in heterogeneous elastic and elasto-plastic materials with low MAPE (3.8% elastic, 6.4% elasto-plastic) and very large speed-ups (103× and up to 8300×, respectively) compared to a spectral solver (DAMASK). The model generalizes to geometries not present in training and provides insights into how network depth and kernel size affect solution quality and oscillations. Future directions include: enforcing periodicity and introducing filtering to mitigate boundary/oscillation errors; predicting full tensor components; integrating physics via PINNs to enhance accuracy and trustworthiness; extending to broader boundary conditions and load cases; and hybrid strategies where ML provides initial guesses to accelerate iterative solvers.

Limitations

- Errors are elevated near domain interfaces and close to simulation box boundaries due to lack of periodic boundary handling in CNN convolutions. - Deeper networks can amplify oscillations at sharp property discontinuities (Gibbs-like effects), requiring architectural or filtering countermeasures. - Accuracy, while acceptable for many tasks, may be insufficient where absolute stress precision is critical (e.g., damage/fracture modeling) without further refinement. - Performance speed-ups depend on solver settings; reported factors may vary with different configurations and optimization of conventional solvers. - Elasto-plastic predictions were trained at a specific load magnitude and perfect plasticity; generalization across broader loading histories, hardening laws, and instabilities/bifurcations may be more challenging and require more data/physics constraints. - The approach is data-driven and does not natively encode governing PDEs; lack of embedded physics can limit extrapolation reliability without PINN-like constraints.

Related Publications

Explore these studies to deepen your understanding of the subject.

Medicine and Health

What Is in There for Artificial Intelligence to Support Mental Health Care for Persons with Serious Mental Illness? Opportunities and Challenges

B. Wang, C. K. Grønvik, et al.

Psychology

What Is in There for Artificial Intelligence to Support Mental Health Care for Persons with Serious Mental Illness? Opportunities and Challenges

B. Wang, C. K. Grønvik, et al.

Engineering and Technology

An artificial intelligence-aided virtual screening recipe for two-dimensional materials discovery

M. C. Sorkun, S. Astruc, et al.

Psychology

Recurrent individual treatment assignment: a treatment policy approach to account for heterogeneous treatment effects

I. Cornelisz and C. V. Klaveren

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny