
Engineering and Technology
A multi-model architecture based on deep learning for aircraft load prediction
C. Sun, H. Li, et al.
This research presents a groundbreaking deep learning-based aircraft load model that achieves remarkable prediction accuracy and goodness-of-fit using extensive flight data. The study, conducted by authors from Peking University and the Aviation Industry Corporation of China, significantly enhances strain prediction and load model calibration, ultimately reducing the need for ground testing.
~3 min • Beginner • English
Introduction
Aircraft structural health monitoring is essential due to the prevalence of load-induced fatigue failures and the impracticality of installing reliable load sensors on in-service aircraft. Current engineering practice identifies structural load F from flight parameters X via a load model F = f(X), but this requires expensive, time-consuming ground tests (wind tunnel and CFD) and often applies a single-aircraft model to an entire fleet, reducing reliability due to inter-aircraft differences. Classical approaches compute load from measured strains using linear relations (F = kE + b), but strain gauges are costly, can fail, drift, or detach, leading to large errors and maintenance burdens. This study proposes a two-phase method that combines advantages of both approaches: (I) predict strains from readily available flight parameters via deep learning; (II) calibrate strain-load coefficients to obtain load, enabling a general, low-cost, accurate load model adaptable to each aircraft without relying on long-term strain gauge measurements. The work investigates causality between flight parameters and strains, addresses multi-distribution data due to varying flight attitudes, mitigates short-term data corruptions, and provides interpretability to support engineering adoption.
Literature Review
Prior structural health monitoring methods rely on strain gauges and ground calibration to relate strains to loads, but gauges suffer from drift, detachment, and radiation effects, and ground tests are expensive and risky. Fleet-level practice often reuses a single aircraft’s load model, limiting universality. Earlier work established linear strain–load equations and explored operational load monitoring across military and transport aircraft. The authors’ prior research introduced a deep learning-based dynamic Granger causality to capture nonlinear, time-varying causal relations in multivariate time series. Interpretability and expert-in-the-loop analysis are emphasized in regulated domains; SHAP and decision-tree-based surrogate explanations have been used to provide meaningful explanations for complex models. The paper builds on these streams by integrating deep learning causality, multi-model learning under non-i.i.d. conditions, and interpretable feature importance for aviation load prediction.
Methodology
Overview: A two-phase pipeline forms a general load model. Phase I predicts strains from flight parameters using a deep learning multi-model tailored to flight attitudes. Phase II calibrates strain-load coefficients across aircraft using clustering-based optimization to adapt a reference strain–load mapping to each aircraft.
Data: 2,003,159 flight records from 5 aircraft (~400k per aircraft). Each record includes 28 flight parameters (e.g., weight, Mach, altitude, attitudes, angular rates/accelerations, control surface deflections, overloads) and 10 strains (e.g., wing/canard/vertical tail shear and bending, fuselage bending).
Preprocessing: (1) Filter time series with 8 Hz stopband cutoff; (2) Remove outliers via angle-based outlier detector; (3) Remove redundant features using Pearson correlation, multicollinearity analysis, and PCA; (4) Create frequency-domain features by fusing across frequency bands; (5) Extend inputs by adding products of flight parameters with physical meaning (e.g., lift ~ normal overload × weight). Total inputs extended from 28 to 406 features.
Causality validation: Adopt deep learning-based Granger causality using LSTM models. A variable X has causal influence on strain E if adding X to the input reduces prediction error versus using E alone (ΔError < 0). This captures nonlinear, dynamic, multivariate relations.
Multi-model architecture (Phase I): Because relations differ across flight attitudes, divide data into 36 subsets via a hybrid of maneuver coding (9 categories: turn/circle, pull/push, dive turn, jump turn, split-S, loop, half loop, roll, ground attack) and point-in-the-sky (PITS) bins using thresholds H = 5000 m and Nz = 3.0 g (4 bins), for 9×4 subsets. Train per-subset predictors using: Multi-Layer Perceptrons (primary), plus Ridge Regression and LightGBM for small-sample subsets. Employ neural architecture search for MLP hyperparameters and a dual-loss objective combining MSE and model uncertainty to improve stability and generalization. Use transfer learning to adapt to new data.
Strain–load mapping and coefficient calibration (Phase II): A reference aircraft (0) has a strain–load equation from ground tests. For any aircraft a, predict strains from flight parameters and calibrate a linear relation E⁰ = S_E·Eᵃ + bᵃ between aircraft a’s predicted/measured strains and the reference strains. Construct strain pairs efficiently by cross-predicting strains between aircraft models to reduce search from O(n²) to O(n). Iterate feasible spaces of the scale factor S_E (SF) while clustering intercepts b to stabilize calibration: model b as Gaussian and perform distribution-based binning plus density-based merging (DBSCAN-like), guided by silhouette coefficient and global R² to select SF that maximizes goodness-of-fit and avoids overfitting to local clusters. The final per-aircraft load model applies the reference strain–load mapping to the calibrated, predicted strains.
Key Findings
Performance: The Phase I strain prediction achieves 97.16% average accuracy across aircraft and strain types; Phase II coefficient calibration achieves 99.49% average goodness-of-fit.
Findings:
- Finding 1: Deep learning-based Granger causality exists from flight parameters (and their products) to strains; >70% of pairs show classical Granger causality (p<0.05) and >80% show DL-based causality.
- Finding 2: A strain–load equation for any aircraft can be derived by calibrating a reference aircraft’s equation with a correction coefficient, enabling fleet-wide generalization with only one ground test.
- Finding 3: Flight attitude strongly affects strain–parameter relations, yielding multiple data distributions; a single i.i.d. model underperforms. A hybrid maneuver+PITS division into 36 subsets yields the best accuracy (>95%), outperforming maneuver-only or PITS-only splits.
- Finding 4: Short-period data corruptions (spikes, steps, composite) are the main noise sources. Correcting them improves prediction accuracy by ~5%; additional preprocessing (outlier removal, redundancy reduction, frequency-domain features) adds ~2%; adding physically meaningful product features adds ~6%.
- Finding 5 (Interpretability): Key influential flight parameters for load/strain prediction are normal overload, angle of attack, inner aileron deflection, Mach, and barometric altitude. Deep models prioritize overload, AoA, Mach, and altitude more than classical linear models, which emphasize control surface deflections.
Additional: The multi-model architecture consistently matches or exceeds the best of MLP, LightGBM, and Ridge Regression across data regimes; cross-aircraft strain pair construction reduces complexity from O(n²) to O(n).
Discussion
The proposed two-phase framework addresses the lack of reliable onboard load sensors and the cost and risk of per-aircraft ground calibration by combining accurate strain prediction from ubiquitous flight parameters with a low-cost calibration step. Dividing data by flight attitude resolves non-i.i.d. distributions and improves robustness and accuracy. Interpretability via SHAP and a nonredundant multiple tree supports engineering acceptance, highlighting physically meaningful drivers (e.g., normal overload, AoA) and offering rules for calibration. The data-driven approach reveals potentially overlooked physics (e.g., dynamic pressure interactions) and demonstrates how big flight datasets can enhance reliability and generalization through uncertainty-aware training and transfer learning. Overall, the method enables a more general fleet-wide load model with minimal ground testing, improving structural health monitoring and life prediction.
Conclusion
This work presents a deep learning-based, two-phase approach for aircraft load prediction: (1) a multi-model architecture that predicts strains from flight parameters under 36 hybrid-coded flight attitudes, and (2) a clustering-driven coefficient calibration to adapt a reference strain–load mapping to each aircraft. The approach attains 97.16% average strain prediction accuracy and 99.49% goodness-of-fit in calibration using over 2 million real flight records, reducing reliance on expensive strain gauges and extensive ground tests. Interpretability methods identify key features and provide rule-based explanations for calibration, facilitating engineering deployment. Future research will improve adaptability without strain gauges through enhanced transfer/few-shot learning and federated learning, address data scarcity and privacy constraints, and further streamline multi-model training to broaden applicability to other fleets and aviation tasks.
Limitations
The approach still requires one reference ground test and an additional calibration phase, adding process steps. Data availability is constrained by engineering, confidentiality, and privacy considerations. Some attitude subsets suffer from small sample sizes, challenging deep models and necessitating simpler learners. Although interpretability is improved via SHAP and surrogate trees, the internal mechanisms of deep models remain complex. The method’s performance and calibration robustness may vary with new aircraft types and operational envelopes, motivating future work on improved transfer and few-shot/federated learning.
Related Publications
Explore these studies to deepen your understanding of the subject.