logo
ResearchBunny Logo
Mass Conservative Time-Series GAN for Synthetic Extreme Flood-Event Generation: Impact on Probabilistic Forecasting Models

Earth Sciences

Mass Conservative Time-Series GAN for Synthetic Extreme Flood-Event Generation: Impact on Probabilistic Forecasting Models

D. Karimanzira

Discover how Divas Karimanzira's innovative research harnesses the power of Generative Adversarial Networks to revolutionize flood forecasting. By generating synthetic flood events, this study significantly enhances predictive models, demonstrating a remarkable 9.8% improvement in multi-step forecasts. Explore the future of smarter and more reliable flood management!

00:00
00:00
~3 min • Beginner • English
Introduction
Extreme flood events are difficult to forecast due to their rarity, complexity, and limited observational data, leading to imbalanced datasets and reduced model accuracy. Traditional numerical/hydrological models, while physically grounded, are computationally intensive and require complex calibration, and may not capture the full variability of extreme events. Time-series GANs offer a data-driven alternative to generate synthetic time series that mimic real flood dynamics. This study addresses the data scarcity problem by proposing a Mass Conservative Time-Series GAN (MC-TSGAN) that integrates physical constraints (mass conservation, energy balance, hydraulic principles) into TimeGAN to generate realistic synthetic extreme flood events. The goal is to augment real datasets with synthetic events to improve multistep-ahead probabilistic flood forecasting accuracy and robustness for the Ahrtal region (Germany).
Literature Review
Prior work shows two main synthetic data approaches: physics-based numerical/hydrological models and machine-learning GANs. Numerical models can simulate unseen events but are computationally heavy and sensitive to calibration and assumptions. Time-series GAN variants have been applied to spatiotemporal data: SINGAN for weather radar generation; TSGAN for sequential patterns; FloodGAN for radar rainfall; TimeGAN/RTSGAN for time-series synthesis, with RTSGAN offering stability and accuracy improvements over TimeGAN. Recent studies indicate GAN-based synthetic data can enhance forecasting models and address data scarcity, with reviews highlighting their utility for extreme event modeling. This paper builds on these by embedding physical constraints into a TimeGAN-like framework to improve realism for flood applications.
Methodology
Case study and data: The focus is the Ahrtal region (Germany) with three gauges. Inputs include historical and forecast precipitation (ECMWF), soil-type information, and high-frequency water levels (10–15 min). Data are normalized and segmented into temporal windows, and multi-site data are integrated to capture spatial dependencies. Approach: A five-step pipeline: (1) data collection and spatial integration; (2) train MC-TSGAN; (3) augment the original dataset with synthetic sequences; (4) train an encoder–decoder LSTM/GRU probabilistic forecasting model on original vs augmented data; (5) evaluate both generation quality and forecasting performance. MC-TSGAN architecture and training: Based on TimeGAN/RTSGAN with three components: (i) generator using RNNs (LSTM/GRU) and WGAN in the latent space learned by an embedding network; (ii) discriminator (CNN-based) to distinguish real vs synthetic sequences; (iii) embedding network (autoencoder-style) to learn compact temporal representations. To enforce physics, mass-conservative LSTM modifications are introduced by redefining gates and cell updates to track and conserve mass between inputs, states, and outputs (normalized gates i^t, o^t, retain gate R^t; total mass m^t split into cell state c^t and output h^t). Additional regularization terms enforce constraints: mass conservation Loss_Q = λ_Q Σ_t |Q_in − Q_out|; energy balance Loss_E = λ_E Σ_t |E_in − E_out| using meteorological and hydrological variables to compute energy proxies; and hydraulic constraints (e.g., relationships consistent with Manning’s equation) added to the loss to reflect flow continuity and channel properties. Training uses adversarial and reconstruction objectives while emphasizing tail behavior to better model extremes. Space-filling sampling across time is used to reduce statistical error and improve coverage of the data manifold. Forecast model: A probabilistic encoder–decoder LSTM/GRU model (TensorFlow Probability head) ingests historical floods, rainfall, soil moisture, catchment characteristics, and future rainfall forecasts to produce multistep-ahead predictions with uncertainty (trained via negative log-likelihood). Evaluation: Generation quality assessed by PCA, t-SNE (structure, clustering, overlap of synthetic vs real), t-statistic tests on means, and discriminative/predictive scores. Forecasting performance assessed using NSE, KGE, and probabilistic metrics CRPS, MPIW, and PICP. Experimental setup: Hyperparameters optimized via Bayesian optimization. Time steps = 24 (hourly), latent dim = 100; generator: 3 layers; discriminator: 2 layers; learning rate = 0.0002; batch size = 64; epochs = 1000; gradient penalty = 10; L2 = 0.001. Models implemented in Python with Keras/TensorFlow/PyTorch.
Key Findings
- Synthetic data quality: A t-statistic of −1.67 for MC-TSGAN vs a two-tailed critical value ±1.98 (95% confidence) indicates no significant difference between means of generated and original data. Discriminative and predictive scores improved compared with baselines: TimeGAN (0.102, 0.088), RTSGAN (0.054, 0.067), MC-TSGAN (0.0490, 0.0519). PCA and t-SNE showed strong overlap and clustering consistency between synthetic and real events, indicating realistic structure and enhanced diversity (lower discriminative score). - Forecasting accuracy: For 6-hour-ahead predictions, model trained on augmented data achieved higher accuracy than using original data only: NSE = 0.838 vs 0.829; KGE = 0.908 vs 0.900. Improvements were more evident at longer horizons, suggesting synthetic data help generalization to extreme scenarios. - Probabilistic performance (6th hour): Augmented vs Original—CRPS: 0.375 (0.023) vs 0.576 (0.024); MPIW (95%): 0.43 (0.021) vs 0.56 (0.02); PICP (95%): 0.921 vs 0.941. Synthetic data narrowed intervals and reduced CRPS, with a modest trade-off in coverage (slightly lower PICP).
Discussion
Embedding physical constraints (mass conservation, energy balance, hydraulic relationships) into a time-series GAN yields synthetic flood sequences that better preserve hydrologic realism and temporal dependencies, as evidenced by PCA/t-SNE overlap, low discriminative/predictive scores, and non-significant t-test differences versus real data. Augmenting training data with these sequences improves deterministic metrics (NSE, KGE) and probabilistic sharpness/calibration (lower CRPS and MPIW) for multistep-ahead forecasting, particularly at longer lead times. These outcomes are consistent with prior studies that found GAN-based augmentation beneficial for time-series forecasting. Nonetheless, a small reduction in PICP highlights the sharpness–coverage trade-off when intervals become narrower. The approach shows promise for enhancing resilience to extremes but still relies on the representativeness of historical data and careful validation to avoid overfitting or bias. The method may struggle with unprecedented events that deviate from historical patterns (e.g., the 2021 Ahrtal event), underscoring the need to further integrate physics and broaden scenario coverage.
Conclusion
The study presents MC-TSGAN, a TimeGAN-based model augmented with mass, energy, and hydraulic constraints to generate physically consistent synthetic extreme flood events. Synthetic augmentation improved multistep-ahead forecasting accuracy (higher NSE/KGE) and probabilistic performance (lower CRPS and MPIW), while maintaining close statistical similarity to real data. Visualization analyses confirmed strong alignment and diversity in synthetic events. Although improvements are demonstrated, especially for longer horizons, the approach requires robust validation and careful handling of the sharpness–coverage trade-off. Future work should enhance spatial–temporal correlation modeling, explore input/architecture sensitivities, expand validation and sensitivity studies, and address limitations in modeling unprecedented extremes.
Limitations
- Potential bias and limited variability: Synthetic data inherit biases from the training set and may underrepresent rare, unprecedented extremes. - Interpretability: GAN-generated sequences can lack direct physical interpretability, complicating trust and adoption. - Data dependence: Quality and representativeness of training data strongly affect synthetic data realism and downstream forecasting performance. - Validation needs: Requires thorough validation under diverse scenarios to ensure robustness and avoid overfitting. - Coverage vs sharpness trade-off: Narrower intervals from augmentation can reduce coverage (lower PICP). - Generalization to unprecedented events: Data-driven models may struggle with events outside historical distributions (e.g., 2021 Ahrtal flood), and with nonstationary drivers (climate/land-use change).
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny