Introduction
Global warming and its associated extreme weather events pose significant threats to society. Climate variability, projected to increase nonlinearly, is a key factor governing these events. General Circulation Models (GCMs) currently struggle to accurately represent the magnitude of this variability. To investigate climate variability across various scales and conditions, long, global time series are needed. However, historical climate data is scarce, particularly for variables beyond near-surface temperature and in the Southern Hemisphere. Existing climate reconstructions often rely on linearity assumptions and stationary data, limiting their accuracy. Climate reanalyses offer high-resolution data but are computationally expensive. This study leverages recent advancements in artificial intelligence, specifically deep learning, to address these limitations. Deep learning algorithms offer a good balance between accuracy and computational cost, handling nonlinear data effectively. While deep learning has shown promise in climate science for gridded data and time series, its application in reconstructing full climate fields from sparse local data is relatively unexplored. This research introduces a novel approach using a simple RNN to reconstruct global climate fields from limited local data, prioritizing efficiency and robustness.
Literature Review
The climate community has developed several methods for reconstructing climate and weather data, including kriging, principal component analysis, and Bayesian algorithms. These methods have yielded high-resolution datasets, often focused on specific regions. Improvements have been made by optimizing input data location using meta-heuristic algorithms. However, most existing methods rely on stationarity and linearity assumptions. Climate reanalyses, while providing four-dimensional datasets, are constrained by data availability and are computationally expensive due to their reliance on GCM outputs. Recent progress in applying artificial intelligence (AI) tools to climate science, particularly deep learning, shows promise in extracting features, forecasting time series, and representing physical systems. However, the application of deep learning to reconstruct climate information has been limited to gridded data or time series reconstructions, rather than full climate fields from sparse data.
Methodology
This study employs a novel approach using a simple recurrent neural network (RNN) to reconstruct global fields from sparse local data. The workflow involves three stages: training, validation, and testing. For training, monthly near-surface temperature data from three gridded climate datasets were used: NOAA's 20th Century Reanalysis Version 3 (20CRv3), the Max Planck Institute for Meteorology Grand Ensemble (MPI-GE) GCM, and the National Center for Atmospheric Research Community Earth System Model Last Millennium Ensemble (CESM-LME). These datasets offer varying origins, approaches, and timeframes, allowing for analysis of the impact of training data on the reconstruction. From each dataset, monthly temperature anomalies (relative to 1951–1980) were calculated at 25 pseudo-station locations, chosen to mimic the realistic distribution of historical meteorological stations, primarily in the Northern Hemisphere. A nearest-neighbor approach extracted grid temperature data for each location. Various RNN models (simple RNN, LSTM, GRU, and Conv1D) with different architectures (layer sizes, neuron counts, dropout values) were tested, resulting in a total of 140 different deep-learning models. For validation, 1000 time steps not used during training were used to evaluate model performance, focusing on metrics such as Mean Squared Error (MSE) and Pearson correlation coefficient (R). The testing phase involved reconstructing global temperature anomaly fields for 1602–2003 CE using 2m temperature data from 25 locations extracted from the EKF400v2 paleo-reanalysis. The LSTM architecture, chosen based on validation results, was used to compare reconstructions with independent test datasets (20CRv3 and EKF400v2). A Principal Component Regression (PCR) reconstruction was also created for comparison, using the MPI-GE dataset. The PCR requires a calibration period where pseudo-location information overlaps with training data, unlike the RNN approach which does not require such a calibration. Assessments of the reconstruction skills were performed using anomaly correlation, mean temperature biases and ability to reconstruct patterns of variability. The study also compared the RNN reconstructions to independent products or time intervals, acknowledging that the baseline datasets (EKF400v2 and LMRv2) are imperfect reconstructions with their own uncertainties.
Key Findings
The study found that simple RNN architectures (GRU and LSTM) generally performed best, even with small training datasets. Overfitting was observed for small training sizes (N=1980), mitigated by using dropout rates of at least 20%. Increasing the training sample size to N=20,000 significantly reduced overfitting, allowing for lower dropout rates (5%). In terms of validation loss, simple models outperformed more complex ones. The best models were found to be one-layer GRU and LSTM models with 32–64 neurons (for smaller training sizes) and up to 256 neurons (for larger training sizes). Adding a second layer increased computational cost without significant performance gains. Evaluation using 1000 unseen time steps demonstrated that simpler models, although prone to overfitting, achieved the lowest MSE scores, with the addition of 5–10% dropout yielding very good performance. Correlation analysis revealed similar results, with simpler models performing better. A comparison with a PCR reconstruction showed that the LSTM outperformed PCR globally, particularly in the Northern extratropics. The PCR performed slightly better in the tropics and parts of the extratropical oceans. Spatial correlation analysis showed that the Southern Hemisphere and tropical regions were more challenging to reconstruct, particularly in boreal summer. The study observed higher correlation coefficients for boreal winter due to stronger Northern Hemisphere planetary wave interactions. Increasing the training sample size improved reconstruction skill for various regions, while adding noise to the input data (by using individual EKF400v2 members instead of the ensemble mean) decreased correlation coefficients. Time-dependent performance analysis showed higher correlation skill for boreal winter compared to summer, explained by differences in climate drivers. Analysis of temperature biases showed varying patterns depending on the training dataset and baseline data used for comparison (EKF400v2 or LMRv2). Case studies of early 19th century cold seasons demonstrated that the RNN reconstruction showed reasonably similar patterns compared to EKF400v2 and a Bayesian cold-season reconstruction. However, the RNN reconstructions generally showed slightly reduced anomaly magnitudes. The leading principal components (EOFs) of reconstructed 2m temperature anomalies showed realistic patterns, with the 20CRv3-based reconstruction closely resembling EKF400v2.
Discussion
The findings demonstrate the effectiveness of a simple RNN approach for generating fast, robust climate reconstructions using minimal computational resources. The method successfully reconstructs global temperature anomaly fields from sparse local data, surpassing the performance of a more established linear method (PCR) in several key regions, especially in the northern extratropics. The superior performance of the RNN in the extratropics is attributed to its ability to capture the time-dependent nature of large-scale atmospheric dynamics and its flexibility in handling non-linear features. The RNN's limitations in tropical regions highlight the need for incorporating multiple time scales or additional data to better capture tropical climate processes. While limitations exist, particularly concerning the representation of certain regions and the potential impact of data biases, the methodology shows great potential for various applications in climate science. The computational efficiency of this approach makes climate research more accessible and energy-efficient, contributing to the United Nations Sustainable Development Goals.
Conclusion
This study demonstrates the feasibility of using a simple RNN to generate accurate and efficient global temperature reconstructions from limited data. The approach outperforms established linear methods in certain key regions and offers several advantages, including computational efficiency and adaptability. Future research should focus on incorporating more data, optimizing data locations, incorporating seasonal information, and exploring more advanced deep learning architectures. This method's accessibility could help decentralize climate expertise and promote broader participation in climate research.
Limitations
The study acknowledges several limitations. The use of pseudo-station data, while realistic, may not perfectly capture the complexity of real-world data. The limited number of pseudo-stations could affect the accuracy of the reconstruction, particularly in regions with sparse historical data. The reliance on existing climate models and reanalyses for training data introduces potential biases. The study focuses solely on temperature anomalies and may not generalize to other climate variables. Further, the relatively simple RNN architecture used may not fully capture the complexities of climate dynamics.
Related Publications
Explore these studies to deepen your understanding of the subject.