logo
ResearchBunny Logo
Introduction
The advancements in artificial intelligence (AI) and machine learning (ML) have led to the development of MLWP systems. Five prominent global MLWP models emerged: Pangu-Weather, FourCastNet v2 (FCN2), GraphCast, FuXi, and FengWu. Traditional numerical weather prediction (NWP) models utilize nonlinear primitive equations and parameterizations for sub-grid physical processes, relying heavily on supercomputing capabilities and data assimilation. However, recent interest has grown in directly using AI/ML techniques to build MLWP models from atmospheric data. While early studies showed potential, coarse spatial resolution limited their performance. A significant breakthrough came in 2022 with the development of global MLWP models, including FourCastNet, SwinRDM, Pangu-Weather, GraphCast, FuXi, and FengWu, each employing different AI/ML approaches. While these models have shown promise, a homogeneous comparison is lacking. The Central Weather Administration (CWA), responsible for weather forecasting in Taiwan, utilizes global forecast fields to drive regional models. This study aims to provide a comprehensive comparison of these five global MLWP models within the East Asia and Western Pacific region, focusing on their performance in synoptic-scale predictions and typhoon forecasting.
Literature Review
The paper reviews the history of AI and ML, highlighting key milestones and recent advancements driven by increased data availability, computational power, and improved AI tools. It discusses the application of ML in various fields, particularly its suitability for addressing complex challenges in physics and Earth science, including numerical weather prediction. The literature review covers the growing interest in MLWP, summarizing the development and capabilities of several global MLWP models: FourCastNet (and its upgraded version, FCN2), SwinRDM, Pangu-Weather, GraphCast, FuXi, and FengWu. It also notes previous comparative studies, highlighting both the promising results and limitations of existing MLWP models in terms of global metrics, extreme events, and especially tropical cyclone intensity prediction. The review emphasizes the need for a comprehensive and homogeneous comparison of these models, particularly concerning tropical cyclone prediction skills.
Methodology
This study performs a homogeneous comparison of five prominent global MLWP models: Pangu-Weather, FCN2, GraphCast, FuXi, and FengWu. All models used identical initial conditions from ERA5 reanalysis data for simulations covering the Eastern Asia and Western Pacific region from June to November 2023. The study employed a consistent 13-level configuration and a 6-hour time step across all models, using codes provided by the model developers. The evaluation metrics included latitude-weighted Root Mean Square Error (RMSE) and Anomaly Correlation Coefficient (ACC) for various atmospheric variables (500 hPa geopotential height, 850 hPa and 2 m temperatures, and 10 m zonal wind). Typhoon track and intensity predictions were also evaluated for 11 typhoons (excluding three short-lived ones) using CWA's best track data. A multi-model ensemble was created by averaging predictions from the five MLWP models. The study also assessed model bias by analyzing the Western Pacific Subtropical High (WPSH) position and strength. A case study of Typhoon Haikui's track and intensity predictions from the five MLWP models and IFS was performed, analyzing the relationship between predicted tracks and the WPSH position. Finally, the study evaluated rainfall predictions from FuXi, GraphCast, and compared these against IFS, TWRF (15 km), ERA5, and QPESUMS data. Statistical significance testing (Mann-Whitney U test) was used for typhoon track and intensity predictions, comparing Pangu-Weather and FengWu against other models.
Key Findings
The results showed FengWu as the best-performing model based on RMSE and ACC scores, followed by FuXi and GraphCast. Pangu-Weather performed the worst. The multi-model ensemble performed comparably to FengWu. In typhoon track prediction, FengWu was the most accurate, particularly at shorter lead times, though it had the largest intensity errors. Pangu-Weather had the largest average track error but performed best for some individual typhoons. IFS generally fell in the middle of the performance range. The multi-model ensemble reduced the range of error for tropical cyclone track predictions. Analysis of the WPSH showed weak biases across all models, with FengWu having the smallest bias. The case study of Typhoon Haikui showed a strong relationship between predicted tracks and the WPSH position. IFS showed the largest track errors in the early stages of Haikui's prediction. The MLWP models had smaller rainfall amounts compared to IFS, TWRF, and QPESUMS, but generally showed accurate rainfall patterns. In terms of predicting the formation of Typhoon Haikui, FCN2 was earliest, followed by the others within 4-5 days.
Discussion
The findings demonstrate the varying capabilities of these MLWP models, highlighting the strengths and weaknesses of different approaches. The study's results address the research question by providing a comprehensive comparison of these models' performance in a specific region and timeframe, offering valuable insights for operational forecasting. The significance of the findings lies in providing a basis for informed decisions on model selection and the potential for multi-model ensemble approaches to improve weather prediction accuracy. The results contribute to the field by offering a detailed comparative analysis, highlighting the limitations of current MLWP models in capturing certain aspects of weather phenomena, particularly the intensity of tropical cyclones and high-resolution rainfall patterns. The high computational efficiency of MLWP models makes them attractive for operational use and ensemble prediction. The study suggests a need for high-resolution regional models to refine detailed meteorological information, emphasizing that MLWP models may require improvements to their resolution for accurate representation of multi-scale processes.
Conclusion
This study provides a comprehensive evaluation of five leading global MLWP models in East Asia and the Western Pacific. FengWu demonstrates superior performance for synoptic-scale predictions and typhoon track forecasting, while the multi-model ensemble provides a robust alternative. However, limitations remain in accurately predicting typhoon intensity and high-resolution rainfall. Future research should focus on increasing resolution, incorporating additional data types (such as satellite imagery), and improving model designs to address these limitations. The rapid advancements in MLWP suggest a future where such models will play an increasingly critical role in operational forecasting and extreme weather prediction.
Limitations
The study's use of ERA5 as the initial condition for all models, although ensuring consistency, may not fully reflect real-world operational scenarios. The comparison focuses on a specific region and time period, limiting the generalizability of the findings. The relatively simple multi-model ensemble approach might not fully exploit the potential benefits of such techniques. The availability of precipitation forecasts varied among the models, restricting a more thorough comparison of rainfall predictions. Finally, the analysis of the interaction between typhoons and the WPSH only considers a first-order influence.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny