Introduction
Harmful Algal Blooms (HABs) pose a significant threat to aquatic ecosystems and human health globally, leading to substantial economic losses, particularly in the United States, where estimated annual damages exceed $4.6 billion. These blooms, often caused by microscopic algae or phytoplankton (including cyanobacteria), disrupt the environment and can contaminate seafood, resulting in shellfish closures, fish mortalities, and reduced seafood consumption. Accurate early seasonal prediction of HAB severity is crucial for effective mitigation. Lake Erie, the fourth largest of the Great Lakes, is particularly vulnerable due to its shallow depth and susceptibility to nutrient runoff from agricultural and urban sources. Conventional prediction models, primarily relying on nutrient loading (e.g., total phosphorus), have limitations, often failing to accurately predict extreme HAB events. This study addresses this limitation by incorporating large-scale climate indices into a machine learning model, aiming to improve the accuracy and timeliness of HAB predictions in Lake Erie.
Literature Review
Previous research has focused on nutrient loading as the primary driver of HABs in Lake Erie, with total phosphorus (TP) identified as a key factor. However, linear and non-linear regression models based solely on TP and water discharge have proven insufficient to explain the severity of extreme bloom events, like the 2011 bloom. This suggests the influence of other factors, including meteorological and large-scale climate conditions. While studies have examined the impact of temperature and nutrients on algal growth and toxin production, the relationship between HABs and large-scale atmospheric patterns has been less explored. Some research has shown links between HAB events and atmospheric teleconnections, suggesting the potential influence of broader climatic factors. For instance, studies have linked HAB events in southwest Europe to the North Atlantic Oscillation and Arctic Oscillation, and HABs in Florida estuaries to hurricanes and El Niño events. Existing models, incorporating factors like tributary discharge, soluble reactive phosphorus, and wind stress, explain a portion of the interannual variability of Lake Erie hypoxia but fall short of capturing the full complexity of HAB dynamics. The complex interactions between chemical, biological, hydrological, and meteorological processes necessitate sophisticated predictive models.
Methodology
This study employs a machine learning approach using Genetic Algorithms (GAs) to develop predictive models for the Lake Erie HAB severity index (SI), a non-dimensional index representing maximum bloom biomass. Three models were developed: GA-chem (using nutrient loading data from March to May), GA-clim (using large-scale climate indices from November of the previous year to May of the current year), and GA-chem-clim (combining both nutrient loading and climate indices). The GA models iteratively search through a vast space of possible mathematical functions to find the best fit between predictor variables and the observed SI. A jackknife procedure was used to assess model uncertainty and cross-validate the results using 18 years of data for training and one year for testing. The climate indices considered include the Pacific-North American (PNA) pattern, El Niño-Southern Oscillation (ENSO) index (SOI), and Pacific Decadal Oscillation (PDO). Chemical loading data included total phosphorus (TP), soluble reactive phosphorus (SRP), total Kjeldahl nitrogen (TKN), total nitrogen (TN), chlorides (CL), total suspended solids (TSS), silica dioxide (SD), and sulfate (SL). Further analysis of large-scale atmospheric circulation patterns from the ECMWF ERA5 reanalysis was conducted to investigate the role of climate variability in HAB events. This involved comparing geopotential height and wind anomalies during years with low and high SI values. Finally, additional experiments were conducted to test the models' performance by reducing the training data length (15 years for training and 4 years for prediction), assessing their ability to predict years with extreme HAB events.
Key Findings
The GA-chem-clim model, incorporating both nutrient loading and climate indices, significantly outperformed the GA-chem model (using only nutrient loading) and the GA-clim model (using only climate indices). The GA-chem-clim model showed lower root mean square error (RMSE) and higher correlation with the observed SI. The improvement was particularly notable in years with high SI values, where the GA-chem model struggled. Specifically, the RMSE was reduced from 2.67 in the GA-chem model to 2.26 in the GA-chem-clim model, while the correlation increased from 0.53 to 0.67. The analysis of large-scale atmospheric circulation patterns revealed distinct differences in geopotential height and wind anomalies between years with mild (SI < 2) and severe (SI > 7) HABs. Years with severe blooms were associated with lower geopotential heights, indicating greater intrusions of Arctic air masses during preceding winters. This suggests a strong influence of large-scale climate patterns on HAB severity. The inclusion of local meteorological data (temperature and wind speed) showed potential for further model refinement. The model identified total phosphorus (TP), as the dominant predictor for HAB severity, along with other key predictors such as SRP, TSS, and CL. The models were capable of predicting the SI by early June, providing valuable lead time for mitigation efforts.
Discussion
The findings demonstrate the significant contribution of large-scale climate indices to improving the accuracy of HAB prediction in Lake Erie. The superior performance of the GA-chem-clim model highlights the limitations of solely relying on nutrient loading data for predicting extreme HAB events. The analysis of atmospheric circulation patterns suggests a mechanism linking large-scale climate variability to HAB development. The ability to predict HAB severity by early June offers a substantial lead time for policymakers to implement effective mitigation strategies, such as restrictions on shellfish harvesting and enhanced toxin monitoring in shellfish. This early warning system has important implications for public health and the economy. The identified predictors (TP, SRP, TSS, CL) provide insights into the factors driving HABs and inform targeted management strategies.
Conclusion
This study successfully demonstrates the improved prediction of Lake Erie HAB severity using a GA model incorporating both nutrient loading and large-scale climate indices. The GA-chem-clim model outperforms models using only nutrient data or climate data alone, especially for high-severity years. Early June prediction allows for timely mitigation. Future research could explore incorporating additional environmental variables, develop probabilistic models, and apply this approach to other water bodies.
Limitations
The study is limited by the relatively short (19 years) dataset available for model training and validation. This could affect the model's generalizability and ability to accurately predict very rare events. The deterministic nature of the current model does not account for the inherent stochasticity of HAB development. Incorporating stochastic elements into the model would improve predictions and quantify prediction uncertainty. Further refinement may be achieved by including additional factors that may influence HAB severity.
Related Publications
Explore these studies to deepen your understanding of the subject.