logo
ResearchBunny Logo
Introduction
Deoxynivalenol (DON), a mycotoxin produced by Fusarium species, contaminates oats and poses risks to human and animal health. The European Commission sets maximum limits for DON in oats, and in Sweden, high DON levels in oats have prompted extensive monitoring, incurring significant costs for farmers, collectors, and food safety authorities. Early prediction of high-DON regions could optimize crop protection strategies, risk-based monitoring, and prevent contaminated oats from entering the food chain. Weather conditions (temperature, humidity, precipitation) significantly impact DON contamination by influencing fungal growth and DON production. Agronomical factors (crop variety, crop rotation, soil type, elevation, geolocation) also play a role. Previous studies using weather data alone or statistical models have limited predictive power. This study aimed to improve DON prediction by incorporating weather, agronomical, and site-specific data within machine learning algorithms, providing regionally specific risk assessments for different stakeholders in the oat supply chain. The goal was to develop models that accurately forecast DON contamination levels, understand feature impact, and explore the collective effect of multiple management practices for DON mitigation.
Literature Review
Prior research on DON prediction in oats largely focused on weather data, neglecting the potential of agronomic factors to enhance model accuracy. Some studies suggested combining weather data with agronomic factors to improve predictions. However, the integration of comprehensive datasets including weather, agronomic, and site-specific variables for early DON forecasting in oats remains limited. Existing studies using machine learning for mycotoxin prediction often lack model explainability, making it challenging to interpret results and inform decision-making. While studies have shown high prediction accuracy using machine learning for other mycotoxins in different grains, the application to DON in oats requires further investigation.
Methodology
Three predictive models—Start of Season (SS), Mid-Season (MS), and Full Season (FS)—were developed using a Random Forest (RF) algorithm. The models aimed to provide regional DON contamination predictions (low, medium, high) at different stages of the oat growing season (November to June for SS, November to July for MS, and November to August for FS). Data included eight years (2012-2019) of DON contamination data from Lantmännen elevators in Sweden (54,350 records), weather data from the Swedish Meteorological and Hydrological Institute (SMHI), and agronomic/site-specific data (2016-2017) from various sources. Dataset 1 combined weather and crop variety data (2012-2019, excluding 2016), while Dataset 2 included weather, crop variety, agronomic, and site-specific data (2016-2017). Model performance was assessed using five-fold cross-validation, confusion matrices, and classification accuracy. Feature importance was analyzed using SHAP values to understand variable contributions to predictions. Dataset 1 underwent an 80/20 split for training/internal validation, with 2016 data used for external validation. Dataset 2 used an 80/20 split for training/internal validation, lacking external validation due to limited data.
Key Findings
The RF models demonstrated high prediction accuracy for DON contamination levels, ranging from 0.72 to 0.96 depending on the model and dataset. The SS model showed promising results even with June predictions. Weather variables (rainfall, relative humidity, wind speed at different growth stages) were the most important predictors, but incorporating crop variety, elevation, and other agronomic factors significantly improved model performance (Dataset 1: 0.72 to 0.73 accuracy; Dataset 2: 0.81 to 0.95 accuracy). SHAP analysis revealed detailed feature impact: higher December rainfall increased DON levels, while lower December rainfall and August temperatures decreased DON levels. The variety Galant and Belinda showed lower DON levels compared to Kerstin. High elevation and sandy soils contributed to higher DON levels, potentially due to drought stress. While the models performed well internally, external validation using a leave-one-year-out approach showed reduced performance, highlighting potential year-to-year variability in DON contamination.
Discussion
This study's findings demonstrate the effectiveness of machine learning for predicting regional DON contamination in Swedish oats. The high accuracy of the developed models provides valuable risk assessments for farmers, crop collectors, and food safety authorities, aiding in proactive management of DON and promoting risk-based testing. The importance of weather variables aligns with previous research. However, this study's inclusion of agronomic and site-specific data significantly enhanced model performance, confirming the need for a holistic approach. The results suggest the possibility of early season predictions (June), enabling timely interventions for DON mitigation. The SHAP analysis provides valuable insights into the individual and collective influence of various factors on DON levels, guiding future research and best practices.
Conclusion
This study successfully developed high-accuracy predictive models for regional DON contamination in Swedish oats using machine learning. The models provide valuable tools for stakeholders, facilitating risk-based management and resource allocation. However, the model's performance varied depending on the years involved, highlighting the need for ongoing data collection to improve predictive accuracy. Future research should investigate the impact of additional factors (fertilization, irrigation, fungicide use, harvest conditions), consider multi-mycotoxin prediction, and explore the use of satellite imagery for data enrichment.
Limitations
Several limitations exist. The study could not include all biologically relevant factors due to data limitations (fertilization, irrigation, pest control, fungicide use, harvest conditions). The models showed reduced performance when predicting years not included in the training data, suggesting potential year-to-year variability in DON contamination. The study focused solely on DON, not considering other mycotoxins. Data on DON contamination was not publicly available due to sensitivity concerns regarding individual farmers.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny