logo
ResearchBunny Logo
An agricultural digital twin for mandarins demonstrates the potential for individualized agriculture

Agriculture

An agricultural digital twin for mandarins demonstrates the potential for individualized agriculture

S. Kim and S. Heo

This groundbreaking study conducted by Steven Kim and Seong Heo investigates the creation of an agricultural digital twin utilizing mandarins as a model crop. By aggregating data from Jeju Island and performing detailed analyses, the research demonstrates the potential of digital twins for individualized agricultural practices.

00:00
00:00
~3 min • Beginner • English
Introduction
The paper addresses how an agricultural digital twin (DT) can support precision and individualized agriculture in open-field mandarins. DTs are virtual counterparts of physical systems, enabling simulation, modeling, and data-driven decision-making by integrating ICT, IoT, remote sensing, GIS, big data analytics, and AI. In agriculture, these technologies facilitate site-specific management by linking longitudinal and geospatial data, shifting from uniform to heterogeneous management. In Korea, a smart farm policy encouraged extending research from greenhouses to open fields; mandarins on Jeju Island were chosen as a model due to widespread cultivation, perennial nature, and clonal propagation that allows repeated, individual-level measurements over time. Despite policy budget cuts limiting new data collection, the authors compiled open datasets via APIs (soil chemistry, fruit quality, weather, and agricultural practices) and geocoded them to build and demonstrate an agricultural DT. The study’s purpose is to integrate multi-source data and evaluate DT-driven analyses at regional, inter-orchard, and intra-orchard scales, assessing their utility for stakeholders and the feasibility of individualized (tree-level) management.
Literature Review
The authors review DT applications in smart farming and related sectors. Prior work includes IoT platforms like SmartFarmNet for automated data collection and API-based analytics; UAV-based intrusion detection DTs for crop protection over 5G; and an orchard-scale DT using 3D LiDAR to continuously monitor tree health, structure, fruit quality, and predict stress, disease, and yield loss with scenario simulation. WebGIS frameworks aggregate geospatial data for regional to global agricultural analytics, and FIWARE-based platforms have supported agricultural DT development with open APIs and real-time processing. In crop management, deep learning (e.g., U-net) has achieved up to 99% performance in aerial tree canopy detection and localization, and DT approaches with satellite imagery have been used to predict forest change trends. Post-harvest DTs (e.g., mangoes) have modeled cold-chain quality evolution using computational fluid dynamics. Collectively, these studies show increasing sophistication and breadth of DT applications across the agricultural value chain.
Methodology
Data sources and collection: Open APIs from the Korean data portal were used to gather multi-source data. Soil data (2020–2022) from the Rural Development Administration (RDA) included available phosphate, exchangeable K, Ca, Mg, pH, organic matter, and electrical conductivity; 30,261 agricultural soils in Jeju were crawled, and 5,939 orchard soils were analyzed. The Jeju Free International City Development Center (JDC) randomly selected 30 mandarin orchards (2021 season) and provided orchard-level weather (daily average temperature, relative humidity, air pressure via on-orchard sensors), self-reported agricultural practices (type, amount, date, units, product names), and fruit quality. Per orchard, 100 trees were randomly selected; from the third week of October to the fourth week of November 2021, three fruits per tree per week (one each from low, middle, high positions) were destructively measured for sugar content (°Brix) and fruit size (mm), with tree IDs and timestamps recorded. Geocoding and data integration: Addresses were geocoded via Kakao Developers API; datasets were merged in R using packages for JSON/XML parsing and data wrangling. GIS shapefiles from the National Spatial Data Infrastructure Portal were used to visualize spatial distributions (terra, maps, sp, sf). Analysis scales: Regional-scale visualization used a 1-km grid created in QGIS and kernel density estimation (KDE) of soil components; temporal averages of fruit quality were plotted by orchard coordinates across three periods (late Oct, early Nov, late Nov). Smooth splines described relationships of fruit quality to soil and weather variables, using distance-weighted soil approximations when exact coordinates lacked soil records. Inter-orchard analyses included ridge plots of monthly practice frequencies, comparative case study of two orchards with the same cultivar, boxplots of quality distributions across 27 orchards (N=39,679 fruits), and spaghetti plots of longitudinal means. Mixed-effects modeling treated orchard as a random effect and time as a fixed effect to quantify variance explained and temporal effects; multiple testing adjustments were applied. AutoML (h2o) with stacked ensembles predicted sugar content and fruit size using predictors: week/month, weather (temperature, humidity, air pressure), frequency of five practices (thinning, mulching, spraying, fertilization, pruning), and orchard index; performance was evaluated via R², RMSE, MAE. Intra-orchard analyses for a selected orchard (lab) included comparisons by fruit position, mixed-effects modeling for position effects, and hierarchical clustering of 100 trees based on longitudinal sugar and size trends to identify groups for targeted management. Demonstration: An R Shiny app (Shiny DT) provides interactive panels (map, soil, weather, practice comparisons, distributions, and histories) at orchard and tree levels to showcase DT functionality.
Key Findings
- Regional soil patterns: KDE maps showed higher available phosphate, exchangeable K and Mg, pH, and electrical conductivity in western Jeju; eastern regions (volcanic ash soils) had higher exchangeable Ca and typically low OM and P, though observed OM exceeded recommendations likely due to organic fertilizer inputs. Western regions tended to be more alkaline; policy suggestions included adjusting fertilizer supply regionally. - Spatiotemporal fruit quality: Average sugar >11.5° Brix was rare in late October but common by mid- to late November; in late November, southern orchards (<33.4°N) had higher sugar than northern ones. Time and location layers differentially influenced sugar and size. - Quality–factor relationships: Sugar content exhibited monotonic relations with some soil (Exch. K, Mg, pH, EC) and weather (temperature, humidity) variables, and non-monotonic relations with available phosphate, Exch. Ca, organic matter, and air pressure. Exch. Ca related more strongly to fruit size than sugar, suggesting limiting Ca fertilization to avoid oversized fruit. An inverse relationship between sugar content and fruit size was observed in several factor ranges. Air pressure between 0–5 atm associated with increased sugar and decreased size; >6 atm showed opposite trends. - Inter-orchard variability: Across 27 orchards (N=39,679 fruits), median sugar content ranged 9.8–12.1 °Brix; median size ranged 43.9–67.6 mm, indicating greater between-orchard variability in size than sugar. Orchards producing the smallest and largest fruits were clearly distinguished; orchards with larger fruit tended to have lower sugar and vice versa. - Temporal trends: Mixed-effects modeling showed mean sugar content increased significantly over time despite ongoing harvests (p<0.001); relative to week 10-3, increases were +0.309 (10-4), +0.308 (10-5), +0.658 (11-1), +1.047 (11-2), +1.248 (11-3), +1.435 (11-4) °Brix. Mean fruit size decreased after the initial week, with less clear patterning. - Machine learning predictions (orchard-level): Stacked ensemble AutoML achieved sugar: RMSE 0.97, MAE 0.76, R² 0.43; size: RMSE 3.73 mm, MAE 2.96 mm, R² 0.84. Orchard index was the top predictor, followed by air pressure. Fruit size was substantially more predictable than sugar at the orchard level. - Intra-orchard variability and position effects: Within the lab orchard, fruit position (high/middle/low) had minimal impact on sugar (high vs middle +0.020 °Brix, p=0.131; middle vs low +0.039 °Brix, p=0.000251) and no significant effect on size, suggesting little value in grading by position. - Tree-level clustering: Hierarchical clustering grouped 100 trees into four clusters with distinct longitudinal sugar/size patterns; one cluster showed clear sugar increases, while all clusters showed size decreases. Low-sugar tree groups can be targeted with tailored practices (rain-shelter, irrigation control, foliar fertilization, pruning, thinning). - Variance explained: Models using harvest time and five orchard-level practices explained 19% of sugar variance (R²=0.19); an inter-field mixed-effects model explained 38% (R²=0.38); a tree-level model explained 66% (R²=0.66), indicating substantially improved explanatory power with intra-orchard analysis and supporting individualized management. - Two-orchard case (same cultivar): The eastern (lab) orchard had lower available phosphate and higher exchangeable Ca (volcanic ash soil traits); the western (Hab) orchard showed higher EC (nutrient excess risk). Practice timings differed. Lab generally had higher sugar (except late October) and larger fruit (except early November). Harvest timing recommendations to maximize prime-grade proportion: lab from late Oct to early Nov; Hab from early to late Nov.
Discussion
The study demonstrates that integrating open, multi-source datasets into an agricultural DT enables monitoring and analysis at regional, inter-orchard, and intra-orchard scales, directly addressing the question of whether DTs can support data-driven and individualized agriculture. Findings reveal that intra-orchard (tree-level) modeling substantially improves explanation and prediction of sugar content compared to orchard-level approaches, validating the DT’s role in micro-precision management. Regional analyses highlight actionable spatial variability in soil properties tied to fruit quality, guiding policy (e.g., targeted fertilizer programs). Inter-orchard results underscore trade-offs between sugar and size and the importance of harvest timing to balance quality attributes. The ML results show that while orchard-level variables can predict fruit size well, sugar content is more idiosyncratic and benefits from tree-level modeling. The DT’s stakeholder-focused visualizations and app interface translate analyses into practical decision support for policymakers (regional planning), distributors (quality tracking), researchers (hypothesis generation and cultivar studies), and farmers (tree-level monitoring and targeted interventions). Overall, the results support a transition from precision to individualized agriculture, where tree-specific management can enhance quality outcomes.
Conclusion
This work builds and demonstrates an agricultural digital twin for open-field mandarins by integrating soil, weather, agricultural practice, and fruit quality data via Open APIs and geospatial processing. Analyses across regional, inter-orchard, and intra-orchard scales, supported by statistical models and machine learning, show that tree-level (intra-orchard) analysis explains substantially more variance in sugar content than orchard-level models, pointing to the feasibility and value of individualized agriculture. The DT prototype, delivered as an interactive R Shiny app, provides multi-level visual analytics and comparisons for stakeholders. Future directions include: increasing data collection frequency (especially soils), incorporating remote sensing and IoT soil sensing for fine-scale soil mapping, developing closed-loop DTs that recommend and evaluate practices automatically, conducting randomized controlled trials to quantify benefits and costs of individualized management, extending to more fruit species and cultivars, and reducing data collection and operational costs to enable broad adoption.
Limitations
The study relies on observational data from a single season and region, limiting causal inference; regional-scale analyses are descriptive. Due to policy-driven budget cuts, new open-field data collection ceased, and soil data are infrequent and relatively sparse (approximately annual). The DT demonstration does not yet automate practice recommendations or quantify their effects on fruit quality, which the authors identify as a main limitation. Environmental factors such as air pressure cannot be experimentally controlled in orchards, complicating interpretation of some associations. The current results focus on mandarins in Jeju and may not generalize to cereals or vegetables, where individual plants are genetically heterogeneous.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny