logo
ResearchBunny Logo
Weather Impact on Solar Farm Performance: A Comparative Analysis of Machine Learning Techniques

Engineering and Technology

Weather Impact on Solar Farm Performance: A Comparative Analysis of Machine Learning Techniques

A. Gopi, P. Sharma, et al.

This study presents an innovative prediction model for forecasting the annual power generation yield and performance ratio of photovoltaic farms. Using advanced AI techniques, the results highlight ANFIS as the most precise model, providing valuable insights for policymakers and solar energy developers. The research was conducted by Ajith Gopi, Prabhakar Sharma, Kumarasamy Sudhakar, Wai Keng Ngui, Irina Kirpichnikova, and Erdem Cuce.

00:00
00:00
~3 min • Beginner • English
Introduction
The study addresses the problem of forecasting energy yield and performance ratio (PR) of solar photovoltaic (PV) farms, which is essential for grid management and the economic sustainability of renewable energy systems. Because solar generation is intermittent and weather-driven, accurate prediction of PV output supports load balancing, storage scheduling, and market operations. The paper proposes to model and forecast monthly energy generation and PR of a 2 MWp PV plant using three key environmental inputs—monthly tilted irradiation (MTI), wind speed, and ambient air temperature—and to compare three AI-based methods (RSM, ANN, ANFIS). The significance lies in leveraging AI to capture nonlinear relationships between weather variables and PV performance, using multi-year, real-plant data, and rigorously validating models with multiple statistical indices and Taylor diagrams.
Literature Review
Prior work largely focuses on predicting solar irradiation rather than direct PV electricity output, despite the latter being affected by hardware and weather variables (cell temperature, wind speed, humidity). AI tools such as ANFIS, ANN, SVM, numerical regression, and RSM have been applied to radiation or power prediction, sometimes using weather classification; e.g., SVM-based weather categorization achieved 8.46% error for a 20 MW plant. Some studies report RSM marginally outperforming ANN for radiation prediction; others use ANN for day-ahead PV generation with errors ranging widely. Deep learning approaches (LSTM, autoencoders, GRU) often outperform traditional ANN for short-term power forecasting. Hybrid approaches (e.g., LSTM-CNN) and decomposition techniques improve next-day prediction, though performance can degrade under rough weather. Broader findings include improved accuracy when using more weather parameters and the importance of integrating accurate forecasts into smart grid operations. Research gaps identified: limited comparative studies of multiple AI methods jointly modeling PV plant performance (energy and PR) using multi-year real PV plant data and thorough statistical validation.
Methodology
Study site and data: An operational 2 MWp grid-connected PV plant at Kuzhalmannam, Kerala, India, equipped with a Solar Resource Assessment (SRRA) station and SCADA system, provided three years of monthly data (2018–2020). SCADA recorded solar irradiance (global tilted), wind speed, ambient and module temperatures, and plant outputs; metering followed IEC 61724. Sensors included a pyranometer (cell-based), 3-cup anemometer, and temperature sensors. Inputs and outputs: Inputs were monthly tilted irradiation (MTI, kWh/m²), wind speed (WS, m/s), and ambient air temperature (AT, °C). Outputs were monthly AC energy generation (kWh) and performance ratio (PR, %). Energy was computed as monthly cumulative AC generation; PR was computed as the ratio of observed AC energy to the expected energy at STC scaled by incident plane-of-array irradiation. Data preprocessing: Correlation analysis showed MTI strongly correlated with generation (Pearson r ≈ 0.9116). Generation also correlated with AT (≈ 0.5264). PR had weaker positive correlation with MTI (≈ 0.2757) and AT (≈ 0.1747), and negative with WS. Data were inspected for quality prior to modeling. Models: - RSM: Built polynomial (up to cubic) response surfaces linking MTI, WS, AT to generation and PR. Steps included design matrix formation, ANOVA to identify significant terms (p < 0.05), model fitting, and prediction. For generation, a cubic polynomial with interaction and squared terms was derived; for PR, a cubic model was developed using sqrt(PR) as response. Model adequacy was assessed via R, R2, RMSE, MAPE, NSCE, KGE, and Theil’s U2. - ANN: A multilayer feed-forward neural network with 3 input neurons (MTI, WS, AT), one hidden layer of 10 neurons (selected via trial to minimize MSE), and 2 output neurons (generation, PR). Data split: 70% training, 15% validation, 15% testing. Training used Levenberg–Marquardt (trainlm). Performance assessed with regression coefficients, error metrics, and uncertainty. - ANFIS: A Sugeno first-order multiple-input single-output (MISO) neuro-fuzzy system modeled generation and PR separately with inputs MTI, WS, AT. FIS generated via grid partitioning; training used hybrid learning (least-squares and backpropagation). Data split: 70% training, 30% validation. Membership functions and rule weights were optimized to minimize RMSE. Model appraisal: Statistical indices used included Pearson’s R, coefficient of determination (R2), Nash–Sutcliffe efficiency (NSCE), root-mean-squared error (RMSE), mean absolute percentage error (MAPE), Kling–Gupta efficiency (KGE), and Theil’s U2 (prediction uncertainty). Taylor diagrams summarized correlation, variability, and centered RMSE to compare models graphically.
Key Findings
- Input–output relationships: MTI is the dominant driver of energy generation (r ≈ 0.91). At lower MTI, generation initially increases with WS then decreases; at higher MTI the trend reverses subtly. AT generally has a positive effect over the studied range. - RSM results: For generation, regression yielded R ≈ 0.9886, R2 ≈ 0.9773, MAPE ≈ 2.24%, RMSE ≈ 6133.93 kWh, NSCE ≈ 0.9774, KGE ≈ 0.9847; Theil’s U2 reported between ≈ 0.1653 (text) and 0.0775 (table). For PR, R ≈ 0.9346, R2 ≈ 0.8735, MAPE ≈ 2.05%, RMSE ≈ 1.85, NSCE ≈ 0.8738, KGE ≈ 0.9157, Theil’s U2 ≈ 0.3343. - ANN results: For generation, R ≈ 0.9679, R2 ≈ 0.9369, MAPE ≈ 3.77%, RMSE ≈ 12070 kWh, NSCE ≈ 0.9128, KGE ≈ 0.9096, Theil’s U2 ≈ 0.325. For PR, R ≈ 0.9663, R2 ≈ 0.9337, MAPE ≈ 1.5%, RMSE ≈ 1.37, NSCE ≈ 0.9317, KGE ≈ 0.9638, Theil’s U2 ≈ 0.245. - ANFIS results: For generation, R ≈ 0.9950, R2 ≈ 0.9901, MAPE ≈ 2.09%, RMSE ≈ 5492.81 kWh, NSCE ≈ 0.9828, KGE ≈ 0.956, Theil’s U2 ≈ 0.1506. For PR, R ≈ 0.9915, R2 ≈ 0.9830, MAPE ≈ 0.8%, RMSE ≈ 0.6898, NSCE ≈ 0.9837, KGE ≈ 0.9917, Theil’s U2 ≈ 0.1259. - Comparative performance: ANFIS outperformed ANN and RSM for both generation and PR across correlation, error, efficiency, and uncertainty metrics. Taylor diagrams confirmed ANFIS points closest to the reference (observations) for both targets. - Practical implication: Accurate monthly forecasting using only MTI, WS, and AT is feasible; ANFIS provides the most precise PR and energy predictions among tested methods.
Discussion
The study aimed to evaluate whether AI models using a minimal set of weather inputs (MTI, WS, AT) can accurately forecast monthly PV energy generation and performance ratio, and to determine which technique performs best. The results show that all three methods (RSM, ANN, ANFIS) achieve strong predictive performance, validating the approach of data-driven modeling for PV plants. ANFIS consistently achieved the highest R and R2, the lowest MAPE and RMSE, and the best NSCE and KGE, indicating superior capability to capture nonlinear interactions among inputs and robust generalization. Taylor diagrams corroborate these findings by showing ANFIS’s closer match in correlation and variability to observations. The strong correlation between MTI and generation quantifies the dominant role of irradiation, while WS and AT contribute secondary but meaningful effects. These findings support the research hypothesis that AI-based models—especially neuro-fuzzy hybrids—can provide reliable monthly forecasts to support grid operations, scheduling, and policy decisions in PV-dominant systems.
Conclusion
The paper demonstrates that AI techniques can effectively model and forecast monthly energy yield and performance ratio of a utility-scale PV plant using three key environmental variables. Among RSM, ANN, and ANFIS, the ANFIS-based models delivered the highest predictive accuracy and lowest uncertainty for both generation and PR, as evidenced by superior R2, NSCE, KGE, MAPE, RMSE, and Theil’s U2, and by Taylor diagram comparisons. The work contributes: (i) a comparative evaluation of three AI methods on multi-year, real PV plant data; (ii) interpretable RSM formulations; and (iii) a validated ANFIS framework for PV performance mapping. Future research should validate the approach across diverse climate zones and plant designs, explore advanced and hybrid machine learning architectures, incorporate additional predictors (e.g., humidity, cloud cover, soiling), and extend models to predict long-term degradation and lifecycle performance.
Limitations
- Single-site study using a 2 MWp PV plant in Kerala, India; generalizability to other climates, technologies, and operating regimes may be limited without further validation. - Monthly aggregated inputs/outputs may mask intra-month variability; short-term dynamics (intra-day, daily) were not modeled here. - Limited predictor set (MTI, WS, AT); other factors such as relative humidity, cloud cover, precipitable water, soiling, shading, and maintenance were not included and could improve accuracy. - Some reported uncertainties (e.g., Theil’s U2 for RSM generation) show minor inconsistencies between narrative and tabulated values. - Models are data-driven and dependent on sensor and SCADA data quality; transferability may require re-training or adaptation.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny