The global stock market's influence on individuals necessitates understanding market trends for prudent investment. Stock price forecasting is challenging due to the inherent noise, non-linearity, and chaotic nature of the market, influenced by various factors like equities, consumer behavior, and news. Traditional methods struggle with market volatility and changing patterns. Machine learning (ML) offers a promising solution, capable of identifying patterns and handling complex datasets. Artificial Neural Networks (ANNs) are beneficial in prediction, but Support Vector Regression (SVR) is a robust supervised learning technique suitable for non-linear problems and minimizing risk. However, SVR requires efficient optimization techniques. This research addresses the need for accurate stock price forecasting models by proposing a hybrid model that combines SVR with several optimization algorithms, including Moth Flame Optimization (MFO), Artificial Bee Colony (ABC), and Genetic Algorithms (GA), to optimize the SVR hyperparameters. The Nikkei 225 index from 2013-2022 serves as the case study, focusing on addressing the challenges presented by outdated and inconsistent data. The study aims to improve prediction accuracy and provide investors with valuable insights for optimizing returns and mitigating risks.
Literature Review
The literature review highlights the increasing use of machine learning algorithms for stock market prediction. Studies have employed deep learning-based non-linear regression, logistic and linear regression tailored to specific sectors (like media and entertainment), and Naïve Bayes classifiers. Research also explores the application of various algorithms such as ensemble methods, support vector machines, random forests, and boosted decision trees across different stock exchanges. The literature addresses challenges like abrupt market changes using models like Multilayer Perceptrons, Support Vector Machines, and Long Short-Term Memory networks. However, existing literature lacks a comparative analysis across diverse datasets and market conditions, and the integration of external variables like geopolitical events and news sentiment is often limited. This research aims to address these gaps by providing a robust hybrid model optimized for accuracy and tested on a comprehensive dataset.
Methodology
The study utilizes Nikkei 225 index data from 2013 to 2022, encompassing Open, High, Low, Close (OHLC) prices and volume. The data undergoes a rigorous cleaning process to ensure accuracy and consistency. Min-max normalization (X_scaled = (X - X_min) / (X_max - X_min)) scales the features to the range [0, 1]. Candlestick charts are used for visual analysis of price changes. The Support Vector Regression (SVR) model is the core predictive algorithm. The SVR optimization problem is formulated using a loss function to minimize the difference between predicted and actual values, including slack variables to handle outliers and a penalty parameter (C) to control the trade-off between model complexity and error. The kernel function maps the data into a higher-dimensional space for better separation. Three optimization algorithms are employed to find the optimal hyperparameters for the SVR: Genetic Algorithms (GA), Artificial Bee Colony (ABC), and Moth Flame Optimization (MFO). GA simulates natural selection to iteratively improve the hyperparameter values. ABC mimics the foraging behavior of honeybees to search for optimal solutions, using employed bees, onlooker bees, and scout bees. MFO simulates the flight behavior of moths, navigating toward a flame (optimal solution), utilizing a spiral path to avoid local optima. The dataset is split into 80% for training and 20% for testing. The model's performance is evaluated using Mean Squared Error (MSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and Root Mean Squared Error (RMSE).
Key Findings
The study presents a comparative analysis of six models: ARIMA, MLP, SVR, GA-SVR, ABC-SVR, and MFO-SVR. Table III shows the results of the evaluation metrics (RMSE, MAPE, MAE, MSE) for both training and testing datasets. The MFO-SVR model consistently outperforms the other models across all evaluation metrics. For the testing dataset, MFO-SVR achieved an RMSE of 230.60, a MAPE of 0.70, an MAE of 197.53, and an MSE of 53175.49. This demonstrates significantly lower errors compared to other models, indicating better predictive accuracy. Figures 8 and 9 visually compare the predicted and actual Nikkei 225 index values during the training and testing phases, respectively, confirming the model's high accuracy. Table II provides a statistical summary of the dataset, including mean, standard deviation, minimum, maximum, skewness, and kurtosis for Open, High, Low, Close, and Volume. The optimal hyperparameters for SVR (kernel, gamma, C, epsilon) determined by GA, ABC, and MFO are presented in Table I. The MFO algorithm is shown to find the best hyperparameter settings for the SVR model.
Discussion
The findings demonstrate the superiority of the MFO-SVR hybrid model for Nikkei 225 stock price prediction compared to ARIMA, MLP, and other SVR-based models. The significantly lower error values across all evaluation metrics indicate that the MFO algorithm effectively optimizes the SVR hyperparameters, leading to enhanced predictive accuracy. The incorporation of multiple optimization techniques allows for a more robust and accurate model. The ability of MFO-SVR to learn from historical data and adapt to market trends highlights the potential of hybrid models in complex financial forecasting scenarios. The improved accuracy in predicting stock prices has direct implications for investment strategies, risk management, and algorithmic trading.
Conclusion
This research successfully developed and validated a hybrid MFO-SVR model for accurate stock price forecasting using Nikkei 225 data. The model's superior performance over other methods demonstrates the effectiveness of combining SVR with MFO for improved prediction accuracy. Future research could explore the model's performance with other financial indices, investigate alternative optimization techniques or ensemble methods, and incorporate real-time data and external factors to improve prediction accuracy and adaptability in dynamic market conditions.
Limitations
The model's reliance on historical data might limit its ability to accurately predict abrupt market shifts or unforeseen events. The model's performance may vary under different market conditions and asset classes. Future research should investigate enhancing the model's robustness by incorporating real-time data streams, external factors, and exploring alternative methodologies to improve the model's generalizability and address interpretability.
Related Publications
Explore these studies to deepen your understanding of the subject.