logo
ResearchBunny Logo
The role of the mass vaccination programme in combating the COVID-19 pandemic: An LSTM-based analysis of COVID-19 confirmed cases

Medicine and Health

The role of the mass vaccination programme in combating the COVID-19 pandemic: An LSTM-based analysis of COVID-19 confirmed cases

S. Hansun, V. Charles, et al.

This groundbreaking study by Seng Hansun, Vincent Charles, and Tatiana Gherman reveals the significant positive impact of mass vaccination programs on COVID-19 case reductions. Utilizing LSTM networks, the research forecasts future cases in ten nations with high vaccination rates, showcasing its accuracy and importance in understanding the pandemic.

00:00
00:00
~3 min • Beginner • English
Introduction
The study addresses whether mass COVID-19 vaccination programmes reduce confirmed COVID-19 cases. Against the backdrop of the global pandemic declared by WHO in March 2020 and the subsequent worldwide rollout of vaccines (including initiatives like COVAX), the authors investigate confirmed case trajectories in the ten countries with the highest vaccination numbers (China, India, United States, Brazil, Indonesia, Japan, Pakistan, Vietnam, Mexico, Germany). The paper’s purpose is to predict and analyse confirmed cases before and after vaccination rollout using LSTM networks, assessing whether vaccination correlates with decreased future case trends. The work is motivated by the growing availability of data enabling robust AI forecasting, and the need for evidence-based insights to inform public health decisions.
Literature Review
The paper overviews AI and deep learning applications in COVID-19 forecasting, noting widespread use of RNN-derived models, especially LSTM, to model time series of cases and trends. Prior studies applied LSTM to country-level outbreak forecasting (e.g., Canada, Russia, Peru, Iran) and other COVID-19-related tasks (e.g., mutation-rate prediction). Comparative works have evaluated ARIMA, SVR, random forests, CNN, Bi-LSTM, stacked/encoder-decoder LSTM, and hybrid CNN-LSTM, often finding LSTM-based approaches competitive or superior for complex epidemic dynamics. The authors highlight that, to their knowledge, this is the first study to explicitly compare pre- and post-vaccination scenarios across the top ten vaccinated countries using LSTM to evaluate vaccination programme impact on future trends.
Methodology
Data source: Global COVID-19 cumulative confirmed cases from the Johns Hopkins University CSSE GitHub repository (time_series_covid19_confirmed_global.csv), snapshot on February 14, 2022 (last recorded data February 12, 2022). Countries analysed: China, India, United States, Brazil, Indonesia, Japan, Pakistan, Vietnam, Mexico, Germany (top ten by total vaccinations per BBC/Our World in Data). Timeframe: January 22, 2020 to February 12, 2022. Scenarios: Two datasets per country were constructed: (1) 'All Time'—all available data through February 12, 2022; (2) 'Before Vaccination'—data up to February 1, 2021 (assumed common effective date when vaccination impact would begin). Data splitting and preprocessing: For each country and scenario, data were split 80:20 into training and test sets. A 14-day lookback window (timestamps) was used to create supervised sequences. MinMaxScaler normalization was applied. Data reshaped to 3D arrays as required by Keras LSTM. A helper function (create_dataset) generated input-output sequences. Model architecture and training: Implemented in Keras (TensorFlow backend). A five-layer neural network comprising two LSTM layers, two Dropout layers (to mitigate overfitting), and one Dense output layer was used. Mean Squared Error (MSE) served as the training loss; evaluation metrics were MAE, RMSE, and MAPE. A Vanilla RNN model with a similar five-layer architecture served as a benchmark for comparative evaluation. The code and data are publicly available at https://github.com/senghansun/COVID-19-with-LSTM. Evaluation and analysis: Models were trained separately for each country under both scenarios. Performance on the test sets was assessed via MAE, RMSE, and MAPE. Future one-step-ahead predictions were compared to the last known observed value to infer directional trend and percentage change, enabling assessment of post-vaccination impact on projected trends.
Key Findings
- Predictive performance: LSTM models achieved good accuracy predicting future cumulative confirmed cases across countries. Average MAPE was 5.977% for 'All Time' and 10.388% for 'Before Vaccination', indicating better accuracy when using the full dataset (more observations). - Country highlights (LSTM): Particularly strong performance noted for Pakistan, Mexico, and Japan in 'All Time'; and India, Vietnam, and Brazil in 'Before Vaccination'. - Benchmark comparison: Vanilla RNN yielded average MAPE of 7.772% ('All Time') and 19.305% ('Before Vaccination'), substantially worse than LSTM on MAPE in both scenarios, despite sometimes having lower MAE/RMSE. Thus, LSTM outperformed Vanilla RNN in percentage error terms across both scenarios. - Future trend projections (LSTM): Most countries showed projected downward trends in cumulative confirmed cases when including vaccination-period data ('All Time'). Exceptions with slight increases were India (+0.243%) and Mexico (+0.079%) in 'All Time'. In 'Before Vaccination', Germany showed a projected increase (+32.697%), whereas under 'All Time' it showed a decrease (−16.584%), illustrating the inferred positive effect of vaccination on controlling case growth.
Discussion
The findings address the core question by showing that models trained on data including the vaccination period predict predominantly downward future trends in cumulative cases across the top ten vaccinated countries, supporting the conclusion that mass vaccination programmes contribute to decreasing and controlling COVID-19 spread. The superior MAPE of LSTM over Vanilla RNN strengthens confidence in the model-based inferences. The improved accuracy in the 'All Time' scenario likely stems from the larger training window and the inclusion of post-vaccination dynamics. Notable exceptions (e.g., India and Mexico with slight projected increases in 'All Time', Germany’s sharp projected increase in 'Before Vaccination') underscore heterogeneity in epidemic trajectories and the potential roles of policy timing, programme implementation, testing capacity, public compliance, and variant emergence (e.g., Omicron). The results are relevant for policymakers, indicating that vaccination programmes can alter epidemic trajectories favorably, reducing projected growth compared to a counterfactual pre-vaccination scenario.
Conclusion
The study applies LSTM to predict COVID-19 confirmed case trajectories before and after mass vaccination rollout across the ten most vaccinated countries, demonstrating that including the vaccination period yields better predictive performance (MAPE 5.977% vs 10.388%) and predominantly downward projected trends. Comparisons with a Vanilla RNN baseline show LSTM’s advantage on percentage error metrics. Overall, evidence suggests mass vaccination programmes positively affect controlling and decreasing COVID-19 spread. Future research should explore barriers to effective vaccination rollouts (e.g., policy response timing, logistics, public acceptance), incorporate additional performance measures (e.g., directional statistics) to evaluate trend prediction, and investigate model optimization or alternative architectures to further improve forecasting robustness.
Limitations
- Modeling simplicity: A relatively simple five-layer LSTM architecture was used without extensive hyperparameter optimization or exploration of alternative/deeper architectures. - Evaluation metrics: Only MAE, RMSE, and MAPE were used; these do not directly assess directional accuracy. Directional Statistics could better quantify trend prediction. - Assumed effective vaccination date: A uniform date (February 1, 2021) was assumed for vaccination impact across all countries, which may not reflect country-specific rollout timings and effectiveness. - Predictive uncertainty: As future conditions (e.g., new variants, policy changes, behavior) are uncertain, projections should be interpreted cautiously.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny