Computer Science
Machine Learning Research Trends in Africa: A 30 Years Overview with Bibliometric Analysis Review
A. E. Ezugwu, O. N. Oyelade, et al.
Machine learning's transformative potential across sectors — from healthcare and education to food security and climate resilience — is mapped in a state-of-the-art bibliometric and literature survey of 2,761 documents from 54 African countries (1993–2021). This research was conducted by the authors present in the <Authors> tag.
~3 min • Beginner • English
Introduction
The paper frames machine learning (ML) as a core driver of artificial intelligence (AI) advances and outlines its evolution from AI to ML to deep learning (DL). It highlights ML’s learning paradigms (supervised, unsupervised, semi-supervised, reinforcement, and deep reinforcement learning) and the breadth of algorithms used in real-world applications. Within Africa, ML has been increasingly adopted to address regional challenges across healthcare, agriculture, security, language technologies, and energy. The study is motivated by the need to systematically map and quantify ML research activity on the continent, identify leading contributors (authors, institutions, and countries), and reveal collaboration networks and thematic trends to inform policy and future research. The research question centers on “What are the publication patterns, collaborations, topical focuses, and impact trends in ML research associated with African institutions over the past three decades?” The paper contributes a comprehensive bibliometric analysis and an accompanying literature review focused on African ML research, aiming to guide future collaboration and agenda-setting.
Literature Review
The paper surveys multi-disciplinary ML research across African countries and institutions, illustrating applications and advances in: (1) healthcare and medicine (e.g., diagnosis of breast cancer, diabetic retinopathy, tuberculosis, COVID-19 image analysis, AI-assisted clinical decision support); (2) agriculture and food security (e.g., crop disease detection, yield prediction, soil nutrient mapping, precision agriculture with drones and IoT sensors); (3) security and surveillance (e.g., intrusion detection, spam classification, anti-poaching via UAVs, crime monitoring); (4) natural language processing (e.g., machine translation for local languages, Arabic diacritization, ASR improvements); (5) energy and sustainability (e.g., microgrid energy management, solar radiation prediction, air pollution modeling); (6) climate and environment (e.g., vegetation health forecasting, climate variable prediction, biome reconstructions, conflict risk projection); (7) geosciences and remote sensing (e.g., lithological mapping, irrigated area mapping using RF on GEE); (8) finance and e-commerce (e.g., credit scoring, panic buying detection, logistics); and (9) entrepreneurship and innovation (e.g., localized language tech, neonatal asphyxia detection from cries). A focused review of quantum-based machine learning research—prominently led from South Africa (University of KwaZulu-Natal)—covers quantum classifiers, quantum kernels, distance-based quantum classification, quantum ensemble methods, and regression on quantum computers, emphasizing potential computational speedups and theoretical advances. The review synthesizes major ML application areas into domains such as predictive analytics and decision-making, cybersecurity, IoT/smart cities, traffic prediction, healthcare, NLP, image/speech/pattern recognition, sustainable agriculture, pollution control, climate systems, and soil analysis, with numerous case studies from African contexts.
Methodology
Data source: Web of Science Core Collection, Science Citation Index Expanded (SCI-EXPANDED). Extraction date: 10 October 2022. Time window: 1991–2021 (results reported for 1993–2021). Query strategy: TOPIC search (title, abstract, author keywords, and Keywords Plus) using a comprehensive set of ML-related terms and variants, including common misspellings and concatenated forms (e.g., "machine learning", "machine learned", "machine learningalgorithm", etc.). Country filter encompassed all 54 African countries (e.g., Egypt, South Africa, Nigeria, Tunisia, Morocco, etc.). PRISMA-like flow was summarized: 2,770 total documents identified (2,477 articles), screened with exclusions, resulting in 2,477 articles included. Keywords Plus-only records were considered potentially irrelevant; the ‘front page’ filter (title, abstract, author keywords) was applied to improve relevance. Data handling: Full records and annual citation counts were exported and processed in Microsoft Excel (functions such as Concatenate, Counta, Sort, Rank, Vlookup, etc.). Affiliations were checked and standardized; UK components were grouped as United Kingdom. Indicators: Publication indicators included TP (total articles), IP (single-country or single-institution articles), CP (internationally or inter-institutionally collaborative articles), FP (first-author), RP (corresponding-author), SP (single-author). Citation indicators: Cyear (citations in a given year, e.g., C2021) and TCyear (total citations to end of 2021, TC2021). Derived impact indicators (CPP2021) were calculated for each publication category (e.g., TP-CPP2021, IP-CPP2021, CP-CPP2021, FP-CPP2021, RP-CPP2021, SP-CPP2021). Journal impact factors were based on JCR 2021. Language and document types were noted to characterize the corpus, with articles analyzed in depth.
Key Findings
- Corpus and growth: 2,761 ML-related documents (1993–2021), with articles constituting 89% (2,468). Publications surged from fewer than 10 annually before 2010 to 1,035 in 2021. The highest CPP2021 occurred in 2013 (54), influenced by a highly cited energy management article.
- Document characteristics: Of 11 document types, reviews (235) had the highest CPP2021 (18), 1.6× articles. Articles were predominantly in English; only three non-English articles were identified.
- Subject categories and journals: 2,468 articles appeared in 903 journals spanning 159 Web of Science categories. Top categories (share of articles): Electrical and Electronic Engineering (20%), Information Systems/Computer Science (18%), Artificial Intelligence/Computer Science (14%), Telecommunications (12%). ‘Interdisciplinary Applications/CS’ and ‘Remote Sensing’ had the highest CPP2021 (15). Most productive journal: IEEE Access (192 articles; CPP2021=11). Expert Systems with Applications had the highest CPP2021 (30) among top outlets.
- Collaboration patterns: 74% (1,819) were internationally collaborative articles from 146 countries (43 African, 103 non-African); 26% (649) were single-country articles from 16 African countries. International collaboration slightly increased citations (CPC-CPP2021=12 vs IPC-CPP2021=10).
- Country performance (Africa): Egypt led across all six publication indicators (TP=777; 31% of articles; IPc=186; CPc=591; FP=345; RP=449; SP=21). South Africa followed (TP=562; 23%). Ten African countries had no ML-related articles in SCI-EXPANDED. Among non-African collaborators, the USA (CPc=431; CPP2021=15), Saudi Arabia (338; 9.3), UK (295; 14), China (252; 14), and France (211; 10) were most prominent.
- Institutional performance: Cairo University (Egypt) had the highest TP (142) and CP1 (127). University of KwaZulu-Natal (South Africa) led in IP1 (19), FP (48), and RP (64), and showed the greatest CPP2021 values across several indicators (e.g., TP-CPP2021=24; CP1-CPP2021=27; FP-CPP2021=27; RP-CPP2021=34). Several South African and Egyptian institutions appeared among the top 20.
- Highly cited works: Top TC2021 article: Wright & Ziegler (2017) on ‘ranger’ (TC2021=683). Highest C2021: Adadi & Berrada (2018) XAI survey (C2021=435; TC2021=675). Five of top 10 most-cited were by Egyptian authors; South Africa contributed two; Nigeria, Kenya, and Morocco also featured.
- Topical trends: Most frequent topics included classification (TP≈841), deep learning (≈430), feature extraction (≈375), and random forest (≈190), all showing sharp growth post-2015.
- Additional observations: Top five IF2021>60 journals collectively carried six African-linked articles (World Psychiatry, Nature, Nature Energy, Nature Reviews Disease Primers, Science).
Discussion
The analysis confirms a rapid and recent expansion of ML research across Africa, with substantial growth after 2017 and a peak in 2021. The dominance of engineering and computer science categories reflects the technical maturation of the field, while the presence of remote sensing and environmental sciences underscores African priorities in earth observation, climate, and agriculture. Egypt and South Africa lead in productivity and authorship roles, suggesting stronger research infrastructure and collaboration networks in these countries. International collaborations are prevalent and modestly boost citation impact, with the USA, Saudi Arabia, and the UK among key partners—indicative of both North–South and Middle East–Africa research ties. Institutionally, Cairo University and University of KwaZulu-Natal anchor production and leadership metrics. The most cited works span methodological advances (e.g., random forests implementation), foundational surveys (XAI), and domain applications (energy, medical imaging, flux prediction), indicating both methodological and applied impact. Topical trend analysis reveals that classification and deep learning dominate, aligning with global ML patterns, while random forests and feature extraction remain relevant across applied domains. Collectively, these findings map the current landscape, highlight leaders and collaboration hubs, and identify growth areas to inform policy, investment, and targeted capacity building. They also reveal gaps—many countries lack single-country outputs—pointing to the need for national capacity and infrastructure strengthening to improve autonomy and impact.
Conclusion
The study provides a 30-year bibliometric overview of ML research in Africa, documenting exponential growth since 2010, concentration in engineering and computer science outlets, and strong international collaborations. Egypt and South Africa are the leading contributors; Cairo University and University of KwaZulu-Natal are prominent institutions. Highly cited contributions include both methodological (e.g., random forests, XAI) and applied works (energy systems, medical image analysis, climate/fluxes). The most common research themes are classification, deep learning, feature extraction, and random forests. The paper’s literature review synthesizes major ML application areas relevant to African needs (health, agriculture, security, NLP, energy, climate, pollution, soil) and highlights noteworthy progress in quantum-enhanced ML led by South African teams. Future directions suggested include: (i) advancing neuro-symbolic integration (merging ML with formal reasoning), (ii) deepening DL–DRL integration with clustering and metaheuristic optimization, (iii) expanding ML for smart cities, transport, surveillance, and multilingual NLP across African languages, and (iv) investing in computational infrastructure, training, and research ecosystems (e.g., GPUs, cloud, reliable power, regional hubs) to enable sustained, high-impact ML research and deployment.
Limitations
Data were sourced solely from SCI-EXPANDED and restricted to TOPIC fields (title, abstract, author keywords, Keywords Plus) for 1991–2021, with results analyzed to end of 2021. A ‘front page’ filter was used to reduce irrelevant records from Keywords Plus, and only articles were analyzed in depth. Affiliations were standardized (e.g., UK components regrouped), which may affect institutional/country attribution. Citation counts (TC2021, C2021) reflect the database status up to 2021, potentially undercounting recent impacts. The analysis excludes outputs outside SCI-EXPANDED and largely reflects English-language publishing, which may limit generalizability.
Related Publications
Explore these studies to deepen your understanding of the subject.

