logo
ResearchBunny Logo
Indicators for tracking programmes to strengthen health research capacity in lower- and middle-income countries: a qualitative synthesis

Medicine and Health

Indicators for tracking programmes to strengthen health research capacity in lower- and middle-income countries: a qualitative synthesis

D. C. Cole, A. Boyd, et al.

This study explores the evaluation indicators for health research capacity strengthening programs in low- and middle-income countries. Conducted by Donald C Cole, Alan Boyd, Garry Aslanyan, and Imelda Bates, it uncovers critical variability in evaluation designs and highlights the need for enhanced measurement and integration of indicators to bolster the effectiveness of health RCS programs.

00:00
00:00
~3 min • Beginner • English
Introduction
Health research is increasingly recognized as essential for informing practice and policy, yet substantial gaps in research production persist in many low- and middle-income countries (LMICs). Various profiles, resources, and proposals have emerged to strengthen health research capacity (RCS), defined as processes of individual and institutional development that lead to higher skills and greater ability to conduct useful research. Despite accumulated experience, heterogeneity and complexity of RCS initiatives have hindered systematic effectiveness assessments. Funders and initiatives (e.g., UKCDS, ESSENCE) have emphasized explicit theories of change and the use of indicators across activities, outputs, and outcomes within evaluation frameworks. This study investigates funder-held evaluations of health RCS to describe evaluation designs, the nature and quality of indicators used, and linkages among activities, outputs, and outcomes, aiming to inform rigorous evaluation design and indicator selection for tracking progress and impacts.
Literature Review
The paper situates its work within prior efforts to conceptualize and evaluate research capacity strengthening. It references profiles of LMIC capacity for equity-oriented research, resources for RCS, and frameworks such as the ESSENCE Planning, Monitoring and Evaluation framework. It notes the widespread use of indicators in health programs, the SMART criteria for indicator development, and research impact frameworks that include policy and practice change. The authors highlight calls for explicit theories of change to link activities to outcomes and identify gaps in prior evaluations, including limited systematic assessment of effectiveness and equity considerations.
Methodology
Design: Qualitative synthesis of evaluation reports, with stakeholder engagement. Ethics approval obtained from the University of Toronto Health Sciences Research Ethics Board (#26837). Report identification: The team consulted ESSENCE member funding agencies and, via snowballing, other LMIC research funders. Of 31 agencies contacted, 11 provided reports. From these, 54 reports of relevant health RCS evaluations (English, post-2000, publicly available) were identified. Using maximum variety sampling, the authors purposively selected 18 reports covering 12 distinct evaluations to maximize diversity in RCS type, funders, countries, and evaluation approaches. Quality appraisal: Drawing on OECD DAC standards, the authors appraised each evaluation using questions about clarity of purpose, methodological description (including analysis), and whether indicators were explicit and justified. Particular attention was given to design, indicator measurement and collection, and bias. Two reviewers independently appraised each evaluation, providing brief justifications. Indicator extraction and analysis: A systematic framework analysis was conducted. Text relating to indicators and their context (including narrative descriptions implying indicators) was extracted and coded using categories from the ESSENCE PME matrix (individual, institutional, national/regional/network levels), with new categories added as needed. To ensure consistency, each researcher coded at least three reports from at least two funders and two evaluations; each report was independently coded by two researchers, with discrepancies resolved through discussion or a third reviewer. Extraction proceeded until no new insights emerged. Synthesis: The team reviewed extracted material, created additional categories as needed, and attempted to identify links from aims to indicators and from activities to outputs and outcomes (theory-of-change pathways). Because clear within-evaluation linkages were rare, the authors assembled cross-evaluation examples to illustrate potential pathways of change. Interim findings were iteratively discussed with the ESSENCE steering committee to focus analysis and validate interpretations.
Key Findings
- Corpus: 12 evaluations (from 18 reports) spanning diverse RCS initiatives, durations, contexts, and evaluation approaches. Reports came from 11 of 31 contacted agencies; 54 reports initially identified. - Evaluation design quality: All evaluations stated purposes; most used mixed methods and existing data, often supplemented with site visits and/or interviews. Complexity varied by initiative scope and evaluation timing (mid-term vs. final). Several evaluations lacked a clear monitoring and evaluation framework and had short review timeframes. Most lacked baseline data; one considered but did not use a control. Constraints limited assessment of change, attribution, and effectiveness. - Data collection and validity: Surveys/questionnaires were crafted specifically for each evaluation; only one report described a formal pilot test of a survey instrument. About half explicitly discussed potential biases (e.g., response bias, recall, classification, low response rates) and used triangulation/site visits for validation. - Indicators: Coverage and justification varied widely. Many indicators were specific, attainable, realistic, and timely; measurability was more challenging. Some evaluations employed bibliometric indicators (e.g., publication counts, citation rates, impact factors, norm-referencing). A few linked indicators to intervention logic frameworks. - Individual-level indicators: Common indicators included numbers and quality of trainings (e.g., PhD/MSc, fellowships), balance of training content (methods, process, advocacy), development of research skills, mentoring ratios, and trainee satisfaction. Equity-related disaggregation (gender, nationality, income level, discipline, award level) appeared in some evaluations; broader equity dimensions (e.g., socio-economic status, minority groups) were rare. Job outcomes highlighted contextual constraints (e.g., limited postdoctoral career structures in LMICs). - Institutional-level indicators: Links from individual awards to institutional strengthening were reported (e.g., mentoring capacity, supervisor numbers, research support services, ICT improvements). Indicators covered infrastructure and management (labs, libraries, IT; SOPs, QA, governance, strategic planning, financial reporting, evaluation capacity, gender analysis), organizational learning, proposal leadership, and collaboration quality (local ownership, regional partnerships, visibility). Missed opportunities included limited sharing of donated equipment and techniques. - National/regional/network-level indicators: Activities with policy makers and networks included stakeholder engagement, communication strategies, tailored dissemination tools, and capacities of research users to acquire/appraise/use evidence (rarely measured). Indicators addressed ministry commitment to research, national research councils and priority-setting, legal frameworks, trans-disciplinary platforms, network sustainability (reduced dependence on individuals), and network functioning (rules, perceived fairness). Many networks lacked readily available output data; formal M&E grounded in program logic was urged. - Pathways of change: Within-evaluation linkage from activities to outputs and outcomes was limited. Across evaluations, the authors constructed illustrative pathways with corresponding indicators at individual (e.g., training leading to research careers and collaborations), institutional (e.g., infrastructure and governance improvements leading to accreditation and funding), and national/network levels (e.g., engagement leading to policy impact and harmonized regional activities). - Equity and coverage gaps: Indicator data disaggregated by equity categories were rare despite global equity priorities. Important constructs such as ongoing relationships with knowledge users were often missing. Nomenclature and scale at the upper levels were inconsistent, complicating cross-case comparisons. - Indicator quality and contribution assessment: Few indicators met full SMART criteria; comments on indicator quality were rare. Limited investment in evaluation and divided responsibilities for data collection likely contributed. Most evaluations were retrospective with little prospective planning, limited articulation of theories of change, assumptions, or confounders, hindering causal inference and contribution assessment.
Discussion
The study addressed its aims by cataloguing indicators used in health RCS evaluations and assessing design quality and linkage across impact pathways. Findings show that while a broad array of indicators exists across individual, institutional, and national/network levels, their measurement properties, equity disaggregation, and systematic linkage to theories of change are often inadequate. This undermines robust assessments of RCS effectiveness and impact and limits accountability and learning for stakeholders. The diversity of contexts and designs reflects the complexity of RCS initiatives, yet common weaknesses emerged: absence of baselines, limited piloting/validation of tools, sparse discussion of bias, and insufficiently specified monitoring frameworks. At the indicator level, coverage favored activities and outputs, with fewer indicators capturing relationships with research users, policy influence, and system-level change. Inconsistent terminology and scale at upper levels further complicate comparisons. To improve relevance and rigor, the authors argue for prospective, theory-driven evaluation designs, stakeholder engagement in indicator selection, systematic attention to indicator quality (valid standards, multiple data sources, triangulation), and equity-focused disaggregation. Enhanced evaluation investments and clearer frameworks can enable better contribution assessment, ultimately strengthening the evidence base for RCS strategies and justifying investments.
Conclusion
This synthesis consolidates knowledge on evaluation designs and indicators applicable to diverse health RCS initiatives. It demonstrates how indicators can be organized along potential pathways of change across individual, institutional, and national/network levels, while highlighting gaps in indicator quality, equity coverage, and theory-driven linkage. The authors call for more rigorous, prospective evaluation designs, clearer frameworks grounded in theories of change, and improved measurement to generate robust evidence on the outcomes and impacts of health RCS and to better justify investments.
Limitations
- Not all contacted funders provided reports, potentially limiting comprehensiveness. - The analysis covered a limited number of evaluations due to labor-intensive extraction and analysis, though saturation was reached with common themes emerging. - Allocation of narrative extracts and indicators to framework categories occasionally required adjudication. - Most evaluations represented a single time point; only two tracked RCS longitudinally. - Few evaluations captured the interplay of multiple health development efforts and RCS initiatives contributing to system emergence, which longer-term case studies may better elucidate.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny