logo
ResearchBunny Logo
Autonomous artificial intelligence increases real-world specialist clinic productivity in a cluster-randomized trial

Medicine and Health

Autonomous artificial intelligence increases real-world specialist clinic productivity in a cluster-randomized trial

M. D. Abramoff, N. Whitestone, et al.

This groundbreaking clinical trial reveals that an AI system for diabetic eye exams significantly boosts productivity in clinics. Conducted by esteemed authors, including Michael D. Abramoff and Nathan Congdon, the study indicates a remarkable 40% increase in patient encounters, showcasing the potential of AI to transform healthcare delivery.

00:00
00:00
~3 min • Beginner • English
Introduction
The study addresses the global problem of limited access to essential health services, particularly affecting racial/ethnic minorities, low socioeconomic groups, and rural populations, with clinic productivity in healthcare declining despite productivity gains in other sectors. The authors hypothesize that autonomous AI—where the system makes diagnostic decisions without human oversight—can improve clinic productivity measured as completed care encounters per hour per specialist. With recent FDA authorization and reimbursement for autonomous AI systems, the study’s purpose is to test, in a preregistered randomized clinical trial, whether deploying such AI for diabetic eye disease (DED) screening increases specialist clinic productivity in real-world practice.
Literature Review
Background literature cited highlights: (1) widening healthcare productivity gaps contrasted with historical total factor productivity growth in other sectors; (2) evidence that healthcare labor productivity may be declining in ambulatory services; (3) mixed impacts of information technology such as electronic medical records on healthcare productivity; and (4) emergence of FDA-authorized autonomous AI systems validated for safety, efficacy, and lack of demographic bias for diabetic eye disease detection. Prior pivotal trials established AI accuracy and regulatory approval, but real-world productivity impacts had not been rigorously tested before this study.
Methodology
Design: B-PRODUCTIVE was a preregistered, prospective, double-masked, cluster-randomized clinical trial conducted at the Deep Eye Care Foundation (DECF) in Rangpur, Bangladesh, from March 20 to July 31, 2022. Clusters were specialist clinic days, randomized monthly via concealed allocation. Medical staff controlling access, specialists, and patient participants were masked to group assignment. Setting and participants: All three retina specialists at DECF participated (100% male; mean 5.17 years practice). Patient participants were adults (≥22 years) with diabetes meeting AI eligibility: best-corrected visual acuity ≥6/18 in better eye; no prior DED diagnosis or retinal treatment/surgery; no contraindications to dilated fundus imaging; and able to consent. Exclusions included symptoms suggestive of active DED, prior DR/DME diagnosis, prior retina procedures, and inability to consent. Intervention and control workflows: All consenting AI-eligible patients underwent the autonomous AI diagnostic workflow (LumineticsCore, formerly IDx-DR) with pharmacologic dilation and standardized fundus imaging (Topcon NW400). AI outputs were: DED present (refer), DED absent (retest in 12 months), or insufficient image quality. In the intervention arm, AI output determined next steps: patients with DED absent completed their encounter without specialist; those with DED present or insufficient quality saw the specialist. In the control arm, all participants proceeded to specialist evaluation regardless of AI output. Non-consenting eligible patients followed usual care without AI-driven decisions. Autonomous AI system: LumineticsCore diagnoses referable DED (ETDRS level ≥35 and/or clinically significant or center-involved macular edema). The system is fully autonomous and previously validated for safety, efficacy, explainability, and lack of demographic bias; FDA De Novo authorized in 2018 with U.S. reimbursement. Outcomes: Primary outcome—clinic productivity among diabetes patients, defined as completed care encounters per hour per specialist (λ): intervention λAI counted AI-only completed encounters (DED absent) plus specialist-involved encounters; control λc counted completed specialist encounters. All diabetes patients presenting to the specialty clinic on study days were included in the productivity denominator. Secondary outcomes included productivity for all patients (with and without diabetes), complexity-adjusted specialist productivity (λcα) using a summed eye-level complexity score (International Clinical DR/DME scales), patient satisfaction, number of DED treatments scheduled per day, and AI diagnostic performance versus a level-4 human grading reference standard. Sample size and power: Assuming ICC=0.15, cluster size=8 per day, control mean 1.34 visits/hour, alpha=0.05 (two-sided), 80% power, a total of 924 completed encounters (462 per arm) would detect a 0.34 visits/hour difference (~25% increase) in productivity. Statistical analysis: Primary outcome compared by two-sided Student’s t-test with 95% CIs. Secondary outcomes analyzed with Wilcoxon rank-sum where non-normal. Robustness assessed via linear regression with generalized estimating equations accounting for clustering by clinic day (autoregressive structure), evaluating potential confounders (age, sex, education, income, complexity, day of week, AI output). Sensitivity analysis included variables with p<0.10 in univariate models. SAS 9.4 was used. Operational details: 105 clinic days randomized: intervention 51 days, control 54 days. Average 54.5 clinic patients/day. Among 2109 patients with diabetes presenting, 993 were AI-eligible and all consented/completed AI (494 intervention; 499 control). Implementation and operator training were performed remotely; imaging typically took ~10 minutes; AI results returned within ~60 seconds.
Key Findings
- Primary productivity (diabetes patients): Intervention λAI 1.59 encounters/hour/specialist (95% CI: 1.37–1.80) vs control λc 1.14 (95% CI: 1.02–1.25); increase 0.45/hour (39.5%); p<0.001. - Regression analyses: Intervention membership significantly associated with higher productivity (univariate β=0.449, SE 0.120, p<0.001). Sensitivity analysis with adjustment for age, sex, day of week, and AI output: β=0.461 (SE 0.118), p<0.001. Day of week showed some associations but minimal impact on the main effect. - Secondary productivity (all patients): Intervention 4.05 (95% CI: 3.66–4.43) vs control 3.36 (95% CI: 3.08–3.63); p=0.004. - Complexity-adjusted specialist productivity (diabetes patients): Intervention 3.15 vs control 1.19; increase by a factor of 2.65. - Patient flow: In intervention, 331/494 (67.0%) completed care via AI-only (DED absent), freeing specialist capacity for other patients. - Patient satisfaction: 100% satisfied/very satisfied with appointment waiting time; interaction satisfaction 100% control (499/499) vs 99.8% intervention (493/494). Among AI-only completes, 100% satisfied with time to receive results and receiving results from AI. - Specialist perceptions: All agreed/strongly agreed AI saved time and enabled focus on appropriate patients. - Treatments and complexity: No difference in DED treatments scheduled/day (control 0.70 [95% CI: 0.47–0.93] vs intervention 0.61 [0.38–0.83], p=0.532). Overall complexity scores similar across all participants (p=0.288), but among those requiring specialist exam post-AI, complexity higher in intervention (mean 2.80 ± 3.19) vs control (1.06 ± 2.36), p<0.0001. - AI diagnostic performance vs level-4 human reference: Sensitivity 93.9% (95% CI: 90.5–97.2), specificity 84.0% (95% CI: 81.4–86.7).
Discussion
The trial confirms the hypothesis that autonomous AI improves real-world specialist clinic productivity in an overloaded, unscheduled clinic context. By completing encounters for lower-complexity, DED-absent patients at the point of care, AI freed specialist time for more complex cases, as reflected by higher complexity among specialist-seen patients on intervention days. This reallocation increased throughput (≈40% among diabetes patients; significant gains across all patients) without reducing care quality, supported by high AI diagnostic performance and full patient and provider satisfaction. The findings address the healthcare productivity gap, suggesting that autonomous AI can deliver productivity gains where traditional IT has sometimes failed to do so, and may help expand access and reduce disparities, particularly in low- and middle-income settings with scarce specialists. The masked, cluster-randomized design in a saturated-queue environment minimized scheduling and behavioral biases, strengthening causal inference that AI use drove productivity gains.
Conclusion
In the B-PRODUCTIVE cluster-randomized trial, an FDA-authorized autonomous AI system for diabetic eye exams increased specialist clinic productivity by approximately 40% among diabetes patients and significantly for all patients, with complexity-adjusted specialist productivity improving by a factor of 2.65. AI-enabled workflows allowed specialists to focus on complex cases, maintained high diagnostic performance, and achieved high patient and provider satisfaction. Autonomous AI has strong potential to improve access and mitigate health disparities by increasing the productivity of overstretched health systems. Future research should evaluate generalizability across multiple health systems and countries, scheduled clinic environments, additional diseases and AI systems, and long-term impacts on outcomes and costs.
Limitations
Single health system in a low-income country; only three specialist physicians; AI targeted a single condition (referable DED) and validated for patients without symptoms or prior DED history; conducted in a saturated-queue, unscheduled clinic context—results may not generalize to scheduled clinics; potential unmasking via perceived complexity could bias against AI; broader applicability to other conditions, settings, and AI systems requires caution and further studies.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny