Medicine and Health
Al-based mobile application to fight antibiotic resistance
M. Pascucci, G. Royer, et al.
Discover how an innovative AI-based offline smartphone application is transforming antibiogram analysis, capturing images, guiding analysis, and providing results with impressive accuracy. Developed by an expert team including Marco Pascucci and Guilhem Royer, this tool aims to enhance antibiotic susceptibility testing in resource-limited settings.
~3 min • Beginner • English
Introduction
The study addresses the need for accessible, reliable antibiotic susceptibility testing (AST) in the face of rising antimicrobial resistance (AMR), especially in low- and middle-income countries where AST access is limited. Disk diffusion (Kirby-Bauer) is widely used but criticized for being labor-intensive, subject to inter-operator variability, and requiring expert interpretative reading to account for complex resistance mechanisms and class-level inferences. Commercial automated readers exist but are costly and require infrastructure unsuited to resource-limited settings. The research aims to develop and evaluate a fully offline, smartphone-based application that can acquire images of disk diffusion plates, automatically measure inhibition zones, identify antibiotic disks, and provide interpreted susceptibility categories using an embedded expert system, thereby improving reliability, reducing variability, and enabling broader access to AST.
Literature Review
Prior work includes commercial automated zone readers (e.g., SIRscan, Osiris, WASPLab with expert systems), which improve standardization and turnaround time but depend on dedicated hardware and infrastructure, limiting use in resource-constrained settings. Several image processing approaches have been proposed for automatic diameter measurement via radial profile analysis, texture/directional filtering, and computerized image analysis. AntibiogramJ is a functional desktop software requiring prior image transfer. Earlier methods for reading printed disk acronyms used image moment invariants or ORB descriptors. Expert systems based on EUCAST rules are established in microbiology and included in commercial systems. However, an offline, end-to-end smartphone solution integrating image processing, ML-based disk identification, and an expert system was lacking.
Methodology
System design: The application runs entirely offline on a smartphone and comprises three components: (1) an image processing (IP) module for analysis of AST images; (2) an expert system (ES) implementing EUCAST-based rules for coherence checks and interpretation; (3) a graphical user interface (GUI) guiding users and allowing verification/correction of measurements.
- Image acquisition: Photos are captured with the phone camera using simple, low-cost setup guidelines (e.g., cardboard stand ensuring camera parallelism, black background). The app uses gyroscope/accelerometer to enforce device orientation/stability and an on-screen frame to center the plate. No perspective correction is applied; instead, acquisition setup minimizes distortion. A method is provided to assess camera optical distortions.
- Image processing library: Implemented in C++ using OpenCV and TensorFlow, with a Python wrapper for prototyping and batch processing. Steps: (a) Plate cropping via GrabCut assuming approximate centering; (b) detection of plate dominant color and shape (round/square) to infer medium type (MH vs blood-enriched MH-F); (c) grayscale conversion; (d) detection of antibiotic disks (white circular cellulose disks, typically 6 mm) and estimation of scale (pixels to mm) from average disk radius; (e) antibiotic label recognition using a CNN trained on 18,000 disk images across two manufacturers (65 labels). An ensemble of 10 models with an output entropy threshold reduces out-of-distribution errors; achieved 99.97% label accuracy and robust performance on poorly printed or slightly out-of-focus disks; (f) inhibition zone diameter measurement using a new algorithm, SWITCH (Spatial Weighted Intensity Threshold CHangepoint): local k-means clustering (k=2) around each disk classifies inhibition vs bacteria pixels; a radial profile I(r) is computed (considering all pixels at distance r from disk center up to nearest neighbor disk), and a changepoint in I(r) segments the inhibition boundary. This method is robust to illumination variability, texture, hazy borders, overlaps, plate borders, and non-circular zones.
- Susceptibility categorization and interpretation: Measured diameters are compared to EUCAST clinical breakpoints stored offline in the ES knowledge base (maintained/updated yearly by i2a). The ES performs coherence checks (e.g., species-antibiotic consistency, expected intrinsic resistances) and extrapolates results across antibiotic classes, issuing clinical comments and alerts for key resistance mechanisms.
- Resistance mechanism shapes: Two ML classifiers were trained to detect characteristic inhibition zone shapes indicating synergy (e.g., ESBL) and induction (macrolide-inducible clindamycin resistance). Accuracies reached 98% (ESBL) and 99.7% (induction) on limited training data. Given overfitting and clinical risk concerns, automatic deployment in the app was deferred; instead, the GUI prompts users to confirm presence/absence of such shapes when relevant, with illustrated examples.
- Datasets for evaluation: Three AST sets were used in fully automatic mode (no manual corrections): A1 (570 plates, 8,168 disks total; MH, mostly square plates) and A2 (75 plates, 649 disks; MH-F blood agar; mixed plate shapes) prepared during routine hospital workflow in Créteil, France; A3 (8 plates, 98 disks; MH; circular plates) prepared in MSF Amman Hospital using ATCC QC strains. In A1 and A2, disks from i2a were placed with a dispenser gun; in A3, Liofilchem disks were placed manually with tweezers. Incubation conditions followed EUCAST recommendations. Multiple technicians (up to 8 in A2/A3) provided manual measurements for benchmarking; SIRscan automatic readings served as control for A1/A2; manual measurements served as control for A3.
- Performance measurement: Agreement in susceptibility categorization (S/I/R) between the app’s automatic pipeline and control was computed, along with error rates (very major, major, minor) and Cohen’s kappa. Inter-operator variability was assessed comparing manual ruler measurements vs app-assisted diameter adjustments. Computational performance was measured on different devices and image resolutions.
Key Findings
- Overall feasibility and accuracy: The fully automatic pipeline achieved overall agreement in susceptibility categorization of 90% versus a hospital-standard automatic system (SIRscan) and 98% versus manual measurement (gold standard).
- Dataset-specific results (Table 2):
- A1 (MH, control SIRscan): overall agreement 90% across 570 images and 7,334 antibiotics; very major 54 (0.7%), major 428 (5.8%), minor 270 (3.7%); Cohen’s kappa 0.77. Standard subset (561 images) agreement 90% (kappa 0.77). Problematic subset (9 images) agreement 58% (kappa 0.12).
- A2 (blood MH-F, control SIRscan): overall agreement 91% across 75 images and 534 antibiotics; very major 4 (0.7%), major 36 (6.7%), minor 6 (1.1%); kappa 0.71. Standard subset (73 images) agreement 95% (kappa 0.83). Problematic subset (2 images) agreement 12% (kappa 0.01).
- A3 (MH, control manual): agreement 98% across 64 images and 976 antibiotics; very major 3 (0.4%), major 7 (0.9%), minor 27 (3.5%); kappa 0.96. Against average of eight manual raters (subset of 776 antibiotics): 95% agreement (very major 1.03%, minor 1.03%).
- Inter-operator variability: App-assisted measurements (adjusting a circle on the smartphone) reduced inter-operator diameter variability compared to manual ruler measurements; manual readings tended to be slightly larger than automatic/app-assisted ones.
- Robustness and image variability: The pipeline handled variability in contrast, illumination, and plate/disk arrangements; problematic images constituted a small fraction (1.5% in A1; 2.6% in A2; 0% in A3) but could cause major discrepancies.
- Computational efficiency and device range: End-to-end reading time for a 12 MP image with 16 disks was <1 s on a 2.3 GHz Intel Core i5 PC, ~1.5 s on a high-end smartphone (Pixel 3, 2018), and ~6.6 s on a low-end smartphone (Samsung A10). Consistent results were obtained even when downscaling images to ~1 MP. The app was tested on Google Pixel 3A, Honor 6x, and Samsung A10; Samsung A10 is recommended as an affordable device.
- ML components: Disk label CNN achieved 99.97% accuracy on 18,000 images (65 labels) using an ensemble with entropy thresholding to mitigate out-of-distribution errors. Prototype ML detectors for resistance mechanism shapes achieved 99.7% (induction) and 98% (ESBL synergy) accuracy on limited datasets but were not deployed for automatic decisions.
- Expert system: The embedded ES performed coherence checks, extrapolated class-level interpretations, and provided clinical comments offline based on EUCAST expert rules updated by i2a.
Discussion
The findings demonstrate that a fully offline, smartphone-based system can perform disk diffusion AST analysis with accuracy comparable to established automatic readers and to manual gold-standard measurements, while reducing inter-operator variability through an intuitive, assisted measurement interface. By integrating a robust image processing pipeline (including the SWITCH algorithm for diameter detection) and a continually updated expert system, the app addresses the key shortcomings of manual disk diffusion—subjectivity, labor intensity, and interpretative complexity—without the cost and infrastructure barriers of commercial systems. Performance was maintained across variable image qualities and media types (standard MH and blood-enriched MH-F), and processing times were practical on low-end devices, supporting suitability for resource-limited laboratories. Compared with existing systems, the app’s offline operation, minimal hardware requirements, and guided workflow are significant advantages for adoption in LMIC contexts. While ML models for detecting specific resistance mechanism morphologies showed promising accuracies, the decision to prompt user confirmation rather than fully automate these detections balances innovation with clinical safety. Overall, the work supports broader AST access and contributes a pathway for scalable AMR surveillance through standardized, interpretable results generated at the point of care.
Conclusion
This work introduces and validates an offline smartphone application that acquires AST images, automatically identifies antibiotic disks, measures inhibition zone diameters using the SWITCH algorithm, and interprets susceptibility with an embedded expert system. The system achieves high agreement with both a hospital-standard automated reader and manual gold-standard measurements, reduces inter-operator variability, and operates efficiently on low-cost smartphones, making it well suited for resource-limited settings. Future directions include: (1) clinical investigations in MSF hospitals to quantify patient impact; (2) planned open-source release of the application (AntigbioGo) following CE marking; (3) support for selective reporting to enhance antibiotic stewardship; and (4) optional contribution of de-identified results to global AMR surveillance (e.g., WHO GLASS/WHONET). Further work may also expand device compatibility, broaden training for disk label recognition across more manufacturers, and rigorously validate ML-based detection of resistance mechanism morphologies before integration.
Limitations
- Image acquisition variability: Smartphone cameras and non-standardized setups introduce variability in contrast, illumination, and reflections; although guidelines and UI checks mitigate this, a small subset of images remained problematic and led to major discrepancies.
- Hardware and generalizability: Only a few smartphone models were tested; broader validation across devices, cameras, and operating conditions is needed. The system does not include perspective correction by design, relying instead on user setup.
- Resistance mechanism detection: ML models for synergy/induction were trained on relatively small datasets, raising overfitting concerns; consequently, automatic deployment was withheld, and user confirmation is required, limiting fully automated interpretation of such mechanisms.
- Manual control variability: Comparisons rely on SIRscan as control for A1/A2 and manual measurements for A3; manual ruler measurements are themselves variable and may bias diameter differences.
- Manufacturer variability: Disk label recognition was trained on disks from two manufacturers (65 labels); although entropy thresholding and ensembling reduce errors, out-of-distribution disks from other manufacturers may require further training.
- Not a replacement for high-end systems: The app is not intended to match the full standardization and throughput of dedicated commercial hardware, which may limit applicability in high-volume labs.
Related Publications
Explore these studies to deepen your understanding of the subject.

