logo
ResearchBunny Logo
Screening COVID-19 by Swaasa AI platform using cough sounds: a cross-sectional study

Medicine and Health

Screening COVID-19 by Swaasa AI platform using cough sounds: a cross-sectional study

P. Pentakota, G. Rudraraju, et al.

Discover the innovative Swaasa AI platform that utilizes cough sounds and symptoms for COVID-19 screening. Achieving a remarkable 75.54% accuracy with high sensitivity and specificity, this cost-effective tool offers valuable remote monitoring for preliminary assessment. Developed by a team of experts including Padmalatha Pentakota, Gowrisree Rudraraju, and more.

00:00
00:00
~3 min • Beginner • English
Introduction
The study addresses the need for rapid, cost-effective, and scalable pre-screening tools for COVID-19 to curb transmission, given that standard viral and serological tests can be expensive, time-consuming, require technical expertise, and may struggle with emerging variants. Cough is a prevalent COVID-19 symptom and carries physiologic and acoustic information that may differentiate respiratory conditions. Prior work has shown the potential of AI-based cough analysis for various diseases and indicated promise for COVID-19 detection, but most efforts relied on crowdsourced datasets with limited clinical validation, hampering deployment. The research question is whether an AI platform analyzing short cough recordings can accurately screen and prioritize individuals likely to have COVID-19 in real-world clinical settings. The purpose is to develop, clinically validate, and pilot an AI model (Swaasa) that leverages a combined CNN and tabular-feature ANN to detect a distinct COVID-19 cough signature and function as a rapid point-of-care triage tool.
Literature Review
Recent studies have explored ML models for COVID-19 detection from respiratory sounds and coughs, reporting high AUCs using CNNs, LSTMs, and ensemble methods, often on crowdsourced data. Examples include: MFCC-based pretrained CNNs achieving AUCs around 94–98% and high sensitivity/specificity; ResNet50 distinguishing COVID-19-positive from healthy coughs (AUC ~98%) and LSTMs distinguishing positive vs. negative coughs (AUC ~94%); ensemble-based MCDM approaches with AUC ~95%. Other works combined cough and breathing/voice signals, reporting overall AUCs above 80%. However, these efforts frequently relied on open, crowdsourced datasets, with limited technical and clinical validation, raising questions about generalizability and real-world performance. The present study aims to overcome these limitations by using clinically acquired data from COVID-19 positive patients, healthy subjects, and patients with other respiratory diseases, and by conducting derivation, validation, and pilot deployment phases.
Methodology
Study design: Cross-sectional development, clinical validation, and pilot deployment of the Swaasa AI platform as a point-of-care screening tool for COVID-19 using 10-second cough recordings. Sample size estimation: Using n = Z^2*P(1−P)/d^2 with 95% CI, anticipated prevalence 0.75%, and 2.5% error, a total of 1152 subjects was estimated to validate 90% sensitivity. Data from two clinical trial studies were pooled, yielding 1052 participants (~62% controls). Balanced COVID-19/non-COVID-19 data were used for training to avoid bias. Data collection: Conducted at Andhra Medical College, Visakhapatnam, India, under IEC approval and CTRI registrations (CTRI/2021/09/036,489; CTRI/2021/07/035,096). Adults (≥18 years) able to consent were enrolled; children, ventilated patients, certain asymptomatic individuals, and pregnant women were excluded. Demographics, vitals, SGRQ Part I, and COVID-19 symptom data were collected. Cough recordings were captured by trained personnel via smartphone (Android/iPhone), instructing participants to cough 2–3 times within a 10-second recording while seated in a quiet area, holding the device 4–8 inches from the mouth at ~90°. Safety measures included mask use and device sanitation after each recording. A noise reduction algorithm and a cough/non-cough classifier filtered and validated cough events. All participants underwent RT-PCR as reference standard. Model development (derivation phase): From 252 RT-PCR-confirmed COVID-19 positive subjects, 803 cough events were extracted using moving-window signal standard deviation; a cough/non-cough classifier removed non-cough events. Additional non-COVID data (including various respiratory diseases and healthy controls) from prior studies were included. In total, 1946 cough events were available across COVID-likely Yes/No. Features (n=209) spanned time and frequency domains: MFCCs, spectral (centroid, roll-off, etc.), chroma, contrast, tonnetz, zero-crossing rate, energy, skewness, kurtosis, plus age and gender. Correlation-based feature selection reduced features to 170. Two parallel models were trained: (1) a CNN on MFCC spectrogram images; (2) a feedforward ANN on tabular features. Their final layers were merged to generate the COVID-likely prediction, with an option to output “inconclusive” when uncertain. Training used K-fold cross-validation. Clinical validation: The trained model was prospectively evaluated on 233 subjects from isolation wards and testing centers, each receiving both Swaasa screening and RT-PCR. Outputs were compared by a statistician; one subject’s result was inconclusive and excluded from performance analysis. External validation (pilot phase): The deployed model was tested on presumptive COVID-19 cases at a peripheral healthcare center to assess screening performance before clinical diagnosis, comparing predictions to RT-PCR. Metrics included sensitivity, specificity, PPV, NPV, accuracy. Explainability: LIME was used to visualize local feature contributions on spectrograms. Green areas indicate positive contributions toward a class; red indicates negative. Disease-specific frequency distributions were examined to assess distinct cough signatures. Statistics: Accuracy, sensitivity, specificity, PPV, NPV, ROC/AUC were computed with 95% CIs using the Clopper–Pearson method. Confusion matrices were generated for each phase.
Key Findings
Derivation phase (clinical data): - Dataset: 252 COVID-19 positive subjects (803 cough records) plus non-COVID data from prior studies; total of 1213 records used in classifiers. - Performance: Accuracy 96%, sensitivity 95.8%, specificity 95.6%, AUC 0.95 (best-fold AUC reported up to 0.965). ROC indicated strong discrimination. Crowdsourced test during derivation: - Performance: Accuracy 86%, sensitivity 85%, specificity 88%, AUC 0.855. Clinical validation phase: - Sample: 234 enrolled; 22 RT-PCR positive, 211 RT-PCR negative; 1 inconclusive excluded. - Confusion matrix summary: Accuracy 75.54%; sensitivity 95.45% (95% CI 75.16–99.88); specificity 73.46% (95% CI 69.96–79.29); AUC 0.75. PPV 27.27% and NPV 99.36% reflected low disease prevalence in the cohort. Pilot (external validation) phase: - Sample: 183 presumptive cases screened; model flagged 82 as COVID-likely; 58 of these were true positives. - Performance: Positive predictive value 70.73% in real-world screening at a tertiary care hospital. Cough signature insights: - LIME and feature analyses indicate COVID-19 coughs exhibit higher spectral content and more dominant high-frequency components than several other respiratory conditions; disease-specific differences in frequency distribution support cough-based discrimination.
Discussion
The study demonstrates that a combined CNN (spectrogram-based) and tabular-feature ANN can identify a distinct acoustic signature in COVID-19 coughs and function as a rapid, noninvasive screening tool. High sensitivity in clinical validation (95.45%) suggests suitability for triaging and prioritizing individuals for confirmatory testing, potentially reducing time and resource burdens compared with traditional pre-screening. Compared with prior work largely based on crowdsourced data, this model was developed and tested with clinically acquired recordings across COVID-19 positive patients, healthy controls, and those with other respiratory diseases, enhancing real-world relevance. The merged-model strategy outperformed single-model approaches reported elsewhere. Although validation accuracy and PPV vary with prevalence and setting, pilot deployment showed promising PPV (70.73%) in a real clinical workflow, supporting utility for prescreening and triage ahead of molecular diagnostics. Explainability via LIME supports biological plausibility by highlighting disease-specific spectral patterns in cough sounds.
Conclusion
Swaasa, an AI platform analyzing brief cough recordings, was developed, clinically validated, and piloted as a rapid point-of-care screening tool for COVID-19. The model achieved high sensitivity in clinical validation and demonstrated strong PPV in pilot deployment, indicating practical value for triage and prioritization before confirmatory testing. The approach is cost-effective, noninvasive, and scalable, requiring only a smartphone and internet connectivity. Future work should include larger, more diverse, and multi-center cohorts to enhance generalizability, refine performance across populations and variants, and further validate robustness for global deployment.
Limitations
- Generalizability: Participants were adults (≥18 years) able to provide consent; children, ventilated patients, certain asymptomatic individuals, and pregnant women were excluded, limiting applicability to these groups. - Setting and data: Recordings were collected in a single clinical setting using smartphones with noise reduction; performance may vary across environments, devices, and background noise conditions. - Prevalence dependence: Validation cohort had low disease prevalence, contributing to a low PPV in that phase; performance metrics (particularly PPV and NPV) may vary with prevalence. - Further validation needed: Authors note the need for larger and more diverse populations and broader external validations to improve global applicability.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny