logo
ResearchBunny Logo
Discovery and systematic assessment of early biomarkers that predict progression to severe COVID-19 disease

Medicine and Health

Discovery and systematic assessment of early biomarkers that predict progression to severe COVID-19 disease

K. Hufnagel, A. Fathi, et al.

This groundbreaking study by Katrin Hufnagel and her team uncovers plasma protein biomarkers that could forecast the progression of COVID-19 to severe illness during its early stages. Through innovative antibody microarrays and machine learning, they identified multi-marker panels that hold promise for timely patient interventions.

00:00
00:00
~3 min • Beginner • English
Introduction
The study addresses the need to predict early which COVID-19 patients will progress to severe or critical illness. Despite known risk factors such as age, obesity, and comorbidities, outcomes vary widely and mechanisms leading to adverse outcomes are not fully understood. Prior work typically evaluated single or few biomarkers with limited clinical accuracy. The research objective was to discover and systematically assess plasma protein biomarkers measurable early after symptom onset that predict progression to severe/critical COVID-19, enabling early risk stratification and intervention.
Literature Review
The authors note multiple reports of altered cytokines, chemokines, and other proteins in severe COVID-19 and their proposed use as prognostic markers or therapeutic targets. However, prior studies often focused on single biomarkers with insufficient accuracy for clinical use. The paper contextualizes findings with literature on S100A8/A9 (calprotectin) correlating with severity and mortality, FGF2 association with severe disease and lung upregulation, AREG expression in monocyte subsets linked to severe disease, IL-2 elevation with high viral load and hyperinflammatory states, and the IGF1/IGF1R pathway in immune regulation and ARDS. It also references OX2G as an immune checkpoint with lower abundance in severe disease and prior reports on CSF1, ALCAM, and T cell markers (e.g., CD28) in severe COVID-19.
Methodology
Study design: Whole blood was collected from RT-PCR–confirmed SARS-CoV-2 patients at two centers under ethics approvals (Hamburg PV7298; Heidelberg 1-548/2020) with informed consent. Cohort 1 (UKE Hamburg) included 53 plasma samples from 16 patients across phases defined by days post-symptom onset: acute (≤9 days), intermediate (10–21 days), and late (>21 days). Cohort 2 (Heidelberg) included 94 plasma samples from the acute phase (≤10 days); patients were not hospitalized or in intensive care at sampling, received no COVID-19–specific medication, and were later matched into 47 severe/critical (CS) and 47 mild/moderate (MM) outcomes by age and sex. Disease severity followed WHO/IDSA/RKI definitions. Assay platform: Antibody microarrays (Sciomics). Cohort 1: 53 sicO2 arrays targeting 51 proteins via 517 antibodies. Cohort 2: arrays targeting 89 proteins via 1425 antibodies; all proteins assayed in cohort 1 were also included in cohort 2. Dual-color, reference-based design with a common pooled reference per cohort; surfaces blocked (sicBloC); competitive incubation; washes with PBST/PBS; drying with nitrogen. Data acquisition and preprocessing: Scanning on Tecan Powerscanner; segmentation with GenePix Pro 6.0. Median intensities were normalized using an invariant Loess method. Linear models for microarray data (limma v3.42.2) incorporated age (≥60 vs <60), sex, and comorbidity (≥1 vs none) as covariates. For cohort 1, the main factor combined severity and phase; for cohort 2, severity only (all acute). Group differences are log2 fold-change (logFC) estimates with moderated two-sided t-tests, empirical Bayes moderation, and Benjamini–Hochberg FDR adjustment. Differential abundance threshold: |logFC| > 0.5 and adjusted p < 0.05. Machine learning: Preselected markers included those discriminative by logFC and adjusted p-values in both cohorts plus the top 10 biomarkers by preliminary linear SVM coefficients, yielding 21 antibodies to 20 proteins. All 2-, 3-, and 4-marker combinations were evaluated using linear SVM (scikit-learn SVC, C=1) with leave-one-out (LOO) cross-validated ROC AUC; class probabilities were used for ROC computation. Robustness across LOO folds was assessed by coefficient of variation (CV) of linear coefficients (generally 2–6%; 12.6% for S100A8/A9+CRP). Validation: CRP measured in the clinical laboratory (Siemens Advia Chemistry XPT). S100A8/A9 measured by ELISA (R&D Systems); plasma diluted 1:100 in PBS+1% BSA. Correlations between array and ELISA: Pearson r=0.905 (S100A8/A9) and r=0.955 (CRP). For ELISA S100A8/A9, a 1.5 µg/mL cutoff yielded specificity 83% and sensitivity 89%.
Key Findings
- Discovery cohorts: Cohort 1 (n=53 samples, 16 patients) identified 58 proteins with altered abundance between future CS vs MM cases across phases. Cohort 2 (n=94 acute-phase samples; 47 CS vs 47 MM) identified 51 differentially abundant proteins (46 higher in CS; 5 higher in MM), including CRP, S100A8/A9, FGF2, SLAF1, with FINC (fibronectin), TSP1, MMP12, IL5, and S100A less abundant in CS. - Eleven biomarkers consistently associated with severity across both cohorts: S100A8/A9 (calprotectin), FGF2, SLAF1, CD47, CXCR5, IL32, IL13R2, CD81, AREG, TNFR16, and IL2. - Individual marker performance (array data, cohort 2): CRP AUC 0.837 (specificities at sensitivities: 90%→59.6%); S100A8/A9 AUC 0.827 (90%→48.9%). ELISA slightly outperformed arrays for these markers (S100A8/A9 AUC 0.886; CRP AUC 0.866) and correlated strongly with array readouts (r=0.905 and r=0.955, respectively). - Multi-marker panels improved accuracy: - S100A8/A9 + TSP1 AUC 0.872 (array). - S100A8/A9 + TSP1 + ERBB2 AUC 0.898. - S100A8/A9 + TSP1 + IFNλ (IFN1L) AUC 0.913. - Four-marker panel S100A8/A9 + FINC + IFN1L + TSP1 AUC 0.928; at 90% sensitivity, specificities for highlighted four-marker panels were approximately 78.7%, 80.9%, 78.7%, and 83.0%. - Directionality for key contributors: FINC and TSP1 were higher in MM than CS (negative acute-phase behavior), aiding discrimination; S100A8/A9 and CRP were higher in CS. - Early-phase sampling (≤10 days from symptom onset) allowed prediction before severe symptoms or hospitalization.
Discussion
The study demonstrates that plasma protein profiles measured early in SARS-CoV-2 infection can predict progression to severe/critical disease. Multiplex antibody microarrays coupled with linear modeling and machine learning identified robust biomarker combinations that outperform single markers, addressing the limitations of prior single-biomarker approaches. The findings are biologically plausible: elevated S100A8/A9 reflects neutrophil-driven inflammation; FGF2 and AREG elevations align with severe disease pathophysiology; IL-2 increases are consistent with hyperinflammatory states; and IGF1/IGF1R pathway involvement has been implicated in ARDS. Conversely, lower levels of FINC and TSP1 in severe disease mirror negative acute-phase behavior and previous associations with poor outcomes in other conditions. Importantly, results translated well to routine immunoassay platforms (ELISA), supporting clinical feasibility. Implementing these biomarker panels could enable early risk stratification to guide timely therapeutics (e.g., antivirals, neutralizing antibodies) and optimize resource allocation in clinical and public health settings.
Conclusion
This work identifies and systematically validates early plasma protein biomarkers predictive of severe COVID-19. Eleven markers were consistent across two cohorts, and machine learning-derived panels—particularly a four-protein panel comprising S100A8/A9, FINC, IFNλ, and TSP1—achieved high diagnostic accuracy (AUC up to 0.928), outperforming single markers. Orthogonal validation confirms strong assay transferability, supporting clinical implementation potential. Future research should prospectively validate these panels in larger, diverse populations and evaluate integration into clinical decision pathways, including testing across viral variants and assessing impact on outcomes with early targeted interventions.
Limitations
- Cohort 1 had a small sample size (53 samples from 16 patients) with only five severe/critical cases, limiting power and potentially contributing to variability. - Differences in cohort selection and protein panels between cohorts (e.g., FINC and TSP1 were not measured in cohort 1) constrain cross-cohort comparability for some markers. - Minor performance differences between array-based and ELISA measurements indicate potential platform-specific effects, although correlations were high and findings were transferable.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny