logo
Loading...
Distinguishing features of long COVID identified through immune profiling

Medicine and Health

Distinguishing features of long COVID identified through immune profiling

J. Klein, J. Wood, et al.

This groundbreaking study explored the distinct biological characteristics of long COVID, revealing significant immune differences and identifying potential biomarkers for future research, conducted by an esteemed team of researchers including Jon Klein and Akiko Iwasaki.... show more
Introduction

The study addresses the question: what immunological features distinguish individuals with long COVID from recovered or uninfected controls, and can these features serve as biomarkers? Post-acute infection syndromes (PAIS) have long been recognized, yet their biology remains poorly understood. SARS-CoV-2 has caused millions of deaths and substantial morbidity; even after mild acute COVID-19, persistent somatic symptoms are common and can impair quality of life. Long COVID manifests with fatigue, post-exertional malaise, cognitive impairment, and autonomic dysfunction. Hypothesized mechanisms include persistent viral antigens, autoimmunity, dysbiosis, reactivation of latent viruses (for example EBV), and chronic inflammation leading to tissue damage. This study aims to profile immune cell populations, humoral responses, soluble mediators, and autoantibodies to identify biological signatures associated with long COVID and to evaluate their predictive utility using machine learning.

Literature Review

The authors situate their work within evidence that PAIS arises after diverse infections and that long COVID occurs in a substantial subset of individuals following SARS-CoV-2 infection. Prior studies documented immune dysregulation in acute COVID-19, risks of long-term sequelae, and proposed mechanisms for long COVID including viral persistence, autoimmunity, microbiome alterations, and herpesvirus reactivation (notably EBV). Reports have also suggested GPCR-directed autoantibodies in some long COVID cohorts and observed EBV reactivation during acute COVID-19 predicting later symptoms. The present study builds on this literature by systematically comparing immune cell phenotypes, antibody repertoires (to SARS-CoV-2 and other viruses), exoproteome autoantibodies, and circulating mediators well over a year post-infection, integrating findings with machine learning classification.

Methodology

Design: Cross-sectional, multi-cohort study (Mount Sinai–Yale long COVID, MY-LC) with 275 participants initially enrolled across five groups: (1) healthcare workers infected before vaccination (HCW); (2) healthy uninfected vaccinated controls (HC); (3) previously infected vaccinated controls without persistent symptoms (convalescent controls, CC); (4) individuals with persistent symptoms after acute infection (long COVID, LC); and (5) an external long COVID cohort (EXT-LC). After exclusions, 268 remained for analysis. Most LC and CC had mild, non-hospitalized acute COVID-19; samples were collected on average more than one year after infection. Systematic multidimensional immunophenotyping and machine learning were performed primarily on HC, CC, and LC groups, with selected external validation in EXT-LC. Clinical and survey measures: Demographics and medical histories were extracted; symptom surveys were used to derive a long COVID propensity score (LCPS) via parsimonious logistic regression (LC vs others), with AUC assessed by bootstrap. Symptom clustering used agglomerative hierarchical clustering on binary symptoms. Flow cytometry and cell phenotyping: Peripheral blood mononuclear cells (PBMCs) were analyzed for myeloid and lymphoid subsets, including non-conventional monocytes (CD14lowCD16high), cDC1, plasmacytoid DCs, cDC2, B cell subsets (activated CD86high HLA-DRhigh; double-negative IgD−CD27−CD24−CD38−), and T cell subsets (naive, central memory, effector memory, exhaustion markers PD-1 and TIM-3). Absolute counts and relative frequencies were assessed. T cell function was measured after PMA/ionomycin restimulation with intracellular cytokine staining for IL-2, IL-4, IL-6, IFN-γ, IL-17, TNF, and granzyme B; double-positive IL-4/IL-6 cells were quantified. Serology for SARS-CoV-2: ELISA measured anti-S1, anti-spike (S), anti-RBD IgG (in vaccinated), and anti-N IgG (in unvaccinated LC vs historical unvaccinated previously infected controls). Linear epitope profiling used protein-based immunome-wide association study (PIWAS) and peptide IgG binding analysis along spike to map enriched motifs and regions; structural mapping used PDB 6VXX. Hormones and soluble mediators: Multiplex plasma assays quantified cortisol, ACTH (MY-LC only), complement C4b, chemokines (CCL19, CCL20, CCL4), galectin-1, APRIL, LH, IL-5, and others. Sample collection times were recorded and included in models. Autoantibodies: Rapid extracellular antigen profiling (REAP) assayed IgG reactivity against >6,000 extracellular/secreted human proteins. Reactivity counts per individual, GPCR-directed autoantibodies, and category-based aggregations (Gene Ontology curated lists) were compared. Antibodies to other viruses: REAP assessed 225 viral surface proteins; SERA (random bacterial display, validated epitope panels for 45 pathogens) measured epitope-level responses and seropositivity, including EBV, HSV-1, VZV. EBV-specific analyses included serostatus restriction, IgM assessment, and viraemia checks; linear motif PVXF[ND]K on EBV gp42 was mapped to structure (PDB 5T1D). Correlations with T cell phenotypes were tested. Statistics and modeling: Group differences used nonparametric tests (Kruskal–Wallis with multiple-testing corrections, Dunn’s tests, Wilcoxon rank-sum with Benjamini–Hochberg). Linear models accounted for confounders (age, sex, BMI, vaccination at blood draw, sample time, cohort) to test associations with LC status. PERMANOVA assessed multivariate differences in immune populations. Machine learning included PCA, k-NN classification, principal component regression, and LASSO feature selection. A Gale–Shapley matching algorithm matched each LC participant to a control based on age, sex, days since onset, and vaccination status to reduce confounding. Model performance was reported via AUC, pseudo-R², and external validation (EXT-LC).

Key Findings

Cohort and clinical: After exclusions, 268 participants were analyzed. LC participants had higher symptom burden and reduced quality of life. The LC propensity score (LCPS) showed strong diagnostic performance (AUC 0.95; 95% CI 0.91–0.98). Common LC symptoms included fatigue (87%), brain fog (78%), memory difficulty (62%), and confusion (55%); POTS was prevalent (38% received diagnostic evaluation); employment was negatively affected in about half. Immune cell differences: LC had significantly higher circulating non-conventional monocytes (CD14lowCD16high) with elevated HLA-DR expression; linear models confirmed LC status association. cDC1 levels were significantly lower in LC; age and LC status associated with cDC1 levels. Activated B cells (CD86high HLA-DRhigh) and double-negative B cells (IgD−CD27−CD24−CD38−) were increased in LC (both percentage and absolute counts). CD4+ central memory T cells were reduced in LC (median 27% LC vs 33% CC and 32% HC); absolute CD4+ counts and exhausted CD4+ T cell counts were increased. Naive CD4+ and CD8+ T cells did not differ. T cell function: Upon PMA/ionomycin stimulation, LC showed higher intracellular cytokines: CD4+ IL-2 (17% LC vs 14% CC vs 13% HC) and IL-4 (11% vs 7% vs 8%); CD8+ IL-2 (4% vs 2% vs 2%) and IL-6 (1.2% vs 0.6% vs 0.6%). IL-4/IL-6 double-positive T cells were uniquely elevated (CD4+ 0.3% LC vs 0.2% CC/HC; CD8+ 0.5% LC vs 0.2% CC/HC). IFN-γ, IL-17 (CD4+) and TNF, granzyme B (CD8+) did not differ significantly. PERMANOVA indicated LC status and age significantly predicted immune cell population differences. SARS-CoV-2 humoral responses: In vaccinated participants, anti-S1 IgG was significantly higher in LC; total anti-S and anti-RBD IgG were elevated but not significantly different vs CC. Unvaccinated LC had higher anti-N IgG vs historical unvaccinated previously infected controls. Linear models adjusting for demographics and vaccination confirmed LC state positively predicted anti-spike humoral responses. Peptide analyses showed LC-enriched binding to spike residues 556–572 (1.3×), 572–586, 625–638, and 682–690 (furin cleavage site); CC had higher responses to S2 peptides 1149–1161 (1.5×) and 1256–1266 (2.1×). Motifs enriched in LC included KFLPFQQ (P=0.023), RDPQTLE (P=0.00058), and LDK[WY]F (P=0.0034); prevalence of KFLPFQQ, RDPQTLE, LDK[WY]F, and DISGI reactivities was higher in LC. Adjusted models associated LC with KFLPFQQ, RDPQTLE, and DISGI reactivities; LDK[WY]F was elevated in both CC and LC. Hormones and soluble mediators: Groups differed in plasma mediators: cortisol (P<0.0001), C4b (P=0.0001), CCL19 (P=0.00058), galectin-1 (P=0.0015), CCL20 (P=0.0032), CCL4 (P=0.0092), APRIL (P=0.013), LH (P=0.022), and IL-5 (decreased; P=0.024). LC had higher C4b, CCL19, CCL20, galectin-1, CCL4, APRIL, LH, and lower IL-5. Cortisol strongly correlated with LCPS and was significantly lower in LC in both MY-LC and EXT-LC; ACTH did not differ. After adjusting for age, sex, BMI, sample time, and cohort, LC status remained significantly associated with lower cortisol. Autoantibodies (exoproteome): REAP revealed diverse private autoantibodies but no increase in the number of autoantibody reactivities per participant in LC vs controls, no correlation with LC clusters or double-negative B cells, and no category-level or GPCR autoantibody enrichment. No individual autoantibody reactivity was more frequent in LC or controls. Antibodies to other viruses: LC had elevated REAP scores to EBV antigens (gp23, gp42) and VZV gE; initially lower HSV-1 gL/gD reactivity reflected lower HSV-1 seroprevalence rather than reduced titers among seropositives. SERA showed no difference in EBV seroprevalence and no EBV IgM elevation or viraemia, supporting recent EBV reactivation rather than acute infection. Among EBV-seropositive individuals, LC had higher reactivity to EBV p23 (P=0.00095) and gp42 (P=0.0039), validated by ELISA (R=0.73). The EBV gp42 motif PVXF[ND]K was enriched in LC (P=0.0031) and maps to a surface-exposed site. In LC, gp42 motif reactivity correlated with IL-4/IL-6 double-positive CD4+ T cells (R=0.26; P=0.013), and EBV p23 reactivity correlated with CD4+ TEMRA cells (R=0.26; P=0.018). Machine learning and biomarkers: After matching LC and controls, PCA separated groups; k-NN classification AUC was 0.94 (95% CI 0.84–1.00). Principal component regression indicated flow cytometry (pseudo-R² 59%) and plasma proteomics/hormones (pseudo-R² 74%) were most informative; a LASSO model achieved pseudo-R² 82%. Features positively associated with LC included serum galectin-1 and EBV IgG epitopes; negatively associated features included serum cortisol, PD-1+ CD4+ central memory T cells, and cDC1 cells. External validation reproduced decreased cortisol; galectin-1 and EBV gp42 predicted LC only in MY-LC. Serum cortisol alone achieved AUC 0.96 (95% CI 0.93–0.99).

Discussion

The study demonstrates that individuals with long COVID exhibit persistent and distinct immunological alterations over a year post-infection. Increased non-conventional monocytes, activated and double-negative B cells, and skewed T cell cytokine responses (increased IL-2/IL-4/IL-6 production and increased IL-4/IL-6 double-positive T cells) point to ongoing immune activation and a Th2/IL-6-influenced milieu. Decreased cDC1 and central memory CD4+ T cells suggest altered antigen presentation and T cell memory dynamics. Elevated SARS-CoV-2-specific antibody levels and enrichment for spike epitopes in LC are consistent with persistent antigen exposure or immune stimulation. Elevated IgG responses to EBV (gp23, gp42) and VZV, together with absent EBV IgM and lack of viraemia, support herpesvirus reactivation as a common feature in LC, potentially linked to the observed T cell cytokine profiles. In contrast, exoproteome autoantibodies were not globally increased, arguing against a predominant autoantibody-driven mechanism in this cohort. The most striking soluble mediator finding—lower cortisol in LC across two cohorts independent of ACTH—implicates hypothalamic-pituitary-adrenal axis dysregulation or altered cortisol metabolism as a contributor to symptoms, aligning with reports of hypocortisolism after other coronavirus infections. Integrative machine learning confirmed that a combination of immune cell phenotypes, viral antibody responses, and soluble mediators, particularly cortisol, reliably distinguishes LC from controls and offers candidate biomarkers. Together, the results support a multifactorial pathobiology involving persistent antigens, latent herpesvirus reactivation, and chronic inflammation rather than broad autoantibody pathology.

Conclusion

This study identifies robust immunological features that distinguish long COVID from recovered and uninfected controls: elevated non-conventional monocytes, reduced cDC1 and CD4+ central memory T cells, heightened IL-4/IL-6-producing T cells, exaggerated SARS-CoV-2 humoral responses with specific spike epitope targeting, increased antibodies to EBV and VZV antigens, and notably reduced serum cortisol. Machine learning models integrating these data accurately classify LC, with cortisol emerging as a particularly strong predictor. These findings provide direction for biomarker development and hypotheses for LC pathogenesis, including roles for persistent viral antigens, herpesvirus reactivation, and chronic inflammation. Future research should longitudinally validate biomarkers, investigate tissue-resident immune responses and viral persistence, elucidate mechanisms underlying cortisol dysregulation, and assess therapeutic strategies targeting herpesvirus reactivation, inflammation, and HPA axis function.

Limitations

Key limitations include: (1) convenience sampling and differing recruitment strategies between cases and controls; (2) sample size smaller than typical for robust machine learning training; (3) focus on peripheral blood immune factors without analysis of tissue-specific immune responses; (4) autoantibody profiling limited to the exoproteome, not assessing intracellular or non-protein antigens; and (5) cross-sectional design precluding causal inference and temporal dynamics of immune changes. Differences between internal and external LC cohorts may also affect generalizability of specific predictive features beyond cortisol.

Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 22+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny