logo
ResearchBunny Logo
Pathobiological signatures of dysbiotic lung injury in pediatric patients undergoing stem cell transplantation

Medicine and Health

Pathobiological signatures of dysbiotic lung injury in pediatric patients undergoing stem cell transplantation

M. S. Zinter, C. C. Dvorak, et al.

This groundbreaking study by Matt S. Zinter and colleagues reveals critical insights into lung injury post-hematopoietic cell transplantation. Through analysis of BAL fluid from 229 pediatric patients, the research identifies dangerous bacterial and viral patterns that could lead to fatal outcomes. These findings open avenues for personalized diagnostics and treatment strategies that may significantly improve survival rates in patients undergoing HCT.

00:00
00:00
~3 min • Beginner • English
Introduction
Pulmonary injury is a frequent and often fatal complication after pediatric hematopoietic cell transplantation (HCT), arising from infection, alloreactive inflammation and cytotoxic treatment effects. Despite clinical use of bronchoalveolar lavage (BAL) to investigate pathogens, many infections remain undetected and the interplay between the lung microbiome, host immunity and epithelium in post-HCT lung disease is poorly understood. The study aims to define pathobiological signatures of dysbiotic lung injury by integrating metatranscriptomic profiling of BAL microbiota with host lung gene expression to identify clinically relevant subtypes associated with outcomes and to improve diagnostics and potential therapeutic strategies.
Literature Review
Prior work demonstrates that the lower airways harbor a dynamic microbiome with roles in health and disease. Conventional diagnostics often miss pathogens in immunocompromised hosts due to prior antimicrobials and limited test panels. The authors previously showed in pre-HCT pediatric cohorts that pulmonary microbial depletion and pathogen enrichment predicted poor lung function and fatal post-HCT lung disease. Additional literature links higher pulmonary microbial mass to neutrophilic inflammation (e.g., cystic fibrosis, COPD) and intestinal microbiome dysbiosis after HCT to worse outcomes, including GVHD and pulmonary disease. Commensal organisms may constrain pathogen expansion through immune modulation and nutrient competition. Antibiotic exposure is known to deplete commensals, alter pulmonary microbiota, and potentially permit fungal and viral expansion.
Methodology
Design and cohorts: Prospective multicenter study of pediatric HCT recipients across 32 children's hospitals in the US, Canada, and Australia (derivation cohort; 2014–2022). A geographically distinct validation cohort of pediatric HCT recipients from Utrecht, the Netherlands (2005–2016) was used for external validation. Inclusion: history of allogeneic (both cohorts) or autologous (derivation only) HCT undergoing clinically indicated BAL for pulmonary disease evaluation. Exclusion: limitations of care at time of BAL. Ethics: IRB approvals at UCSF (14-13546, 16-18908) and Utrecht (05/143, 11/063); informed consent obtained. BAL collection and processing: BAL performed per local protocols by pediatric pulmonologists; 3–6×10 ml saline instilled into radiographically diseased lung areas; clinical aliquots processed locally; research aliquots frozen and shipped to UCSF. RNA extraction from 200 µl BAL using bead-beating and column purification with DNase; ERCC spike-in (25 pg) added as internal control. Sequencing: Metatranscriptomic libraries prepared via miniaturized Ultra II RNA protocol; 19 PCR cycles; size-selected (300–700 nt); sequenced on Illumina NovaSeq 6000 (paired-end 125 nt; ~40M read pairs/sample). Negative controls (water) and HeLa controls included per batch to characterize contamination. Bioinformatics – human reads: Alignment to hg38 (STAR); non-coding/mitochondrial/ribosomal transcripts excluded; 19,032 protein-coding genes retained. Batch effects assessed via PCA. Bioinformatics – microbial taxonomy and function: Human-subtracted files processed using CZID v7.1 with multiple rounds of human read subtraction (STAR, Bowtie2, GSNAP) and quality/complexity filtering; alignment to NCBI nt/nr; stringent filters on read counts, alignment length and percent identity (≥95% bacteria, ≥90% eukaryotes, ≥80% viruses). Low-biomass samples (<100 pg) required contigs ≥250 nt with ≥10 transcripts and ≥80% identity. High-quality microbial reads averaged 1.6% of all reads. Absolute quantification and contamination adjustment: ERCC spike-in used to back-calculate total sample RNA mass and convert microbial reads to absolute mass (pg). Samples <10 pg total input discarded. Batch-specific contamination profiles derived from water/HeLa controls; mean + 2 SD of contaminant taxa subtracted from patient samples. ERCC-transformed, contamination-adjusted masses used downstream. Microbial functional profiling and AMR: FMAP v0.15/DIAMOND quantified UniRef90 proteins and KEGG pathways (levels 1–3). CZID AMR Gene Pipeline (RGI/CARD/WILDCARD) identified AMR transcripts; filtered for coverage and contaminant AMR hits; AMR normalized to ERCC and to bacterial mass. Unsupervised clustering: Multi-Omics Factor Analysis (MOFA) to reduce dimensionality across microbial phyla/genera/species/KEGG pathways present in >15% of samples, plus total microbial mass, richness, and alpha diversity (Shannon, Simpson). Variance-stabilizing transformation applied; subsequent UMAP and hierarchical clustering (Euclidean) determined 4 optimal clusters (silhouette/elbow/gap statistics). Clinical data and statistics: Demographics, transplant features, immune status (ANC/ALC), antimicrobials in prior 7 days, and outcomes collected. Antibacterial Exposure Score (AES) computed as sum over 7 days of exposure days × agent-specific broadness weight (range ~4–49.75). Associations tested with Kruskal–Wallis/Dunn’s, chi-squared, Wilcoxon, edgeR negative binomial GLMs, Spearman correlation, Cox regression, and mediation analysis (linear structural equation modeling with Poisson/logistic models, 500 bootstraps). Multiple testing controlled by FDR. Pathogen calling: Adapted CZID pathogen list for respiratory/immunocompromised context. Viruses: any detection above background considered present. Bacteria: thresholds required mass ≥10 pg, dominance ≥20%, and cohort z-score ≥+2. Fungi: mass ≥10 pg and z-score ≥+2 (no dominance criterion). Parasites assessed via metagenomics only. Host analyses: Differential gene expression (edgeR; genes present in >25% samples; FDR ≤0.05). Gene set variation analysis for REACTOME pathways. Cell-type deconvolution and imputed cell-type-specific expression via CIBERSORTx using a lung single-cell atlas reference; T/B cell receptor repertoire inference via ImReP. Classifier and external validation: Random forest (10,000 trees; randomForestSRC) trained on taxonomic and host gene expression data with 1.5× weighting for clusters 3–4; performance via out-of-bag AUC and confusion matrix; variable importance by permutation VIMP. Classifier applied to Utrecht cohort; survival stratified by predicted clusters.
Key Findings
Cohort and outcomes: 229 pediatric HCT recipients contributed 278 clinically indicated BALs across 32 hospitals (2014–2022). Symptoms occurred median 93 days post-HCT (IQR 23–278); BAL at median 112 days (IQR 36–329). Lymphopenia was prevalent (median ALC 420/µL, IQR 156–1,035). A substantial proportion required intensive care and prolonged ventilation; in-hospital mortality was high. Four BAL clusters: Unsupervised integration of microbiome and host transcriptome defined four clusters with distinct biology and outcomes. Clusters 3 and 4 were associated with greater pre-BAL respiratory support (P=0.004), renal injury and GVHD (P=0.001 and P=0.019), more ICU care and prolonged ventilation (P=0.001) and significantly higher in-hospital mortality compared to clusters 1–2 (log-rank P=0.005). Among patients on respiratory support at BAL, mortality ranged 22–30% (clusters 1–2) vs 50–60% (clusters 3–4; log-rank P=0.007). Multivariable Cox regression (adjusting for age, sex, ANC, ALC, GVHD) retained significance (P=0.023). Microbiome taxonomy and diversity: Cluster 1 showed moderate mass/richness, high diversity, low viral burden; cluster 2 had high bacterial mass and richness with moderate diversity; cluster 3 showed depletion of oropharyngeal commensals and enrichment of RNA viruses and Ascomycota; cluster 4 exhibited profound commensal depletion with enrichment of Staphylococcus and RNA viruses (Pisuviricota). Nonsurvivors had depletion of commensals, higher viral and fungal RNA, reduced richness (P=0.025) and lower Simpson diversity (P=0.006), consistent with clusters 3–4. Microbial function and AMR: KEGG pathway transcription mirrored taxonomy. Cluster 2 had higher carbohydrate/lipid/amino acid metabolism and glycan biosynthesis (e.g., peptidoglycan, LPS). Clusters 3–4 had markedly reduced microbial metabolic activity. Absolute AMR transcript mass was highest in cluster 2 and lowest in cluster 4, but AMR normalized to bacterial mass was highest in cluster 4 (Extended Data Fig. 2). Pathogen detection: Clinical testing identified 173 pathogens in 116/278 samples (41.7%). Metagenomic sequencing (conservative thresholds) identified 360 pathogens in 196/278 samples (70.5%). Combined, 429 pathogens were found in 209/278 samples (75.2%), reclassifying 90 cases of idiopathic pneumonia syndrome as lower respiratory tract infection. In nonsurvivors, clinical testing detected pathogens in 49% (22/45) vs 80% (36/45) by metagenomics (P=0.002). Viral detection: CRVs detected clinically in 18% vs 28% by sequencing; herpesviruses in 13% vs 16% by sequencing (enrichment in clusters 3–4; Dunn’s P=0.018 and 0.021 vs cluster 1). Additional respiratory-transmitted viruses not on standard panels (e.g., BK/WU/KI polyomaviruses, bocavirus, parvovirus B19, LCMV, non-vaccine rubella) were found in 26 BALs from 23 patients and associated with 39% in-hospital mortality. Torquetenovirus detected in 20% and enriched in clusters 2–4 (P<0.001). Bacteria: high-prevalence taxa included S. pneumoniae (34%), M. catarrhalis (21%), H. influenzae (21%), S. aureus (16%), P. aeruginosa (14%). Applying mass/dominance/z-score thresholds identified potentially pathogenic bacteria in 76 samples, including occult taxa (e.g., Bacillus cereus, Citrobacter freundii, Chlamydia pneumoniae, Klebsiella aerogenes, Salmonella enterica, Ureaplasma parvum). Fungi: clinical 9% vs sequencing 30% (83/278) above thresholds; sequencing uniquely identified Pneumocystis and Cryptococcus. Parasites: not detected clinically; sequencing identified Toxoplasma (n=4) and Acanthamoeba (n=3) predominantly in clusters 3–4 with >50% mortality (4/7). Antimicrobial exposure: AES varied by cluster (lowest cluster 2; highest clusters 3–4; P=0.005). Higher AES associated with reduced microbial richness (rho = -0.14, P=0.018), depletion of major bacterial phyla including oropharyngeal commensals, and enrichment of Ascomycota (FDR<0.05). AMR transcripts decreased with higher AES in absolute terms (P<0.001) but increased when normalized to bacterial mass (P<0.001). AES was higher among nonsurvivors (median 352, IQR 210–507) vs survivors (median 175, IQR 75–336; P<0.001). Mediation analysis suggested that antibiotic-induced depletion of commensals (Actinomyces, Fusobacterium, Gemella, Haemophilus, Neisseria, Rothia, Schaalia, Streptococcus) statistically mediated the AES–mortality association (P<0.001), though mediation attenuated after adjusting for oxygen support, ANC and ALC. Host response: 18,158 DEGs across clusters. REACTOME enrichment: cluster 1 with antigen-presenting cell activation; cluster 2 with neutrophil/innate activation, bacterial processing and airway inflammation; cluster 3 with collagen deposition and fibroproliferation; cluster 4 with antiviral response and cellular injury. Between survivors and nonsurvivors (1,253 DEGs), nonsurvivors showed downregulated innate/APC signals and upregulated fibroproliferation and epithelial injury genes (e.g., COL1A1, COL3A1, CXCL5, IL13, MMP7, SFTPA1, SFTPC, TIMP3; FDR<0.05). Cell deconvolution: cluster 1 enriched for monocytes/macrophages; cluster 2 for neutrophils; cluster 3 for CD4+ T cells; cluster 4 for CD8+ T cells. TCR repertoire: cluster 4 had highest TCRα clonotypes/diversity; cluster 1 the lowest; BAL TCR metrics did not correlate with blood ALC for αβ, but γδ TCR and BCR subtypes were higher with higher ALC. Dynamics and validation: On repeat BALs (n=34 patients; median 79 days apart), patients tended to transition away from low-risk cluster 1 to higher-risk clusters (P<0.001). A random forest cluster classifier achieved out-of-bag AUC 0.923, with host gene expression more informative than taxonomy. Applied to the Utrecht cohort (n=57), 1-year non-relapse mortality stratified by predicted clusters (9.5% cluster 1, 36% cluster 2, 52% clusters 3–4; log-rank P=0.009), confirming external validity.
Discussion
Integrated metatranscriptomic profiling of BAL in pediatric HCT recipients identified four biologically distinct pulmonary microenvironments that strongly associate with clinical severity and mortality. The data support a paradigm where balanced commensal biodiversity coincides with lower viral/fungal burden and favorable host immune tone, whereas depletion of commensals is accompanied by viral and fungal enrichment, staphylococcal predominance, maladaptive lymphocytic activation, epithelial injury and fibroproliferation, portending high mortality. Antibiotic exposure correlated with commensal loss, decreased richness, and expansion of Ascomycota and respiratory RNA viruses, with statistical evidence that commensal depletion mediates part of the exposure–outcome relationship. Host transcriptomics distinguished cluster-specific immune states (macrophage-dominant, neutrophilic inflammation, fibroproliferation, antiviral/injury), suggesting that precision phenotyping could guide tailored immunomodulation and antifibrotic strategies. Metagenomic pathogen detection, contextualized by absolute mass, dominance and cohort outlier status (z-score), nearly doubled diagnostic yield over standard testing, reclassifying many idiopathic cases and identifying occult or off-panel pathogens. The validated classifier and external cohort replication underscore generalizability and potential clinical utility of integrating lung microbiome and host-response profiling in post-HCT pulmonary complications.
Conclusion
This multicenter study presents the largest integrated analysis of the pulmonary microbiome and host transcriptome in pediatric HCT recipients with lung disease, defining four BAL phenotypes that stratify risk and elucidate key pathobiological axes—commensal depletion, pathogen enrichment (viruses/fungi/staphylococci), neutrophilic and lymphocytic inflammation, and fibroproliferation—linked to mortality. Metagenomic sequencing with absolute quantification and contextual thresholds markedly increases pathogen detection. Findings advocate for precision pulmonary phenotyping to inform diagnostics (rapid clinical metagenomics), antimicrobial stewardship, consideration of microbiome-restorative approaches, targeted immunomodulation by phenotype, and antifibrotic therapy trials for fibroproliferative signatures. Future work should include prospective interventional trials leveraging real-time BAL multi-omics to guide individualized therapy, mechanistic studies of microbe–host interactions in the post-HCT lung, and evaluation of microbiome-preserving or -restoring therapies.
Limitations
Observational design limits causal inference between exposures (e.g., antibiotics), microbiome changes and outcomes. Clinical heterogeneity across patients and centers; protocols for care, microbiology testing and BAL collection were not standardized, and bronchoscope controls were not obtained. Healthy pediatric controls were unavailable. Lack of histopathology prevented definitive attribution of detected organisms to disease causation. Clinical microbiology varied by site and may have missed pathogens. Some affiliations suggest contributors without explicit author listing in main list (not affecting results). Overall, potential confounding by illness severity (e.g., greater antibiotics in sicker patients) complicates mediation analyses.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny