logo
ResearchBunny Logo
Mapping disease regulatory circuits at cell-type resolution from single-cell multiomics data

Medicine and Health

Mapping disease regulatory circuits at cell-type resolution from single-cell multiomics data

X. Chen, Y. Wang, et al.

Discover the innovative MAGICAL framework, developed by researchers including Xi Chen and Vance G. Fowler Jr, which accurately maps disease-associated regulatory circuits using cutting-edge genomic techniques. This groundbreaking approach helped reveal epigenetic biomarkers that differentiate between methicillin-resistant and methicillin-susceptible *Staphylococcus aureus* infections, providing crucial insights into sepsis from patient samples.

00:00
00:00
~3 min • Beginner • English
Introduction
Gene expression is controlled by regulatory circuits formed by transcription factors, proximal and distal chromatin regulatory domains, and their three-dimensional interactions with target promoters. Disease can dysregulate these circuits in a cell-type-specific manner that is not apparent in bulk data, necessitating cell-type-resolved multiomic approaches to map regulatory changes. While single-cell RNA-seq and ATAC-seq have improved identification of differentially expressed genes and accessible chromatin sites, computational methods to infer disease-associated regulatory interactions at cell-type resolution remain limited. To address this, the authors developed MAGICAL, a hierarchical Bayesian framework that jointly models coordinated variation in chromatin accessibility and gene expression across conditions to infer disease-modulated TF–peak–gene circuits without requiring cell-level RNA/ATAC pairing. The study applies MAGICAL to COVID-19 and Staphylococcus aureus bloodstream infection, including distinguishing host responses to methicillin-resistant versus methicillin-susceptible strains, to demonstrate accurate circuit inference and diagnostic utility based on circuit genes.
Literature Review
Prior work has shown that distal enhancers can regulate gene expression through long-range 3D chromatin looping, and that disease-associated variants often map to regulatory elements linked to target genes. Single-cell multiomics technologies (joint or paired scRNA-seq and scATAC-seq) enable cell-type-resolved profiling, but integrative computational methods to infer TF–cis–gene relationships under contrasting conditions are still emerging. Recent approaches such as TRIPOD and FigR focus on single-condition peak–gene linkage and do not explicitly model condition-driven regulation within cell types. Conventional differential expression analyses may fail to capture regulatory mechanisms or provide robust biomarkers in challenging contrasts (e.g., MRSA vs MSSA). The field also emphasizes incorporating priors such as TF motif maps and TAD boundaries and validating inferred interactions using chromatin interaction datasets (Hi-C/HiChIP) and independent omic datasets.
Methodology
MAGICAL constructs candidate circuits within each cell type using differentially accessible sites (DAS) and differentially expressed genes (DEG) between contrast conditions. TFs are linked to DAS via motif mapping (chromVARmotifs), and peaks are paired to genes within the same topologically associating domain (TAD) using TAD boundaries called from GM12878 Hi-C. The framework explicitly models: (1) chromatin accessibility as a linear combination of TF–peak binding confidence and hidden TF activity plus noise, and (2) gene expression as a linear combination of TF–peak binding, TF activity, and peak–gene looping confidence plus noise. Hidden TF activities are learned at the sample-by-cell-type level and shared across RNA and ATAC cells from the same sample, enabling use of true multiome or sample-paired datasets without cell-level pairing. Gaussian priors are assumed for TF–peak binding, TF activity, and peak–gene looping variables, and noise is modeled with zero-mean Gaussian with inverse-gamma priors on variances. Gibbs sampling iteratively estimates posterior distributions of TF–peak binding, TF activity, and peak–gene looping, updating binary linkage states by mapping sampled values to probabilities and sampling over many iterations. High-confidence circuits are selected based on posterior probabilities (top 10%). Benchmarking: MAGICAL’s peak–gene loop inference was compared against TRIPOD and FigR on public multiome datasets using validation by enrichment of curated 3D chromatin interactions (4DGenome; GM12878 H3K27ac HiChIP). To assess the role of priors, MAGICAL was also run with a naive 500 kb distance prior in place of TADs. COVID-19 analysis: MAGICAL was applied to a public PBMC scRNA-seq/scATAC-seq discovery dataset, stratified by mild and severe disease, focusing on CD8 TEM, CD14 monocytes, and NK cells. Circuit chromatin sites were validated using a newly generated independent PBMC scATAC-seq dataset (six mild COVID-19 and three controls) and published severe COVID-19 T-cell chromatin changes; circuit genes were validated against genes from independent single-cell COVID-19 studies. S. aureus analysis: The authors generated paired scRNA-seq and scATAC-seq PBMC datasets from adults with MRSA (n=10), MSSA (n=11), and controls (n=23). scRNA-seq processing used Seurat with quality control, batch correction, and pseudobulk DE refinement; scATAC-seq processing used ArchR with QC, clustering, peak calling (MACS2), and differential accessibility with refinement via pseudobulk linear modeling. MAGICAL identified circuits for three contrasts (MRSA vs control, MSSA vs control, MRSA vs MSSA) across 13 cell types surpassing cell count thresholds. Validation included enrichment of circuit peak–gene links in cell-type-matched pcHi-C datasets (B, CD4 T, CD8 T, CD14 Mono), TF ChIP-seq overlap (Cistrome), enhancer overlap (ENCODE), and inflammatory GWAS locus enrichment (GREGOR). Predictive modeling: Circuit genes shared between MRSA and MSSA were filtered by pseudobulk AUROC and used to train support vector machine models to predict S. aureus infection in independent adult whole-blood and pediatric PBMC microarray datasets. For MRSA vs MSSA, circuit genes differing between strains were similarly filtered and used to train models tested on three pediatric PBMC datasets. Performance was compared to DEG-based feature sets.
Key Findings
• MAGICAL outperformed TRIPOD and FigR for peak–gene loop inference on benchmark multiome datasets, showing significantly higher enrichment of validated chromatin interactions (P<0.0001, two-sided Fisher’s exact test), even when run without TAD priors (P<0.001). • COVID-19: In CD8 TEM, CD14 monocytes, and NK cells, MAGICAL identified 1,489 high-confidence circuits (1,404 sites; 391 genes) across mild and severe groups. Circuit site precision against independent datasets exceeded original DAS and simple proximity/TAD-based baselines by ~50% (P<0.05 to <0.001), and circuit gene precision exceeded DEG by ~30% (P<0.05). • S. aureus: Across MRSA vs control, MSSA vs control, and MRSA vs MSSA, MAGICAL identified 1,513 circuits (1,179 sites; 371 genes). Circuits were highly cell-type specific, with strongest representation in CD14 monocytes. Circuit peak–gene interactions showed significant enrichment in matched cell-type pcHi-C data (P<0.01) and lower enrichment in mismatched cell types. • TFs and distal regulation: In CD14 monocytes, AP-1 complex TFs (FOS/JUN family) were most enriched at sites with increased accessibility during infection; circuit genes were enriched for cytokine signaling pathways (adjusted P=2.4×10^-10). Circuit sites were enriched 15–25 kb from TSS and overlapped ENCODE enhancer-like regions (~50%), aligning with known median enhancer distances (~24 kb). Circuit sites were enriched for inflammatory GWAS loci (P<0.005), including a distal site looping to HLA-DRB1 in a top S. aureus-associated GWAS region. • Epi-driven genes: Circuit genes in CD14 monocytes were significantly enriched for experimentally epigenetically driven genes (adjusted P<0.005), whereas other DEG sets showed no such enrichment. • Diagnostic performance: Using 117 circuit genes common to MRSA and MSSA (filtered by discovery pseudobulk AUROC>0.7), S. aureus infection was predicted in independent datasets with AUROCs 0.93–0.98. For antibiotic sensitivity, 32 circuit genes distinguishing MRSA vs MSSA achieved AUROCs 0.67–0.75 across three pediatric datasets, with significant score differences between groups (P=9.2×10^-3), while DEG-based models failed (AUCs ~0.5; non-significant).
Discussion
The study demonstrates that integrating scRNA-seq and scATAC-seq via a hierarchical Bayesian framework enables accurate, cell-type-specific inference of disease-modulated regulatory circuits, particularly involving distal regulatory elements that are not captured by nearest-site assumptions or expression-only analyses. By explicitly modeling TF–peak binding, hidden TF activity, and peak–gene looping while accounting for modality-specific noise, MAGICAL improves the accuracy of both chromatin site and gene associations to disease conditions. The approach generalizes across diseases and cohorts: in COVID-19, MAGICAL-selected circuit sites and genes validated better than standard DAS/DEG; in S. aureus infection, circuits localized primarily to CD14 monocytes and highlighted AP-1-mediated inflammatory regulation, with strong support from pcHi-C, ChIP-seq, enhancer, and GWAS enrichments. Importantly, circuit genes provided robust, generalizable diagnostic features, distinguishing infection status across adult and pediatric cohorts and resolving antibiotic resistance phenotypes where DEG-based models failed, indicating that multiomic regulatory context yields more reliable biomarkers than expression alone.
Conclusion
MAGICAL provides a principled framework to map disease-associated regulatory circuits at cell-type resolution from single-cell multiomics, accurately resolving distal regulatory interactions and yielding robust biomarkers. It outperforms existing methods for peak–gene linkage, validates across independent datasets, and enables diagnostic prediction of infection state and antibiotic sensitivity. Future work should extend MAGICAL to directly model and discover cell-type-specific regulatory changes across closely related cell states, improve circuit inference when cell types are poorly defined or candidate features are sparse, and incorporate additional priors or experimental modalities to further enhance distal interaction resolution.
Limitations
• MAGICAL relies on sufficient numbers of DAS and DEG within each cell type; in less distinct cell types or conditions with few candidates, circuit inference power decreases. • The method analyzes each cell type independently and does not directly model cell-type specificity across types for disease circuit identification. • TAD boundaries were taken from the GM12878 lymphoblastoid cell line due to limited PBMC Hi-C data; although TADs are generally conserved, mismatches may affect peak–gene pairing in some contexts. • Experimental aspects: sample sizes were not predetermined statistically, experiments were not randomized, and investigators were not blinded, which may introduce bias in validation datasets. • Validation of inferred peak–gene links relies on available chromatin interaction datasets that may be incomplete or cell-type mismatched, potentially underestimating true positives.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny