logo
ResearchBunny Logo
Chromatin accessibility during human first-trimester neurodevelopment

Medicine and Health

Chromatin accessibility during human first-trimester neurodevelopment

C. C. A. Mannens, L. Hu, et al.

Explore groundbreaking insights into early brain development with a high-resolution multiomic atlas of chromatin accessibility and gene expression, crafted by an exemplary team of researchers including Camiel C. A. Mannens, Lijuan Hu, and Peter Lönnerberg. Discover how transcription factors and regulatory elements play pivotal roles, revealing vulnerabilities in GABAergic neurons linked to major depressive disorder.... show more
Introduction

The developing human brain undergoes sequential patterning, specification and differentiation events that generate over a thousand distinct neural and non-neural cell types. Single-cell RNA sequencing has delineated trajectories and regional diversity during development, but the regulatory DNA landscape that underpins these processes is dynamic, cell-type specific and often transient. This presents challenges for interpreting GWAS loci, which largely fall in non-coding regions and are context dependent. Prior single-cell chromatin accessibility maps in humans have emphasized the second-trimester cortex or model systems. The purpose of this study is to chart chromatin accessibility and paired gene expression at single-cell resolution across the entire human brain during the first trimester, when patterning occurs and many neural lineages acquire core transcriptional identities.

Literature Review

Previous work established single-cell gene expression atlases of the developing human brain and adult brain, revealing differentiation trajectories and regional specialization (e.g., Braun et al., 2023; Siletti et al., 2023; Zeisel et al., 2018; Saunders et al., 2018; La Manno et al., 2021). Single-cell epigenomic studies mapped regulatory landscapes primarily in the developing cerebral cortex (Ziffra et al., 2021; Trevino et al., 2021), in whole embryos (Domcke et al., 2020), and in organoids or iPSC-derived systems (Fleck et al., 2023). GWAS signals for neurodevelopmental and psychiatric disorders often lie in regulatory DNA (Maurano et al., 2012), with multiple studies implicating fetal neurodevelopment in disease risk (Schork et al., 2019; Finucane et al., 2018). These studies collectively motivate a whole-brain, first-trimester chromatin accessibility atlas to contextualize regulatory programs and disease-associated non-coding variants.

Methodology

Samples and assays: Human fetal brain tissue (6–13 post-conception weeks) from multiple hospitals and resources was dissected into major regions (telencephalon, diencephalon, mesencephalon, metencephalon/pons/medulla, and cerebellum). Single-nucleus scATAC-seq (10x Genomics) and single-cell Multiome ATAC + Gene Expression were performed. After QC (TSS enrichment ≥5, fragment thresholds, doublet removal, etc.), 526,094 nuclei were retained, including 166,785 with paired RNA from Multiome.

Preprocessing and clustering: A Chromograph pipeline implemented stratified peak calling. Initial rough clustering used 20-kb genomic bins, followed by consensus peak calling (MACS2 on pseudobulks per precluster). Iterative LSI decomposition was applied, with batch correction by Harmony, kNN graph construction, and Louvain community detection. A split-and-pool strategy subdivided major classes (radial glia/glioblast, OPC, neuron, fibroblast, vascular, immune), ultimately yielding 135 clusters.

Peak annotation and cCRE–gene linkage: Peaks were annotated by genomic context (intergenic, promoter, intron, exon, TTS). A modified Cicero/SKGGM approach combined chromatin co-accessibility with gene expression to infer cCRE–gene links, yielding 106,991 enhancer–gene interactions across 16,267 genes and 59,069 accessible regions. Gene activity scores were computed by integrating peak-by-nucleus matrices with region–TSS covariance and normalized.

Topic modeling and motif analysis: pycisTopic (LDA) identified 175 topics of covarying accessible regions, interpreted as regulatory programs. GREAT provided functional annotation. HOMER motif enrichment was run on marker peaks and topics, with motif reduction to arche-motifs; motif–expression concordance filtered likely TFs per cluster.

CNN enhancer model: Accessible regions enriched in five neuronal clades (midbrain GABAergic neurons, hindbrain glutamatergic neurons, telencephalic glutamatergic neurons, cerebellar granule and Purkinje neurons) were used to train a CNN (4 convolutional layers + 2 dense layers; PyTorch). Inputs were one-hot encoded 401-bp sequences. Model training used Adam optimizer and cross-entropy loss with label smoothing; performance assessed by ROC AUC. DeepExplainer and TF-MoDISco identified sequence motifs (seqlets) driving predictions and nucleotide contribution scores.

Lineage dynamics: For Purkinje lineage (71,947 nuclei), pseudotime was inferred (pySlingshot). DELAY (deep learning-based GRN inference leveraging pseudotime-lagged causality) built a TF regulatory network (148 TFs). BoolODE converted the Boolean network to ODEs to simulate TF dynamics and validate trajectories. cCRE accessibility dynamics for ESRRB-linked regions were modeled, and CNN-derived nucleotide importance identified TF-binding contributions (e.g., TFAP2B in early, LHX5 in late enhancers).

GWAS enrichment: Stratified LD score regression (LDSC) assessed heritability enrichment of 11 psychiatric phenotypes and 325 UK Biobank traits in cluster-specific accessible regions, conditioning on accessible regions from other developmental stages and adult tissues. MAGMA mapped cCREs to genes and also treated individual accessible regions as features to identify MDD-associated genes and elements. Significant elements were examined for TF motif enrichment and CNN-predicted binding sites.

Quality control and statistics: TSS enrichment, fragment distributions, doublet detection (adapted DoubletFinder), sex inference via Y-chromosomal reads; thresholds for high-quality nuclei set for fragments, TSS fraction, RNA UMIs/percent unspliced (Multiome). Statistical analyses included Fisher’s exact tests (Benjamini–Hochberg corrected), linear regression with age covariates, topic stability (coherence and log-likelihood), and multiple testing correction (FDR/Bonferroni) for GWAS enrichments.

Key Findings

Atlas and cCRE–gene links: 526,094 nuclei across 6–13 PCW yielded 135 clusters spanning neural and non-neural lineages. Using multiomic integration, 106,991 enhancer–gene (cCRE–gene) links were identified for 16,267 genes and 59,069 accessible regions.

Accessibility dynamics: Along neuronal differentiation, accessible regions increased by ~10% (significant; P<0.05). With age, the number of accessible regions increased across all classes except radial glia (P<0.001; coefficient 3,206; s.e. 634; t=5.06; 6 d.f.; linear regression). Newly acquired accessible regions were enriched for NFI-binding sites, suggesting a maturation role for NFI factors. In oligodendroglial lineage, no significant increase was observed, consistent with heterochromatinization during differentiation.

Specificity: Gene expression profiles were generally more specific than aggregate accessibility scores, but individual accessible regions were more cell-type specific than marker genes: 1,361 marker genes versus 120,183 marker regions. Most VISTA CNS enhancers overlapped with accessible regions (96%).

Regulatory topics and motifs: 175 cis-topics captured coherent regulatory programs; promoter-proximal regions clustered as constitutively open, and specific topics highlighted pan-neuronal and glial CTCF-binding sets. Topic 4 enriched for interneuron identity (MEIS, HD/2, E-box/CAGATGG, NFI), and Topic 25 for oligodendrocyte differentiation (SOX/4 arche-motif). CTCF-related topics underscored chromatin organization’s role in neurodevelopment.

CNN predictions and TF syntax: A CNN trained on enhancer sequences classified five neuronal clades with mean ROC AUC 0.92. Key motifs included MEIS1 and ATOH1 in hindbrain glutamatergic and cerebellar granule neurons; LHX2 and BHLHE22 in telencephalic glutamatergic neurons; GATA2 (midbrain-only) and OTX2/DMBX1, TFAP2B, and LHX1/5 in GABAergic and Purkinje neurons.

Purkinje lineage regulation: A DELAY-inferred GRN (148 TFs) and BoolODE simulations revealed sequential TF activation culminating in ESRRB, a Purkinje-specific TF. Nine ESRRB-linked cCREs split into early and late groups: early cCREs harbored TFAP2B sites, late cCREs LHX5 sites. ESRRB activation follows a two-step mechanism: TFAP2B poising via early enhancers, then LHX5 induction via late enhancers, followed by genome-wide increase in ESRRB motif accessibility.

GWAS enrichment and MDD vulnerability: LDSC identified significant psychiatric trait enrichments in specific neuronal subtypes. Schizophrenia: cortical interneurons (MGE-derived) and SATB2+ telencephalic excitatory neurons. ADHD: immature GABAergic neurons and cerebellar Purkinje neuroblasts. Anorexia nervosa: LGE/CGE-derived interneurons. Autism spectrum disorder: hindbrain neuroblasts. Insomnia: TAL2+ midbrain GABAergic neurons. Major depressive disorder (MDD) showed the strongest enrichments in multiple midbrain-derived GABAergic neuron groups (validated in an independent cohort; one-sided; Benjamini–Hochberg α = 3.37×10^-10). MAGMA identified 25 MDD-associated genes, including NEGR1, BTN3A2, LRFN5, SCN8A, and histone genes at the BTN3A2 locus. MDD-associated accessible regions were enriched for MEIS2, OTX2, and GATA2 motifs; 29/114 significant regions contained CNN-predicted OTX2 sites and 46 contained GATA2 sites. One SNP (rs114155007) directly overlapped an OTX2-binding site. Findings suggest broadly expressed genes contribute to MDD when dysregulated within midbrain GABAergic neurons during early development.

Discussion

This work provides a comprehensive single-cell multiomic atlas of chromatin accessibility during the first trimester across the whole human brain, linking cCREs to gene expression and decomposing regulatory programs into sequence-level syntax. The age- and differentiation-associated increase in accessible regions, particularly those enriched for NFI motifs, underscores shared maturation mechanisms in neural progeny. Although gene expression better demarcates cell identity at the gene level, highly specific accessible regions and cis-topic programs deliver precise regulatory signatures that complement expression-based annotations.

The CNN-based enhancer model decodes TF motif logic that differentiates neuronal subtypes and, coupled with lineage analyses, reveals temporally ordered enhancer activation. The Purkinje lineage case study elucidates a two-step ESRRB activation via TFAP2B (early) and LHX5 (late) enhancers, followed by ESRRB-dependent chromatin remodeling, illustrating how this resource enables mechanistic dissection from TFs to cCREs to target gene activation in vivo.

Integrating GWAS with developmental cCRE maps highlights context-specific vulnerabilities: significant enrichments align with known disease biology (e.g., schizophrenia with cortical lineages and ADHD with cerebellar and immature GABAergic states) and reveal strong associations of MDD with midbrain-derived GABAergic neurons—pointing to early developmental perturbations in specific midbrain circuits. The result emphasizes that non-coding disease risk is realized within discrete cell types and developmental windows, and suggests that many associated genes are broadly expressed but mediate disease through cell type- and time-specific regulatory disruption.

Overall, the atlas advances understanding of gene regulatory mechanisms guiding early human neurodevelopment and provides a foundation for interpreting disease-associated variants in a precise developmental and cell-type context.

Conclusion

The study delivers a high-resolution, first-trimester human brain atlas of chromatin accessibility paired with gene expression, identifying over 100,000 cCRE–gene links and regulatory topics that define lineages and regions. A CNN framework captures enhancer grammar distinguishing neuronal subtypes and, together with pseudotime GRN inference, reveals a two-step enhancer logic for ESRRB activation in the Purkinje lineage. Developmental cCRE maps contextualize psychiatric GWAS signals, highlighting midbrain-derived GABAergic neurons as a key vulnerability for MDD risk variants. This resource enables multi-scale analyses—from developmental trajectories to nucleotide-resolution enhancer logic—and will facilitate mechanistic studies of human neurodevelopment and disease. Future work should extend atlases across later fetal stages and postnatal/adult periods, integrate additional epigenomic modalities, and functionally validate predicted enhancers and variant effects.

Limitations

Key limitations include: (1) clinical sample timing variability (gestational age inferred by expert annotation rather than exact conception date); (2) incomplete regional coverage per specimen due to tissue availability/damage, necessitating pooling across donors; (3) variable intervals between collection and processing across sources; (4) sample size constrained by scarcity of early human fetal tissue, with no power calculations; (5) exploratory, non-blinded study design; (6) GWAS enrichments interpreted within an early developmental window only—later stages and adult tissues may reveal additional or distinct vulnerabilities; and (7) one author affiliation (for superscript 6) is not specified in the provided text segment.

Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny