Medicine and Health

SPACEL: deep learning-based characterization of spatial transcriptome architectures

H. Xu, S. Wang, et al.

Discover SPACEL, a groundbreaking deep learning toolkit designed for spatial transcriptomics data analysis. This innovative tool surpasses 19 existing methods, enabling superior cell type deconvolution, spatial domain identification, and 3D tissue reconstruction. Developed by a team from the University of Science and Technology of China and the Hefei Comprehensive National Science Center, SPACEL is set to revolutionize the field.... show more

Introduction

Spatial transcriptomics (ST) enables transcriptome-wide measurements with spatial coordinates, using either image-based (e.g., seqFISH, osmFISH, MERFISH) or sequencing-based platforms (e.g., 10X Visium, Slide-seq, Stereo-seq). While many analytical methods impute missing transcripts and deconvolute spot-level cell-type mixtures, a pressing challenge remains: integrated analysis across multiple slices to identify coherent spatial domains and to reconstruct accurate 3D tissue architectures. Existing single-slice domain methods (BayesSpace, SpaGCN, STAGATE, stLearn) and multi-slice methods (STACI, PRECAST, STAligner) primarily rely on gene expression inputs and have limitations in robustness to batch effects and in 3D alignment, especially with partial slice overlap. PASTE and STAligner address multi-slice alignment but can misalign global structures due to assumptions (e.g., complete overlap, reliance on selected landmarks). The study introduces SPACEL—a toolkit with Spoint (cell-type deconvolution), Splane (multi-slice spatial domain identification from cell-type compositions with adversarial learning), and Scube (automated 3D alignment)—to provide robust multi-slice analysis and accurate 3D reconstruction across technologies and tissues.

Literature Review

Prior work has focused on spatial domain identification and multi-slice integration. Single-slice methods include BayesSpace (Bayesian MCMC to subspot resolution then clustering), SpaGCN (graph convolutional network integrating expression and spatial location), STAGATE (adaptive graph attention autoencoder learning neighborhood similarity), and stLearn (Louvain clustering with spatial nearest neighbors). Multi-slice approaches include STACI (over-parameterization), PRECAST (projecting batch effects), and STAligner (triplet-loss with landmark domains). For 3D alignment, PASTE integrates slices assuming complete 2D overlap and balances expression and spatial distances via a hyperparameter, and STAligner builds MNNs from expression for landmark-based alignment; both can misalign overall structures and underuse per-spot correspondences, especially with partial overlap. The literature suggests cell-type distribution is more robust than gene expression for identifying uniform domains across slices, motivating SPACEL’s design choice to use cell-type compositions with adversarial learning to mitigate batch effects.

Methodology

SPACEL comprises three modules. 1) Spoint (cell-type deconvolution): Training data are constructed by simulating pseudo-spots from scRNA-seq, assuming spot cell counts Ns ~ N(μs=10, δs=5) and number of cell types Nt ~ N(μt=μs/2, δt=δs/2); sampled cells are aggregated and downsampled to match real ST distributions. Spoint uses a VAE to embed simulated (xs) and real (xr) spot expression to latent zs and zr. A predictor network E (three 512-node hidden layers, ReLU, Softmax output) estimates cell-type proportions with an MMD loss aligning simulated and real latent features. A reconstructor R (three 512-node hidden layers) encourages recovery of VAE latent z from E’s outputs. Objectives combine Cosine similarity and KL divergence terms for proportion prediction (Lossc) and latent recovery (LossR), with He initialization and Adam optimization, iterating until convergence (|ΔLossc|<0.001). 2) Splane (multi-slice spatial domain identification): Builds an undirected k-NN adjacency graph (Visium default k=6; single-cell methods default k=25). Inputs are cell-type compositions Q per spot/cell. A GCN with Chebyshev polynomial filters (order K=2) propagates features through 5 layers H0–H4; Lossc minimizes deviation between input H0(Q) and output H4(Q). A neighborhood smoothness loss Ln minimizes differences between latent H2 of a spot and its neighbors. An adversarial discriminator D (two 64-node hidden layers) predicts slice labels from H2 to enforce slice-invariant shared latent features via adversarial training. The total loss is Loss = αc Lossc + αs Losss − αp Lossp (defaults αc=1, αs=1, αp=0.5), optimized with RMSProp and Xavier initialization. K-means clusters H2 features; Davies–Bouldin score guides convergence (ΔLoss<0.0001 and ΔDBS<0). 3) Scube (3D alignment of consecutive slices): Constructs MNN graphs between adjacent slices from Splane-derived spatial domain coordinates. Rigid-body transformations (mirroring, rotation, translation) per slice are optimized via differential evolution to maximize an alignment objective function (AOF) that rewards domain-consistent correspondences and penalizes mismatches, handling partial overlaps through a penalty term and overlap proportion weighting. Initialization recenters slices. After optimizing pairwise transformations, slices are stacked to build a 3D architecture. 4) Gaussian Process Regression (GPR) in 3D: An alpha-shape–based 3D manifold is built and smoothed; a GP with anisotropic RBF covariance models expression y at coordinates x, with hyperparameters (mean μ(x)=mean(y), process variance δ, length-scale l=A·var(y)) tuned by grid search and Adam. Predictions at 500,000 uniformly sampled 3D points yield continuous 3D gene distributions; Bayes Factor (BF) compares optimized GP vs. infinite-length-scale model to score spatial variation. Benchmarking and evaluation: Deconvolution methods compared on 32 simulated datasets using PCC, SSIM, RMSE, JSD, and an accuracy score (AS). Spatial domains evaluated on DLPFC Visium (12 slices) using Jaccard Index (JI), Adjusted Rand Index (ARI), and Shifting Distance (SD). SVG identification assessed by overlap with layer DEGs (fold-change >0.5, P<0.01) and ROC/AUC. Batch-effect removal in multi-batch tumor slices assessed by Average Silhouette Width. Scube alignment compared to STAligner and PASTE on simulated STARmap slices (varied crop ratios) using SSIM/PCC vs ground truth and on real MERFISH and Stereo-seq datasets using adjacent-slice SSIM/PCC.

Key Findings

Spoint deconvolution: On 32 simulated datasets, Spoint achieved the best performance across metrics with highest averages PCC/SSIM = 0.73/0.69 and lowest RMSE/JSD = 0.05/0.41, and the highest accuracy score (AS = 0.93) compared to other methods (AS range 0.24–0.82). On human DLPFC Visium (12 slices) using cortical layer annotations as ground truth, Spoint attained higher accuracy scores (average AS = 0.60) than alternatives (0.30–0.48) and lower Wilcoxon P-values for layer-specific enrichment (average P = 0.01 vs 0.05–0.64). In three pseudo-spot datasets derived from single-cell-resolution ST (Stereo-seq, MERFISH), Spoint consistently achieved top accuracy across cell types; relative error of summed proportions was among the lowest (often comparable to SpatialDWLS). - Splane spatial domains (DLPFC): Joint analysis outperformed per-slice analysis (Splane-single): across 12 slices, average JI/median ARI = 0.61/0.61 vs 0.53/0.57; average SD = 53 µm vs 262 µm. Versus multi- and single-slice state-of-the-art tools (STAligner, PRECAST, STACI, STAGATE, SpaGCN, BayesSpace, stLearn), Splane had the highest accuracy (average JI/median ARI = 0.61/0.61) and lowest error (average SD = 53 µm), while others achieved JI/ARI = 0.41–0.56/0.36–0.54 and SD = 75–548 µm. - SVG identification: Using layer DEGs as ground truth, multi-slice methods identified more SVGs than single-slice methods; Splane identified 1,714 of 1,917 SVGs (PRECAST 1,671; STAligner 1,669; STACI 1,527). ROC-AUC: Splane 0.90, PRECAST 0.89, STAligner 0.88, others <0.83. - Input choice for Splane: Using Spoint-predicted cell-type proportions as input yielded higher JI across all 12 DLPFC slices than using highly variable genes or their PCA; using deconvolution inputs from alternative methods reduced performance and in some cases failed to identify specific layers (e.g., Layer 4). - Cancer datasets (11 breast cancer Visium slices across 3 batches): Splane identified 10 domains grouped as tumor (D0–D3), intermediate (D4–D6), and immune (D7–D9), with reduced batch effects via adversarial learning. Tumor domains showed expected CNV patterns (chr 1q and 8q gains, 1p losses) both within representative slices and on average across all 11 slices; intermediate/immune domains showed fewer or no CNVs. Immune domain validation: H&E-marked lymphocyte spots (slices S1, S2, S5, S6) enriched in D9/D8/D7 (44%, 8%, 6%) vs <1% in tumor/intermediate domains; CD3 IF in S10 enriched in D9/D8/D7 (72%, 70%, 63%) vs <47% in other domains. Splane uniquely delineated accurate tumor boundaries consistent with CNV profiles and H&E morphology and showed strong correlation between tumor cell proportion and CNV scores within predicted tumor domains. - Scube 3D alignment: On simulated STARmap slices with crop ratio 0.25, Scube achieved average SSIM/PCC = 0.96/0.97 vs STAligner 0.76/0.77 and PASTE 0.72/0.75; across crop ratios 0.10–0.25, Scube consistently outperformed both methods in SSIM and PCC. On MERFISH MOp (33 slices), Splane yielded best domain accuracy (average JI/median ARI = 0.44/0.43) versus STACI (0.43/0.38), STAligner (0.33/0.25), PRECAST (0.23/0.16). Scube produced higher adjacent-slice SSIM/PCC (0.71/0.76) than STAligner (0.61/0.65) and PASTE (0.57/0.61), while STAligner and PASTE generated visibly twisted 3D architectures. In mouse embryo Stereo-seq data, Scube better reconstructed structures (brain, liver, paws) and achieved higher SSIM/PCC (0.62/0.66) than STAligner (0.57/0.61) and PASTE (0.24/0.25). - Integrated workflow on mouse whole brain (75 Spatial Transcriptomics slices): Splane outperformed multi-slice competitors (average JI/median ARI 0.42/0.56 vs STAligner 0.36/0.40, PRECAST 0.35/0.38, STACI 0.31/0.40). Scube’s 3D reconstruction showed higher SSIM/PCC (0.83/0.85) than PASTE (0.82/0.84) and STAligner (0.79/0.81).

Discussion

The study addresses the need for robust multi-slice spatial analysis and accurate 3D reconstruction by introducing SPACEL’s three modules. Splane’s use of cell-type compositions rather than raw gene expression, together with adversarial learning in a GCN, effectively mitigates batch effects across slices and enhances identification of common domain features, improving spatial-domain accuracy and SVG discovery. This design aligns with evidence that cell-type distributions are more stable across technical and biological heterogeneity. Spatial domains established by Splane provide reliable landmarks for Scube, which formulates alignment as a global optimization over rigid transformations with MNN-based correspondences and overlap-aware penalties. Compared to PASTE’s hyperparameter-sensitive trade-off between expression and distance and STAligner’s reliance on selected landmarks and partial correspondences, Scube leverages per-spot correspondences across adjacent slices, better preserving global structure and handling partial overlaps, yielding superior SSIM/PCC and visibly coherent 3D stacks across simulated and real datasets. The integrated GPR enables continuous 3D gene expression mapping and quantitative assessment of spatial variation via Bayes Factors, facilitating hypothesis generation about spatial gene dynamics. Applications to cancer highlight biologically meaningful domain delineations, consistent CNV patterns in tumor regions, and clear tumor boundaries, underscoring clinical relevance for tumor microenvironment mapping and immuno-oncology. Overall, SPACEL advances the field by coupling accurate deconvolution, multi-slice domain inference, and robust 3D alignment in a unified, cross-platform workflow.

Conclusion

SPACEL provides an integrated, deep learning-based framework for spatial transcriptomics analysis: Spoint accurately deconvolves cell-type compositions, Splane robustly identifies spatial domains across multiple slices with reduced batch effects, and Scube aligns consecutive slices to construct coherent 3D tissue architectures. Across simulated and diverse real datasets and technologies, SPACEL outperforms state-of-the-art alternatives in deconvolution accuracy, spatial-domain identification, SVG recovery, and 3D alignment quality. The inclusion of a GPR module further enables continuous 3D gene expression mapping and detection of spatially varying genes. Future work includes enabling nonlinear alignment in Scube, reducing retraining costs via transfer learning for incremental data addition, and expanding compatibility and efficiency across large-scale datasets and platforms.

Limitations

For seq-based ST datasets (e.g., 10X Visium), Splane requires cell-type composition inputs, necessitating a preceding deconvolution step (e.g., Spoint), which adds dependency and potential error propagation. - Current Scube implementation supports only rigid (linear) alignment and lacks automated nonlinear alignment capabilities, limiting performance where deformations are nonrigid. - Model retraining is required when new slices are added; absence of incremental/transfer learning may impact computational efficiency for large-scale or continuously growing datasets. - Sensitivity to choice and quality of reference scRNA-seq data in deconvolution may affect downstream domain identification.

Related Publications

Explore these studies to deepen your understanding of the subject.

Medicine and Health

Prediction of tumor origin in cancers of unknown primary origin with cytology-based deep learning

F. Tian, D. Liu, et al.

Medicine and Health

Automated detection of type 1 ROP, type 2 ROP and A-ROP based on deep learning

E. K. Yenice, C. Kara, et al.

Computer Science

A comprehensive review of deep learning in EEG-based emotion recognition: classifications, trends, and practical implications

W. Ma, Y. Zheng, et al.

Medicine and Health

Deep learning-based virtual staining, segmentation, and classification in label-free photoacoustic histology of human specimens

C. Yoon, E. Park, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny