logo
ResearchBunny Logo
Data-driven fine-grained region discovery in the mouse brain with transformers

Interdisciplinary Studies

Data-driven fine-grained region discovery in the mouse brain with transformers

A. J. Lee, A. Dubuc, et al.

Discover a scalable self-supervised workflow for detecting fine-grained tissue domains from multimillion-cell spatial transcriptomics data using CellTransformer, an encoder-decoder model that learns hierarchical tissue features and enables GPU-accelerated clustering on MERFISH and Slide-seqV2 whole-brain datasets. The research was conducted by Alex J. Lee, Alma Dubuc, Michael Kunst, Shenqin Yao, Nicholas Lusk, Lydia Ng, Hongkui Zeng, Bosiljka Tasic, and Reza Abbasi-Asl.... show more
Abstract
Spatial transcriptomics offers unique opportunities to define the spatial organization of tissues and organs, such as the mouse brain. We address a key bottleneck in the analysis of organ-scale spatial transcriptomic data by establishing a workflow for self-supervised spatial domain detection that is scalable to multimillion-cell datasets. This workflow uses a self-supervised framework for learning latent representations of tissue spatial domains or niches. We use an encoder-decoder architecture, which we named Cell-Transformer, to hierarchically learn higher-order tissue features from lower-level cellular and molecular statistical patterns. Coupling our representation learning workflow with minibatched GPU-accelerated clustering algorithms allows us to scale to multi-million cell MERFISH datasets where other methods cannot. CellTransformer is effective at integrating cells across tissue sections, identifying domains highly similar to ones in existing ontologies such as Allen Mouse Brain Common Coordinate Framework (CCF) while allowing discovery of hundreds of uncataloged areas with minimal loss of domain spatial coherence. CellTransformer domains recapitulate previous neuroanatomical studies of areas in the subiculum and superior colliculus and characterize putatively uncataloged subregions in subcortical areas, which currently lack subregion annotation. CellTransformer is also capable of domain discovery in whole-brain Slide-seqV2 datasets. Our workflows enable complex multi-animal analyses, achieving nearly perfect consistency of up to 100 spatial domains in a dataset of four individual mice with nine million cells across more than 200 tissue sections. CellTransformer advances the state of the art for spatial transcriptomics by providing a performant solution for the detection of fine-grained tissue domains from spatial transcriptomics data.
Publisher
Nature Communications
Published On
Oct 07, 2025
Authors
Alex J. Lee, Alma Dubuc, Michael Kunst, Shenqin Yao, Nicholas Lusk, Lydia Ng, Hongkui Zeng, Bosiljka Tasic, Reza Abbasi-Asl
Tags
spatial transcriptomics
self-supervised learning
CellTransformer
MERFISH
Slide-seqV2
tissue domain discovery
multimillion-cell scaling
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny