Feature selection is crucial for accurate single-cell data clustering. Existing methods are inconsistent and ignore gene-gene correlations. DUBStepR, a novel feature selection algorithm, leverages gene-gene correlations and a Density Index (DI) to select informative genes. It outperforms existing methods in various benchmarks and robustly deconvolves T and NK cell heterogeneity in rheumatoid arthritis patient data. DUBStepR is scalable to large datasets and applicable to other data types like single-cell ATAC-seq.
Publisher
NATURE COMMUNICATIONS
Published On
Oct 06, 2021
Authors
Bobby Ranjan, Wenjie Sun, Jinyu Park, Kunal Mishra, Florian Schmidt, Ronald Xie, Fatemeh Alipour, Vipul Singhal, Ignasius Joanito, Mohammad Amin Honardoost, Jacy Mei Yun Yong, Ee Tzun Koh, Khai Pang Leong, Nirmala Arul Rayan, Michelle Gek Liang Lim, Shyam Prabhakar
Tags
feature selection
single-cell data
gene-gene correlations
DUBStepR
clustering
rheumatoid arthritis
scalability
Related Publications
Explore these studies to deepen your understanding of the subject.