Biology

Self-supervised machine learning for live cell imagery segmentation

M. C. Robitaille, J. M. Byers, et al.

Discover a groundbreaking self-supervised learning approach for segmenting individual cells from microscopy images, developed by Michael C. Robitaille, Jeff M. Byers, Joseph A. Christodoulides, and Marc P. Raphael. This innovative method eliminates the need for human-annotated labels, significantly reduces bias and variability, and is poised to enhance cell biology research!

00:00

Playback language: English

Index

Introduction

Time-lapse live-cell microscopy is essential for cell biology research, particularly in 2D cell cultures. Quantitative bioimage analysis, including cell segmentation, is necessary for extracting meaningful data. Machine learning (ML), specifically supervised learning (SL), has emerged as a powerful tool for cell segmentation. SL techniques like Artificial Neural Networks (ANNs) often outperform traditional image processing methods. However, a significant limitation of SL is its dependence on large, pre-processed datasets with human-annotated labels. Creating these datasets is labor-intensive, introducing potential biases. Existing large-scale labeled datasets like COCO, EVICAN, CellPose, and LIVEcell strive to address this need, but they still struggle to encompass the vast diversity of cell types, optical modalities, microscope configurations, and experimental conditions encountered in cell microscopy. The need for retraining models with new datasets highlights this limitation. Subjectivity in data labeling introduces further biases, impacting the reliability and reproducibility of results. To overcome these limitations, this study explores self-supervised learning (SSL) as an alternative. SSL leverages inherent data features, eliminating the need for manually curated libraries and reducing biases. In time-lapse microscopy, cell motion provides a readily available feature for self-labeling.

Literature Review

The application of machine learning, particularly supervised learning, to cell segmentation has seen significant advancements. Studies have shown the effectiveness of deep learning methods such as U-Nets and other ANN architectures for cell counting, detection, and morphometry. However, these approaches heavily rely on large, manually annotated datasets for training, leading to challenges in terms of time commitment, reproducibility, and potential biases. Several large-scale initiatives have been launched to curate datasets for training generalist cell segmentation models, but the diversity of microscopy data remains a significant challenge. The inherent limitations of supervised learning, the need for constant retraining, and the introduction of subjective biases have spurred the exploration of alternative approaches like self-supervised learning.

Methodology

To create a self-supervised algorithm for cell segmentation, the researchers used the ubiquitous feature of intracellular motion in time-lapse microscopy. The algorithm leverages optical flow (OF) algorithms, specifically the multi-resolution Farneback Displacement (FD) algorithm, to identify pixel displacement between consecutive image frames. This accounts for cell deformability and microscope stage jitter. The FD algorithm automatically labels pixels: high displacement pixels are labeled as 'cell,' low displacement pixels as 'background,' and intermediate pixels remain unlabeled. This self-labeling is independent of optical modality or cell type. The labeled pixels are then used to generate additional feature vectors (entropy and gradient) to train a Naïve Bayes classifier. This classifier is applied to all pixels in the image for segmentation. The process is recursive, continuously retraining on consecutive image pairs. For low-contrast images, an iterative threshold adjustment is implemented based on cell pixel entropy. This automation eliminates the need for manual labeling or size filtering. A stand-alone application of FD is used for automated size filtering and background exclusion based on motion. A fixed morphological blurring tool is applied for hole filling, ensuring consistency across different cell types and microscope settings.

Key Findings

The self-supervised learning (SSL) algorithm was evaluated on a diverse dataset encompassing various cell types, optical modalities (fluorescent and tag-free), magnifications, and time increments. The F1-score, a metric evaluating segmentation quality, was used to compare the SSL approach with CellPose, a popular supervised learning algorithm trained on 70,000 manually annotated objects. The SSL algorithm consistently performed well across all datasets, achieving F1-scores between 0.7 and 0.9. In four out of six datasets, the SSL approach outperformed CellPose, particularly in lower magnification, multicellular datasets. In the higher magnification, single-cell datasets, both methods showed statistically equivalent performance. The SSL algorithm successfully automated size filtering and hole filling, commonly requiring manual intervention in other methods. The algorithm adapted to different cell sizes and shapes, and successfully segmented cells from various optical modalities, even in the presence of background inhomogeneities. The algorithm's performance was robust across different magnifications and time increments. An example demonstrated the additional capability of utilizing optical flow to quantify intracellular protein dynamics. Processing times were acceptable, even for high-resolution data, making this algorithm accessible to many laboratories.

Discussion

The results demonstrate the potential of SSL to address critical limitations in supervised learning for cell segmentation. The fully automated nature of the SSL algorithm eliminates the labor-intensive pre-processing steps associated with SL, significantly reducing time and resource requirements. This automation also minimizes the introduction of biases stemming from subjective data labeling. The algorithm's consistent performance across diverse datasets highlights its robustness and generalizability. Compared to CellPose, a state-of-the-art supervised learning method, the SSL approach showed superior performance in several instances, demonstrating the effectiveness of this self-supervised approach. The findings suggest that leveraging inherent data features like motion, as opposed to relying solely on large curated datasets, can lead to more robust and reproducible cell segmentation methods. The accessibility of the algorithm due to its relatively low computational demands makes it suitable for widespread implementation in cell biology laboratories.

Conclusion

This research introduces a novel self-supervised learning algorithm for automated cell segmentation in time-lapse live-cell microscopy. By leveraging the inherent motion information within the data, the algorithm eliminates the need for manual labeling and pre-processing steps, thereby improving reproducibility and reducing bias. The algorithm's strong performance across diverse datasets demonstrates its robustness and generalizability. Future work will focus on extending the algorithm to handle instance segmentation and further improve its performance with additional feature vectors.

Limitations

The primary limitation of the current SSL algorithm is its applicability solely to time-lapse live-cell imagery. The algorithm's reliance on optical flow necessitates a stable experimental setup to accurately distinguish cell motion from background movement. While current commercial microscopes generally meet this requirement, instances of microscope drift may necessitate the incorporation of auto-alignment software. The current version of the software performs semantic segmentation but not instance segmentation; future iterations will integrate techniques for separating touching cells.

Related Publications

Explore these studies to deepen your understanding of the subject.

Biology

Cell morphology-based machine learning models for human cell state classification

Y. Li, C. M. Nowak, et al.

Engineering and Technology

Machine learning assisted discovery of high-efficiency self-healing epoxy coating for corrosion protection

T. Liu, Z. Chen, et al.

Computer Science

Reliability of Supervised Machine Learning Using Synthetic Data in Health Care: Model to Preserve Privacy for Data Sharing

D. Rankin, M. Black, et al.

Chemistry

Hierarchical Molecular Graph Self-Supervised Learning for property prediction

X. Zang, X. Zhao, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny