logo
ResearchBunny Logo
Introduction
Morphological profiling, combining optical microscopy and machine vision, has shown success in high-throughput phenotyping. A key question is the information content of images for identifying genetic differences. While fluorescent microscopy is widely used, the potential of label-free bright-field (BF) microscopy remains underexplored. BF microscopy offers advantages: natural imaging without staining, information from multiple organelles, and reduced artifacts compared to other techniques. Previous studies demonstrated BF image-based profiling can discriminate different cell lines and infected/non-infected cells. This study investigates whether BF images can distinguish single-gene perturbation effects, focusing on the ubiquitin-proteasome system. CRISPR-Cas9 was used to generate mutants of eight non-lethal genes with functional redundancy, minimizing morphological effects. The goal was to determine if machine learning could identify these mutants from BF images.
Literature Review
Several studies have explored the use of machine learning and computer vision in high-content screening for phenotypic profiling (Grys et al., 2017; Boutros et al., 2015; Mattiazzi et al., 2016; Fetz et al., 2016). Morphological profiling using fluorescent microscopy targeting specific organelles has been extensively explored (Kraus et al., 2017; Fenistein et al., 2008; Arora et al., 2011), demonstrating its effectiveness in various applications (Caicedo et al., 2016; Bougen-Zhukov et al., 2017). However, the use of label-free BF microscopy for this purpose is less common. Previous research has shown that BF image-based profiling can discriminate cell lines (Meng et al., 2019) and infected/non-infected macrophages (Adiga et al., 2012), suggesting its potential. Studies have also shown that BF images contain information about organelle localization and morphology (Ali et al., 2012; Rychtarikova & Stys, 2017; Ounkomol et al., 2018). However, the ability of BF imaging to distinguish single-gene mutants remained to be tested.
Methodology
The study used CRISPR-Cas9 to generate single-gene knockout mutants of eight genes in the ubiquitin-proteasome system in HEK293Ta cells. These genes were chosen for their functional redundancy to ensure cell viability and minimal overt morphological changes. High-throughput imaging using an IN Cell Analyzer 6000 acquired BF and fluorescent (Hoechst 33342 for nuclear staining) images of over 670 cells per mutant and wild type. A nuclei detector based on Faster R-CNN with a ResNet-101 backbone was employed to identify individual cells in the fluorescent images. Cellular regions were cropped from corresponding BF images. 296 texture features were extracted using the LPX296 feature extractor. Data were preprocessed by standardizing features and removing outliers. Logistic regression with L1 regularization was used as the primary machine learning model for discriminating mutants from wild type. AUC with tenfold cross-validation was used to evaluate model performance. Additional models (SVM, random forest, k-nearest neighbor) with and without PCA were also tested for comparison. The effects of different image preprocessing methods (blur, edge enhancement, sharpen) were assessed. Hierarchical clustering based on feature profiles (regression coefficients) was used to analyze relationships between mutants.
Key Findings
The logistic regression model successfully discriminated single-gene mutant cells from wild-type cells with a mean AUC of 0.773 across all mutants. The AUC values for individual mutants ranged from 0.59 to 0.87. Analysis of features contributing to discrimination revealed that they are related to the morphology of intracellular structures, such as clumps, likely representing organelles. Mutants of functionally close genes (paralogs or those with interacting proteins in the proteasome complex) showed similar feature profiles. For example, the paralog pairs PSME1/PSME2 and UBQLN1/UBQLN2, and the non-paralog pair PSMB5/PSMA7 clustered together based on their feature profiles. This suggests that the morphological changes captured by BF imaging reflect functional and/or physical relationships between genes. The study also evaluated the impact of the training data size. Additional experiments confirmed the model’s applicability on independent datasets, demonstrating reproducibility. Finally, the study compared logistic regression to other machine learning models, with logistic regression performing better.
Discussion
This study demonstrated the ability of BF microscopy coupled with machine learning to discriminate single-gene mutants from wild-type cells. This significantly advances the field as it shows the potential for identifying genetically modified cells without the need for labeling. This has broad implications in biomedical research for high-throughput screening applications. The features identified as crucial for the discrimination are related to intracellular structure morphology, suggesting that the approach can capture subtle cellular changes resulting from single-gene mutations. The observation that functionally related genes show similar mutant profiles strengthens the biological relevance of the findings. The use of a relatively simple machine learning model (logistic regression) with a moderate dataset size makes the approach easily accessible to researchers.
Conclusion
This study shows that single-gene mutant cells can be effectively discriminated from wild-type cells using BF images and machine learning. The method is label-free, cost-effective, and potentially high-throughput. Future work could include validating the findings with different cell types and genetic perturbations, as well as exploring other non-label imaging modalities (DIC, phase-contrast). Further development could involve using deep learning approaches to potentially improve the accuracy and efficiency of genotype classification.
Limitations
The study acknowledges the possibility of off-target mutations introduced by CRISPR-Cas9, although the use of pooled datasets from multiple clones mitigated this. Whole-genome sequencing would be needed for comprehensive off-target mutation analysis. The analysis was limited to genes in the ubiquitin-proteasome system; expanding the study to include different cellular pathways would further validate the approach's generality. Finally, the number of features extracted might have been reduced through additional feature selection techniques.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny