Introduction
The accurate and timely diagnosis of metastasis in sentinel lymph node (SLN) biopsies during breast cancer surgery is critical for guiding treatment decisions. Frozen section analysis is commonly used for its speed, but the quality of frozen sections is inferior to formalin-fixed paraffin-embedded (FFPE) sections, making accurate diagnosis challenging for pathologists, particularly under time pressure. Convolutional neural networks (CNNs), a powerful tool in computer vision, show promise in improving diagnostic accuracy. However, training robust CNN models requires large, high-quality labeled datasets, which are difficult and expensive to obtain in the context of frozen section pathology. This research addresses this challenge by exploring the use of transfer learning. Transfer learning leverages pre-trained models from large public datasets to improve the performance of models trained on smaller, domain-specific datasets. This approach can significantly reduce the amount of labeled data needed for training, making it more feasible for applications like frozen section analysis where data acquisition is resource-intensive. The study aims to assess whether transfer learning from a publicly available dataset of FFPE tissue (CAMELYON16) can enhance the performance of a CNN model trained on a smaller dataset of frozen sections for classifying the presence or absence of tumor metastases. The success of this approach could significantly improve the speed and accuracy of intraoperative diagnosis.
Literature Review
Deep learning, particularly CNNs, has shown remarkable success in various computer vision tasks, surpassing traditional algorithms. Supervised learning methods, which require large labeled datasets, generally perform well given sufficient data. However, obtaining and labeling large datasets is often challenging due to cost and expertise limitations. Transfer learning emerges as a powerful solution. It utilizes pre-trained models from other domains, such as ImageNet, a large-scale dataset of natural images, or CAMELYON16, a dataset specifically for tumor classification in digital pathology using FFPE tissue. These pre-trained models provide a foundation of learned features which can accelerate training and improve performance on target datasets. Previous studies have applied deep learning to pathology tasks such as mitosis detection and breast cancer metastasis detection in FFPE samples, frequently leveraging pre-trained models from ImageNet. This study uniquely explores the potential of transfer learning from CAMELYON16, a dataset more closely related to the task of tumor classification in pathology slides, to a frozen section dataset, addressing the limitations of existing approaches.
Methodology
This retrospective study used two datasets: one from Asan Medical Center (AMC) containing 297 whole slide images (WSIs) of SLNs and another from Seoul National University Bundang Hospital (SNUBH) with 228 WSIs, used for external validation. The AMC dataset was split into training (157 WSIs), validation (40 WSIs), and test (100 WSIs) sets. The SNUBH dataset served as an independent test set. Three different CNN models were trained: one from scratch (random initialization), one initialized with weights pre-trained on ImageNet, and one initialized with weights pre-trained on CAMELYON16. The AMC training set was further subdivided into different ratios (2%, 4%, 8%, 20%, 40%, and 100%) to investigate the impact of dataset size on model performance. Inception v3 architecture was employed for all models, trained with the same hyperparameters. Patch extraction involved using Otsu thresholding on H&S color channels to identify tissue regions and sampling 448x448 patches. Patches were classified as tumor or non-tumor based on the percentage of tumor area present. Stain normalization was used to handle variations in image scanning conditions. Slide-level classification utilized confidence maps generated by interpolating tumor confidence scores over the entire WSI. The maximum confidence score was used to predict the presence or absence of tumor at the slide level. Performance metrics included sensitivity, specificity, accuracy, and area under the ROC curve (AUC). The Hanley & McNeil method was used to compare AUCs statistically. Grad-CAM was used to visualize the model's attention.
Key Findings
The results demonstrate that transfer learning significantly improved model performance. Specifically, CAMELYON16-based models consistently outperformed scratch-based models across all training dataset ratios, achieving significantly higher AUCs for both patch-level (ranging from 0.843 to 0.944) and slide-level (ranging from 0.814 to 0.886) classifications in the AMC dataset. Even with limited training data (up to 40% of the training set), the CAMELYON16-based models showed higher AUCs than ImageNet-based models for patch-level classification. When the entire training set was used, both ImageNet- and CAMELYON16-based models exhibited comparable performance. In external validation using the SNUBH dataset, CAMELYON16-based models showed higher AUCs than ImageNet-based and scratch-based models, indicating good generalization ability. The analysis of Grad-CAM visualizations showed that the CAMELYON16-based model, even when trained with only 4% of the training dataset, exhibited significantly higher confidence in correctly identifying tumor patches compared to the other models. Experiments with varying learning rates confirmed that a learning rate in the range of 5e-4 to 5e-5 yielded the best results for training all model types.
Discussion
This study's findings strongly support the use of transfer learning to address the challenge of limited datasets in training CNNs for frozen section analysis. The superior performance of CAMELYON16-based models compared to ImageNet-based models, especially when using smaller training datasets, highlights the benefit of leveraging a pre-trained model from a closely related domain. The domain-specificity of CAMELYON16 (FFPE tissue) made it more suitable as a transfer learning source for frozen sections compared to the general-purpose ImageNet dataset. The consistent improvement in AUC across various training set sizes suggests that transfer learning can effectively mitigate the limitations of small datasets. The successful external validation further reinforces the robustness and generalizability of the approach. The findings offer a promising avenue to improve the efficiency and accuracy of intraoperative tumor diagnosis.
Conclusion
This study demonstrates that transfer learning from the CAMELYON16 dataset substantially improves the performance of CNN models for tumor classification in frozen section SLN biopsies, particularly when training data is limited. The use of CAMELYON16 as a pre-trained model source offers a significant advantage over ImageNet or scratch-based methods, especially with smaller datasets. Further research could explore alternative transfer learning strategies, refine slide-level classification methods, and investigate the impact of different patch selection criteria on model performance. The development of more robust and efficient diagnostic tools holds the potential to transform intraoperative decision-making in breast cancer surgery.
Limitations
The study has some limitations. The criteria for selecting tumor patches (tumor area >80%) may have biased the results. Additionally, using the maximum confidence score for slide-level classification is a relatively simple approach, and more sophisticated methods might improve accuracy. The difference in MPP between the AMC and SNUBH datasets could also have influenced the results, although resizing augmentation was used to mitigate this.
Related Publications
Explore these studies to deepen your understanding of the subject.