Introduction
Breast cancer treatment increasingly utilizes neoadjuvant therapy (systemic therapy before surgery) to improve outcomes. However, response varies significantly, and current prediction methods are limited. Existing predictors often rely on single-platform profiling and fail to capture the complexity of the tumor ecosystem, which is increasingly recognized as a major determinant of treatment response. This study aimed to develop improved predictive models by integrating multi-omic data from the tumor ecosystem, encompassing malignant cells and their microenvironment (stromal, vascular, and immune cells). The hypothesis was that considering the tumor as a complex ecosystem would lead to more accurate prediction models for response to neoadjuvant therapy.
Literature Review
Previous studies have explored predictors of response to neoadjuvant therapy using clinical, molecular, and digital pathology data. However, these studies often suffered from limitations such as small sample sizes, heterogeneous treatment regimens, and the use of single-platform profiling. These limitations hindered the development of robust and widely applicable predictive models. The current study aimed to overcome these limitations by using a comprehensive multi-omic approach and a larger, more homogeneous patient cohort.
Methodology
This prospective, multi-center study enrolled 180 women with locally advanced breast cancer undergoing neoadjuvant therapy. Paired pre-treatment and post-surgery biopsies were collected from 168 patients. Comprehensive profiling included digital pathology analysis, genomic (shallow whole-genome sequencing, whole-exome sequencing), transcriptomic (RNA sequencing), and immune cell quantification from histology. The response to therapy was assessed using residual cancer burden (RCB) classification after surgery. A machine learning framework was used to integrate the multi-omic features to predict pathological complete response (pCR) and residual disease. The machine learning approach involved feature selection, dimensionality reduction, and an ensemble of three algorithms (logistic regression with elastic net regularization, support vector machine, and random forest). Model performance was evaluated using five-fold cross-validation and validated on an independent external cohort of 75 patients from the ARTemis clinical trial and the PRECISION programme.
Key Findings
The study found several key associations between pre-treatment features and treatment response. Clinical features such as tumor grade and ER+ receptor status were associated with pCR, but showed response heterogeneity. Genomic features, including TP53 mutations, tumor mutation burden, and copy number alterations, were also associated with response, with tumors achieving pCR showing higher mutation burdens and chromosomal instability. Analysis of mutational signatures revealed that tumors with pCR had a greater contribution from non-KLOS signatures and higher HRD scores. Transcriptomic analysis identified genes associated with proliferation and immune activation as key determinants of response. Specifically, higher lymphocytic density and immune cytolytic activity were strongly associated with pCR. Conversely, features of T cell dysfunction and immune evasion were associated with poor response. The machine learning models that integrated clinical, genomic, transcriptomic, and digital pathology data significantly outperformed models based solely on clinical variables. The fully integrated model achieved an AUC of 0.87 in the external validation cohort, demonstrating its robustness and predictive power. The model identified several key features, including age, lymphocyte density, and expression of PGR, ESR1, and ERBB2, as significant predictors of response.
Discussion
The findings of this study demonstrate the importance of considering the tumor ecosystem as a whole when predicting treatment response in breast cancer. Integrating multi-omic data and employing machine learning significantly improved the accuracy of response prediction compared to relying solely on clinical variables. The identification of specific genomic, transcriptomic, and immune features associated with response provides valuable insights into the mechanisms driving treatment success or failure. The observed independence of response from proliferation in HER2+ tumors treated with both chemotherapy and HER2-targeted therapy warrants further investigation. The high accuracy of the predictive model in the external validation cohort suggests its potential clinical utility in guiding treatment decisions and selecting patients for novel therapies.
Conclusion
This study successfully developed and validated a multi-omic machine learning model for predicting response to neoadjuvant therapy in breast cancer. The model's superior performance compared to existing clinical predictors underscores the value of integrating multi-omic data. Future research should focus on further refining the model, exploring its application in different breast cancer subtypes and treatment settings, and investigating the biological mechanisms underlying the identified predictors to further improve therapeutic strategies.
Limitations
While the study demonstrated excellent performance in an external validation cohort, some limitations exist. The sample size, although larger than many previous studies, could be further increased to enhance generalizability. The study was primarily focused on patients receiving a specific regimen of neoadjuvant therapy, and the model’s performance may differ with other treatment protocols. Further research is required to determine the clinical utility and cost-effectiveness of implementing the model in routine practice.
Related Publications
Explore these studies to deepen your understanding of the subject.