logo
ResearchBunny Logo
Abstract
Accurately identifying somatic mutations is crucial for precision oncology and tumor mutational burden (TMB) calculation, a key immunotherapy response predictor. Tumor-only variant calling, lacking matched normal tissue, faces challenges in distinguishing somatic from germline variants, leading to biased TMB estimates. This study applies machine learning (TabNet, XGBoost, LightGBM) to classify somatic vs. germline variants using tumor-only features and matched-normal labels. All three models achieved state-of-the-art performance (AUC > 94% on TCGA data, > 85% on melanoma data). Concordance between matched-normal and tumor-only TMB improved significantly (R² from 0.006 to 0.71-0.76), with LightGBM performing best. Importantly, XGBoost and LightGBM eliminated racial bias in tumor-only TMB estimates observed in previous studies.
Publisher
npj Precision Oncology
Published On
Jan 07, 2023
Authors
R. Tyler McLaughlin, Maansi Asthana, Marc Di Meo, Michele Ceccarelli, Howard J. Jacob, David L. Masica
Tags
somatic mutations
tumor mutational burden
machine learning
XGBoost
LightGBM
precision oncology
racial bias
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny