This paper introduces StratoMod, an interpretable machine-learning classifier that predicts germline variant calling errors. StratoMod leverages genomic context features to accurately predict recall using HiFi or Illumina sequencing data, allowing researchers to assess the impact of factors like homopolymer regions and difficult-to-map areas on variant calling accuracy. The model also predicts clinically relevant variants likely to be missed, which is a significant improvement over existing pipelines that only filter for false positives. StratoMod facilitates precise risk-reward analyses when designing variant calling pipelines.
Publisher
communications biology
Published On
Oct 13, 2024
Authors
Nathan Dwarshuis, Peter Tonner, Nathan D. Olson, Fritz J. Sedlazeck, Justin Wagner, Justin M. Zook
Tags
machine learning
germline variant calling
predictive modeling
genomic context
clinically relevant variants
sequencing data
Related Publications
Explore these studies to deepen your understanding of the subject.