This study introduces a novel modeling technique for real-time crash and severity prediction, addressing challenges like non-IID data, large model sizes, missing data, and the sensitivity vs. false alarm rate trade-off. A deployable framework is developed using real-time traffic and weather data, leveraging spatial ensemble modeling with local model regularization (weight decay, label smoothing, knowledge distillation) and post-calibration. The framework predicts crashes and severity levels (fatal, severe, minor, PDO) with high sensitivity and low false alarm rates. Deployment strategies and sustainability aspects are discussed.