Predicting the ground state electron density using Kohn-Sham Density Functional Theory (KS-DFT) simulations is crucial for understanding material properties. However, KS-DFT's cubic scaling with system size hinders data generation for machine learning (ML) models. This paper addresses this by using transfer learning to leverage multi-scale training data and comprehensive configuration sampling via thermalization. Bayesian neural networks enable uncertainty quantification. The resulting models are less reliant on heuristics, offering confident predictions for diverse bulk systems, including those with defects, at multi-million-atom scales, using modest computational resources.