Finding the ground states of spin glasses is a crucial challenge with implications for understanding disordered magnets and solving complex combinatorial optimization problems. This paper introduces DIRAC, a deep reinforcement learning framework trained on small-scale spin glass instances, which can then be applied to arbitrarily large ones. DIRAC demonstrates superior scalability and accuracy compared to existing methods like simulated annealing and parallel tempering, enhancing their performance through a gauge transformation technique that bridges physics and artificial intelligence. The framework offers significant advancements in understanding low-temperature spin glass phases and provides a promising approach for tackling various hard combinatorial optimization problems.