Computer Science
M+: Extending MemoryLLM with Scalable Long-Term Memory
Y. Wang, D. Krotov, et al.
Large language models often lose information from the distant past. MemoryLLM compresses past context into a 1B-parameter latent memory but struggles beyond ~20k tokens. This paper presents M+, which augments MemoryLLM with a long-term memory and a co-trained retriever to dynamically fetch relevant information during generation, extending retention from under 20k to over 160k tokens with similar GPU overhead. Research conducted by Yu Wang, Dmitry Krotov, Yuanzhe Hu, Yifan Gao, Wangchunshu Zhou, Julian McAuley, Dan Gutfreund, Rogerio Feris, and Zexue He.
Related Publications
Explore these studies to deepen your understanding of the subject.

