Computer Science

M+: Extending MemoryLLM with Scalable Long-Term Memory

Y. Wang, D. Krotov, et al.

Large language models often lose information from the distant past. MemoryLLM compresses past context into a 1B-parameter latent memory but struggles beyond ~20k tokens. This paper presents M+, which augments MemoryLLM with a long-term memory and a co-trained retriever to dynamically fetch relevant information during generation, extending retention from under 20k to over 160k tokens with similar GPU overhead. Research conducted by Yu Wang, Dmitry Krotov, Yuanzhe Hu, Yifan Gao, Wangchunshu Zhou, Julian McAuley, Dan Gutfreund, Rogerio Feris, and Zexue He.... show more

Related Publications

Explore these studies to deepen your understanding

Adjacent work that informs or extends this paper's methodology and findings.

Business

Financial time series prediction under Covid-19 pandemic crisis with Long Short-Term Memory (LSTM) network

M. Mroua and A. Lamine

Psychology

Long-term memory guides resource allocation in working memory

A. L. Bruning and J. A. Lewis-peacock

Medicine and Health

Trends in dietary patterns over the last decade and their association with long-term mortality in general US populations with undiagnosed and diagnosed diabetes

S. Yuan, J. He, et al.

Biology

Long-term effects of SARS-CoV-2 infection on human brain and memory

Q. Ding and H. Zhao

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 22+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny