Computer Science
HMT: Hierarchical Memory Transformer for Efficient Long Context Language Processing
Z. He, Y. Cao, et al.
Hierarchical Memory Transformer (HMT) imitates human memory hierarchy to boost long-context processing: it preserves tokens from early segments, passes memory embeddings forward, and recalls relevant history using memory-augmented segment-level recurrence, improving language modeling, QA, and summarization while using far fewer parameters and much less inference memory. Research conducted by Zifan He, Yingqi Cao, Zongyue Qin, Neha Prakriya, Yizhou Sun, and Jason Cong.
Related Publications
Explore these studies to deepen your understanding of the subject.

