This paper investigates the functional specialization in transformer-based language models (like BERT) and its correspondence to human brain activity during language processing. By analyzing the "transformations" (computations performed by attention heads) within the Transformer architecture, the researchers demonstrate that these transformations account for significant variance in brain activity across the cortical language network. Furthermore, they show a structured relationship between specific attention heads, their computations, and activity in particular brain regions. These findings suggest shared computational principles between the models and the human brain.
Publisher
Nature Communications
Published On
Jun 29, 2024
Authors
Sreejan Kumar, Theodore R. Sumers, Takateru Yamakoshi, Ariel Goldstein, Uri Hasson, Kenneth A. Norman, Thomas L. Griffiths, Robert D. Hawkins, Samuel A. Nastase
Tags
transformer models
BERT
brain activity
language processing
computational principles
attention heads
cortical language network
Related Publications
Explore these studies to deepen your understanding of the subject.