logo
ResearchBunny Logo
Shared functional specialization in transformer-based language models and the human brain

Computer Science

Shared functional specialization in transformer-based language models and the human brain

S. Kumar, T. R. Sumers, et al.

Discover groundbreaking insights into how transformer-based language models, like BERT, align with human brain activity in language processing. This research by Sreejan Kumar and colleagues reveals significant correlations between model computations and specific brain regions, suggesting shared computational principles that bridge machine learning and neuroscience.

00:00
00:00
~3 min • Beginner • English
Abstract
When processing language, the brain is thought to deploy specialized computations to construct meaning from complex linguistic structures. Recently, artificial neural networks based on the Transformer architecture have revolutionized the field of natural language processing. Transformers integrate contextual information across words via structured circuit computations. Prior work has focused on the internal representations ("embeddings") generated by these circuits. In this paper, we instead analyze the circuit computations directly: we deconstruct these computations into the functionally-specialized "transformations" that integrate contextual information across words. Using functional MRI data acquired while participants listened to naturalistic stories, we first verify that the transformations account for considerable variance in brain activity across the cortical language network. We then demonstrate that the emergent computations performed by individual, functionally-specialized "attention heads" differentially predict brain activity in specific cortical regions. These heads fall along gradients corresponding to different layers and context lengths in a low-dimensional cortical space.
Publisher
Nature Communications
Published On
Jun 29, 2024
Authors
Sreejan Kumar, Theodore R. Sumers, Takateru Yamakoshi, Ariel Goldstein, Uri Hasson, Kenneth A. Norman, Thomas L. Griffiths, Robert D. Hawkins, Samuel A. Nastase
Tags
transformer models
BERT
brain activity
language processing
computational principles
attention heads
cortical language network
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny