logo
ResearchBunny Logo
Computational models of category-selective brain regions enable high-throughput tests of selectivity

Psychology

Computational models of category-selective brain regions enable high-throughput tests of selectivity

N. A. R. Murty, P. Bashivan, et al.

Discover groundbreaking research by N. Apurva Ratan Murty, Pouya Bashivan, and colleagues as they unveil artificial neural network-based encoding models that predict brain responses to images with unprecedented accuracy, validating domain-specific theories in human cognition. This innovative approach enhances our understanding of how we perceive faces, places, and bodies, paving the way for future explorations in cognitive neuroscience.

00:00
00:00
Playback language: English
Introduction
The discovery of cortical regions specialized for perceiving faces, places, and bodies has fueled debates about the human mind's structure, evolution, and development. These regions—the fusiform face area (FFA), parahippocampal place area (PPA), and extrastriate body area (EBA)—are linked to social interactions and navigation. However, the intuitive definitions of these categories are not quantitatively precise and lack image-computability, hindering rigorous testing of their selectivity. Existing evidence, while substantial, remains vulnerable to refutation because a vast space of untested images exists. This study aims to address these limitations by developing image-computable encoding models to accurately predict neural responses to novel images and then utilize these models to conduct strong tests of the hypothesized category selectivity of FFA, PPA, and EBA.
Literature Review
Extensive research supports the hypothesized category selectivities of the FFA, PPA, and EBA. However, these hypotheses remain vulnerable to refutation due to the vast space of untested images. The study leverages advances in deep convolutional artificial neural networks (ANNs) which approach human-level performance in object recognition and whose internal representations mirror the hierarchical organization of the visual cortex. These ANNs, as accurate computational models of visual processing, are used to predict fMRI responses, exceeding the performance of descriptive models and experts. The study integrates existing knowledge about category-selective brain regions and applies cutting-edge ANN modeling techniques to investigate and rigorously validate the hypothesized category selectivity.
Methodology
The study involved four fMRI participants. First, FFA, PPA, and EBA were localized in each participant using a standard dynamic localizer. Event-related fMRI responses were then recorded in these functionally defined regions of interest (fROIs) to a diverse set of 185 naturalistic stimuli, each presented at least 20 times. Sixty ANN-based models of the ventral stream were screened for their ability to predict the observed responses using a regression-based model-to-brain alignment approach. A linear mapping was established between a selected layer of the model and the activation of each brain region, trained on a subset of stimuli and tested on held-out stimuli. The ResNet50 model, trained on a broad set of images, emerged as the best predictor. The model's predictive accuracy was evaluated across different metrics, including the correlation between predicted and observed responses for individual images, within and across categories, voxel-wise responses, and population-level representational dissimilarity matrices (RDMs). Behavioral experiments compared the model's predictions to those of novice participants and experts, rating image similarity to hypothesized preferred categories. To test category selectivity, the models screened ~3.5 million images from three databases (VGGface, Imagenet, Places2) and synthesized new images using a generative adversarial network (GAN) to identify stimuli predicted to maximally activate each fROI. Finally, a variant of Randomized Input Sampling for Explanation (RISE) was used to identify stimulus features driving neural responses.
Key Findings
The ResNet50 model accurately predicted responses to novel images in the FFA, PPA, and EBA, outperforming simpler models, descriptive models, and human experts. These predictions generalized across participants. High-throughput screening and image synthesis revealed that all top-predicted images for each region were unambiguous exemplars of the hypothesized preferred category. Analysis using RISE identified key features driving responses in each region (e.g., eyes and noses for FFA, hands and torsos for EBA, and perspective cues for PPA). The models successfully identified these features in complex natural images.
Discussion
The high accuracy and generalizability of the ANN-based models provide strong support for the hypothesized category selectivity of the FFA, PPA, and EBA. These models offer an image-computable and fine-grained account of category selectivity, going beyond simple word-based descriptions. Their superior performance compared to human experts demonstrates that the models capture knowledge not explicitly represented in human intuitions. The study highlights the complementary nature of word-based and ANN-based models, with words serving as pointers to the code executing the models.
Conclusion
This research provides highly accurate, image-computable encoding models of category-selective brain regions. These models robustly validated the category selectivity hypotheses for FFA, PPA, and EBA, identifying key features driving each region's activity. Future work should focus on improving model accuracy (e.g., using models incorporating biological network properties, expanding training data diversity) and extending the models to predict responses to more abstract stimuli. Further research should aim to create comprehensive models of the entire visual pathway to better understand the sequential computations involved in visual information processing.
Limitations
The study focused on a limited set of brain regions and stimuli. The ANN models, while powerful, have limitations regarding generalization to out-of-domain samples (e.g., line drawings). The reliance on fMRI data means the findings are not necessarily generalizable to other neuroimaging techniques. The noise-ceiling estimations could be further refined for even more precise accuracy estimates.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny