Psychology
Audiovisual adaptation is expressed in spatial and decisional codes
M. Aller, A. Mihalik, et al.
The study investigates how the adult human brain adapts to spatially misaligned audiovisual signals and whether recalibration arises in perceptual (spatial) encoding, decisional processes, or both. Spatial coding differs across modalities (retinotopic place code in vision vs. hemifield-based coding in audition), creating challenges for maintaining aligned spatial maps. Behaviourally, exposure to synchronous but spatially disparate audiovisual stimuli induces the ventriloquist aftereffect across multiple timescales, but the underlying neural mechanisms and representational codes remain unclear. Prior neuroimaging often conflated spatial and decisional processes by using localization tasks that map each location to a unique response. Here, using a spatial classification task and model-based analyses of fMRI and EEG, the authors aim to dissociate changes in spatial representations from choice-related (decisional uncertainty) codes during audiovisual recalibration.
Prior work shows robust cross-sensory plasticity such as the ventriloquist aftereffect over milliseconds to days. Early suggestions of frequency-selective recalibration pointed to auditory cortices, while findings of hybrid reference frames implicated parietal cortex or inferior colliculus. Auditory spatial coding in primates is often modeled as hemifield coding in cortex, potentially transforming to place-like codes in higher areas (e.g., VIP). Previous human neuroimaging decoded auditory space in auditory and frontoparietal regions but could not disentangle spatial from decisional components due to task designs. The present study builds on these findings by explicitly modeling and comparing spatial (hemifield) and decisional uncertainty codes and their recalibration.
Design: Within-subjects study across 13 days per participant: pre-screening (1), psychophysics (4), fMRI (4), EEG (4). Each experiment comprised (i) auditory pre-adaptation, (ii) audiovisual adaptation with leftward (VA) or rightward (AV) spatial disparity (±15°), and (iii) auditory post-adaptation (postVA or postAV). During pre/post phases, participants heard 50 ms white-noise bursts spatialized via HRTFs at 7 azimuths (±12°, ±5°, ±2°, 0°). On 22% of trials (response trials), they performed a left/right spatial classification indicated 500 ms after sound onset by fixation dimming; the remainder were non-response trials to minimize motor confounds in fMRI. Adaptation phases presented synchronous AV pairs with V shifted ±15° relative to A (visual at −5°, 0°, 5°), organized in mini-blocks, with a non-spatial visual detection task (10% dimmer trials) to induce implicit recalibration. Participants: Psychophysics N=15 (10 female; mean age 22.1). A subset N=5 (4 female; mean age 22.2; one author) completed fMRI and EEG. Normal hearing and vision; no neurological/psychiatric history; ethics approved. Stimuli and setup: Auditory stimuli via HRTF-convolved white noise (75 dB SPL); visual dot cloud (15 dots, 50 ms). Presentation via Psychtoolbox. fMRI at 3T (Philips Achieva); EEG 64-channel (1000 Hz). Eye tracking verified fixation in psychophysics; high-quality eye tracking not feasible in-scanner. Behavioural analysis: Cumulative Gaussian psychometric functions (beta-binomial) fit to percent perceived right; model comparison (AIC-based random effects) between static vs. recalibration PSE models; PSE compared across phases. fMRI acquisition/analysis: EPI TR=2800 ms, TE=40 ms, 3×3×3 mm; GLM in SPM12; multivariate spatial noise normalization; ROIs: Heschl’s gyrus (HG), higher auditory cortex (hA, incl. planum temporale), intraparietal sulcus (IPS), inferior parietal lobule (IPL), frontal eye fields (FEF). Decoding: SVR trained on pre-adaptation patterns to predict location; evaluated spatial encoding index (Fisher z correlation) and recalibration index (difference in fraction decoded right: postAV − postVA). Neurometric functions fit to percent decoded right. Representational similarity analysis (Mahalanobis RDMs) with MDS projection along azimuth. Model-based analyses: Pattern Component Modelling (PCM) compared spatial hemifield model, decisional uncertainty model (non-linear function of distance to decisional boundary), and combined models; factorial versions allowed recalibration in spatial and/or decisional components (shift ±2.3°, the mean behavioural PSE shift). Linear mixed-effects models assessed regional mean BOLD responses using model predictors, with and without recalibration factors. EEG analyses: Preprocessing (0.1–45 Hz, rereference, artifact rejection, downsample to 200 Hz). Time-resolved SVR decoding in sliding 50 ms windows from −100 to 500 ms. Cluster-based bootstrap statistics tested spatial encoding and recalibration indices. PCM across four windows (50–150, 150–250, 250–350, 350–450 ms) compared spatial, decisional, and combined models, with/without recalibration. EEG–fMRI fusion PCMs used fMRI ROI second-moment matrices (HG, hA, IPS, IPL, FEF) as predictors for EEG representational structure across time windows.
- Behaviour: Strong ventriloquist aftereffect across experiments. Recalibration models with shifting PSEs were strongly preferred over static models (protected exceedance probability >0.90). PSEs were significantly more positive after VA than AV adaptation: psychophysics t(14)=11.4, p<0.0001; fMRI t(4)=9.4, p=0.010; EEG t(4)=9.2, p=0.011. Eye movements did not differ between postVA and postAV.
- fMRI decoding: Significant spatial encoding indices in all ROIs (FDR-corrected): HG t(4)=2.47, p=0.028; hA t(4)=6.58, p=0.020; IPS t(4)=4.04, p=0.020; IPL t(4)=4.27, p=0.020; FEF t(4)=2.84, p=0.023. Recalibration index (postAV−postVA fraction decoded right) was significantly >0 in HG (t=3.55, p=0.042), hA (t=3.33, p=0.010), IPL (t=4.31, p=0.026), FEF (t=2.49, p=0.047), with a trend in IPS (t=1.81, p=0.086). Neurometric functions shifted in the expected directions; model comparison favored recalibration (AIC evidence: HG 10.6, hA 15.5, IPS 12.7, IPL 8.0, FEF 5.6).
- Representational geometry (RSA/MDS): hA, IPS, and IPL RDMs reflected physical spatial order strongly (Spearman Rs≈0.91–1.0), HG/FEF less so. MDS showed consistent leftward shift after VA and rightward after AV, with more complex structures in IPL/FEF.
- Model-based fMRI (regional mean BOLD, LME): hA showed linear increase along azimuth consistent with spatial coding; IPS/IPL/FEF showed inverted U-shaped profiles centered at the decisional boundary that shifted with recalibration, consistent with decisional uncertainty coding. Bayesian comparisons supported spatial model in hA and decisional model in IPS (pre-adaptation); with recalibration, evidence favored decisional components in IPS/IPL/FEF and spatial components in hA.
- Model-based fMRI (PCM on fine-scale patterns): Spatial model dominated in HG; combined spatial+decisional model best in hA, IPS, IPL, FEF, with decisional component increasingly dominant along the hierarchy. Recalibration was best captured by spatial shifts in HG/hA and decisional shifts in IPS/IPL/FEF; combined SR+DR models often outperformed single-component models.
- EEG decoding: Significant spatial decoding from ~110–500 ms, peaking ~355 ms. Recalibration index significantly positive in 185–285 ms (p=0.019) and 335–470 ms (p=0.005). Within N100 (70–130 ms), spatial decoding above chance (mean 0.125±0.059; t(4)=2.12, p=0.0406) and positive recalibration index (mean 5.963±1.743; t(4)=3.42, p=0.0019).
- Model-based EEG (PCM): 50–150 ms primarily spatial coding (with modest added evidence for combined model); from 150–250 ms onward, combined spatial+decisional models were superior (Log BF ≥6.1). Recalibration from 150–250 ms mainly via spatial coding; from 250 ms onward via joint spatial+decisional coding.
- EEG–fMRI fusion (PCM): hA patterns best explained EEG representational structure at 150–250 ms (exceeding others by Log BF ≥5.0); IPL best from 250 ms onward (Log BF ≥3.2), consistent with a temporal shift from spatial to decisional dominance. Overall: Audiovisual recalibration engages both spatial and decisional codes with opposite gradients across cortical hierarchy and distinct temporal dynamics: early spatial coding in auditory cortices (planum temporale), later decisional uncertainty coding in frontoparietal areas.
The findings directly address how the brain maintains aligned audiovisual spatial maps: by flexibly adapting both spatial representations and decisional uncertainty, rather than a unitary mechanism. Decoding and representational analyses showed pervasive recalibration across auditory and frontoparietal regions, but model-based comparisons revealed distinct codes: early auditory regions (hA/planum temporale) primarily adjust spatial encoding, while frontoparietal cortices (IPS, IPL, FEF) primarily adjust decisional uncertainty relative to a classification boundary. EEG temporal dynamics mirrored this hierarchy: early post-stimulus activity (N1 and 150–250 ms) reflected spatial recalibration; later activity (>250 ms) reflected decisional components, with EEG–fMRI fusion linking hA to earlier windows and IPL to later windows. This demonstrates that decoding alone is insufficient to infer representational content; explicit model-based representational analyses are needed to separate perceptual from decisional processes. The results suggest top-down influences from frontoparietal areas contributing to decisional uncertainty coding in auditory cortices at later stages and support the notion that recalibration mechanisms operate at multiple processing levels and timescales.
Audiovisual adaptation relies on both spatial and decisional coding that are expressed with opposite gradients across the cortical hierarchy and evolve over different time courses. Early activity in higher-order auditory cortex (planum temporale) encodes a flexible, continuous auditory space that shifts toward visual inputs, whereas later frontoparietal activity primarily reflects decisional uncertainty consistent with these shifts. By combining psychophysics with model-based fMRI/EEG representational analyses and EEG–fMRI fusion, the study delineates where and when recalibration is implemented in the human brain. Future work should test how task demands (e.g., explicit spatial judgments), recalibration duration (rapid vs. long-term), and broader spatial ranges affect the balance between spatial and decisional recalibration, and use methods with higher temporal/spatial resolution to further disentangle multiplexed codes and causal interactions across regions.
- The spatial range tested (−12° to 12°) makes hemifield and place-code model predictions nearly indistinguishable for pattern similarity, limiting definitive conclusions about precise spatial coding formats across all regions.
- fMRI BOLD sluggishness may mix neural codes present at different latencies within regional patterns, potentially contributing to apparent multiplexing of spatial and decisional codes.
- The adaptation task was non-spatial to promote implicit recalibration; results may differ under explicit spatial tasks or different timescales (rapid vs. cumulative recalibration), which were not directly compared here.
- Eye tracking during fMRI was limited due to scanner constraints, though behavioural and design controls minimize confounds.
- Small fMRI/EEG sample size (N=5) may limit generalizability, although effects were consistent and corroborated across modalities.
Related Publications
Explore these studies to deepen your understanding of the subject.

