Psychology
Neural integration underlying naturalistic prediction flexibly adapts to varying sensory input rate
T. J. Baumgarten, B. Maniscalco, et al.
The study addresses how the brain predicts future sensory input when the rate of incoming information varies widely in natural environments. It focuses on sensory history integration (SHI)—the accumulation and selective integration of past sensory information needed to extract temporal regularities and form predictions. Two competing hypotheses are tested: (1) a temporal bottleneck, where SHI operates over fixed-duration windows, implying fewer items are integrated at faster input rates; and (2) an informational bottleneck, where SHI integrates a fixed amount of information, requiring flexible time scaling as input rate changes. The work examines which mechanism underlies predictive computation for naturalistic temporal regularities using MEG while manipulating presentation speed of autocorrelated tone sequences.
Prior work shows that natural stimuli possess long-range temporal correlations and redundancy that can be exploited for prediction. Behavioral studies demonstrate robust prediction across varied rates (e.g., speech, vision), and neuroimaging indicates flexible temporal scaling of responses in speech comprehension. Evidence for intrinsic temporal constraints includes biophysical limits and stable-frequency neural oscillations linked to perception. Conversely, flexibility is supported by findings that single neurons process features at multiple timescales, fMRI responses scale with input speed, and recurrent neural networks can recognize time-warped inputs. However, the relevance of these findings to neural mechanisms of predictive computation, especially how SHI adapts to input rate, remained unclear prior to this study.
Participants: 26 right-handed adults with normal hearing were recruited; after exclusions for performance or artifacts, 20 participants remained (11 female; mean age 25, range 19–34). IRB-approved with informed consent. Seven participants completed a behavioral prescreening session.
Stimuli and design: Auditory sequences comprised 34 concatenated pure tones without gaps. Pitch fluctuations followed naturalistic temporal autocorrelations with 1/f^β spectra (β = 0.5, 0.99, 1.5), discretized to 25 semitone-spaced values spanning 220–880 Hz. Nine unique 33-tone sequences (3 β × 3 predicted final pitch bins) were generated using circulant embedding. All sequences converged to the same penultimate tone (P33 = 440 Hz). For each sequence, an optimal theoretically predicted final tone pitch (p34) was computed from the preceding tones. The actually presented final tone (P34) was drawn from six pitches ±4, ±8, ±12 semitones relative to 440 Hz. Presentation speeds: 150 ms/tone (fast), 300 ms (medium), 600 ms (slow). Each of the 27 distinct sequences (9 sequences × 3 durations) was presented once per block across 12 blocks (324 trials/participant).
Task: After each sequence, participants rated the likelihood that the presented final tone fit the preceding sequence (1–5 scale), then rated sequence trend strength (autocorrelation level; 1–3). Feedback was provided for trend strength ratings only. Eye fixation was maintained during sequences.
MEG acquisition and preprocessing: Whole-head MEG (275-channel CTF; 272 usable sensors) at 600 Hz sampling. Bandstop filters at 58–62 Hz and 118–122 Hz; no high-pass filter to retain low-frequency activity. Demeaning/detrending applied. Independent component analysis removed ocular, cardiac, breathing, and motion artifacts. Head position was monitored across blocks.
Neural analyses: Predictive activity was probed by regressing non-baseline-corrected MEG during the penultimate tone (P33, identical pitch across trials) on the theoretically predicted p34, using 50-ms nonoverlapping windows. Group-level one-sample t-tests on regression coefficients with cluster-based permutation correction identified predictive processing clusters.
Event-related analyses contrasted time-resolved fields between sequences converging on low vs high p34 in predictive clusters and in an early sensory spatial filter emphasizing M100 auditory responses (75–125 ms) localized to primary auditory cortex.
Sensory History Integration (SHI): For tones i=16–32 in the second half of sequences, sensor/time-window MEG activity was regressed on the current tone pitch and k preceding tones, testing models with k' from 0 to 15 (i.e., integrating 1–16 tones). Six-fold cross-validation selected the k' best predicting held-out data. Significance was assessed against a shuffled null distribution that destroyed temporal order of past tones in training folds but preserved current tone pitch and test-set order, repeated 100 times to build subject-level nulls; group-level cluster permutation statistics were computed.
Rate dependence tests: To adjudicate between fixed-duration vs fixed-information SHI, per-sensor k' values (averaged across subjects) for the three tone durations defined a vector in 3D k'-space. Vector norm (magnitude) estimated overall SHI length; vector angle to the “information line” (k'150 = k'300 = k'600) and to the “duration line” (k'150 = 2×k'300 = 4×k'600) quantified proximity to each hypothesis. Comparisons against shuffled nulls used nonparametric permutation tests; data were also projected onto the 2D plane spanned by the two hypothesis lines for visualization/control analyses. Analyses were run both within predictive processing clusters and across the full sensor array with cluster-based correction.
Brain–behavior correlation: For each subject, a behavioral index of SHI was the F-statistic of the interaction between theoretically predicted and presented final tone (p34 × P34) from repeated-measures ANOVAs. Neural SHI was summarized as k' averaged across tone durations for sensors/time windows common to all durations (0–150 ms). Spearman correlations between neural k' and behavioral indices were computed within predictive clusters (FDR-corrected) and sensor-wise across the entire array with cluster-corrected permutation testing.
Statistics: Cluster-based nonparametric permutation tests corrected for multiple comparisons in sensor space for prediction, SHI (k', vector norm, vector angles), and correlation analyses, using appropriate one-tailed or two-tailed thresholds as defined in the study.
Behavior:
- Participants successfully used sequence history to judge the likelihood of the final tone across a four-fold range of presentation speeds. Significant main effects and interaction between theoretically predicted final tone (p34) and presented final tone (P34) were observed, replicable across the first and second halves of the experiment, indicating stable performance.
Neural prediction signals:
- MEG activity during the penultimate tone (P33, identical 440 Hz across trials) carried information about the theoretically predicted final tone (p34), isolating predictive processing from instantaneous sensory effects.
- In the 300 ms condition, significant bilateral predictive processing sensor clusters were identified: • 0–50 ms: left cluster 41 sensors, p = 0.005, d_cluster = 5.2; right cluster 27 sensors, p = 0.024, d_cluster = 2.9. • 50–100 ms: left 38 sensors, p = 0.016, d_cluster = 3.8; right 29 sensors, p = 0.024, d_cluster = 2.8. • 100–150 ms: left 34 sensors, p = 0.014, d_cluster = 3.8. • 150–200 ms: left 43 sensors, p = 0.002, d_cluster = 5.3. • 200–250 ms: left 42 sensors, p = 0.002, d_cluster = 6.0. • 250–300 ms: left 34 sensors, p = 0.017, d_cluster = 3.3.
- Time-locked ERFs in predictive clusters showed increasing divergence between sequences converging on low vs high p34 toward the sequence end; no such differences were found in the early sensory filter emphasizing M100 responses, indicating predictive information resides in slow, arrhythmic activity rather than early sensory responses.
Sensory history integration (SHI):
- SHI was significant and widespread across sensors and time windows (0–150 ms), with k' values (number of integrated previous tones) significantly exceeding shuffled nulls.
- Within predictive processing clusters (50–100 ms window), vector norms of k' across durations were significantly larger than shuffled (mean norm ≈ 10.67 ± 0.43 SD, p < 0.001), confirming robust SHI.
- Critically, vectors were significantly closer to the information line (k'150 ≈ k'300 ≈ k'600) than expected by chance in the left cluster at 50–100 ms (mean angle 0.055 ± 0.033 radians, p = 0.014); a trend at 100–150 ms (mean angle 0.061 ± 0.027, p = 0.065). No significant proximity to the duration line was detected in predictive clusters. These results indicate that predictive SHI integrates a stable number of tones across speeds (informational bottleneck).
- Data-driven whole-array analysis corroborated this: a right central-lateral cluster (0–50 ms; 4 sensors) showed smaller angles to the information line than shuffled (mean angle 0.017 ± 0.008, p = 0.042), overlapping completely with the right predictive cluster. Conversely, an anterior-central cluster (0–100 ms; 19 sensors at 0–50 ms, p = 0.008; 15 sensors at 50–100 ms, p = 0.011) was significantly closer to the duration line, providing spatially distinct evidence for temporally fixed integration (temporal bottleneck) that minimally overlapped with predictive clusters.
Brain–behavior correlations:
- Across subjects, neural SHI length (k') in right predictive clusters negatively correlated with behavioral history dependence (F-statistic of p34 × P34 interaction): • 0–50 ms: Spearman ρ ≈ -0.55, p = 0.014 (FDR-corrected). • 50–100 ms: Spearman ρ ≈ -0.56, p = 0.012 (FDR-corrected).
- Whole-array analysis identified a right hemisphere cluster (50–100 ms; 14 sensors) with negative correlations (average ρ ≈ -0.61, p = 0.002), partially overlapping predictive clusters.
Overall, predictive neural computation operates over a fixed amount of information, flexibly scaling in time with input rate, while separate frontal regions exhibit fixed-duration integration.
The findings directly address how the brain maintains effective prediction across varying sensory input rates. By isolating predictive activity during an identical penultimate tone and quantifying SHI, the study shows that predictive neural processes integrate a constant number of recent items, independent of presentation speed, consistent with an informational bottleneck. This flexible temporal scaling enables robust prediction under naturalistic rate variability. Spatially distinct frontal regions exhibited fixed-duration integration (temporal bottleneck), suggesting parallel SHI modes: one flexibly supporting prediction and another possibly serving temporary storage or other functions not directly predictive. The results link predictive computation to known cortical hierarchies of temporal receptive windows and intrinsic timescales, demonstrating that predictive signals emerge from slow, arrhythmic activity with extended integration. Correlations between neural SHI length and behavioral history dependence indicate that individual differences in neural integration windows are behaviorally meaningful.
This work demonstrates that neural activity integrates sensory history both over fixed amounts of information and fixed durations, with flexible, information-limited integration primarily underpinning prediction of upcoming input. Predictive SHI maintains a stable number of integrated items across a four-fold change in presentation speed, enabling robust predictions in dynamic, naturalistic environments. The study introduces a paradigm that disentangles predictive from instantaneous sensory processing for naturalistic temporal regularities and connects predictive computation to cortical timescale hierarchies. Future research should determine the limits of flexible scaling at extreme rates, identify anatomical generators via higher-precision modalities (e.g., intracranial recordings), and model single-trial behavior to clarify computational strategies (e.g., linear vs nonlinear heuristics) used by participants.
- Range of input rates: Flexible integration windows were tested within a limited range (150–600 ms per tone) and may break down at extreme presentation speeds.
- Source localization: Non-baseline-corrected accumulating activity reduced precision for source estimation; anatomical loci of predictive and SHI effects remain to be determined, motivating intracranial studies.
- Computational strategy: While neural SHI follows a weighted linear sum of past items, the exact behavioral prediction strategy was not fully modeled; future model-fitting/comparison on single-trial behavior is needed.
- Distinguishing adaptation: Although multiple analyses argue against adaptation as a sole explanation, a definitive dissociation would benefit from specific manipulations (e.g., controlling item frequency, omission responses).
Related Publications
Explore these studies to deepen your understanding of the subject.

