Psychology
Opposing neural processing modes alternate rhythmically during sustained auditory attention
F. H. Kasten, Q. Busson, et al.
Sustaining attention is vital for everyday activities yet is prone to lapses that can impair cognition and lead to errors. Such lapses are prevalent across healthy individuals and are characteristic of several neurological and psychiatric disorders. A proposed account posits dynamic allocation of resources between processing external sensory input and internal processes (e.g., memory consolidation or mind-wandering), yielding modes of "external" versus "internal" attention with distinct neural signatures. Alpha (α) oscillations (~10 Hz) have been linked to inhibition/suppression of sensory processing and attentional selection, whereas neural entrainment—phase alignment of neural activity to rhythmic stimuli—indexes active sensory processing. Prior work frequently examined these markers separately, and trial-based paradigms are ill-suited to capture slow fluctuations in sustained attention. In macaque auditory cortex, bouts of α activity (internal attention) alternated rhythmically with periods of strong entrainment to rhythmic sounds (external attention) at ~0.06 Hz (~16 s), modulating neuronal firing and behavior. It remains unclear whether humans exhibit analogous rhythmic alternation of attentional modes during continuous, ecologically relevant auditory processing like speech, whether such alternations show temporal regularity, and which networks are involved. Here, using human EEG during sustained attention to rhythmic speech, we tested the hypothesis that neural entrainment to speech and spontaneous α-oscillations show slow rhythmic fluctuations near ~0.06–0.07 Hz (range 0.02–0.2 Hz) and are anti-phased, reflecting alternating external and internal attentional modes.
Research has linked α-oscillations to suppression of sensory input and attentional (de-)selection, consistent with internal attention states. Neural entrainment to rhythmic auditory input reflects active processing and supports intelligibility. However, relations between α and entrainment have been sparsely examined, often in trial-based or spatial attention paradigms that may obscure slow dynamics in sustained attention. A key study in macaque auditory cortex demonstrated rhythmic alternation (~0.06 Hz) between entrainment-dominant and α-dominant processing, with α modulating neuronal firing and reducing responses to stimuli during internal attention. Human studies have shown entrainment to speech rhythms and associations between α power and attentional lapses, but whether slow, regular anti-phased dynamics between α and entrainment occur during sustained speech attention was unknown. The network-level substrates (e.g., dorsal attention network, default mode network, speech/auditory networks) underlying such alternations in humans also remained to be established.
Participants: Twenty-three healthy adults (mean age 22.4 ± 1.6 years; 15 females) provided informed consent under approved ethics (CPP Ouest II Angers, protocol CPP 21.01.22.71950/2021-A00131-40). Experimental design and task: Across six 5-min blocks, participants listened to continuous streams of rhythmic, monosyllabic French words presented at 3 Hz. Blocks alternated eyes-open (fixation cross) and eyes-closed conditions (3 blocks each; order randomized). Participants detected off-rhythm targets (words shifted by ±80 ms relative to 3 Hz) via button press. Stimuli and apparatus: A corpus of 474 monosyllabic words was recorded to a 2 Hz metronome to align p-centers, time-compressed to 3 Hz using PSOLA (Praat), and intelligibility was degraded with 16-channel noise-vocoding while remaining intelligible. Words were concatenated into 5-min streams (≈900 words). Each block contained 25 targets (50% early, 50% late; 150 total across experiment). Audio was presented via ER-2 pneumatic in-ear headphones in a dim booth; stimuli generated in MATLAB/Psychtoolbox via an RME Fireface UCX soundcard. EEG acquisition and preprocessing: EEG (64-channel BioSemi ActiveTwo, 10–10 system) was recorded at 2048 Hz. Data were re-referenced to common average, resampled to 256 Hz, bandpass filtered 1–40 Hz (two-pass 4th-order zero-phase Butterworth), and cleaned via ICA (blink, movement, cardiac, muscle artifacts removed). Time-resolved measures: EEG was epoched into overlapping 2 s segments centered on each word’s p-center (±1 s), comprising seven p-centers per segment. For each segment, FFT (Hanning, 2 s zero-padding) yielded complex spectra. Within sliding windows of 15 adjacent segments (~5 s; step 1 segment), inter-trial coherence (ITC) at 3 Hz indexed neural entrainment; α-power (8–12 Hz) was averaged across segments within the same window to match smoothing. This produced time series of ITC and α-power per channel. Slow fluctuation analysis: ITC and α-power time series were z-scored, segmented into 100 s windows with 90% overlap (60 segments across the three blocks per condition), and FFTed (Hanning, 400 s zero-padding) to resolve <0.2 Hz. Spectra were corrected for aperiodic 1/f using FOOOF to identify rhythmic peaks. Individual peak frequencies closest to 0.07 Hz (in 0.04–0.1 Hz) were selected for α and ITC. Phases of α and ITC envelopes at each participant’s α envelope peak frequency were extracted per segment; phase differences were averaged per subject/condition. Coupling statistics: Sensor-level coupling between α and ITC envelope phases used Rayleigh tests for non-uniformity across subjects at each channel, followed by cluster-based permutation (10,000 shuffles) to identify significant channel clusters. Phase relation was assessed with circular one-sample tests (circ_mtest) against 0 and evaluated for anti-phase (CIs covering ±π). Eye-closure effect: The above analyses were repeated for eyes-open and eyes-closed conditions separately. Behavioral controls: Distributions of inter-target, inter-hit, and inter-miss intervals were examined (converted to rate, Hz) to assess correspondence with ~0.07 Hz. Correlations (Pearson r) tested relationships between individuals’ envelope peak frequencies and inter-event rates. Source analysis: Dynamic imaging of coherent sources (DICS beamforming) was applied at individual envelope peak frequencies to project complex coefficients from each 100 s segment onto a cortical surface grid (20,484 sources) using standard BEM head model and MNI electrode positions. A neural activity index (power divided by noise estimate) was computed. Sources were parcellated into 360 HCP-MMP1 ROIs; within-ROI phases for α and ITC envelopes were averaged across sources and segments. Coupling between ROI-wise α and ITC phases across subjects was tested for all ROI pairs with permutation-based cluster statistics (alpha level 0.01 for sparsity), identifying clusters linking α in one ROI set to ITC in another. Target-related phase analysis: α and ITC envelopes were bandpass filtered around each participant’s α envelope peak frequency (±0.02 Hz) using a causal 6th-order one-pass Butterworth filter. Envelopes were epoched from −14 to +7 s around targets. Instantaneous phase (Hilbert transform) was averaged over trials separately for hits and misses, and time-resolved phase differences (hits − misses) were computed from −14 to −2.5 s (to avoid smearing from post-target responses). Rayleigh tests assessed clustering; circular tests evaluated deviation from 0 and anti-phase. Additional controls: Coupling between 3 Hz ITC and harmonics within α (9, 12 Hz) was tested to exclude harmonic-driven artifacts.
- Behavior: Hit rates were similar across conditions (eyes-open: 41.22% ± 13.80; eyes-closed: 42.96% ± 15.56; t22 = −0.94, p = 0.35). False alarms were low (eyes-open: 0.91% ± 1.06; eyes-closed: 0.83% ± 1.00; t22 = 1.11, p = 0.27). Reaction times did not differ (eyes-open: 777 ± 122 ms; eyes-closed: 738 ± 96 ms; t22 = 1.62, p = 0.12).
- Slow rhythmic fluctuations: Both α-power and speech-entrained ITC envelopes exhibited prominent peaks around ~0.07 Hz (group: Mα = 0.0713 Hz ± 0.0126; MITC = 0.0710 Hz ± 0.0116). Individual α and ITC peak frequencies did not differ (t22 = 0.08, p = 0.93).
- Anti-phasic coupling (eyes-open): Significant coupling between α and ITC envelope phases was found in a fronto-central sensor cluster (cluster-based Rayleigh: Pcluster = 0.037). Mean phase difference was near anti-phase (Mangle = −3.04 rad), 99% CI included ±π. A more distributed cluster emerged when testing per-channel α vs entrainment in the frontal cluster (Pcluster = 0.042) with Mangle = 3.13 rad (p < 0.01), CI99 included ±π. Fourteen of 23 participants showed significant coupling within the identified cluster; for 15/23, the mean phase difference differed from 0.
- State dependence (eyes-closed): The ~0.07 Hz peaks were less pronounced; no significant coupling detected (Pcluster > 0.68) and phase relation was not different from 0 (Mangle = 2.22 rad; p > 0.05).
- Control for stimulus timing: Distributions of inter-target, inter-hit, and inter-miss rates showed no bias around 0.07 Hz. Individual α envelope frequencies were uncorrelated with inter-target (r = 0.04, p = 0.84), inter-hit (r = −0.01, p = 0.95), and inter-miss rates (r = 0.14, p = 0.53). ITC envelope frequencies similarly showed no significant correlations (targets: r = 0.35, p = 0.10; hits: r = −0.17, p = 0.43; misses: r = 0.27, p = 0.20).
- Source-level coupling: Three significant clusters linked α-power fluctuations to entrainment fluctuations across ROIs (Pcluster1 = 0.0026; Pcluster2 = 0.004; Pcluster3 = 0.006), including connections from parietal (α) to right temporal (entrainment), right frontal (α) to left temporal (entrainment), and parietal (α) to left frontal (entrainment). Entrainment-related regions aligned with auditory/language networks; α-related regions included superior parietal, posterior cingulate, right IFG, among others associated with dorsal/ventral attention and default mode networks.
- Behavioral relevance: Pre-target α and ITC envelope phases differed between hits and misses. For α, phase differences (hits − misses) were clustered (Rayleigh p = 0.009, z = 4.61) with mean angle −2.87 rad (CI99 = 2.42, −1.88), indicating anti-phase. For ITC, clustering was weaker but significant (Rayleigh p = 0.049, z = 2.07) with mean angle 2.44 rad (CI95 = 1.52, −2.93), consistent with anti-phase.
The study demonstrates that during sustained attention to rhythmic speech, neural signatures of external (entrainment) and internal (α-oscillations) attention alternate rhythmically at ultra-slow frequencies (~0.07 Hz), closely mirroring findings in non-human primates. The robust anti-phasic coupling indicates opposing processing modes: periods of heightened entrainment correspond to reduced α-power (external attention), whereas elevated α-power aligns with diminished entrainment (internal attention). These dynamics were behaviorally relevant, as pre-target phases of α and entrainment distinguished detected from missed targets, supporting the functional role of these modes in shaping perceptual outcomes. Source analyses suggest that the observed alternations arise from interactions between distinct cortical networks: an auditory–language network exhibiting entrainment fluctuations and a fronto-parietal/default mode network expressing α-power fluctuations, consistent with top-down regulation of sensory processing via inhibitory α mechanisms. The absence of significant anti-phase coupling with eyes closed may reflect dominance of visual α overshadowing auditory/parietal α, or a qualitative shift in processing mode when visual input is blocked. Control analyses argue against stimulus- or target-timing as drivers of the ultra-slow rhythms. Instead, the data support intrinsically generated oscillatory dynamics of attention at multiple timescales, extending known theta/α rhythmic sampling into the infra-slow range. The rhythmic alternation could serve to conserve resources by interleaving periods of focused external processing with internally oriented states. The alignment of timescales with known interoceptive rhythms (respiratory, cardiac, gastric) raises the possibility of cross-system coupling influencing attention. These insights bridge oscillatory mechanisms of attention with large-scale network dynamics and highlight the importance of continuous, rhythmically structured tasks to reveal slow attentional rhythms.
Human EEG reveals slow, periodic alternation between opposing neural processing modes during sustained auditory attention: neural entrainment to speech and α-oscillations fluctuate in anti-phase at ~0.07 Hz, predict performance, and arise from interactions between auditory–language regions and fronto-parietal/default mode networks. These findings indicate a conserved, intrinsic rhythm of attention that periodically shifts resources between external and internal processing. Implications include improved prediction of attentional lapses and novel state-dependent interventions (e.g., tACS targeting α or synchronizing with stimulus rhythms). Future work should test causality of network interactions, examine generalization across sensory modalities and tasks, resolve why effects diminish with eye closure, explore links to interoceptive rhythms, and determine optimal strategies to stabilize attention without sacrificing performance.
- Generalizability: The task involved rhythmic, noise-vocoded speech at a fixed 3 Hz rate; findings may not generalize to non-rhythmic or other sensory modalities without further testing.
- Eye-closure effects: Mechanisms underlying the disappearance of anti-phase coupling with eyes closed are unresolved (potential overshadowing by visual α vs. qualitative state change).
- Causality: The study is correlational; causal roles of α oscillations and network interactions in regulating entrainment require interventional tests (e.g., brain stimulation).
- Source modeling: Standard head models/electrode positions were used; while ROI parcellation mitigates errors, source localization precision is limited compared to individual MRIs.
- Sample size and variability: N = 23; individual variability in peak frequencies exists. Larger samples could refine estimates of rhythm frequency and coupling strength.
- Temporal specificity: Analyses focused on a predefined ultra-slow range and individual peaks near 0.07 Hz; other relevant slow timescales may exist.
Related Publications
Explore these studies to deepen your understanding of the subject.

