Introduction
The involvement of the motor system in speech perception is a topic of ongoing debate. While the temporal lobe pathways are primarily responsible for speech perception, evidence suggests that frontal lobe motor areas, typically involved in speech production, are also active during speech perception. However, the exact role of these motor areas remains unclear, with some researchers suggesting a minor supporting role, while others propose an essential role. Several hypotheses exist to explain motor area activation during speech perception, including articulatory motor plans aiding in solving the "lack of invariance" problem in speech perception by using representations of motor plans to constrain interpretation of incoming information. Alternatively, some suggest that motor activity reflects domain-general processes like attention or decision-making. This study aims to resolve these competing hypotheses by designing a behavioral task combined with EEG to investigate four key stimulus features potentially influencing motor activity during speech perception: 1) whether the stimulus is speech or a non-speech sound; 2) whether the speech is lexical or sublexical; 3) whether the speech is auditory-only or audiovisual; and 4) whether the signal is easy or hard to perceive. The task was designed to temporally separate perception from decision-making, using large stimulus sets to prevent rehearsal or preparation of perceptuo-motor templates before stimulus presentation. This approach helps to isolate motor processing specific to speech perception, and not just general cognitive processes. This is important because previous studies haven't always effectively separated these factors.
Literature Review
The dominant explanation for motor involvement in speech perception emphasizes articulatory motor plans. The motor system is thought to help decipher ambiguous speech by using representations of motor plans to restrict the interpretation of incoming information. This model predicts that motor activity during perception will be specific to speech sounds and influenced by task context and stimulus characteristics. Studies exploring noisy or degraded speech often support this, emphasizing the role of representations in motor areas like primary motor cortex, premotor cortex, and Broca's area. However, alternative theories suggest motor activity reflects domain-general processes like attention or decision-making, rather than speech-specific processing. Prior research investigating this has yielded inconsistent results, partly because studies often examined isolated syllables or phonemes instead of whole words, and they conflated perception with decision-making. Few studies directly addressed speech specificity, leading to ambiguity regarding the role of motor areas in real-world speech comprehension, which typically involves whole words. The role of visual speech in motor engagement is also unclear, as is the interaction between ambiguity and signal content (auditory words/phonemes vs. audiovisual words vs. non-speech). Most previous studies haven't adequately disambiguated between motor modeling and domain-general processes due to using tasks with small stimulus sets, permitting rehearsal or preparation of motor templates, and conflating motor engagement with attention and decision-making.
Methodology
This study employed a four-alternative forced-choice (4AFC) task combined with electroencephalography (EEG) to examine motor activity during speech perception. Participants (24 healthy right-handed adults) listened to four types of stimuli: auditory-only words (AudWords), audiovisual words (AVWords), auditory-only phonemes (Phonemes), and non-speech environmental sounds (EnvSounds). The adaptive staircase procedure controlled difficulty (signal-to-noise ratio) across stimulus types, maintaining performance at approximately 80% and 50% correct for easy and hard levels respectively. EEG data were collected using a 128-channel HydroCel Geodesic sensor net, digitized at 500 Hz, and preprocessed to remove artifacts. Independent component analysis (ICA) was used for source separation. The researchers focused on sensorimotor μ/beta (8-30 Hz) power suppression, a known indicator of increased motor activity. They time-locked EEG activity to stimulus onset, separating the response period from the perception window (at least 1000ms between stimulus offset and response options onset) to minimize contamination of perceptual processing by motor response preparation. K-means clustering on dipole locations identified left and right hemisphere sensorimotor regions for analysis. Time-frequency analysis examined μ/beta activity across all conditions to determine effects of stimulus type and accuracy (collapsed across difficulty) on motor engagement. The effects of difficulty (using only correct trials) on μ/beta activity were also analyzed, using linear mixed-effects models with subject and independent component (IC) as random effects. One-sample tests compared the activity against baseline for each condition.
Key Findings
Behavioral results confirmed that the adaptive staircase procedure effectively controlled difficulty across stimulus types. EEG analysis revealed a significant left-lateralized sensorimotor μ/beta suppression (increased motor activity) during speech perception, absent for environmental sounds. For audiovisual words and phonemes, greater μ/beta suppression was associated with correct trials, indicating that motor activity aids in accurate perception of these stimuli. In contrast, for auditory-only words, greater μ/beta suppression was associated with incorrect trials, suggesting motor processing is detrimental to accurate perception of these stimuli. These differences highlight a significant interaction between stimulus type and accuracy. Analysis of correct trials only revealed greater μ/beta suppression (motor activity) for audiovisual words and phonemes compared to auditory-only words, regardless of difficulty level. Environmental sounds showed minimal μ/beta modulation. One sample tests confirmed no significant μ/beta suppression during the perception of environmental sounds, while significant modulation was present in all other stimulus types.
Discussion
The findings support the hypothesis that motor involvement in speech perception is left-lateralized and specific to speech stimuli. The results demonstrate that the motor system is dynamically engaged to aid perception, but its contribution varies depending on stimulus type. The lack of μ/beta modulation for environmental sounds suggests motor activity isn't simply a reflection of domain-general processes like attention. The contrasting patterns for auditory-only words (enhanced suppression for incorrect trials) versus audiovisual words and phonemes (enhanced suppression for correct trials) suggest distinct roles of motor processing depending on lexicality and modality. The beneficial effect of motor engagement for phonemes and audiovisual words may reflect the use of internal phoneme or sublexical models to decode speech input. Visual speech information may facilitate access to these models, making motor processing more effective for audiovisual speech. The negative correlation between motor activity and accurate perception of auditory-only words suggests that relying on motor mechanisms is inefficient or counterproductive for processing already-efficiently-processed lexical information via the ventral stream. The absence of a clear relationship between ambiguity (easy vs. hard) and motor engagement may reflect a non-linear relationship or the confound of previous studies' limited stimulus sets.
Conclusion
This study demonstrates that motor processing is selectively involved in speech perception and isn't merely a byproduct of domain-general processes. Left hemisphere motor regions are beneficial for processing phonemes and audiovisual words but not auditory words. This suggests a flexible interactive network where dorsal and ventral streams are engaged differentially, impacting perceptual accuracy based on the stimulus type. Future research should investigate the causal relationship between motor activity and perception using techniques with higher spatial resolution to elucidate the exact mechanisms involved.
Limitations
The study's correlational nature prevents establishing causal relationships between motor activity and perception. The spatial resolution of EEG source localization may limit precise identification of the specific motor areas involved. Furthermore, the variability of the stimuli might have influenced the analysis of the temporal characteristics of μ/beta suppression. Future studies using fMRI, TMS, or multivariate pattern analysis could address these limitations.
Related Publications
Explore these studies to deepen your understanding of the subject.