Medicine and Health
Meta-learning of human motor adaptation via the dorsal premotor cortex
T. Sugiyama, S. Uehara, et al.
Meta-learning lets the brain learn how to learn — this study shows that the dorsal premotor cortex (PMd), not the dorsolateral prefrontal cortex (DLPFC), is crucial for regulating the rate and retention of motor adaptation to maximize rewards: PMd-targeted TMS impaired meta-learning and memory retention. Research conducted by Taisei Sugiyama, Shintaro Uehara, and Jun Izawa.
~3 min • Beginner • English
Introduction
The study investigates whether meta-learning of motor adaptation—previously shown to share computational principles with meta-learning in decision making—depends on prefrontal cortex structures (e.g., DLPFC) or on dorsal premotor cortex (PMd). Prior work in decision-making attributes meta-reinforcement learning to prefrontal networks, while motor adaptation engages hierarchical motor streams (DLPFC → PMd → M1). Two hypotheses were tested: (1) DLPFC acts as the meta-learning site sending control signals to premotor/motor areas; (2) PMd itself integrates reward and motor adaptation signals to implement meta-learning without requiring DLPFC downstream signals. Establishing the neural locus has importance for understanding and enhancing motor learning in practical domains such as rehabilitation.
Literature Review
- Meta-learning in decision-making is theorized to involve prefrontal networks including orbitofrontal cortex and anterior cingulate cortex, implementing hierarchical reinforcement learning (Wang et al., Silvetti et al., Hattori et al.).
- Motor adaptation exhibits meta-learning that flexibly regulates learning rate and memory retention based on learning–outcome structure (LOS) (Sugiyama et al., 2023).
- Hierarchical motor information processing across DLPFC, PMd, and M1 suggests more abstract control rostrally, with PMd positioned to integrate cognitive and motor signals.
- PMd is implicated in both motor adaptation (error sensitivity and retention) and reward processing, with cerebellar–premotor interactions modulating trial-by-trial learning; neurostimulation of PMd alters adaptation (Tzvi et al., Vyas et al.).
- DLPFC is associated with metacognition and explicit strategy regulation in visuomotor tasks, offering an alternative mechanism for meta-learning via explicit cognitive control (Liew et al.).
- These strands motivate contrasting PMd-centered versus DLPFC-centered meta-learning networks for motor adaptation.
Methodology
Design: Within-subject TMS experiment contrasting two independent variables: TMS location (PMd vs. DLPFC) and Learning–Outcome Structure (LOS: Promote vs. Suppress), yielding four conditions (PMd-Promote, PMd-Suppress, DLPFC-Promote, DLPFC-Suppress). Participants completed up to four sessions in randomized order with at least 3 days between sessions.
Participants: N=44 right-handed healthy adults (22 males; age 18–26, mean 21.5). Final datasets per condition: PMdProm N=21, PMdSupp N=20, PFCProm N=25, PFCSupp N=20.
Apparatus: Horizontal-plane custom manipulandum; visual stimuli presented via screen reflected in a mirror occluding the hand/forearm. Target positions sampled from seven directions (−15°, −10°, −5°, 0°, 5°, 10°, 15°).
Task structure: Two phases repeated across four blocks.
- Meta-learning training: Six cycles per block; each cycle had 10 Null trials (washout) and five ER pairs (E trial with +5° visuomotor rotation and task-error clamp; subsequent R trial without cursor to measure aftereffect and deliver monetary score). Score functions implement LOS:
• Promote: score = S*y + s0 with [S, s0] = [1, −5]; higher aftereffect reduces punishment (less negative score), encouraging larger learning.
• Suppress: score = S*y + s0 with [S, s0] = [−1, 0]; larger aftereffect increases punishment, discouraging learning. Max score bounded at 0; 1 point = 1 JPY.
- Probe (no score, no TMS): Learning phase with step perturbation (+7°) for 10 trials following 5 no-rotation trials; Retention phase with 15 error-clamp trials (cursor fixed to target) to assess memory decay independent of error.
Block schedule: Block 1 included Probe–Training–Probe, with score removed in R trials of Block 1 to measure baseline. Blocks 2–4 had Training then Probe; each Probe preceded by 30 Null trials and followed by a 1-minute break.
TMS: Single-pulse TMS delivered at go-signal onset in E trials (and every odd Null trial to avoid cueing rotation). Intensity: 90% resting motor threshold (FDI muscle). PMd site: 2 cm anterior, 1 cm medial to FDI hotspot. DLPFC site: F3 (10–20 EEG system). Left hemisphere stimulation (contralateral to right hand).
Instructions: Participants informed of potential manipulations and TMS, instructed to reach to targets accurately and ignore stimulation; monetary loss feedback explained. Post-task questionnaires verified no alternative aiming and lack of side effects.
Analysis: Linear mixed-effects models (R 4.3.2; lme4, lmerTest) estimated slopes of learning/retention curves and their change over blocks (Δslope), accounting for random intercepts per participant and LOS. Learning rate approximated from initial slope in Probe’s learning phase; retention rate from slope of decay in retention phase. Three coding schemes quantified: (1) condition-wise Δslope; (2) meta-learning effects (Promote − Suppress) within location; (3) difference of meta-learning effects between locations (PMd − DLPFC). Score improvement analyzed via block-wise average scores with random effects for LOS and participant. Bias corrections applied to reach direction at 5 cm; significance threshold P<0.05, two-sided.
Key Findings
- Meta-learning training (ER trial learning curves):
• Condition-wise Δslope: PMdProm µ=0.05 (95% CI [0.02, 0.08], P=0.001); PFCProm µ=0.16 ([0.13, 0.19], P<1e−25); PMdSupp µ=0.009 ([−0.02, 0.04], P=0.61); PFCSupp µ=−0.03 ([−0.06, 0.002], P=0.07).
• Meta-learning effects (Promote − Suppress): PMd µ=0.044 ([0.006, 0.081], P=0.02); DLPFC µ=0.193 ([0.155, 0.231], P<1e−22).
• Location contrast (PMd − DLPFC): µ=−0.15 ([−0.20, −0.10], P<1e−7), indicating stronger meta-learning in DLPFC and attenuation with PMd TMS.
- Score performance over blocks:
• Improvement (less negative scores) in DLPFC: PFCProm µ=0.19 ([0.06, 0.33], P=0.004); PFCSupp µ=0.22 ([0.07, 0.37], P=0.004).
• PMd conditions: PMdProm µ=−0.03 ([−0.17, 0.11], P=0.67); PMdSupp µ=0.13 ([−0.01, 0.28], P=0.08). Consistent with attenuated meta-learning under PMd TMS.
- Probe—Learning phase (error sensitivity):
• Condition-wise Δslope: PFCProm µ=0.50 ([0.09, 0.92], P=0.02); PMdProm µ=0.32 ([−0.13, 0.76], P=0.16); PMdSupp µ=0.41 ([−0.04, 0.85], P=0.08); PFCSupp µ=0.30 ([−0.15, 0.75], P=0.19).
• Meta-learning effects (Promote − Suppress): PMd µ=−0.09 ([−0.62, 0.43], P=0.73); DLPFC µ=0.20 ([−0.30, 0.71], P=0.43). Location contrast NS (µ=−0.29, [−1.03, 0.44], P=0.43).
- Probe—Retention phase (retention rate of motor memory):
• Condition-wise Δslope: PFCProm µ=0.03 ([0.02, 0.05], P<1e−5); PMdProm µ=−0.02 ([−0.03, −0.008], P=0.003); PMdSupp µ=−0.03 ([−0.04, −0.01], P=0.0006); PFCSupp NS (µ≈0.30 with wide CI and P=0.19 as reported).
• Meta-learning effects (Promote − Suppress): DLPFC µ=0.034 ([0.017, 0.051], P<1e−5); PMd µ=0.003 ([−0.014, 0.020], P=0.70).
• Location contrast (PMd − DLPFC): µ=−0.031 ([−0.055, −0.007], P=0.01), showing PMd TMS significantly attenuates meta-learning of retention.
Overall: TMS to PMd, but not DLPFC, impairs meta-learning in motor adaptation, particularly by attenuating meta-learning of memory retention. Motor adaptation per se remained intact; differences are not attributable to general TMS artifacts given active-site controls.
Discussion
Findings support a PMd-centric meta-learning network for motor adaptation, distinct from prefrontal meta-learning networks in decision making. PMd integrates reward and motor learning signals to regulate adaptation parameters according to LOS, with TMS disrupting goal-directed planning that upregulates learning in Promote and downregulates it in Suppress. Probe analyses indicate selective attenuation of meta-learning for retention rate under PMd TMS, with inconclusive effects on error sensitivity. This selectivity aligns with proposed dissociations: cerebellum mediates rapid error-driven updates, while cortical regions (including PMd/M1) store and regulate motor memories over longer timescales. The results argue that PMd acts beyond basic motor planning, serving as an information hub connecting metacognitive, reward, and motor systems to optimize long-term outcomes of adaptation. The explicit versus implicit nature of higher-layer meta-learning remains open; however, the PMd effects suggest at least a partially implicit process regulating retention.
Conclusion
The study demonstrates that meta-learning of motor adaptation relies on the dorsal premotor cortex rather than the dorsolateral prefrontal cortex. PMd TMS attenuates meta-learning, especially for retention of motor memory, revealing a motor-specific meta-learning network that integrates reinforcement learning with motor adaptation. Contributions include identifying PMd as central to regulating adaptation parameters based on learning–outcome structure and distinguishing motor meta-learning from prefrontal-driven meta-learning in decision making. Future work should delineate the roles of cerebellum and cortical areas in meta-learning of error sensitivity versus retention, clarify the explicit/implicit components of meta-learning, and explore translational applications in rehabilitation (e.g., stroke) where optimizing retention may improve functional outcomes.
Limitations
- No sham TMS condition; instead, active-site controls (PMd vs. DLPFC) were used to mitigate nonspecific TMS artifacts and sensory/placebo effects, which may not capture all confounds.
- Selectivity for retention over error sensitivity is suggested but not conclusively established due to non-significant effects in the learning phase.
- Linear slope approximations may oversimplify non-linear learning/retention dynamics; reach direction measures include small biases despite correction.
- Within-subject design with session randomization reduces variability, but dropout and order/carryover effects—though analyzed—could still influence estimates.
- DLPFC targeting via F3 may not optimally engage specific subregions implicated in metacognition (e.g., anterior PFC), potentially underestimating DLPFC contributions.
Related Publications
Explore these studies to deepen your understanding of the subject.

