
Engineering and Technology
A myoelectric digital twin for fast and realistic modelling in deep learning
K. Maksymenko, A. K. Clarke, et al.
Discover a groundbreaking approach to muscle signal decoding with the Myoelectric Digital Twin, developed by Kostiantyn Maksymenko and colleagues. This innovative model simulates EMG signals, enabling the creation of extensive, high-quality datasets for deep learning in human-machine interfaces.
~3 min • Beginner • English
Introduction
The study addresses the challenge of enabling robust, generalisable EMG-based human–machine interfaces without extensive subject-specific calibration. While traditional machine learning can classify simple hand gestures, it requires per-subject training and recalibration, limiting scalability. Deep learning promises improved performance but needs large, high-quality, annotated datasets across subjects, times, and electrode configurations—data that are difficult or impossible to obtain experimentally, especially with accurate ground truth of physiological variables (e.g., motor neuron spike times, fibre parameters, individual muscle forces). The authors propose a Myoelectric Digital Twin: a realistic and computationally efficient EMG simulator to generate large, perfectly annotated datasets for training deep learning models, potentially enabling new decoding tasks (e.g., force estimation, decomposition) and reducing or eliminating the need for experimental training data.
Literature Review
Previous EMG data augmentation relied on black-box generative methods that capture some signal features without physiological interpretability, limiting their utility for tasks requiring ground-truth physiological labels. Analytical biophysical models with simplified geometries (e.g., multilayer cylinders) capture broad EMG characteristics but cannot replicate subject-specific anatomy or electrode configurations. More realistic numerical models based on solving quasi-static Maxwell/Poisson equations with generic volume conductors are accurate but computationally prohibitive, often requiring hours per simulation for tens of thousands of fibres and a few electrodes. Reciprocity-based speed-ups have been explored but remain too slow for large-scale data augmentation. Thus, prior work either lacked anatomical realism and physiological labeling or was too slow for generating the volume and diversity of data required for deep learning.
Methodology
The authors develop a fast, anatomically realistic EMG simulation framework based on a reformulation of the forward volume-conductor problem under quasi-static assumptions. Key elements:
- Volume conductor modelling: Realistic anatomy (muscles, bones, fat, skin, electrodes) is discretised into a tetrahedral mesh. Conductivity is anisotropic in muscles and isotropic in other tissues. The forward problem solves ∇·(σ∇φ)=−ι with Neumann boundary conditions using FEM (Galerkin, piecewise affine basis).
- Basis sources and adjoint method: Instead of solving for each fibre/time, unit point sources are placed at mesh vertices in muscle subdomains (basis sources). The electrode potentials for arbitrary point sources are computed via barycentric interpolation of vertex solutions. Using the adjoint method, only n_e linear systems are solved (A K^T = S^T) to obtain K = S A^{-1}, where n_e ≪ number of basis sources, drastically reducing computation. Basis solutions V_basis = K B are then reused across simulations.
- Single-fibre EMG: The transmembrane current per unit length is proportional to the second spatial derivative of the intracellular action potential (IAP), with propagation at velocity v from the neuromuscular junction and end-of-fibre effects modelled via windowing (e.g., Tukey). The EMG at electrodes is the spatial integral along the fibre of the point-source transfer function weighted by the current density, efficiently approximated by discretising the fibre path and using precomputed V_point via barycentric interpolation from V_basis.
- Motor unit (MU) modelling: Fibres are uniformly generated in a unit circle; MU centres and circular territories are defined, and fibres are probabilistically assigned to MUs based on MU density. This ensures uniform fibre distribution and allows rapid regeneration of MU distributions without recomputing fibre transfer functions. The 2D distribution is morphed to each muscle cross-section and mapped to 3D fibre trajectories.
- Recruitment and rate coding: MU recruitment follows the size principle with MU-specific thresholds; firing rates increase with excitation using linear/nonlinear rate coding. Inter-discharge intervals include variability around the mean.
- Software architecture: The pipeline separates steps to maximise reuse of precomputed data (e.g., changing fibre properties or MU distributions does not require recomputing the forward solver). Implementation uses Python with FEniCS for assembling/solving linear systems and CGAL for meshing.
- Computational complexity: The naive product complexity over vertices, electrodes, fibres, and time is decomposed into a sum over smaller products per pipeline module, enabling precomputation and rapid resimulation when only subsets of parameters change.
Key Findings
- Numerical-analytical agreement: In a four-layer cylindrical model, the numerical solution closely matched the analytical solution, with normalised mean square error 3% at 1 mm and 5% at 11 mm fibre depth. Waveforms were nearly indistinguishable.
- Realism of simulated signals: Single-fibre simulations in anatomically accurate forearm models reproduced known features in experimental sEMG: NMJ-related cancellation (differential channels straddling the NMJ), propagating components along fibre direction, and end-of-fibre effects at tendons.
- Single-muscle activation: Simulating Brachioradialis with 50,000 fibres across 200 MUs and 8 circular bipolar electrodes showed amplitude increases with excitation due to recruitment and rate coding; electrode proximity to active muscle produced higher amplitudes, reflecting volume-conductor effects.
- Multi-muscle task reproduction: Simulations of wrist flexion/extension with small constant abductor activation reproduced qualitative spatial-temporal patterns observed experimentally across electrodes. RMS features per electrode matched experimental patterns reasonably well without parameter optimisation (better for extension than flexion).
- Frequency-domain similarity: By running hundreds of simulations varying parameters within realistic ranges, parameter sets were found that minimised spectral differences with experimental signals, demonstrating feasibility of simple inverse modelling enabled by high simulation speed.
- Computational performance: For a 1-minute, 100% MVC Brachioradialis simulation (2.1M vertices, 13M tetrahedra; 16 rectangular + 16 circular electrodes; 50,000 fibres; 200 MUs; 2 kHz sampling):
• General forward solution (adjoint/basis): 7 min (~13 s per electrode)
• Fibre basis point evaluation: 2 min
• Fibre EMG responses: 30 s
• MUAP assembling: 0.8 s
• Raw sEMG assembling: 2.6 s
Subsequent simulations reusing the volume conductor and fibre geometry (changing only fibre/MU/recruitment parameters) reduced to ~33.4 s total.
Compared to state-of-the-art numerical models requiring hours per simulation for ~15.5k fibres and 5 electrodes, the proposed approach operates in minutes or seconds, enabling large-scale data generation.
- Deep learning augmentation benefit: Pretraining a GRU network using 320 simulated MUAP templates (64 sets × 5 MUAPs, 130-channel HD-sEMG) improved decomposition of experimental HD-sEMG to motor neuron spike trains. Rate of Agreement (ROA) median (IQR) increased from 82.4% (71.6–100.0) with random initialisation to 93.8% (84.8–100.0) with simulation pretraining. Median difference 8.1 (Hodges–Lehmann), 95% CI 3.4–13.3; Wilcoxon signed-rank Z=4.0, p=0.00006. Improvements held in female (median diff 9.1; p=0.00694) and male (median diff 5.7; p=0.00064) subsets. Variance in accuracy decreased with pretraining. Of 39 MUs, 22 improved, 1 worsened, 16 unchanged (often already at 100% ROA).
Discussion
The Myoelectric Digital Twin addresses the core bottleneck in EMG-based AI—lack of large, annotated, high-quality datasets—by enabling fast, anatomically realistic, and perfectly labelled simulations. By reformulating the forward problem with basis sources and the adjoint method, the model decouples dependencies so that expensive volume-conductor computations are amortised and reused, making it practical to generate diverse datasets across anatomies, electrode montages, and physiological parameters. The simulator reproduces key physiological features in time and frequency domains and matches analytical solutions with low error, supporting validity. Critically, pretraining with simulated MUAPs significantly improved neural network decomposition of experimental HD-sEMG, demonstrating that synthetic data can yield tangible performance gains on real data. This establishes simulation-driven data augmentation as a viable strategy for EMG deep learning, potentially enabling models that generalise across subjects and tasks with minimal calibration. The framework’s access to ground-truth hidden variables (e.g., MU discharges, forces) further unlocks training of algorithms otherwise impossible with experimental data alone (e.g., direct force estimation, latent neural activity decoding).
Conclusion
This work introduces a computationally efficient, anatomically realistic EMG simulator—a Myoelectric Digital Twin—capable of generating large, richly annotated datasets for deep learning. The method achieves minutes-to-seconds simulation times via a basis-source and adjoint reformulation of the FEM forward problem, while preserving physiological realism and matching analytical solutions with low error. Simulations reproduce experimental signal characteristics across tasks and electrode configurations. Importantly, simulation-based pretraining improves the accuracy and stability of deep learning decomposition of HD-sEMG into motor neuron activity. Future research directions include: integrating advanced noise and artefact models; coupling with biomechanical musculoskeletal models for movement-to-force/%MVC estimation; modelling non-stationary volume-conductor and fibre geometry; systematic inverse modelling to personalise parameters; and broader application to tasks like force estimation, denoising, fatigue detection, and gesture classification. These advances collectively move toward digital twins that can augment or replace experimental data for training AI-driven EMG interfaces.
Limitations
Current limitations include: absence of advanced noise and artefact modelling; lack of integrated biomechanical musculoskeletal modelling for automatic force/%MVC generation; no modelling of non-stationary changes in volume conductor properties or fibre geometry; potential biases introduced by synthetic data remain to be fully characterised; and code is proprietary (access upon request), which may limit external validation and adoption. The model was not personalised for subject-specific parameter optimisation in the demonstrations, leading to imperfect matches in some comparisons (e.g., flexion RMS).
Related Publications
Explore these studies to deepen your understanding of the subject.