logo
ResearchBunny Logo
Generalisable 3D printing error detection and correction via multi-head neural networks

Engineering and Technology

Generalisable 3D printing error detection and correction via multi-head neural networks

D. A. J. Brion and S. W. Pattinson

Discover how Douglas A. J. Brion and Sebastian W. Pattinson have developed CAXTON, a cutting-edge system utilizing a multi-head neural network to automatically detect and correct errors in material extrusion. This innovative approach leverages a massive dataset of 1.2 million images to enhance the quality of additive manufacturing processes!

00:00
00:00
~3 min • Beginner • English
Introduction
The study addresses the challenge that material extrusion (fused deposition modeling) is prone to diverse errors that limit its adoption for end-use products. Traditional monitoring requires human oversight, which cannot provide continuous, real-time correction and does not scale across printers. Prior automated sensing and computer vision methods often target specific errors, are expensive, or lack generalisability across geometries, materials, and printers. The research question is whether a multi-head deep neural network trained on automatically labeled images, coupled with a real-time control loop, can detect and correct multiple error modalities across varied printers, materials, toolpaths, and even extrusion methods. The purpose is to enable robust, scalable, low-cost, real-time error detection and correction that generalises beyond a single part or setup, thereby improving reliability, reducing waste, and aiding parameter discovery for new materials.
Literature Review
The paper reviews prior monitoring approaches for extrusion AM: current, inertial, and acoustic sensors have detected certain large-scale error modalities but are costly and often not data-rich enough for online feedback. Frame-mounted camera systems with traditional vision have detected diverse defects but struggle to view deposition at the nozzle due to occlusion and often require pausing, limiting real-time correction. Multi-camera/IR and 3D reconstruction approaches can detect dimensional inaccuracies but are expensive, sensitive to lighting/surface properties, slow, and require calibration. Nozzle-mounted cameras have enabled some real-time corrections (e.g., over/under extrusion) but handcrafted features rarely generalise across parts and setups. Machine learning work has shown promise in error detection, yet typically on a single part, often limited to one error modality, sometimes requiring a prior successful print as reference, and without demonstrated real-time correction across unseen geometries. Limited prior correction studies required many prints of the same object or corrected only one parameter with delay, leaving generalisation unclear.
Methodology
System overview: CAXTON (collaborative autonomous extrusion network) automates data collection and labelling using inexpensive, nozzle-focused webcams and Raspberry Pi gateways on a fleet of eight Creality CR-20 Pro printers. STL models are downloaded from Thingiverse, sliced with randomly sampled parameters (rotation, scale, infill density/pattern, wall count), and toolpaths are split to a maximum move of 2.5 mm to reduce firmware response time. During printing, images are captured at 2.5 Hz (every 0.4 s). Each image is timestamped and automatically labeled with actual/target hotend and bed temperatures, flow rate (%), lateral speed (%), and Z offset (mm) via firmware queries or G-code settings. Nozzle-tip coordinates are recorded for automated cropping. Data curation: Suboptimal parameter sampling can cause complete failures; such images are manually removed, leaving 1,166,552 of 1,272,273 images. To handle mechanical response delays after parameter updates, experiments showed changes become visible within ~6 s; thus 15 images post-update are removed, yielding 1,072,500. Outliers from execution glitches/sensor errors are filtered to 991,103. Very dark images (mean RGB < 10) are removed, resulting in 946,283 labeled images (74.4% of original). Continuous parameter values are binned into three classes per parameter (low, good, high) based on PLA experience, yielding 81 possible class combinations (3^4). Data augmentation: For each image: random rotation up to ±10°, perspective transform with p=0.1, nozzle-centered crop to 320×320 using saved nozzle coordinates, then random square crop with area 0.9–1.0 resized to 224×224, horizontal flip p=0.5, color jitter ±10% (brightness, contrast, hue, saturation), and per-channel normalization. This simulates variability in geometry, camera position, material, and lighting to improve generalisation. Model architecture: A multi-head deep residual attention network with a single shared Attention-56 backbone (three attention modules and six residual blocks) and four separate fully connected heads (flow rate, lateral speed, Z offset, hotend temperature). Each head outputs three logits (low/good/high). Losses (cross-entropy per head) are summed to train the shared backbone, enabling learning of parameter interdependencies and reducing inference cost versus four separate networks. Attention masks provide robustness to noisy labels and interpretability via attention visualization. Training strategy: Three-stage training with transfer learning; each stage trained with three seeds and best seed transferred. - Stage 1 (single-layer 100% infill subset): best seed training accuracy 98.1%, validation 96.6%. - Stage 2 (full dataset): best seed training 91.1%, validation 85.4%. - Stage 3 (balanced subset across 81 combinations, backbone frozen, heads retrained): training 89.2%, validation 90.2%. Final test set accuracy: overall 84.3%; per-parameter accuracies: flow rate 87.1%, lateral speed 86.4%, Z offset 85.5%, hotend temperature 78.3%. Baselines: single-stage ResNets achieved 80.4–82.5% overall; multi-head context improved single-parameter flow prediction (ResNet18: 77.5% single-head vs 82.1% with multi-head context after 50 epochs). Online correction/control loop: During printing, nozzle-region images are cropped (320×320), resized (224×224), normalized and inferred. For each parameter, the last L predictions are buffered; if the mode frequency ≥ θ_mode, the mode is accepted. If mode is low/high, proportional updates are computed via linear interpolation from θ_mode→1 mapped to l_min→1, scaling the maximum update A+ (increase) or A− (decrease). Parameters (θ_mode, L, l_min, A+, A−), tuned experimentally: - Flow: 0.70, 10, 0.20, +40, −50 - Lateral speed: 0.80, 10, 0.25, +40, −50 - Z offset: 0.65, 12, 0.25, +0.16, −0.16 - Hotend: 0.85, 15, 0.50, +10, −12 Toolpath splitting to 1 mm maximum move is used during online correction to reduce response time. G-code updates are executed via Raspberry Pi; the server resumes predictions only after firmware acknowledgements to prevent oscillations. Camera setup requires a one-time nozzle coordinate selection; best performance occurs when ~±5 extrusion widths around the nozzle are visible. Hardware/software: Printers (Creality CR-20 Pro) with Logitech C270 webcams; additional tests on Lulzbot Taz 6 (0.6 mm nozzle, 2.85 mm filament) and a DIW syringe-based Ender 3 Pro modification. Firmware: Marlin 1.1.9/2.0, OctoPrint with custom plugin. Training on Nvidia RTX 5000 GPUs; PyTorch-based implementation. Dataset DOI and code repositories are provided by the authors.
Key Findings
- Large, automatically labeled dataset: 1.2 million images from 192 parts; cleaned to 946,283 labeled nozzle-view images with labels for flow rate, lateral speed, Z offset, hotend temperature. - Classification performance on held-out test set: overall 84.3% across four parameters; per parameter: flow rate 87.1%, lateral speed 86.4%, Z offset 85.5%, hotend temperature 78.3%. - Multi-head context improved single-parameter prediction: ResNet18 predicting only flow achieved 77.5% vs 82.1% when trained with all four parameter heads. - Real-time online correction: Rapid correction of manually induced single-parameter errors (flow, speed, Z, hotend) on unseen setup components (e.g., new 0.4 mm nozzle geometry and camera angle) in PLA single-layer tests. - Generalisation across materials: Simultaneous multi-parameter optimisation and recovery on unseen thermoplastics (e.g., TPU, ABS-X, PVA HT+, carbon fiber filled) with varying initial conditions; correction facilitated bed adhesion and prevented early failures. - Cross-printer generalisation: Successful correction on an unseen Lulzbot Taz 6 (0.6 mm nozzle, 2.85 mm filament) including rescue of a bishop geometry where the uncontrolled print failed. - Cross-process generalisation: Effective control on direct ink writing (PDMS with 0.21 mm nozzle; mayonnaise and ketchup with 0.84 mm nozzle), primarily by adjusting flow and Z offset; demonstrated understanding of pressure dynamics and layer interactions, though long DIW prints could exhibit overshoot due to pressure build-up. - End-to-end benefits: Saved prints mid-process (rook example) and improved surface finish and completion rates from poor initial parameters (spanner set example). Toolpath splitting and proportional updates delivered order-of-magnitude faster correction/response than prior real-time methods (qualitative comparative claim). - Interpretability: Attention maps and gradient-based saliency (Guided Backprop, Grad-CAM) indicate the model focuses on the most recent extrusion and uses similar features across parameters to detect under/over-extrusion, supporting robustness and rapid response.
Discussion
The findings demonstrate that a multi-head residual attention CNN trained on automatically labeled, nozzle-view images can infer multiple interdependent process parameters and drive a closed-loop controller to correct errors in real time. This addresses the central challenge of generalisable monitoring: the system works across diverse geometries, printers, materials, toolpaths, and even extrusion modalities (thermoplastics and DIW), without handcrafted, setup-specific features or preprinted references. The model’s simultaneous prediction of four parameters captures their interplay, enabling creative corrective strategies (e.g., compensating high Z offset by increasing flow) akin to expert human reasoning but with continuous operation and instant actuation. The online control loop design (mode thresholding, proportional updates, toolpath splitting, firmware acknowledgement gating) stabilizes feedback and reduces response time, enabling practical correction during printing. Visual explanations suggest the network learns universal extrusion cues (e.g., extrudate shape and recent deposition), supporting transferability. Overall, the approach enhances reliability and reduces waste by recovering failing prints and accelerating parameter discovery for new materials with low-cost, easily deployable hardware.
Conclusion
This work introduces CAXTON, an end-to-end, low-cost, and scalable framework that autonomously collects and labels large-scale nozzle-view image data, trains a multi-head residual attention network to predict key extrusion parameters, and closes the loop for real-time error correction. The system generalises across printers, geometries, materials, and extrusion methods, and can propose multiple corrective solutions by leveraging learned parameter interactions. Contributions include: (1) automated fleet-scale dataset generation (∼1.2M images) labeled by deviation from optimal parameters; (2) a multi-head attention architecture and multistage training yielding strong accuracy and interpretability; (3) a stabilised, fast-response control loop with proportional updates and toolpath splitting; and (4) demonstrations of real-time correction and parameter discovery across diverse settings. Future work should expand datasets (more printers/materials, balanced and finer-grained labels), integrate global and local sensing (e.g., additional cameras/IR), address mechanical/electrical fault modalities with complementary controls, refine feedback parameter tuning to reduce oscillations, and extend to challenging processes such as metal AM with transfer learning for specific environments.
Limitations
- Dataset bias and class imbalance: Particularly for Z offset, low values are underrepresented due to hardware constraints (risk of bed contact), potentially reducing performance for small Z changes; more granular, balanced data are needed. - Sensitivity to feedback tuning: Corrections can oscillate if prior poor regions remain in view or due to sequences of mispredictions; choice of θ_mode, L, l_min, A+, A−, sampling rate, and toolpath split length affects stability and response. - Unaddressed failure modes: Mechanical (skipped steps, belt slip), some electrical issues, and large-scale defects (cracking, warp, severe bed detachment) are not fully detected or correctable by local nozzle-view imaging alone. - DIW dynamics: For viscous inks (e.g., PDMS) small nozzles and pressure delays can cause overshoot during long prints; timely pressure compensation remains challenging. - Environmental variability: Although augmentations improve robustness, lighting, focus, and camera placement still matter; hotend temperature has slower dynamics and lower classification accuracy (78.3%). - Generalisability scope: While strong across tested setups, broader validation on more printers, materials, and settings would further substantiate generalisation.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny