Engineering and Technology
Automatic design of stigmergy-based behaviours for robot swarms
M. Salman, D. G. Ramos, et al.
Discover Habanero, an innovative automatic approach for designing collective behaviors in robot swarms, developed by Muhammad Salman, David Garzón Ramos, and Mauro Birattari. This method showcases the remarkable capabilities of autonomous design in creating effective behaviors that can match or even outperform human-created ones.
~3 min • Beginner • English
Introduction
Stigmergy enables agents to coordinate indirectly by leaving cues in the environment that stimulate or inhibit peers’ actions. In social insects, pheromones serve as a distributed external memory that encodes colony state and enables coordination without direct communication or global awareness. Swarm robotics adopts similar principles: collective behaviour emerges from local interactions among robots and with the environment. Implementing pheromone-based stigmergy in physical robot swarms faces two challenges: (i) hardware to lay and sense artificial pheromones without costly external infrastructure, and (ii) the design of effective pheromone-based coordination strategies, which is non-trivial and typically addressed via manual, trial-and-error design. Prior manual approaches are time-consuming, not easily repeatable, and often not generalisable across platforms or missions. This work addresses the design challenge by proposing Habanero, an automatic off-line design method (within the AutoMoDe family) that assembles predefined modules into probabilistic finite-state machines and tunes their parameters via simulation-based optimisation. The research question is whether automatic design can reliably produce effective, pheromone-based collective behaviours that cross the reality gap and match or surpass manual and neuroevolutionary baselines across different stigmergy-reliant missions.
Literature Review
Prior implementations of stigmergy in robot swarms often rely on external infrastructures: RFID-tag floors to store virtual pheromones; projection/display systems to render virtual trails; or augmented-reality frameworks immersing robots in a virtual pheromone environment. These solutions enable complex coordination but are costly and constrained to prepared environments. Physical deposition approaches using alcohol or wax trails eliminate external infrastructure but raise safety and practicality issues. A safer alternative uses UV light to activate photochromic pigments on the floor, creating trails that evaporate (fade) in tens of seconds; however, this still requires environment preparation. Design-wise, stigmergy-based behaviours have been predominantly handcrafted via trial-and-error for specific missions, with one notable simulation-only exception using deep reinforcement learning for collision avoidance via a virtual pheromone. Beyond stigmergy, automatic off-line design frameworks (AutoMoDe) constrain control software to modular architectures, which has been shown to improve transfer to reality compared to neuroevolution of unrestricted neural controllers, by reducing overfitting to simulation idiosyncrasies. Habanero builds on this line of work, adding modules to lay/sense pheromones and leveraging Iterated F-race for mixed-variable optimisation.
Methodology
Habanero is an AutoMoDe instance specialised for swarms that lay and detect pheromone trails. It automatically assembles predefined low-level behaviours and transition conditions into probabilistic finite-state machines (PFSMs), and tunes their parameters via Iterated F-race to maximise mission-specific objective functions using simulations prior to deployment.
Target platform and hardware: e-puck robots extended with (i) Overo Gumstix Linux board; (ii) UV-light module (nine downward-facing UV LEDs) to deposit artificial pheromone on a photochromic floor; (iii) omnidirectional camera to detect pheromone trails; and (iv) ground and proximity sensors. The photochromic floor turns magenta under UV exposure and fades back to white in ~50 s, emulating pheromone evaporation.
Reference model: RM 4.1 defines sensor inputs (proximity, ground colour, camera detections) and actuator outputs (wheel speeds; pheromone deposition mode: none/thin/thick). Control step duration is 100 ms.
Software modules: Seven low-level behaviours—Exploration, Stop, Go-to-Colour, Avoid-Colour, Go-to-Pheromone, Avoid-Pheromone, Waggle—and six transition conditions—White-Floor, Gray-Floor, Black-Floor, Colour-Detected, Pheromone-Detected, Fixed-Probability. Parameters include: pheromone deposition mode (none/thin/thick), behaviour-specific parameters (e.g., colour c, field of view fov, exploration rotation duration τ), and transition probabilities β.
Design space and optimisation: The PFSMs have up to 4 states and up to 4 outgoing transitions per state. The mixed-variable design space includes 105 parameters (categorical, integer, real). Iterated F-race (as implemented in irace) selects and refines candidate PFSMs via racing on stochastic simulations in ARGOS3 with a custom library (argos3-phormica) for pheromone simulation and cross-compilation to real robots. Constraining designs to modular PFSMs reduces variance and mitigates overfitting, improving reality-gap crossing compared to neuroevolution of unrestricted neural controllers.
Comparators: (i) EvoPheromone, a neuroevolutionary baseline adapted from EvoStick: a single-layer, fully connected feed-forward ANN with 61 inputs and 7 outputs (427 real-valued synaptic weights), tuned via elitism and mutation. (ii) Human-Designers: 10 human designers manually assembled PFSMs using the same module set and tools. (iii) Random-Walk: simple baseline (move straight, random turns on obstacle).
Missions and objectives (N=8 robots, T=180 s):
- AGGREGATION: minimise sum over time of the average inter-robot distance.
- DECISION MAKING: maximise time-weighted presence in designated regions (2 points for blue, 1 for green), with lights switched off uniformly between 70–90 s.
- RENDEZVOUS POINT: robots start left of a wall with a narrow gate; maximise K_in − K_out at mission end for the green region; lights switched off 70–90 s.
- STOP: a blue stop signal appears at a random time (70–90 s); minimise movement after the signal and avoid stopping before.
Protocol: For Habanero and EvoPheromone, 10 independent design runs per mission (simulation budget: 100,000 simulations per run) produced 10 controllers per method and mission. For Human-Designers, 10 participants produced one controller per mission (4 h per mission). Each controller was evaluated once in simulation and once on real robots; initial positions were varied consistently across methods. Controllers were automatically cross-compiled and deployed without manual modification. A tracking system (overhead camera with markers) computed performance metrics; robots did not receive external information.
Statistics: Per-mission performance is shown with notched boxplots; method comparisons use Wilcoxon paired rank-sum tests (95% confidence). Aggregated across missions, a Friedman rank sum test (nonparametric, block design with mission as block) compares methods on real-robot performance; lower average rank indicates better performance.
Key Findings
Across all four missions, Habanero produced effective stigmergy-based collective behaviours that leveraged artificial pheromones in mission-appropriate ways and transferred better to reality than alternatives.
Per-mission highlights:
- AGGREGATION: In simulation, Habanero, EvoPheromone, and Human-Designers performed similarly. On real robots, Habanero significantly outperformed all others. Habanero’s robots intermittently laid short pheromone marks to avoid saturation and formed clusters around marked points. EvoPheromone’s wall-avoidance failed in reality, and Human-Designers’ continuous deposition led to local traps and multiple clusters.
- DECISION MAKING: In simulation, Habanero and Human-Designers outperformed EvoPheromone. On real robots, Habanero significantly outperformed Human-Designers; both beat EvoPheromone and Random-Walk. Habanero consistently selected blue and used pheromone to remain after cues were removed, though some spillover at region boundaries reduced real-world scores.
- RENDEZVOUS POINT: Simulation showed similar performance across methods. In reality, Habanero significantly outperformed all others; EvoPheromone performed significantly worse than all methods, even Random-Walk. Habanero used random walk to find the gate, then maintained presence in green via pheromone maintenance. EvoPheromone relied on wall-following that failed on hardware, leading to robots stuck and unable to cross.
- STOP: Habanero and Human-Designers performed similarly and significantly better than EvoPheromone in both simulation and reality. Habanero’s robots, upon detecting the stop signal, stopped or waggled while laying pheromone to broadcast the stop; peers stopped upon detecting the signal or pheromone. EvoPheromone exploited timing to stop against walls, failing to reliably react to the actual signal.
Aggregated result: A Friedman rank sum test on real-robot performance across missions showed Habanero ranked best with at least 95% confidence. Human-Designers ranked significantly better than EvoPheromone and Random-Walk.
Behavioural analysis: Although using the same module set, Habanero and Human-Designers combined modules differently. Habanero used Exploration less, relied more on pheromone-reactive modules, used wall-colour responses less, and used Waggle more.
Discussion
The study shows that automatic off-line design can generate robust, pheromone-based collective behaviours for robot swarms, addressing a long-standing challenge in stigmergic coordination design. By constraining the design to modular PFSMs and optimising via Iterated F-race, Habanero produced mission-specific interaction strategies that generalised from simulation to real robots and often surpassed manual design and neuroevolutionary baselines. The behaviours exhibited emergent collective capabilities compensating for individual limitations: spatial organisation (self-aggregation and region congregation), external memory (persistent marking of chosen regions after cue removal), and communication (mission-specific semantics of pheromone trails, e.g., “come here” vs “stop here”). These properties were not hard-coded but emerged from the automatic combination of modules per mission.
The results address the research question by demonstrating that automatic design is viable and effective for stigmergy-based coordination and can better cross the reality gap than an unconstrained neuroevolutionary approach. However, causal attribution of specific module choices to performance remains unresolved, as the current setup does not disentangle design decisions’ effects.
Conclusion
This work introduces Habanero, an automatic, modular off-line design method that assembles and tunes pheromone-aware behaviours for robot swarms. Across four stigmergy-reliant missions with real e-puck robots on a photochromic floor, Habanero produced controllers that matched or outperformed manual designs and a neuroevolutionary baseline, and it ranked best overall in aggregated real-robot performance. The method yielded mission-tailored strategies that enabled emergent spatial organisation, memory, and communication through pheromones, without mission-specific module coding.
Future research should explore: (i) automatic tuning and exploitation of pheromone intensity and decay (beyond binary detection); (ii) automatic selection and integration of direct and indirect communication modalities; (iii) development of more universally applicable, safe, and easy-to-integrate pheromone hardware, enabling broader deployment and standardised benchmarks.
Limitations
Technological limitations include reliance on a prepared photochromic floor, restricting experiments to indoor, preconditioned environments; no universally applicable on-board pheromone laying/sensing solution currently exists. Methodologically, the missions—while varied—do not establish performance on more complex tasks requiring fine-grained modulation of pheromone deposition/response or multiple pheromone types. The study setup does not allow causal analysis of how specific module combinations influence performance. Sensing of pheromone was binary in these experiments; the capability to modulate and sense graded intensities and decay remains untested. EvoPheromone’s poor transfer underscores sensitivity to reality-gap issues in unconstrained neuroevolution.
Related Publications
Explore these studies to deepen your understanding of the subject.

