
Social Work
Trust within human-machine collectives depends on the perceived consensus about cooperative norms
K. Makovi, A. Sargsyan, et al.
Dive into an intriguing exploration of how trust is built in human-machine collectives. Through comprehensive studies involving 7917 individuals, the research reveals the critical role of cooperative norms and consensus in fostering trust, highlighting valuable insights by authors Kinga Makovi, Anahit Sargsyan, Wendi Li, Jean-François Bonnefon, and Talal Rahwan.
~3 min • Beginner • English
Introduction
The paper investigates how humans navigate cooperation and trust in mixed human-bot collectives, such as those emerging on platforms like Wikipedia, Twitter, Reddit, YouTube, Twitch and Discord, where bots can help or hinder humans and vice versa. These collectives struggle to define appropriate norms for interactions involving bots, both in laboratory and field settings. The authors adapt a stylized two-stage interaction paradigm (third-party punishment game followed by a trust game) to include both human and bot agents as Beneficiaries, Helpers, and Punishers, with humans serving as Trustors. A methodological concern is that bots do not care about money, the lab currency. Study 1 addresses this by showing participants believe bots behave as if they had preferences for money-like currencies (e.g., likes, avoiding bans) due to programming, justifying the use of monetary incentives as proxies. The theoretical focus is on signaling: humans may help bots to signal trustworthiness to other humans. Across five studies, the authors test how identity configurations (human vs bot) affect helping, punishment, and the trust earned by such behaviors, and whether perceived consensus about cooperative norms (helping and third-party punishment) explains and can modify trust allocations.
Literature Review
The study builds on several strands of literature: (1) Cooperation, exploitation, and norm stabilization challenges in human societies and human-machine collectives; (2) Evidence that online communities include human-bot interactions with ambiguous norms, complicating moderation (e.g., Wikipedia, Twitter). (3) Experimental paradigms linking prosociality (sharing/helping) and third-party punishment to reputational benefits and trustworthiness signaling (e.g., Jordan et al., 2016, Nature, on punishment as a costly signal). (4) Norm theory emphasizing the roles of injunctive norms, empirical expectations, and second-order normative beliefs/metanorms in shaping behavior and cooperation. The paper situates its contribution in showing that in mixed human-bot settings, trust gains from prosocial behaviors are attenuated, and that perceived consensus about norms helps explain when and why trust accrues to norm-followers. It also connects to research on signaling theory and the reputational benefits of altruistic punishment, and to methodological discussions about lab currencies as proxies for varied platform-specific currencies (money, likes, bans).
Methodology
Design: Five preregistered online studies (some amendments noted) implemented in Qualtrics with MTurk samples via CloudResearch. Roles include Beneficiary, Helper, Punisher (human or bot), and Trustor (human only). The core procedure uses a third-party punishment game (Helpers decide to share or not with Beneficiaries; Punishers can pay a cost to punish non-sharing Helpers) followed by a trust game (Trustors decide how much to send to either Helpers or Punishers after observing their prior behavior; sent resources are tripled and may be returned). Identities of agents (human vs bot) are signaled textually (“MTurk worker” vs “Bot”) and with images. Key measures include rates of sharing, rates of punishment, Trustor sending amounts conditional on observed behavior, and trust-gains defined as differences in sending between norm-following vs not (Helper trust-gain: send after share minus send after no-share; Punisher trust-gain: send after punish minus send after no-punish).
- Study 1 (N=299): Validity check. Participants learned the game context and rated agreement with statements about bots’ and humans’ desires for currencies (money, Twitter likes, avoiding Wikipedia bans) and whether bots behave as if they had such preferences. No experimental conditions.
- Study 2 (N analyzed = 3761 with adequate comprehension): Behavioral experiment with monetary incentives. Humans can be any role, bots can be Beneficiary/Helper/Punisher. Research questions focus on whether helping/punishing behaviors and resulting trust-gains change when bots occupy specific roles (Beneficiary B1–B4; Helper H1–H3; Punisher P1). Welch’s t-tests used; normality violations noted but large samples justify t-tests.
- Study 3 (N=2514): Hypothetical trust decisions, incentivized belief elicitation. Participants (only Trustor role) estimate (i) empirical expectations (proportion who helped/punished), (ii) injunctive norms (whether one should help/punish), (iii) perceived injunctive norm-consensus (estimated proportion sharing that belief), and (iv) hypothetical trust allocations. Multiple OLS regressions relate perceived norm-consensus to trust-gains, controlling for own norm, empirical expectations, condition fixed effects, and demographics.
- Study 4 (N=458; within-person): Same Trustors from Study 2 (paired with Helpers) re-invited ~360 days later, same condition. Before decisions, a norm-consensus message states that an overwhelming majority (93%) in a recent similar study said Helpers should share in that specific condition. Measures effect on trust-gains relative to their own Study 2 baseline. Analytical sample includes participants who believed the manipulation; robustness also shown with full sample.
- Study 5 (N=2077; between-person): New sample with some returnees; Trustors randomized to receive a norm-consensus message (“majority believe Helpers should share” in the specific condition) or no message, then make incentivized trust decisions. Analytical sample focuses on those who believed the manipulation (78% overall), with robustness checks including all participants and excluding returnees.
Procedures: All studies included consent, rules comprehension checks (eight questions, thresholds to proceed varied by study), decision tasks or belief elicitation, manipulation checks (identity recall; belief in consensus signal for Studies 4–5), and demographics. Bots’ decisions, when needed, were randomized. Qualitative justifications were collected post-decision and coded by four trained assistants; 3833 responses double-coded and reconciled.
Recruitment and samples: U.S.-based MTurk workers with quality filters (CloudResearch Approved in Studies 1,3,5), age ≥18, ≥100 HITs, ≥95% approval, IP and geolocation checks, ballot-box stuffing prevention, and ID de-duplication across studies as applicable. Payment varied by study; bonuses used; Study 4 included additional lottery bonuses.
Analyses: Descriptive statistics, Welch’s t-tests for group comparisons, OLS regressions for Study 3 linking perceived norm-consensus to trust-gains with controls and fixed effects. Robustness checks excluded participants failing comprehension/manipulation checks, and tested results in restricted and full samples; additional analyses compared distributions across Studies 2 and 3, meta-analyzed Studies 4–5 (Morris & DeShon approach). Ethical approval obtained (NYUAD IRB #062-2019).
Key Findings
Study 1 (perceived preferences): Participants disagreed that bots feel desire/need for currencies but agreed bots behave as if they had such preferences due to programming.
- Bots’ desire ratings (0–100 scale): Money μ=25.7 (95% CI [22.2, 29.2]), Likes μ=32.8 [29.1, 36.4], Avoiding bans μ=29.9 [26.4, 33.5]; all significantly below midpoint (p<0.001).
- Humans’ desire ratings: Money μ=83.1 [81.1, 85.1], Likes μ=83.5 [81.2, 85.9], Avoiding bans μ=81.9 [79.5, 84.3]; all p<0.001.
- Bots behave as if they want: Money μ=64.6 [61.6, 67.5]; Likes μ=68.4 [65.1, 71.8]; Avoiding bans μ=64.0 [60.5, 67.5]; all p<0.001 vs midpoint.
Study 2 (behavioral differences and trust-gains):
- B1: Helpers share less with bot Beneficiaries (59%) than with human Beneficiaries (86%); p<0.001.
- B2: Trust-gain from sharing is smaller when Beneficiary is a bot: +36 pp vs +54 pp for human Beneficiaries (p<0.001). Driven by greater leniency toward not sharing with bots: trust in non-sharers is 27% (bot Beneficiary) vs 14% (human Beneficiary), p<0.001. Trust in sharers: 63% (bot Beneficiary) vs 68% (human), p=0.083.
- B3: Non-sharers punished less when Beneficiary is a bot (27%) vs human (48%), p<0.001.
- B4: Punishing non-sharers who denied bots yields smaller trust-gain (+5 pp) than punishing non-sharers who denied humans (+21 pp), p<0.001.
- H1: When Helpers are bots sharing with humans, their trust-gain is lower (+45 pp) than humans sharing with humans (+54 pp), p=0.006.
- H2: Punishment rates for bot non-sharers (42%) vs human non-sharers (48%) not credibly different, p=0.173.
- H3: Trust-gain from punishing bot non-sharers (+19 pp) vs punishing human non-sharers (+21 pp) not credibly different, p=0.639.
- P1: Trust-gain for bot Punishers punishing human non-sharers (+19 pp) vs human Punishers (+21 pp) not credibly different, p=0.678.
Qualitative data in Study 2: With bot Beneficiaries, fewer participants cited signaling to Trustors or personal moral principles; in human-only conditions, 47% referenced higher-level principles vs 21% with bot Beneficiaries. Trustors attributed more signaling value to Helpers’ behavior when Helpers were people (41%) vs bots (31%).
Study 3 (perceived norm-consensus and trust-gains): Participants underestimated consensus about what should be done relative to actual behavior, yet believed norms favor helping and punishing across conditions, especially helping in human-only settings and punishing bots in some views.
- OLS: Perceived consensus that Helpers should share predicts greater trust-gain for sharers over non-sharers (β=0.257, 95% CI [0.133, 0.380], p<0.001), controlling for own norm, empirical expectations, condition fixed effects, and demographics.
- OLS: Perceived consensus that Punishers should punish predicts greater trust-gain for punishers over non-punishers (β=0.109, 95% CI [0.004, 0.215], p=0.042) with analogous controls.
Study 4 (within-person norm-consensus manipulation): Making a strong norm-consensus salient (93% should share) increased trust-gain for bot Helpers who shared with humans from +44 pp (Study 2 baseline) to +55 pp; p=0.003.
Study 5 (between-person norm-consensus manipulation): Making consensus salient (majority should share) increased trust-gain for people who shared with bots from +37 pp to +46 pp; p=0.009.
Meta-analytic synthesis of Studies 4–5: Suggests generally positive effects of consensus information for bot Helpers (wide interval including zero) and clear positive effects for human Helpers interacting with bot Beneficiaries, with caveats about homogeneity across designs.
Overall: Actors earn trust by helping and by punishing non-helpers, but trust-gains are attenuated when bots are involved (especially when Beneficiary is a bot). Perceived consensus about cooperative norms explains variation in trust-gains and, when made salient, can reduce differential treatment and boost trust in norm-followers.
Discussion
The studies address the central question of how cooperative norms and their perceived consensus influence trust in mixed human-bot collectives. Results show that sharing and third-party punishment signal trustworthiness as in human-only societies, but their signaling potency declines when bots are involved—particularly when the intended beneficiary is a bot. This attenuation appears linked to uncertainty about whether helping and punishing norms are consensual in human-bot contexts. Correlational evidence (Study 3) demonstrates that greater perceived injunctive norm-consensus is associated with larger trust-gains for norm-followers across varied identity configurations. Causal manipulations (Studies 4–5) show that informing participants about high consensus on helping can increase trust-gains and reduce asymmetric treatment across human-bot pairings. These findings suggest that individuals import human-human cooperative norms into mixed settings but adjust trust allocations based on perceived consensus, implying that norm communication strategies can facilitate trust formation and cooperation in human-machine ecosystems.
Conclusion
The paper contributes evidence that trust in mixed human-bot collectives is shaped by cooperative norms and, critically, by the perceived consensus surrounding those norms. While helping and punishing generally increase trust, their effects are reduced when bots are part of the interaction—especially when bots are beneficiaries or helpers. Perceived consensus about helping and punishing predicts the magnitude of trust-gains, and experimentally making consensus salient can boost trust in norm-followers and reduce differential treatment of bots versus humans. These insights indicate that cooperative norms from human-only societies can guide behavior in human-machine collectives and that explicitly communicating emerging consensus may accelerate norm stabilization and trust establishment. Future research should test the generality across platforms and currencies of cooperation (e.g., likes, reputation, bans), examine cultural and contextual moderators, and explore additional mechanisms (e.g., signaling vs. altruism) and design levers (e.g., bot transparency, capability/role design) for fostering cooperation.
Limitations
- Sample representativeness: MTurk workers skew younger, more educated, and more tech-savvy than the general population; results may not generalize to other demographics.
- Stylized laboratory setting: The controlled third-party punishment and trust games may not capture idiosyncrasies of specific platforms; external validity may vary by context.
- Currency proxy: Money was used as a proxy for platform-specific currencies (likes, bans); effects may depend on the operative currency in particular ecosystems.
- Cultural and contextual variation: Social norms about humans and machines vary by culture and context, limiting prediction from a single set of studies.
- Deception and belief in manipulation: Study 4 employed deception and trust in the norm-consensus manipulation varied across conditions; analyses conditioned on believing the manipulation; effects attenuate when including all participants.
- Normality violations: Although large samples justify t-tests, formal normality assumptions were not met.
- Repeat participation: Some participants appeared across studies (especially Study 5), though robustness checks suggest substantively similar results when excluding returnees.
- Hypothetical decisions in Study 3: While distributions were similar to incentivized Study 2, hypothetical nature remains a caveat.
Related Publications
Explore these studies to deepen your understanding of the subject.