Interdisciplinary Studies

Modularity and composite diversity affect the collective gathering of information online

N. Pescetelli, A. Rutherford, et al.

Discover the intriguing dynamics of group composition and performance in online information gathering. This research, conducted by Niccolò Pescetelli, Alex Rutherford, and Iyad Rahwan, reveals how diversity within groups can enhance forecasting of geopolitical events, with modular crowds outshining traditional setups. Engage in the surprising findings highlighting the benefits of diversity and group size!... show more

Introduction

The study addresses how the composition and structure (size and modularity) of online groups influence their ability to gather information and form accurate beliefs about uncertain, time-critical geopolitical events. Motivated by concerns about filter bubbles, algorithmic personalization, and homophily leading to correlated information exposure, the authors test whether sorting individuals along a high-dimensional trait space affects the independence and breadth of information accessed online, and how this interacts with group size. They hypothesize that online interactions and enforced consensus can improve accuracy, and that composite diversity and modularity may enhance aggregate performance, especially in environments where judgments are correlated and rational debate opportunities are limited.

Literature Review

Prior work suggests that personalization and homophily can create clusters of users with correlated information exposure online, potentially undermining diversified knowledge acquisition. Psychological and organizational research links diversity to improved information independence, complex thinking, creativity, and resilience to biases, though often studying single dimensions (e.g., age, skill, race). The authors extend this by considering composite diversity across many traits that matter in digital ecosystems. Work on collective intelligence and social learning indicates that smaller, modular groups can reduce correlated errors and can outperform single large crowds in estimation tasks; however, the generalization to complex, real-world forecasting tasks is less understood. The paper situates itself within this literature by experimentally manipulating composite diversity and modularity to assess causal impacts on forecasting performance.

Methodology

Participants (N=193) completed a pretest survey covering 29 trait dimensions (demographic, professional, political, geographic/relational, and cognitive measures including CRT). Using DBSCAN clustering on the resulting multidimensional profiling space, participants were segmented into core (~50%), inner (~25%), and outer (~25%) segments. Core participants were randomly assigned to: (a) low diversity (paired with inner segment—more similar) or high diversity (paired with outer segment—more dissimilar) conditions; and (b) small groups (~5 people) or large groups (~25 people). Group modularity was manipulated orthogonally: non-modular condition aggregated one large interacting group (M=1) per diversity treatment; modular condition aggregated across multiple independent small groups (M>1). The diversity measure was based on Euclidean distance in the profiling space and correlated strongly with standard deviance (r=0.92, p<0.001). During the test phase, core participants solved eight binary geopolitical forecasting problems drawn from IARPA’s Hybrid Forecasting Competition (unresolved at the time). For each item, participants provided: (1) an initial private forecast (no search), (2) a revised private forecast after timed online browsing, and (3) a final private forecast following real-time group chat plus a required group consensus forecast. Performance was scored with Brier scores (0 best, 2 worst). Aggregate forecasts used median aggregation: for modular conditions, medians were computed within groups and then aggregated across groups; for non-modular, a single group median. Analyses employed multilevel GLMMs with a Gaussian log link (robustness checks with probit/logit). Preregistered main analyses examined: (i) forecast type effects on individual errors and (ii) effects of composite diversity and modularity on aggregated errors. Exploratory analyses assessed individual-level effects of diversity and group size, within-group disagreement (SD of forecasts), performance variability (SD of average individual performance), and consensus convergence dynamics via coded chat data. Preregistration available at AsPredicted (https://aspredicted.org/9m6df.pdf); data and code on OSF (osf.io/wb538).

Key Findings

Online browsing aligned beliefs proportionally to trait similarity: Euclidean distance in profiling space inversely related to forecast correlations after browsing (initial: r=0.12, p=0.38; revised: r=-0.39, p=0.006; final: r≈-0.056, p<0.001), indicating belief coupling emerged through online search rather than prior to it.
Forecast type effects (individual level): Consensus forecasts had lower errors than initial, revised, and final private forecasts, indicating benefits of social interaction and enforced consensus (e.g., Table 1A: Initial β=0.62, p=5.81e−12; Revised β=0.70, p=7.73e−15; Final β=0.24, p=0.017, all relative to consensus baseline).
Individual-level manipulation effects (exploratory): Higher composite diversity marginally improved individual accuracy (β=-0.37, p=0.066). Group size alone (small vs. large) had no significant main effect (β=-0.20, p=0.319). There was a significant interaction: the benefit of diversity was larger in large groups than small groups (Diverse×Small β=0.83, p=0.004).
Aggregate performance (preregistered): Both composite diversity and modularity improved aggregated accuracy (Table 1D). Diverse groups outperformed homogeneous groups (β=-0.56, p=0.016); modular (multiple small groups aggregated) outperformed non-modular (single large group) (β=-0.82, p=0.0019). There was a significant interaction (β=0.93, p=0.014), indicating diversity’s benefit was greater in large groups than in small groups.
Forecast type effects (aggregate, exploratory): Consensus errors were lower than initial (β=0.68, p=0.003) and revised (β=0.60, p=0.009) aggregates; no advantage over final at aggregate level (β=-0.13, p=0.666).
Disagreement and dynamics: Opinion disagreement (within-group SD) increased after browsing (Revised β=5.06, p<0.001) and decreased after social interaction (Final β=-4.41, p<0.001). Diversity×Group size interaction indicated diversity had a smaller effect on disagreement in large groups than in small groups (β=7.11, p=0.006). Residual disagreement persisted in final private forecasts.
Performance variability: Variability across members’ average accuracy increased notably in small diverse groups after browsing, less so in small homogeneous and large groups, indicating uneven benefits/harm of online search within some groups.
Consensus reaching: Diverse groups reached consensus faster (β=-0.31, p=0.01), and small groups were faster than large groups (β=-0.46, p<0.001), with a positive interaction indicating the diversity-related speed advantage diminished in smaller groups. Overall, composite diversity and modularity improved forecasting, with diversity’s benefit scaling with group size; enforced consensus added accuracy beyond mere exposure to social information.

Discussion

The findings directly address how online group composition and structure shape information gathering and forecasting under uncertainty and time pressure. Trait similarity did not predict belief alignment before search, but after browsing, suggesting that digital ecosystems map user traits onto information access, creating correlated beliefs. Social interaction and enforced consensus improved accuracy, supporting the value of structured deliberation. At the aggregate level, modularity (independent small groups) and composite diversity yielded better forecasts; however, diversity’s positive effect was stronger in large groups. Exploratory analyses suggest mechanisms: browsing increased disagreement and within-group performance variability (especially in small diverse groups), while social deliberation reduced disagreement and accelerated consensus, particularly in diverse or smaller groups. The results imply that in environments with interdependent judgments, structuring collectives into modular subgroups and increasing composite diversity can mitigate correlated errors and enhance performance. They also raise concerns that personalization may skew collective outcomes by coupling beliefs among similar users, highlighting the importance of considering digital context in collective intelligence.

Conclusion

This study provides causal evidence that composite trait diversity and modularity improve collective forecasting in complex, time-critical online information environments. Diversity’s benefits scale with group size, and aggregating multiple independent small-group deliberations outperforms aggregating a single large crowd. Enforced consensus further enhances accuracy beyond mere social exposure. The work emphasizes the role of digital personalization in shaping information access and belief correlation. Future research should replicate these findings, refine and theory-drive diversity measures, investigate alternative aggregation and network structures, examine how specific platform algorithms mediate information access, and explore ethical and practical interventions (e.g., increasing modularity) to improve online collective decision-making.

Limitations

Some analyses were exploratory (e.g., individual-level effects, disagreement/variability, consensus dynamics); exact analytic plans for these were not preregistered.
The direction of the interaction between diversity and group size/modularity was not hypothesized a priori; replication is needed.
The experiment was conducted once (pilot discarded due to a software bug); generalizability may be limited.
Specific forecasting problems were not preregistered; most events were rare, which may influence Brier score dynamics.
The composite diversity metric, while data-driven and robust to an alternative measure, may not capture all theoretically relevant dimensions; potential confounds from unmeasured traits remain.
Findings about personalization’s role in belief coupling post-browsing were not preregistered and should be interpreted cautiously.
Sample drawn from a specific population; external validity to other platforms, tasks, and cultures is uncertain.

Related Publications

Explore these studies to deepen your understanding of the subject.

Business

Comparing the influence of visual information and the perceived intelligence of voice assistants when shopping for sustainable clothing online

P. Li, C. Wu, et al.

Psychology

Impact of the quality and diversity of reference products on creative activities in online communities

K. Sato, K. Yang, et al.

Interdisciplinary Studies

The political and social contradictions of the human and online environment in the context of artificial intelligence applications

R. Rakowski and P. Kowaliková

Business

Does the supply of tax information affect financial restatements? Evidence from the launch of Taxation Administration Information System III in China

J. Zhang, N. Wang, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 22+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny