logo
ResearchBunny Logo
Impact of the quality and diversity of reference products on creative activities in online communities

Psychology

Impact of the quality and diversity of reference products on creative activities in online communities

K. Sato, K. Yang, et al.

We examined how exposure to others’ work affects creativity by statistically analyzing three large online communities (Cities: Skylines, SCP-wiki, Archive of Our Own). Findings show a “just right” level of quality diversity boosts output, while high content diversity and extremely high reference quality can harm product quality. Research conducted by Keisuke Sato, Kunhao Yang, and Kazuhiro Ueda.

00:00
00:00
~3 min • Beginner • English
Introduction
The study addresses a central question in creativity research: what kinds of external information exposure foster higher-quality creative output? While prior work shows that environmental factors and exposure to others’ ideas can both inspire and constrain creativity, the specific characteristics of referenced products that help or hinder subsequent creation remain unclear. Focusing on the diversity and quality of products that creators refer to in online communities, the authors propose two hypotheses: (1) greater diversity of referenced products increases the quality of generated products; and (2) higher quality of referenced products increases the quality of generated products. Using large-scale data from three active online creative communities (video game modding and novel writing), the study seeks quantitative evidence on how reference product characteristics relate to the quality of subsequent creations.
Literature Review
Prior research highlights multiple influences on creativity, including individual abilities (e.g., divergent thinking, executive and metacognitive abilities), knowledge breadth and depth, experience, personality, and memory structure, as well as environmental factors such as physical space and ambient conditions. Exposure to others’ ideas has been emphasized as a source of cognitive stimulation and inspiration, exemplified by brainstorming and online collaborative platforms. However, exposure can also induce fixation, constraining novelty. Studies have yielded mixed results: diversity in knowledge, words encountered, identities, and team composition is often beneficial, while exposure to common or low-novelty examples can foster fixation. In online innovation settings, the number and originality of viewed prior products and their evaluations can affect idea generation. The present work builds on this literature to clarify how the diversity (quality diversity and content diversity) and average quality of referenced products shape the quality of new creations.
Methodology
Data were collected from three online communities: (1) Steam Community’s Cities: Skylines modding community; (2) SCP-wiki (original horror/sci-fi stories); and (3) Archive of Our Own (AO3; fanfiction). Steam: 170,032 mods by 35,579 developers (as of Sept 13, 2022) were scraped, with information including developer data, Current Subscribers, publication date, and tags (122 types). Developer metadata included Cities: Skylines playtime, number of games purchased on Steam, and list of favorited Cities: Skylines mods (availability dependent on user privacy settings). SCP-wiki: 4,653 stories by 3,405 participants (Jan 2009–Dec 2018) were collected via API, including voters, Rating ((+1) – (−1)), initial draft text, and publication date. AO3: 102,964 novels by 6,031 randomly selected authors were scraped, with Kudos, original work status, authorship, publication date, and 573,012 bookmarked novels (with bookmark time, Kudos, original work status, authorship, and publication date). Quality metrics (log10 scale): Steam quality = log10(1 + Current Subscribers); SCP quality = log10(1 + Rating); AO3 quality = log10(1 + Kudos). Reference sets: inferred from behavioral records—Steam Favorites, SCP Votes, AO3 Bookmarks—collected only if recorded before the creation of each target work. Diversity metrics: Quality diversity = standard deviation of the quality values within the reference set. Content diversity: for Steam and AO3, Shannon entropy of tag/original-work distributions; for SCP, average pairwise Euclidean distance between sentence-BERT (768-dim) embeddings of referenced stories. Reference quality: average of quality values within the reference set. Modeling: Polynomial regression with orthogonal polynomials (QR decomposition) and L1 regularization; polynomial degree selected by BIC per analysis. Dependent variable: logged quality of generated product. Controls (per dataset): Steam—log10(playtime), log10(number of games purchased), log10(order of mod among developer’s mods); SCP—log10(number of content revisions), time point of publication, days since author’s first participation, number of previous participations; AO3—log10(popularity of the original work, proxied by number of fanfics), log10(order of target novel among author’s works). Analyses were run separately for (a) quality diversity, (b) content diversity, and (c) average reference quality. Observation counts and fit statistics are reported per model in Tables 1–3.
Key Findings
- Quality diversity of reference products: Across all three datasets, the relationship between the quality diversity of references and generated product quality was an inverted U-shape (concave quadratic). Generated product quality was maximized at intermediate levels of quality diversity; too little or too much quality diversity was associated with lower quality. Representative model fits: Steam (R²=0.245; N=13,623; BIC=3.543×10³), SCP (R²=0.320; N=2,907; BIC=2.060×10³), AO3 (R²=0.082; N=65,619; BIC=1.249×10³). Coefficients indicated negative quadratic terms, confirming concavity. - Content diversity of reference products: Content diversity generally had a negative association with generated product quality. Steam displayed a convex quadratic, with a threshold: when content diversity > 2.833, quality increased with diversity; otherwise, increasing diversity reduced quality. SCP and AO3 showed linear negative relationships. Representative fits: Steam (R²=0.235; N=13,247; BIC=3.469×10⁴), SCP (R²=0.304; N=2,907; BIC=2.269×10³), AO3 (R²=0.075; N=70,665; BIC=1.352×10⁵). - Average quality of reference products: When the mean reference quality was extremely high, the quality of generated products tended to decrease. Thresholds beyond which higher reference quality predicted lower generated quality were: Steam > 5.159, SCP > 2.534, AO3 > 4.680 (on the respective log scales). Model forms: Steam quartic (R²=0.189; N=16,113; BIC=4.349×10⁴), SCP quadratic (R²=0.309; N=2,964; BIC=2.245×10³), AO3 seventh-degree (R²=0.164; N=70,200; BIC=1.273×10⁵).
Discussion
The findings challenge the initial hypotheses that greater diversity and higher average quality of referenced products uniformly enhance creativity. Instead, intermediate levels of quality diversity are most beneficial, while excessive diversity can hinder performance—likely due to cognitive load limiting effective memory exploration or heightened perceived constraints when differentiating from numerous diverse exemplars. Content diversity showed predominantly negative effects, suggesting that diversity constrained within the same domain/genre may not equate to the kind of broad, cross-domain diversity that benefits creativity. For average reference quality, extremely high-quality exemplars appear to reduce subsequent product quality, consistent with fixation effects where memorable, high-quality examples bias ideation toward derivative outputs. Overall, these results indicate that curated exposure—neither too diverse nor too elite—may better support high-quality creative production in online communities.
Conclusion
This study provides large-scale, cross-community evidence that the characteristics of referenced products systematically relate to the quality of subsequent creative outputs. The main contributions are: (1) demonstrating an inverted U-shaped relationship between quality diversity of references and output quality; (2) showing predominantly negative effects of content diversity on output quality (with a limited high-diversity region in Steam where effects turn positive); and (3) revealing that extremely high average reference quality can impair output quality. Future research should: investigate cognitive mechanisms (e.g., fixation) using text embeddings to measure similarity shifts post-reference; examine the content diversity of generated products as an outcome; incorporate broader information environments (e.g., web browsing); extend to other creative domains beyond mods and novels; and evaluate community-level factors (audience considerations, feedback, social ties) that may moderate these relationships.
Limitations
- Causality not established: observational analyses cannot rule out alternative explanations (e.g., inherently skilled creators may choose different reference patterns). - Reference behavior measurement noise: Favorites/Votes/Bookmarks may reflect varied user intentions (e.g., logging viewed items or differentiation materials) and unequal influence levels across references. - Incomplete information environment: analyses capture only within-platform references and not broader external information exposure (e.g., web browsing, offline sources). - Domain scope: results are based on mod development and novel writing; effects may differ in other creative domains. - Cognitive mechanisms not directly tested: the study did not experimentally probe mechanisms like fixation or memory processes underlying the observed relationships.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny