logo
ResearchBunny Logo
Consensus on community guidelines: an experimental study on the legitimacy of content removal in social media

Sociology

Consensus on community guidelines: an experimental study on the legitimacy of content removal in social media

J. C. Aguerri, F. Miró-llinares, et al.

Discover how public perception shapes the need for removing violent and hateful communication on social media. This compelling study reveals a strong consensus among participants regarding the seriousness of such content, especially towards vulnerable groups. Conducted by Jesús C. Aguerri, Fernando Miró-Llinares, and Ana B. Gómez-Bellvís, this research sheds light on an urgent societal issue.

00:00
00:00
~3 min • Beginner • English
Introduction
The paper investigates whether and how the general public perceives the legitimacy of social media community guidelines that restrict violent and hateful communication (VHC). With social media serving as major venues for public and political discourse, platforms increasingly regulate harmful speech through private rules that can exceed state legal thresholds, raising debates about legitimacy and freedom of expression. The study adopts a legitimacy perspective grounded in perceptions of moral credibility and social norms: if users view rules and sanctions as legitimate, compliance should increase. The research questions focus on the extent of consensus regarding content seriousness and support for content removal under platform guidelines, and on which factors (e.g., harmfulness, wrongfulness, frequency) drive perceived seriousness and support for moderation. The authors specifically examine VHC (including but not limited to hate speech) and test four hypotheses about consensus variability with abstract vs. concrete presentations (H1), higher seriousness for behaviors implying physical harm (H2), primacy of wrongfulness over harmfulness in determining seriousness (H3), and the link between seriousness and support for removal (H4).
Literature Review
The background reviews how discursive harms have migrated online and how social media function as relational places enabling unique crime phenomena reliant on communication alone. Regulating VHC is complex given cross-jurisdictional differences and varying protections of offensive speech. Hate speech is defined across sources by discriminatory targeting based on protected characteristics; it forms a subset of broader VHC. The paper surveys public and private regulatory frameworks: states impose differing legal limits; platforms, incentivized to maintain safe, engaging environments, have developed their own community standards enforced via a mix of algorithms, human moderators, and user flagging. Platform guidelines (e.g., Meta and Twitter) share core protected values (authenticity, safety, privacy, dignity) and similar rule structures around violence, harassment, and hateful conduct, with Meta generally more restrictive. Prior work indicates users support proactive moderation but often criticize opaque, inconsistent enforcement and perceived infringements on free speech. Research on community views of crime suggests high consensus around seriousness and proportional punishment, but abstract norm agreement can mask disagreements that surface when concrete cases are presented. Prior studies identify harmfulness, wrongfulness, and perceived frequency as key determinants of perceived seriousness, with some evidence that wrongfulness is usually weighted more. These strands motivate the current hypotheses about consensus sensitivity to examples (H1), the role of physical harm (H2), the relative weight of wrongfulness vs. harmfulness (H3), and the connection between seriousness and support for removal (H4).
Methodology
Design: A between-groups experiment with three independent samples tested perceptions of VHC under different presentation formats: Group 1 received only the community standard description; Group 2 received the description plus an example; Group 3 received only an example. Sample: N=918 Spain-resident participants via non-probability snowball sampling, split into n1=302, n2=301, n3=315. To mitigate bias, 10 initial seeds were selected across age/sex strata and asked to disseminate the survey within similar networks. Groups showed no significant differences in age, political identity, or legal literacy; slight differences existed in education and gender composition. Stimuli: Thirteen discursive behaviors were drawn from Twitter’s Rules to represent a common baseline across Twitter, Facebook, and Instagram. Where possible, examples came from platform documentation; otherwise, authors created clear, serious examples. Behaviors spanned threats of physical harm, glorification of violence, promotion of terrorism/violent extremism, abusive insults, wishing harm, hateful content (including stereotypes, slurs, symbols), election misinformation, and harmful synthetic/manipulated media. Measures: For each behavior, participants rated on 0–10 scales: wrongfulness (moral reproach), harmfulness (risk/harm), seriousness, and agreement that the content should be removed (consequences). Perceived frequency used a 0–5 Likert scale. Demographics included age, gender, education level, formal legal studies, and political ideology. Analyses: The study compared variances across groups (Levene’s test) to test consensus effects (H1); examined ceiling effects and grouped behaviors by references to physical harm and to vulnerable groups, using independent-samples t-tests (H2). Associations between seriousness and support for removal were assessed via correlation (Pearson r) and OLS regression models (H4). Determinants of seriousness were modeled via OLS with seriousness as the dependent variable and wrongfulness, harmfulness, perceived frequency, presentation type, and demographics as predictors (H3). A second OLS modeled consequences with seriousness, perceived incidence, presentation type, and demographics as predictors. Key coefficients and R2 values are reported from Table 4.
Key Findings
- Consensus decreases with concrete examples (H1): Including examples increased variance in ratings of seriousness and support for removal vs. description-only. Levene tests showed significant variance heterogeneity across groups (seriousness: F=15.883, p<0.001; consequences: F=14.286, p<0.001). - High overall seriousness (ceiling effect): All behaviors were rated very serious, limiting fine-grained ranking. - Physical harm references increase perceived seriousness (H2): Behaviors referencing physical harm had higher mean seriousness (9.28) than those without (9.08), p<0.001. - Targeting vulnerable groups increases perceived seriousness: VHC against protected/vulnerable groups averaged 9.34 vs. 8.99 for non-group-specific cases, p<0.001. - Seriousness predicts support for removal (H4): Seriousness correlated with agreement on removal (Pearson r=0.605, p<0.001). An OLS model with consequences as the dependent variable showed a substantial, significant effect of seriousness, alongside perceived incidence (coef≈0.74, p<0.001), with model fit around one-third to two-fifths of variance explained (Table 4 reports R2=0.388). - Determinants of seriousness (H3 rejected): In the OLS predicting seriousness (R2=0.636), harmfulness (coef≈0.42, p<0.001) and wrongfulness (coef≈0.41, p<0.001) were both significant, with harmfulness slightly stronger, contradicting prior findings that moral wrongfulness dominates. Perceived incidence had a small positive effect (≈0.03, p<0.01). Presentation with examples reduced seriousness (standard+example coef≈-0.18; only example coef≈-0.23; both p<0.001). Demographic effects were small: being male slightly decreased seriousness (≈-0.10, p<0.001); higher education slightly increased it (≈0.03, p<0.01); more right-leaning ideology slightly decreased it (≈-0.06, p<0.001); age had a negligible positive effect; formal legal studies were not significant. - Predictors of support for removal: In the consequences model, beyond seriousness, perceived incidence was strongly positive (≈0.74, p<0.001); being male was associated with lower support (≈-0.60, p<0.001); right-leaning ideology (≈-0.06, p<0.001) and higher education (≈-0.09, p<0.001) slightly reduced support; only-example presentation reduced support (≈-0.23, p<0.001).
Discussion
The findings show broad public consensus that VHC prohibited by platform rules is highly serious and should be removed, supporting the legitimacy of community guidelines. However, consensus weakens when rules are applied to concrete examples, aligning with literature that abstract agreement can mask case-level disagreements. This helps explain why users may endorse moderation in principle yet sometimes perceive specific enforcement actions as unfair or restrictive of free speech. Seriousness ratings were especially elevated for content implying physical harm and for VHC targeting protected groups, suggesting strong social concern about potential real-world harms and reinforcing the societal salience of combating hate speech online. Contrary to some prior criminological work, perceived harmfulness slightly outweighs moral wrongfulness in predicting seriousness of VHC, indicating that users prioritize the risk and consequences of online speech over purely moral evaluations. Seriousness, in turn, is a significant driver of support for removal, though models explain only a moderate proportion of variance, implying additional unmeasured factors (e.g., experiences with moderation, trust in platforms, political context) also shape attitudes. Demographic effects were small but consistent: men and more right-leaning respondents showed slightly lower seriousness and support for removal. Overall, the results suggest that platform rules align with many users’ norms and risk perceptions, offering a degree of social legitimacy for content regulation while highlighting the importance of fair, transparent case-level enforcement to maintain perceived legitimacy.
Conclusion
This study contributes experimental evidence that users perceive VHC restricted by social media guidelines as highly serious and generally support its removal. Consensus is robust but diminishes when abstract rules are instantiated with concrete examples, underscoring the need for transparent, well-justified enforcement in specific cases. Perceived harmfulness is the strongest predictor of seriousness for VHC, exceeding moral wrongfulness and reflecting public concern about real-world risks, particularly for content targeting protected groups or implying physical harm. Seriousness is positively associated with support for removal, but substantial unexplained variance suggests additional factors influence attitudes toward moderation. Future research should examine enforcement procedures and transparency as dimensions of legitimacy, explore individual experiences with moderation, trust in platforms, cross-cultural generalizability, and potential moderators such as political identity, platform type, or prior exposure to VHC. Methodologically, addressing ceiling effects and employing more diverse, probabilistic samples could refine estimates and external validity.
Limitations
- Sampling: Non-probability snowball sampling of Spain-resident participants limits generalizability; recruitment may over-represent certain internet-active subpopulations. - Group composition: Slight between-group differences (notably gender and education) could confound presentation effects; women were over-represented overall and may slightly elevate seriousness ratings. - Ceiling effects: Very high seriousness ratings across behaviors limited fine-grained ranking and may compress variability. - Measurement context: Hypothetical ratings of examples/descriptions may not fully capture reactions to real-world posts, and enforcement perceptions were not directly manipulated. - Explanatory scope: Models for support of removal explained a moderate share of variance, indicating unmeasured factors influence attitudes. - Platform scope: Behaviors derived from Twitter Rules as a common baseline may not cover the full nuance of other platforms’ policies or enforcement practices.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny