
Computer Science
Diverse misinformation: impacts of human biases on detection of deepfakes on networks
J. Lovato, J. St-onge, et al.
This compelling research by Juniper Lovato and colleagues explores human biases in identifying deepfakes, revealing that people perform better when videos align with their own demographics. The study's innovative mathematical model suggests diverse social groups may help shield each other from misinformation. Dive into the findings that could change how we understand video deception!
~3 min • Beginner • English
Introduction
The study investigates how human demographic biases shape susceptibility to and detection of deepfakes, a salient form of misinformation. The authors define “diverse misinformation” as the interplay between human biases and the demographics represented in misleading content. They pose four research questions: (Q1) the role of priming (whether knowing to look for deepfakes improves detection), (Q2) the impact of prior knowledge and social media use on accuracy, and (Q3–Q4) whether homophily (matching perceived demographics between viewer and video persona) or heterophily influences detection accuracy across age, gender, and race. Deepfakes are chosen because their veracity is objectively binary, their personas’ demographic attributes can be perceived by viewers, and they represent a real-world, harmful phenomenon. Understanding these biases is important for characterizing how misinformation spreads and is corrected within social networks and for evaluating the feasibility of a “self-correcting crowd.”
Literature Review
Prior work shows that homophily bias and political alignment influence misclassification of political misinformation, with greater belief in misinformation when sources align politically. Broader misinformation harms span politics, health, economic markets, journalism trust, crisis communication, and hate speech. Automated deepfake detection exists but faces accuracy and bias issues and can be computationally expensive; hybrid human–machine systems often outperform either alone. Early deepfakes were visually imperfect, but advances (e.g., GANs, face mapping) have increased realism, complicating detection. Ethical concerns include evidentiary standards in law, consent and attribution, bias in detection tools, and degradation of the epistemic environment. Literature on biases such as own-race bias (ORB), gender and age biases in face processing, and homophily in networks suggests demographic factors can affect detection performance and network structure, influencing how misinformation propagates and is corrected. Network modeling work (e.g., SIS-like rumor models, community structure effects, mixed-membership block models) highlights how demographics and topology modulate spread and correction dynamics.
Methodology
Empirical survey: IRB-approved online observational study with N=2016 U.S.-based participants recruited via Qualtrics (two phases; main data April–May 2020). Participants were not primed to look for deepfakes; instead, they were told the study concerned communication styles (tone, expressions, body language) to mimic real-world, unprimed social media viewing. Each participant watched two ~10-second video clips randomly sampled from the Facebook Deepfake Detection Challenge (DFDC) Preview Dataset after filtering to single-person clips. Videos included balanced real/deepfake content (Table 4: 50% real, 50% deepfake) and some had augmentations/distractors (e.g., frame rate, audio changes, blur, rotation, filters). After viewing and answering communication-style questions, participants were debriefed with a definition of deepfakes and asked to judge each video as real or fake. In total, 4032 video observations (2 per participant) were collected. Demographics of participants are summarized in Table 3 (e.g., gender, age brackets, race/ethnicity). Data processing converted Likert items and education levels to ordinal scales for analysis.
Analysis: Accuracy was computed as (TP+TN)/(TP+FP+TN+FN). The Matthews Correlation Coefficient (MCC) was used to evaluate performance of participant subgroups treated as classifiers. To compare subgroup MCCs across matched/mismatched demographics (homophily vs heterophily), the authors bootstrapped pairs of confusion matrices (10,000 samples) to estimate differences and credibility intervals, rejecting null differences at thresholds ≥95% credibility. They also ran a Bayesian logistic regression on matching demographics (age, gender, race) to model the probability of correct classification.
Mathematical model: To extrapolate population-level effects, the authors formulated an idealized network model inspired by mixed-membership stochastic block models with two demographic classes of equal size, power-law degree distribution, and tunable in-group density (homophily). Misinformation spreads via duped-to-susceptible contacts at demographic-specific rates λ; correction occurs when susceptible neighbors fact-check and revert duped individuals at rate γ. Ordinary differential equations track duped fractions by demographic and degree class with neighbor-state probabilities closing the system. The model explores invasion thresholds, steady states under heterogeneous susceptibility, and how network structure (assortativity Q, degree heterogeneity) shapes spread and “herd correction,” including scenarios with multiple independent misinformation streams targeting different demographics.
Key Findings
- Overall detection performance: Unprimed participants achieved 51% accuracy across 4032 video judgments, with an overall MCC of 0.334, modestly above chance (coin flip), though overall evidence for being above random was reported at 94% credibility.
- Q1 Priming: Comparing to prior work with primed participants shows substantial gains from priming and human–machine teaming. Reported benchmark accuracies: non-primed humans (this study) 51%; primed humans 66%; machine alone 65%; primed human with machine helper 73%.
- Q2 Prior knowledge/social media use: Weak evidence that familiarity and usage improve accuracy. Frequent social media users had MCC=0.0396 vs infrequent users MCC=−0.001 (83% credibility). Participants familiar with deepfakes had MCC=0.079 vs unfamiliar MCC=−0.0175 (94% credibility). Neither comparison met the ≥95% credibility threshold.
- Q3–Q4 Homophily/heterophily effects (≥95% credibility unless noted):
• White participants showed a strong homophily bias: more accurate on videos of white personas than personas of color (e.g., White viewer on homophilic videos MCC=0.518 vs heterophilic videos MCC=−0.498; credibility 0.99 and 0.97, respectively).
• By gender, when viewing male personas, male participants outperformed female participants (Male persona/Male viewer MCC=0.827 vs Male persona/Female viewer MCC=0.567; credibility 0.97–0.98).
• By race, participants of color were more accurate than white participants on videos of personas of color (POC persona/POC viewer MCC=0.858 vs POC persona/White viewer MCC=−0.0544; credibility 0.99).
• By age, participants aged 18–29 were more accurate on young personas than older viewers and were also more accurate than 30–49-year-old participants even on 30–49 personas (e.g., Age 30–49 persona/Age 18–29 viewer MCC=0.1168 vs Age 30–49 persona/Age 30–49 viewer MCC=−0.0037; credibility ~0.96).
- Modeling insights: Heterogeneous susceptibility and network structure produce steady states of misinformation below saturation. Highly susceptible individuals in assortative echo chambers face the highest risk; diverse neighborhoods provide “herd correction,” especially for high-degree nodes. Protection scales with reduced homophily (lower Q) and increased degree heterogeneity (friendship paradox effects).
Discussion
Findings address the four research questions as follows: (Q1) Non-primed observers detect deepfakes at near-chance levels, aligning with the notion that in-the-wild detection is difficult without explicit cues or training. (Q2) Prior awareness of deepfakes and frequent social media use show only weak associations with improved accuracy, suggesting limited protective effects from general exposure alone. (Q3–Q4) Demographic matching significantly shapes performance: strong evidence of homophily effects across race, gender, and age indicates viewers are generally better at classifying videos that match their own perceived demographic attributes. These results extend own-race bias phenomena to deepfake detection and highlight that no single demographic excels across all persona types.
At the population level, the mathematical model shows how demographic-specific susceptibilities and network structure can lead to heterogeneous steady states of misinformation. Critically, diverse social neighborhoods can enable “herd correction,” where differing strengths across demographic groups allow friends to correct each other’s blind spots, generalizing the self-correcting crowd concept. Conversely, echo chambers comprising highly susceptible individuals amplify risk and hinder correction. These insights suggest interventions should foster cross-demographic connectivity and leverage complementary strengths, potentially augmented by human–machine hybrid detection systems.
Conclusion
This work provides an in-the-wild style assessment of human deepfake detection under minimal priming, demonstrating overall near-chance accuracy and strong demographic effects consistent with homophily biases. By integrating an idealized network model, the study links individual-level biases to population-level dynamics, highlighting conditions under which diverse social ties enable “herd correction” and mitigate spread.
Future research directions include: experimentally varying priming and training; evaluating non-primed human performance with machine assistance; randomized controlled trials to identify causal mechanisms behind homophily advantages; field studies on real social networks with varied diversity; and testing editorial or platform-level interventions (e.g., cross-demographic exposure, education on deepfakes) to reduce susceptibility and enhance correction.
Limitations
- Observational design without experimental manipulation of priming; participants were unprimed during viewing, which limits causal inference about priming effects (comparisons rely on external studies).
- Potential algorithmic and dataset biases (DFDC content quality and augmentations) could differentially affect detectability across demographics; the study focuses on viewer biases but cannot fully disentangle content-generation biases.
- Self-reported participant demographics and perceived demographics of video personas may introduce misclassification and measurement error.
- Confounding variables (e.g., varying social media usage across age groups) may influence subgroup comparisons despite bootstrapping and regression analyses.
- U.S.-based Qualtrics panel may limit generalizability to other populations and platforms.
- Some statistical findings (e.g., effects of prior knowledge and social media usage) did not reach ≥95% credibility.
- The mathematical model is stylized with simplifying assumptions (two demographic classes, independent misinformation streams) and serves as a qualitative exploration rather than a calibrated predictive model.
Related Publications
Explore these studies to deepen your understanding of the subject.