Political Science
Sordid genealogies: a conjectural history of Cambridge Analytica’s eugenic roots
M. Wintrob
This research by Michael Wintrob delves into the intriguing link between Cambridge Analytica's psychological tactics and the eerie history of eugenics, shedding light on data manipulation in electoral politics.
~3 min • Beginner • English
Introduction
The paper investigates how Cambridge Analytica’s claimed psychometric methods for microtargeting voters in the 2016 US election grew out of intellectual traditions intertwined with eugenics. Framed through questions of translation and fidelity (from the Septuagint myth to machine translation), it asks how meanings, measures, and models are transferred across contexts—from statistical techniques and lexical taxonomies to political manipulation. The purpose is to trace a conjectural genealogy linking the lexical hypothesis, trait psychology, and factor analysis to eugenic thought and to show how these methods and assumptions informed Cambridge Analytica’s data practices and the broader politics of Trumpism. This matters for understanding the authority claims of data-driven psychology, their ethical stakes, and their deployment in democratic processes.
Literature Review
The article situates its argument within several bodies of work: (1) Histories of eugenics and statistics (e.g., Galton, Pearson, Spearman, Fisher; Kevles; MacKenzie; Porter) showing co-production of statistical measurement and eugenic ideology; (2) Trait psychology and the lexical hypothesis (Galton; Klages; Allport & Odbert; Cattell) including critiques that personality taxonomies are artifacts of methods (e.g., Borsboom; Francis et al.); (3) Big Five personality research and applications (McCrae & Costa; Norman; Goldberg) and methodological debates (e.g., PCA as formative rather than reflective); (4) Cambridge Analytica/MyPersonality/Kosinski-Stillwell literature demonstrating prediction from Facebook Likes and subsequent controversies; (5) STS perspectives on translation, reproducibility, and the symmetry principle regarding truth and falsity (Collins; Duster), used to frame the analysis of how pseudoscientific claims travel; (6) Alt-right and race realism sources (e.g., Jared Taylor; Breitbart/Bannon) linking contemporary rhetoric to earlier eugenic discourses.
Methodology
Conceptual-historical and genealogical analysis drawing on primary and secondary sources. The author traces lines of descent from eugenics to psychometrics by close reading and synthesis of writings by Galton, Klages, Spearman, Burt, Fisher, and Cattell, alongside mid- to late-20th-century personality psychology (Allport & Odbert, McCrae & Costa, Big Five). The paper integrates case material on Cambridge Analytica’s data acquisition and modeling claims (e.g., MyPersonality, Kogan’s app, Facebook data, ad targeting) and connects these to the methodological and ideological legacies of lexical trait models and factor analysis. It is an interpretive, conjectural genealogy rather than an empirical causal test, situating techniques within sociopolitical contexts and assessing their epistemic and ethical implications.
Key Findings
- Cambridge Analytica’s psychometric approach draws on a lineage from eugenics-linked statistical and lexical methods: Galton’s biometry and lexical hypothesis, Klages’s characterology, Spearman’s factor analysis and g (with eugenic aims), Burt’s hereditarian interpretations, and Cattell’s 16-factor model and Beyondism.
- The lexical hypothesis and trait taxonomies were shaped in contexts suffused with eugenic ideology and class/racial hierarchies; factor-analytic methods helped reify latent traits as if they were natural kinds.
- Big Five traits, like earlier taxonomies, can be understood as artifacts of method and language choices; PCA and factor analysis risk the fallacy of misplaced concreteness when constructs are treated as real causes rather than statistical summaries.
- Cambridge Analytica’s data pipeline: MyPersonality app (4M users; ~1/3 shared profiles); Kosinski & Stillwell showed Likes could predict intimate traits; Kogan’s app paid ~250,000 US users and leveraged friend-Like data (~300× expansion), yielding data Cambridge Analytica claimed covered 87 million users. Reported inference accuracies from Likes included ethnicity (95%), gender (93%), sexual orientation (88%), politics (85%), religion (82%), relationship status (67%); claims that 30 Likes could predict better than a spouse.
- CA and the Trump campaign reportedly deployed 40,000–50,000 ad variants per day, iteratively optimized via feedback to microtarget personality profiles (e.g., high neuroticism, low openness/agreeableness), though the effectiveness of CA’s psychometrics is disputed.
- There are strong ideological continuities between eugenic thought and contemporary alt-right “race realism,” with actors like Bannon/Breitbart and Jared Taylor echoing themes found in earlier eugenicists and in Cattell’s later writings.
- The performative authority of complex statistical and computational methods can cloak pseudoscientific claims, enabling their translation into political strategy and voter manipulation irrespective of empirical validity.
- Even if Cambridge Analytica exaggerated its capabilities, the infrastructures of data harvesting, psychometric classification, and algorithmic targeting persist in successor firms and broader digital advertising ecosystems, posing ongoing risks to democratic processes.
Discussion
By reconstructing the intellectual genealogy of psychometric trait models and their eugenic entanglements, the paper reframes Cambridge Analytica’s methods as a continuation of older projects to measure, sort, and control populations under a guise of scientific objectivity. This addresses the research question by showing how statistical techniques (lexical trait extraction, factor analysis, PCA) both emerged from and reinforced hierarchical ideologies that can be repurposed in modern computational politics. The analysis highlights how the translation of online behavior into fixed personality traits is not neutral but reflects methodological choices and historical legacies, amplifying their power when embedded in large-scale data infrastructures. The findings underscore the significance for democracy: even contested or overstated psychometric efficacy can legitimize manipulative practices and catalyze class/racial resentments aligned with race-realist politics. Recognizing these genealogies helps interrogate claims of objectivity in data-driven persuasion and motivates scrutiny of how such models are weaponized in electoral contexts.
Conclusion
The paper contributes a conjectural history linking Cambridge Analytica’s psychometric targeting to eugenics-rooted traditions in statistics and personality psychology. It shows that lexical trait lists and factor-analytic taxonomies—far from being neutral scientific tools—were historically co-produced with ideologies of hierarchy, later adapted to computational advertising and political microtargeting. While Cambridge Analytica’s specific impacts remain debated, the broader techniques and infrastructures enabling behavioral profiling and targeted persuasion continue to evolve beyond CA’s demise. The author cautions that such methods, once weaponized, can manipulate identities and choices at scale. Future inquiry should critically examine contemporary data analytics firms, validate or debunk claimed predictive powers, assess ethical and legal frameworks for data consent and targeting, and develop safeguards that address the translation of dubious constructs into powerful instruments of governance and political influence.
Limitations
The analysis is interpretive and genealogical rather than an empirical test of causal effects; it does not quantify Cambridge Analytica’s actual impact on electoral outcomes. The author acknowledges disputes about CA’s effectiveness and notes that not all Big Five researchers share eugenic beliefs. The argument relies on historical linkages and conceptual critique, which may not generalize to all uses of trait models. Some reported performance claims (e.g., prediction from Likes) are cited from prior studies and public reports and may be context-dependent. The conjectural nature of the genealogy limits definitive claims about direct influence pathways.
Related Publications
Explore these studies to deepen your understanding of the subject.

