Psychology

AI assessment changes human behavior

J. Goergen, E. D. Bellis, et al.

AI is replacing human decision-makers in assessments. Research conducted by Jonas Goergen, Emanuel de Bellis, and Anne-Kathrin Klesse shows people alter their self-presentation under AI (vs. human) assessment—emphasizing analytical traits and downplaying intuition and emotion. Across twelve studies (N = 13,342), they document the “AI assessment effect” and trace it to the belief that AI prioritizes analytical characteristics, with implications for psychology and organizations using AI assessment.... show more

Introduction

Organizations increasingly rely on AI-based tools to assess people in high-stakes contexts such as hiring and admissions. The research asks whether informing people that an AI (versus a human) will assess them changes their behavior during assessment and, if so, why. The core hypothesis is that disclosure of AI assessment leads people to strategically present themselves as more analytical and less intuitive/emotional because they believe AI prioritizes analytical traits. This question is important due to transparency regulations that make AI use salient, the potential distortion of assessments if people adapt to perceived AI criteria, and the possibility that lay beliefs about AI’s focus may be inaccurate given advances in affective and intuitive capabilities. Prior work has mostly examined organizational outcomes and perceptions of AI assessment; this research instead investigates behavioral changes induced by AI assessment disclosure.

Literature Review

The paper builds on impression management and self-presentation theory, accountability, and faking research, which show that candidates strategically adapt behavior when they know evaluative standards. It integrates work on lay beliefs about AI (algorithm aversion, perceived lack of emotion/intuition and emphasis on rationality, data-driven decision-making) and dual-process accounts of cognition (analytical vs. intuitive). Prior literature in AI assessment largely focuses on efficiency gains, discrimination reduction, trust and fairness perceptions, and interface design effects on faking. People commonly construe AI assessments as objective and quantifying analytical traits, potentially dehumanizing and reducing qualitative dimensions. This background motivates the proposed mechanism: an analytical priority lay belief driving behavioral shifts under AI assessment.

Methodology

Across 12 preregistered studies (N = 13,342) using Prolific, Upwork, and representative US sampling, the authors varied paradigms (field pilot, controlled experiments, between- and within-subjects designs, incentive-aligned behavioral tasks, ranking-based self-presentation, and real job/application contexts). Key manipulation: informing participants that performance or applications are assessed by an AI/algorithm or by a human; one study added joint AI+human final decision disclosure. Dependent variables: self-reported task approach (analytical vs. intuitive, affect vs. cognition scales) and behavioral ranking of attributes (four analytical vs. four intuitive) to present oneself. Measures included an abbreviated situational thinking style scale (analytical α ≈ 0.87–0.89; intuitive α ≈ 0.93), analytical priority lay belief (α ≈ 0.93), and exploratory constructs (creativity, risk-taking, effort investment, ethical/social considerations). Studies 5a/5b directly manipulated lay beliefs using consider-the-opposite and AI-intuitive prompts requiring brief written reflections. Sampling and platforms: most studies via Prolific; pilot on Upwork; one representative US sample via Prolific (age, gender, ethnicity quotas). Ethics: IRB approval/exemption; voluntary participation; adults; field consent constraints in exploratory/pilot. Analyses: t-tests, ANOVAs, Wilcoxon/Kruskal–Wallis rank tests for ranking tasks, mediation (assessor → lay belief → task approach), moderated regressions with Johnson–Neyman analyses; bootstrapped CIs (≥1,000 resamples). Attention checks and preregistered exclusions applied. OSF hosts data and code.

Key Findings

Pilot (Upwork event planner, N = 278): Applicants invited to AI-assessed position reported more analytical self-descriptions than human-assessed (M_AI = 4.41 vs. M_human = 4.02 on 1–7 analytical-intuitive; t(274) = 2.04, P = 0.042, d = 0.25). Exploratory company survey (N = 1,421): Greater perceived algorithm involvement correlated with stronger self-reported behavioral adjustment (r(1,407) = 0.12, P < 0.001). Study 1 (college admission, N = 513): AI assessment increased analytical task approach and decreased intuitive approach (Analytical: M_AI = 6.00 vs. M_human = 5.59; t(511) = 4.95, P < 0.001, d = 0.44. Intuitive: M_AI = 4.20 vs. M_human = 4.82; t(511) = −4.93, P < 0.001, d = −0.43). Lay belief manipulation showed mediation via analytical priority belief (indirect b_analytical = 0.30, CI [0.17, 0.44]; b_intuitive = −0.35, CI [−0.52, −0.18]). Four replications (N = 1,424) consistently supported the effect (d = 0.22–0.72). Study 2 (representative US, N = 1,166): Robust AI assessment effect controlling covariates (b = −0.48, SE = 0.09, t(1,136) = −5.04, P < 0.001). Moderation: stronger among younger participants (JN point = 62 years); marginal patterns for fear of negative evaluation and agreeableness; stronger for unemployed vs. employed, weaker for retired. Study 3 (two-stage mixed design; stage 1 N = 1,485; stage 2 N = 1,138): Stage 1 between-subjects: AI > human for analytical approach (M = 5.85 vs. 5.60; t(1,135) = 3.49, P < 0.001, d = 0.26). Within-subjects: switching assessors produced larger changes than repeating same assessor; AI led to higher analytical approach than human (M = 6.00 vs. 5.52; t(389) = 8.10, P < 0.001, d = 0.41). Compared to no assessment, AI substantially increased analytical self-presentation. Selection threshold illustration: with threshold = 6, 27% would be selected only under AI assessment. Study 4 (incentive-aligned ranking, N = 1,789): Participants ranked analytical attributes higher under AI (median rank sum AI = 16 vs. human = 18; Wilcoxon z = −4.46, P < 0.001, r = 0.11). Study 5a (N = 2,321): AI > human for analytical self-presentation; consider-the-opposite reduced analytical priority beliefs and attenuated the AI effect (Mdn_AI = 18 vs. Mdn_human = 20; z = −4.23, P < 0.001; AI vs. AI-opposite: z = −5.15, P < 0.001; AI-opposite comparable to human, z = −1.25, P = 0.210). Study 5b (N = 2,370): Stronger manipulation focusing on AI’s emotional/intuitive capacity further reduced beliefs; AI-intuitive condition led to less analytical self-presentation than AI and even less than human (AI vs. human: z = −2.61, P = 0.009; AI vs. AI-intuitive: z = −4.45, P < 0.001; AI-intuitive vs. human: z = −2.30, P = 0.021). Study 6 (joint assessment, N = 2,330): Joint AI+human disclosure yielded analytical self-presentation between AI-only and human-only (AI vs. human: z = −5.97, P < 0.001; joint vs. human: z = −2.54, P = 0.011; AI vs. joint: z = −3.38, P < 0.001). Overall, across self-report and behavior, disclosure of AI assessment reliably shifts self-presentation toward analytical traits, driven by the analytical priority lay belief; challenging that belief attenuates or reverses the effect.

Discussion

Findings across diverse paradigms demonstrate that simply disclosing that an AI/algorithm will assess performance leads people to emphasize analytical characteristics and downplay intuition and emotions. Mediation and belief interventions show the effect is driven by a lay belief that AI preferentially rewards analytical traits. This addresses the core hypothesis and highlights psychological mechanisms underlying responses to AI in assessments. Theoretical contributions include extending psychology-of-AI by documenting behavioral adaptation under AI assessment, adding to impression management and faking literatures with an AI-specific driver. Practically, these shifts can alter which candidates are selected (e.g., thresholds yielding different selection pools), threaten validity and reliability by obscuring authentic qualities, and interact with transparency regulations. Remedy implications include correcting or contextualizing lay beliefs, designing disclosures that mitigate biased self-presentation, and considering hybrid processes (AI + human) to reduce the effect.

Conclusion

The paper establishes the AI assessment effect: disclosure of AI assessment systematically shifts self-presentation toward analytical traits, with convergent evidence across self-report and behavioral measures and causal support for a lay-belief mechanism. Contributions span theory (psychology of AI, impression management under AI) and practice (risks to assessment validity, selection outcomes). Future research should examine other self-presentation dimensions (creativity, ethics, risk-taking, effort), diverse decision contexts (public services, admissions), long-term consequences for career trajectories and well-being, and whether different AI types or interfaces elicit distinct behavioral responses. Policymakers and organizations should consider how transparency and capability disclosures shape candidate behavior and explore interventions to align assessments with authentic performance.

Limitations

Field pilot lacked guaranteed random assignment and was subject to platform noise and extraneous factors. Mediation evidence in Study 4 had low internal validity for the lay-belief measure. Many studies rely on self-reports of task approach, which may be subject to demand effects despite attention checks. While bootstrapped CIs were used, normality was not formally tested. Contexts centered on hiring/admissions; generalizability to other domains and populations remains to be tested. Attrition and two-stage designs introduced potential selection biases (writing-task dropouts in Studies 5a/5b). Transparency details beyond assessor identity were not manipulated in most studies; real-world disclosures may vary.

Related Publications

Explore these studies to deepen your understanding of the subject.

Interdisciplinary Studies

From MilkingBots to RoboDolphins: How AI changes human-animal relations and enables alienation towards animals

L. N. Bossert and M. Coeckelbergh

Psychology

Ascribing consciousness to artificial intelligence: human-AI interaction and its carry-over effects on human-human interaction

R. E. Guingrich and M. S. A. Graziano

Computer Science

Human-AI collaboration is not very collaborative yet: a taxonomy of interaction patterns in AI-assisted decision making from a systematic review

C. Gomez, S. M. Cho, et al.

Education

A RCT for assessment of active human-centred learning finds teacher-centric non-human teaching of evolution optimal

L. Buchan, M. Hejmadi, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny