logo
ResearchBunny Logo
Revealing Stereotypes: Evidence from Immigrants in Schools

Education

Revealing Stereotypes: Evidence from Immigrants in Schools

A. Alesina, M. Carlana, et al.

This groundbreaking study explores whether awareness of stereotypes influences teacher grading behaviors, particularly between immigrant and native students. Conducted by Alberto Alesina, Michela Carlana, Eliana La Ferrara, and Paolo Pinotti, the research reveals that revealing implicit biases can lead to more equitable grading practices, challenging the very foundations of discrimination in education.

00:00
00:00
~3 min • Beginner • English
Introduction
The paper examines whether making individuals aware of their implicit stereotypes changes their behavior, focusing on teachers' grading of immigrant versus native students in Italian middle schools. In a context of rising immigration and critical academic tracking at the end of middle school, the authors ask: (i) Do teachers display discriminatory grading against immigrant students relative to standardized, blind assessments? (ii) Are such grading differences related to teachers' implicit stereotypes measured by an IAT? (iii) Does revealing teachers' own implicit bias affect their subsequent grading? The study is motivated by concerns that stereotypes can cause biased judgments and self-fulfilling effects on disadvantaged students' effort and aspirations, with long-run consequences for educational and labor market trajectories.
Literature Review
The study contributes to economics and social psychology literatures on discrimination and implicit bias. It builds on work measuring implicit attitudes via IATs and their predictive validity for real-world outcomes (Greenwald et al., 2009; Nosek et al., 2007; Burns et al., 2016), and on recent economic studies linking managers' or teachers' implicit bias to minority outcomes (Glover et al., 2017; Carlana, 2018; Reuben et al., 2014). Prior education research documents teacher biases against minorities and gender and the consequences for expectations and performance (Rosenthal and Jacobson, 1968; Jussim and Harber, 2005; Burgess and Greaves, 2013; Hanna and Linden, 2012; Lavy, 2008; Lavy and Sand, 2018), but typically cannot isolate stereotypes from unobserved student traits. Psychology and medicine research has studied defensive reactions to implicit bias feedback (O'Brien et al., 2010; Howell et al., 2015; Sukhera et al., 2018), yet lacked evidence on behavioral effects of revealing such information. The paper also relates to interventions aimed at reducing prejudice through information or contact (Grigorieff et al., 2017; Hopkins et al., 2018; Burns et al., 2016; Paluck et al., 2018).
Methodology
Setting and data: The study focuses on Italian middle schools (grades 6–8). Administrative data include teacher-assigned grades and standardized, blindly-graded INVALSI test scores (math and reading) for grade 8 cohorts graduating 2011/12–2015/16. The research team surveyed teachers in 102 schools (five Northern Italian cities) from Oct 2016–Mar 2017; 65 schools completed surveys before mid-term grading (January) and form the main sample. About 80% of math and literature teachers completed the survey (N≈1,384). For the experiment, 533 teachers (262 math, 271 literature) and 6,031 grade-8 students (5,141 with math grades, 5,138 with literature grades) in 2016/17 are analyzed. Implicit Association Test (IAT): Teachers took a seven-block IAT measuring automatic associations between immigrant/native-sounding names and positive/negative school-related adjectives. The scoring followed the improved algorithm (Greenwald et al., 2003), with order randomized. The primary IAT metric is the difference in response times between compatible (e.g., native+good) and incompatible (e.g., immigrant+good) pairings; higher scores indicate stronger negative associations toward immigrants. Observational analysis (bias in grading): To assess grading bias, the authors compare teacher-assigned end-of-year grades with INVALSI standardized test scores taken at the same time (grade 8). Models include teacher fixed effects (and robustness with class fixed effects), polynomials in INVALSI scores, and student/teacher controls (student gender, immigrant generation, mother's education; teacher gender, birthplace, age, STEM degree; plus interactions with immigrant status). They also include the classroom behavior grade as a robustness check. Experimental design (bias revelation): The intervention randomly varied the timing of IAT feedback emails at the school level: treated schools received feedback in the last week of January 2017 (just before mid-term grading); control schools received feedback after grading (early February). Over 80% opted to receive feedback. Emails reported the teacher's IAT scores and qualitative category (slight/moderate/strong). Outcomes are mid-term grades (January 2017) because standardized test scores are unavailable at mid-term. The primary estimands are intention-to-treat (ITT: Early Feedback indicator) and local average treatment effects (LATE: instrumenting actual receipt of email with random assignment). Standard errors are clustered at the school level. Heterogeneity: by teachers' explicit attitudes (agreement that immigrants and natives should have equal job access, WVS-style question), students' immigrant generation, and region of origin (Eastern Europe baseline vs. Africa, Latin America, Asia).
Key Findings
- Implicit bias prevalence: Mean IAT score among teachers is 0.47 (vs. 0.41 among Italians taking the online IAT). 67% exhibit moderate to severe implicit bias (score > 0.35) and 91% exhibit some bias against immigrants. - Correlates of IAT: Female and Northern-born teachers are slightly less biased; IAT correlates with explicit egalitarian views (teachers supporting equal job access have weaker implicit bias). IAT is uncorrelated with prior cohorts' immigrant-native INVALSI gaps or variability, suggesting stereotypes do not reflect statistical discrimination from past objective performance. - Baseline grading gap: Conditional on quintiles of standardized test scores, immigrants receive lower teacher grades across the distribution: about −0.13 in math and −0.20 in literature (comparable to the advantage of having a mother with a university degree: +0.15 math, +0.21 literature). - Link between IAT and grading (end-of-year, observational): Math teachers' higher IAT scores predict lower grades for immigrants relative to natives with the same standardized scores. A 1 SD increase in the IAT (≈0.26) is associated with −0.033 grade for immigrants in math—about half the residual immigrant-native grade gap after conditioning on INVALSI. Results are robust to student and teacher controls, interactions, behavior grade, and class fixed effects. For literature teachers, immigrants' grades do not vary with teachers' IAT on average. - Heterogeneity by immigrant generation (observational): In literature, teachers with stronger stereotypes give relatively higher grades to first-generation immigrants and lower to second-generation (consistent with leniency due to language hurdles), while in math the IAT effect does not differ by generation. - Experimental effects (mid-term grades, ITT): Early feedback increases immigrants' grades and slightly reduces natives' grades. • Math: Early Feedback × Immigrant ≈ +0.392 to +0.439; Early Feedback main effect on natives ≈ −0.153 to −0.176. • Literature: Early Feedback × Immigrant ≈ +0.288 to +0.312; Early Feedback main effect on natives ≈ −0.147 to −0.160. - Experimental effects (LATE using take-up >80%): Email × Immigrant increases immigrant grades by ≈ +0.501 to +0.554 in math and ≈ +0.366 to +0.403 in literature; natives' grades decrease ≈ −0.194 to −0.234. - Failing probability (grade < 6, ITT/LATE): In math, Early Feedback × Immigrant reduces failure by about 9.4–10.7 percentage points (LATE: −11.9 to −13.3 pp); no significant effects on natives' failure or on literature failures. - Heterogeneity of experimental effects: Effects are driven by teachers who do not express explicit anti-immigrant views (those endorsing equal job access show larger positive treatment effects). By origin, math teachers increase grades more for immigrants from Eastern Europe (baseline) and Latin America than for those from Africa or Asia.
Discussion
The findings establish that implicit stereotypes, as measured by an IAT, are linked to discriminatory grading in math: teachers with stronger anti-immigrant implicit associations assign lower grades to immigrant students with the same standardized performance as natives. In literature, a mismatch between standardized tests and teachers' evaluative criteria and/or lower expectations for non-native speakers may mask a direct link between IAT and grading on average, though heterogeneity by immigrant generation suggests language-related adjustments. Revealing teachers' own IAT scores prior to grading causally increases grades assigned to immigrants (and slightly lowers natives'), especially in math and near the pass/fail threshold. The effect is strongest among teachers who did not report explicit anti-immigrant views, indicating that awareness of implicit bias provides new information that changes behavior. The results imply that awareness interventions can reduce discriminatory outcomes; however, they may also induce compensatory behavior among teachers whose stereotypes did not previously translate into biased grading, particularly in subjects where IAT does not predict grading differences.
Conclusion
The paper documents that immigrant students receive lower teacher-assigned grades than natives with the same standardized test performance, and links this gap in math to teachers' implicit stereotypes. A simple, scalable intervention—revealing teachers' own IAT scores just before grading—raises immigrants' grades (and reduces failure in math), particularly among teachers without explicit anti-immigrant attitudes. These results inform debates on implicit bias training and the use of IAT feedback in educational and organizational settings. Future research should explore the persistence of behavior change, potential spillovers to other outcomes (e.g., tracking recommendations, student effort and aspirations), optimal ways to frame feedback to avoid defensive reactions, subject-specific mechanisms (e.g., language proficiency in literature), and generalizability across contexts and minority groups.
Limitations
- Standardized test alignment: INVALSI may imperfectly capture skills valued in literature (e.g., language proficiency, open-ended performance), complicating identification of subjective bias in that subject. - Mid-term data constraints: The experiment’s outcomes are mid-term grades without concurrent standardized scores; while randomization ensures internal validity, direct standardization of grades to test performance at mid-term is not possible. - Potential unobservables: Although extensive controls and fixed effects are used, residual unobserved student characteristics (e.g., classroom behavior dynamics, non-cognitive skills) could influence teacher grades; the behavior grade is included cautiously given endogeneity. - External validity: The sample covers 65 schools in five Northern Italian cities and two subjects; effects may differ in other regions, school systems, grades, or subjects. - Behavioral reactions: Revealing implicit bias may prompt adjustments even among teachers whose stereotypes did not previously affect grading, raising questions about positive discrimination and optimal policy design.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny