
Education
The effect of language on performance: do gendered languages fail women in maths?
T. Krichli-katz and T. Regev
This groundbreaking research by Tamar Krichli-Katz and Tali Regev uncovers how the way we address women in gendered languages can significantly impact their performance in mathematics. The study reveals that addressing women in feminine terms can reduce the gender gap in math scores, highlighting the powerful role language plays in shaping educational outcomes.
~3 min • Beginner • English
Introduction
Languages differ in whether they grammatically encode gender (e.g., French, Spanish, German, Hebrew). In such gendered languages, masculine forms are often used generically for mixed-gender groups and even to address women. Prior work links gendered languages with greater gender inequality across labor markets, credit, leadership, household division of labor, and education, and experimental studies show that gendered language can shape attitudes and motivations. Yet establishing causal effects of language is challenging because languages are embedded in cultures. Cross-country correlations cannot fully rule out cultural confounds. Focusing on mathematics achievement, the authors note cross-country variation by language type: in PISA 2015 data, countries with gendered languages exhibit a larger average gender gap in mathematics (mean gap 6.15) than countries with gender-neutral or genderless languages (mean gap 1.9; t-test p = 0.039). To move beyond correlational evidence, the study exploits Hebrew’s common practice of using masculine generics. The authors experimentally manipulate whether test-takers are addressed in the feminine or masculine forms and assess effects on maths performance. They hypothesize that masculine address may trigger alienation and stereotype threat for women, reducing effort and performance, while feminine address could mitigate these effects. They also posit that effects should be stronger among participants with higher Hebrew proficiency (e.g., earlier age of acquisition).
Literature Review
Prior studies associate gendered grammatical systems with gender inequality in economic and social outcomes and show that gendered language usage influences cognition and attitudes. For example, answering surveys in gendered languages (French/Spanish) increased reported sexist attitudes versus English; addressing women in masculine forms lowered reported task value and intrinsic goal orientation compared to gender-neutral forms. Experimental work indicates masculine linguistic forms evoke male-biased mental representations, while gender-neutral pronouns can reduce traditional gender role biases and improve attitudes toward women and LGBT individuals. Despite these findings, evidence directly linking gendered language use to performance outcomes (rather than attitudes) has been limited, motivating the present causal experimental approach.
Methodology
Main experiment: A large, random, representative sample of Hebrew-speaking adults in Israel (N = 963; 491 women, 472 men; 18% born outside Israel) was recruited by a survey company (Dialogue). Participants were randomly assigned to one of two experimental conditions: all instructions and prompts addressed either in the feminine or in the masculine grammatical form (implemented by varying the verb form for “answer” in Hebrew). In Hebrew, masculine is commonly used as a generic; women sometimes encounter both forms in online contexts, whereas men are rarely addressed in the feminine. Task: an SAT-type quantitative reasoning test comprising six questions sourced from official prior Israeli university entrance exams (Israeli National Center for Testing and Evaluation). No time limit was imposed. Outcomes: primary—math score (number correct; missing answers coded as incorrect in the main analysis); secondary—time spent (minutes) on the maths test as a proxy for effort/motivation. Post-test measures included an Implicit Association Test (science vs arts by gender) and an explicit attitudes questionnaire (e.g., agreement that “science is for men,” “arts and humanities are for women”), followed by demographics. Immigrants were retained; age at immigration served as a proxy for Hebrew proficiency, with the hypothesis that effects of address form diminish with later language acquisition.
Randomization counts reported: 490 participants addressed in the masculine and 473 in the feminine. Analyses include OLS regressions with controls (e.g., age, higher education, income, immigrant status, political affiliation fixed effects) and interactions with immigration age. Some analyses focus on native speakers (N = 759) and subsets who completed all questions for time analyses.
Supplementary experiment 1 (positive stereotype task): Reading comprehension on empathy (a domain stereotypically favorable to women). N = 690 (335 women, 345 men), randomly assigned to masculine (N = 333) or feminine (N = 347) address. Outcome: test score.
Supplementary experiment 2 (gender-neutral task): Timed (60 s) word-generation task producing words starting with consecutive letters; score based on accuracy and total letters. N = 674 (334 women, 340 men), masculine address N = 343, feminine address N = 331.
Ethics: IRB approvals from Interdisciplinary Center Herzliya and Tel Aviv University; written informed consent obtained. Materials (Hebrew and English translations), data, and code are available via Open-ICPSR.
Key Findings
Main experiment outcomes:
- Descriptives: Average maths score 63%. Only 79.85% completed the full post-test questionnaire (IAT and explicit attitudes).
- Gender gap: Among native Hebrew speakers (N = 759), women scored lower than men: women mean 57.8 vs men 68.0 (t-test p < 0.001).
- Effect of address on women’s performance: Women addressed in the feminine scored higher than women addressed in the masculine: 59.5 vs 54.5 (t-test p = 0.059; N = 383). Authors estimate addressing women in the feminine reduces the gender gap by about one-third.
- Regression results (Table 2; OLS predicting maths grades): Female coefficient ≈ −0.150 (SE ≈ 0.034, p < 0.001); Feminine generics ≈ −0.069 to −0.077 (SE ≈ 0.038, p < 0.05); Triple interaction Female × Feminine generics × Immigration age positive and significant across models (e.g., 0.120, SE 0.051, p < 0.05). Controlling for demographics, addressing women in the masculine is associated with women’s maths scores being lower by about 6.14 percentage points relative to addressing them in the feminine (Model 2; F(1,737) = 3.24, p = 0.066; N = 759).
- Language proficiency moderation: Each additional year older at immigration reduces the effect of feminine address on women by about 1.1 percentage points (F(1,917) = 3.80, p = 0.083; N = 926), consistent with greater sensitivity among those more proficient in Hebrew. Exploratory comparisons suggest larger effects for immigrants from gendered-language countries, but small subgroup sizes preclude firm inference.
Effort (time) outcomes:
- Addressing in the feminine increased women’s time and decreased men’s time. When addressed in the masculine, women spent 1.87 minutes less than men on the test (F(1,684) = 6.15, p = 0.013; N = 688). When addressed in the feminine, women spent 1.18 minutes more than when addressed in the masculine (reported increase also noted as 0.35; model 1 p < 0.001; N = 759). Men addressed in the feminine spent 1.44 minutes less than when addressed in the masculine (F(1,648) = 3.48; N = 688; p reported as 0.603 in text, indicating nonsignificance); demographic controls did not materially change effects.
Attitudes and implicit bias:
- “Science is for men” agreement was lower when the statement was addressed in the feminine: mean 4.84 vs 5.03 (t-test p = 0.007); no significant gender-by-condition differences. No significant differences for “arts and humanities are for women.” No significant differences in IAT scores between address conditions.
Supplementary experiments:
- Positive stereotype task (empathy reading comprehension): Women outperformed men overall (53.7 vs 50.2; p = 0.058; N = 680). Women performed better when addressed in the masculine vs feminine (56.6 vs 51.0; p = 0.042); men’s differences were nonsignificant.
- Gender-neutral task (word generation): Women produced more letters than men overall (42.8 vs 39.6; p = 0.035). Addressing women in the masculine reduced their performance relative to feminine address (40.8 vs 45.0; p = 0.058). Men’s differences were nonsignificant (38.2 masculine vs 41.2 feminine; p = 0.202).
Discussion
The experiment provides causal evidence that the grammatical gender used to address test-takers influences performance: addressing women in the masculine diminishes their maths performance and effort, and increases endorsement of the notion that science is for men, consistent with alienation and stereotype threat mechanisms. Effects are stronger among those more proficient in Hebrew (earlier age of acquisition), implying that sensitivity to gendered forms grows with language proficiency. Effects on men when addressed in the feminine are weaker and less robust, likely reflecting persistent stereotypes of men’s higher competence in maths. Supplementary experiments support mechanism interpretations: when the task aligns with positive stereotypes about women (empathy-related reading comprehension), masculine address enhances women’s performance (consistent with stereotype activation); for a relatively gender-neutral task (word generation), feminine address benefits women relative to masculine address, suggesting general gender stereotypes can still influence performance even without task-specific stereotypes. Overall, the findings indicate that gendered language choices can shape performance and contribute to sustaining gender inequalities in STEM-related contexts.
Conclusion
This study experimentally demonstrates that gendered language can causally affect performance: addressing women in the feminine (rather than the masculine generic) reduces the gender gap in maths performance, partly by increasing women’s effort and reducing perceptions that science is for men. Effects are stronger among individuals with higher proficiency in the gendered language. Supplementary experiments indicate that task-specific and general gender stereotypes are activated by gendered address, modulating performance in stereotype-consistent and neutral tasks. Contributions include causal evidence linking language use to performance and insights into stereotype threat mechanisms embedded in language. Practical implications suggest revising exam and classroom language to include feminine and gender-neutral forms and potentially allowing women to choose preferred forms of address; however, such changes alone will not eliminate gender gaps, which are multi-determined. Future research should test classroom and high-stakes settings, explore long-term effects, examine interactions with examiner gender and peer composition, and assess impacts across different gendered languages and proficiency levels.
Limitations
- Setting: Tests were taken at home individually rather than in classrooms, limiting external validity to typical exam environments and potential interactions with classroom dynamics, proctor gender, and peer composition.
- Scoring: Missing answers were coded as incorrect in the primary analysis, which could bias results if nonresponse patterns differ by condition or gender.
- Subgroup sizes: Small samples in certain subgroups (e.g., female immigrants from non-gendered-language countries) limit statistical power for subgroup comparisons.
- Generalizability: Findings are from Hebrew, a highly gendered language; effects may vary across other languages and cultural contexts. Effects on men were weaker and less consistent.
- Measurement: Some reported p-values and effect magnitudes in time analyses show inconsistencies in the text; IAT measures may be insensitive to short-term manipulations.
Related Publications
Explore these studies to deepen your understanding of the subject.