logo
ResearchBunny Logo
Less than one percent of words would be affected by gender-inclusive language in German press texts

Linguistics and Languages

Less than one percent of words would be affected by gender-inclusive language in German press texts

C. Müller-spitzer, S. Ochs, et al.

This study by Carolin Müller-Spitzer, Samira Ochs, Alexander Koplenig, Jan Oliver Rüdiger, and Sascha Wolfer reveals that making German press texts gender-inclusive requires only minimal textual changes—less than 1%! This finding challenges common beliefs about the readability issues associated with gender-inclusive language.

00:00
00:00
~3 min • Beginner • English
Introduction
Public and academic debates often claim that gender-inclusive German makes texts cumbersome, longer, and harder to read, particularly for L2 learners. Prior psycholinguistic work suggests that gender-inclusive language does not reduce comprehensibility and that readers rapidly habituate to inclusive forms, but resistance remains widespread. This study asks: how much of a German press text would actually need to be changed to make it gender-inclusive? Addressing this provides quantitative evidence against claims about excessive length or complexity. Because identifying whether masculine forms are used generically or specifically cannot be done reliably by surface form or automation alone, the study employs manual annotation to quantify the proportion of text that refers to persons and the subset that would require change under gender-inclusive rewriting.
Literature Review
The paper reviews how German encodes gender in personal nouns across grammatical, lexical, and morphological means: (1) double-gender forms via articles/adjective inflection (singular) with plural neutralization; (2) lexical gender (e.g., Mann/Frau; -mann/-frau compounds); and (3) morphological feminization (e.g., -in). Epicene nouns and collectives can neutralize referential gender. Masculine forms serve both as masculine specifics and as so-called masculine generics; distinguishing these uses requires context. Psycholinguistic research shows that masculine generics often induce a male bias and do not represent all genders equally well. Since 2018, recognition of a third gender option (divers) has heightened interest in inclusive reference, spurring strategies like pair forms and the use of gender symbols (e.g., Lehrer:innen, Lehrer*innen), particularly effective in plurals. While guidelines for inclusive language have spread, critics argue such forms are nonstandard or reduce readability, especially for learners and vulnerable groups. Quantitative corpus evidence on the scale of affected textual material has been lacking, motivating the present study.
Methodology
Corpus and sources: Texts were drawn from the German Reference Corpus (DeReKo), focusing on four sources: DPA (German Press Agency) and three magazines (Brigitte, Zeit Wissen, Psychologie Heute). DPA was the central source due to its reach, impartiality, and pre-2021 practice of not consciously using gender-inclusive forms. Only texts from 2006–2020 were included. A control corpus combined the three magazines to contrast patterns found in DPA. Sampling: From 2,322,095 available documents, the inner 90% by token length (5th–95th percentile per source) were eligible (e.g., DPA 87–837 tokens). Random samples: 190 DPA and 40 per magazine (total 310). After annotation, 261 texts had dual annotations (184 DPA; 35 Brigitte; 36 Psychologie Heute; 6 Zeit Wissen). Token statistics for sampled and annotated sets are reported to show representativeness. Annotation focus and protocol: The goal was to mark all tokens that would need to change if the text were rewritten in a gender-inclusive way. Annotations targeted heads of noun phrases referring to persons (personal nouns), pronouns with person reference, and dependent NP elements (articles, attributive adjectives) that might change if the head changed. A bottom-up approach identified heads first, then dependent elements. A decision-tree scheme with 11 categories captured: linguistic class (personal noun, pronoun, dependent element), generic vs specific uses (e.g., masculine generic), epicenes, lexical gender nouns, referent gender identifiable from context, and whether a gender-inclusive form would be necessary. Two trained annotators independently annotated the same texts; the manual was refined through pretests. Inter-annotator agreement across tokens with person reference was 77.89%. Analyses prioritized tokens with matching class annotations. Computation: Proportions were computed at document level and aggregated as weighted means (weights by document token count). 95% confidence intervals used hypergeometric assumptions due to sampling without replacement. Punctuation was excluded from token totals to avoid underestimating affected shares.
Key Findings
- Corpus size and person reference: 261 dual-annotated texts contained 120,626 tokens (93,533 without punctuation). Of these, 11,375 tokens (12.2%) were marked by at least one annotator as person reference candidates; 8,840 tokens had matching class annotations and were used as reliable person references. Weighted means by document show: DPA texts contain on average 7.99% (95% CI: 7.75–8.23) tokens with person reference; the control corpus 11.06% (10.77–11.35); combined 9.45% (9.26–9.64). - Share of tokens needing gender-inclusive change: DPA average 0.73% (0.66–0.81); control corpus 1.18% (1.09–1.29); combined 0.95% (0.89–1.01). Considering only tokens with person reference, 9.13% (8.25–10.08) in DPA and 10.67% (9.82–11.56) in control would be affected; combined 9.99% (9.37–10.63). - Affected linguistic classes: Of all affected tokens (N=887), personal nouns account for 90.08% (799 tokens). Average proportion of personal nouns requiring change within documents is 25.00% (23.51–26.54). Pronouns are minimally affected on average (0.12%, 0.02–0.34), and dependent elements 2.62% (2.08–3.24), indicating limited grammatical ripple effects within NPs. - Distribution of personal noun types (all sources): epicene nouns 27.32% (25.78–28.90), masculine generics 24.97% (23.48–26.51), masculine specifics 22.93% (21.49–24.43), lexical gender nouns 9.95% (8.93–11.04), feminized forms 8.04% (7.12–9.04). No feminine generics were jointly annotated. - Source differences: In DPA, masculine specifics dominate (36.47%, 34.10–38.89), reflecting reporting on specific (mostly male) individuals; the control corpus shows higher shares of epicenes, masculine generics, lexical gender nouns, and feminized forms. - Referent gender where identifiable: In DPA, 80.37% (77.46–83.05) of such tokens refer to men and 19.01% (16.37–21.89) to women, evidencing a strong male bias. In the control corpus, women slightly predominate: women 52.87% (48.56–57.14), men 45.29% (41.04–49.59). - Document-level prevalence: 44.44% of texts contain both masculine generics and specifics; 39.85% contain only one of these; 15.71% contain neither. Gender-inclusive re-editing would affect 62.45% of documents, implying that more than one third would remain unchanged under inclusive rewriting. Overall, fewer than 1% of all tokens would need changes for gender-inclusive language in the sampled German press texts.
Discussion
The findings directly address claims that gender-inclusive German substantially lengthens or complicates texts. Empirically, only about 0.95% of all tokens would change if non-inclusive texts were rewritten inclusively, and fewer than 11% of person references are implicated. Changes are overwhelmingly confined to personal nouns, with negligible impact on pronouns and limited adjustment of dependent NP elements, suggesting minimal structural disruption. This small footprint challenges the assumption that inclusive language poses a major barrier to readability or language learning. At the same time, the data highlight that masculine generics, not surface-distinct from masculine specifics, are common and can yield referential ambiguity and male-biased interpretation, as shown in psycholinguistic research. Source differences reveal that newswire texts (DPA) skew toward specific male referents, while magazine texts use more epicenes and inclusive strategies, indicating topic and audience effects. The annotated dataset provides a basis to train or validate automated systems for detecting personal nouns and distinguishing generic from specific uses and to support media audits of gender representation beyond named entities.
Conclusion
This study provides the first quantitative estimate of how much German press text would need to change under gender-inclusive rewriting. Three key values emerge: on average, 9.45% of tokens are person references; 0.95% of all tokens would be affected by inclusive re-editing; and 9.99% of person-reference tokens would change. Around one third of documents would remain entirely unchanged. These results undermine claims that gender-inclusive language substantially impairs comprehensibility or increases text length and complexity, especially given available neutralization strategies (e.g., epicenes like Lehrkraft). Future research directions include combining these annotations with automatic personal noun detection to distinguish generics and specifics, lexical-level analyses of which nouns are prone to generic use, expanding to other text types (e.g., civic and corporate websites, parliamentary debates, official addresses), extending the approach cross-linguistically, and leveraging LLM-based methods to compare processing difficulty and comprehensibility between original and inclusively rewritten texts across different inclusive strategies.
Limitations
- Scope and representativeness: The core analysis relies on 261 documents with dual annotations (184 DPA; 35 Brigitte; 36 Psychologie Heute; 6 Zeit Wissen). Zeit Wissen is underrepresented, and some sampled texts were not doubly annotated due to annotator availability and tool errors. Results focus on press texts, primarily 2006–2020, and may not generalize to other genres or time periods. - Annotation constraints: Inter-annotator agreement was 77.89% for person-reference classes, with greatest uncertainty for dependent NP elements, reflecting challenges in phrase-structure decisions. Tokens with non-matching or single annotations were excluded from main counts to ensure reliability, potentially yielding conservative estimates. - Inference limits: Distinguishing masculine generics from specifics requires context and cannot be deduced from form alone; the study intentionally avoids external research beyond the text, which may leave some referents’ genders unresolved. Non-binary references were not identified in this corpus. - Token accounting: Punctuation was excluded to avoid underestimating affected shares; while methodologically justified, this choice affects comparability with studies that include punctuation.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny