Linguistics and Languages
Intergenerational differences in Russian color naming in the globalized era: linguistic analysis
Y. A. Griber, D. Mylonas, et al.
The study investigates intergenerational (apparent-time) differences in Russian color naming. Building on sociolinguistic work showing synchronic heterogeneity and age-related stability of language after childhood, the authors ask how color-term (CT) inventories and naming patterns vary across age cohorts in contemporary Russia. The context includes the Berlin and Kay framework of basic color term (BCT) evolution, known age-related differences in vocabulary and lexical diversity, and post-1991 sociocultural changes in Russia (globalization, market openness). The purpose is to quantify age-related diversity in CT lexicon, the prevalence of BCTs versus non-BCTs, modifications/compounds, object referents, and naming patterns, and to identify potential emerging BCTs in Russian. The study is important for understanding how sociocultural change and generational experience shape lexical systems for color, a domain central to perception and communication.
The paper reviews prior anthropological and linguistic studies of age-related variability in CT inventories. In non-industrialized languages at stages III–V of BCT evolution, younger speakers often display more advanced BCT systems and increased use of loanwords (e.g., Aguaruna, Binumarien, Futunese; Damara; Setswana). In European languages with 11 BCTs (stage VII), intergenerational variation appears in PURPLE and ORANGE areas and in lexical replacement processes (e.g., brun → marron in French; castaño → marrón in Galician; skär → rosa and violett → lila in Swedish; changes in Polish blue terms). Older speakers tend to use richer modifiers and compounds; younger speakers often use fewer modifiers but introduce “fancy” or loan terms. Aging affects chromatic discrimination, particularly in short wavelengths, though perceptual compensation maintains color appearance. For Russian, the review notes 12 BCTs (including sinij ‘dark blue’ and goluboj ‘light blue’) and a rich non-BCT inventory, with sirenevyj ‘lilac’, salatovyj ‘lettuce-colored’, and birûzovyj ‘turquoise’ previously flagged as potentially emerging basic terms. The authors frame their analysis within these cross-linguistic patterns and Russian-specific sociocultural shifts.
Design: Apparent-time, web-based, unconstrained color-naming study using http://colournaming.com. Participants: N=1927 native Russian speakers (1307 females), aged 16–98 years (born 1927–2003), residing across the Russian Federation. Recruitment via social media (2018–2020), snowball sampling. Stratified into seven age groups: 16–19, 20–29, 30–39, 40–49, 50–59, 60–69, 70+. Color vision screened online (Dynamic Color Vision Test); data from those with color-vision anomalies were excluded. Demographics collected (residency, nationality, native language confirmation, multilingualism, education, gender, color competence). Stimuli: 606 color samples approximately uniformly distributed in the Munsell Renotation Dataset, plus 8 sRGB cube-corner colors and 9 neutrals, presented as self-luminous patches in sRGB gamut. Procedure: Participants named randomly presented colors by typing Russian responses (Cyrillic). Unconstrained responses allowed single words, compounds, modifiers/qualifiers; responses recorded verbatim. Data cleaning: Multi-step process addressing typos, spelling errors, non-Russian items, abbreviations, acronyms, numerals. Steps: (1) SQL extraction of unique strings (initially 4770 items) and gross filtering; (2) manual correction to canonical forms with a replacement table; (3) automated replacement via VBA in Excel. Case normalization, merging inflectional variants (gender, number), and transliteration to Latin for analysis. Lexical units: Tokens = total responses; Word types = unique forms (including variant modifier orderings); Lemmas = canonical forms. Final dataset: 55,515 tokens and 3,128 word types (across the total sample), with per-age-group token/type counts reported (e.g., 16–19: 2,176 tokens, 354 types; 70+: 3,864 tokens, 209 types). Analyses:
- Diversity: Margalef index D=(S−1)/ln N for each age group and per age-year to assess richness of color lexicon; note sensitivity to sample size.
- Similarity: Agglomerative hierarchical clustering (Ward’s method, ward.D2) on age groups’ color lexicons; visualization via ggplot2 in R.
- Frequency and dominance: Ranked top-30 color names per group; identification of dominant types (covering ≥50% of tokens).
- BCT usage: Proportion of tokens that are unmodified BCTs (12 Russian BCTs), comparisons across age groups, and chi-square tests.
- BCT families: Occurrence of derived, modified (e.g., light/dark/bright/pale), and compounded forms per BCT.
- Descriptor structure: Distribution of single-word BCTs, monolexemic non-BCTs, and multi-component types by age.
- Non-BCT categorization: Classification into logical, evocative, obscure, and conventional categories; shares per age group.
- Object referents: Identification and quantification of object-derived CTs (tokens/types), taxonomy of 30 referent categories grouped into six classes (Flora, Fauna, Inanimate nature, Food & beverages, Man-made objects, Body/bodily products), with per-age-group counts and shares.
- Patterns: Frequencies of “cveta X” (‘color of X’) and noun-clad adjectives (“X”).
- Diversity and richness: Younger cohorts exhibit greater lexical diversity. Word-type proportions and Margalef indices are highest under 40, lower at 40–59, and decline steeply after 60. Example: 16–19 group had 354 word types (16% of their responses), whereas 70+ had 209 types (5.43%).
- Clustering: Ward’s clustering grouped Millennials (20–29, 30–39) together; 40–49, 50–59, 60–69 formed another cluster with 70+ abutting it; 16–19 (Gen Z) was distinct, reflecting post-Soviet sociocultural environment.
- BCT usage increases with age: Overall, 48% of tokens were unmodified BCTs. Younger groups used fewer BCT tokens proportionally than older groups. For example, 20–29 had 4,679 BCT tokens (40%) out of 11,581 responses; 70+ had 2,785 BCT tokens (72%) out of 3,864 (χ²=35.04, p<0.001). Older cohorts showed reduced relative use of BLUE–GREEN BCTs (fioletovyj, zelenyj, goluboj) and gray (seryj).
- High-frequency CTs: Across age groups, PURPLE (fioletovyj) and PINK (rozovyj) are frequent (partly reflecting stimulus set). Achromatic BCTs (belyj, čërnyj) rank below 12 likely due to stimulus distribution.
- BCT families and modifiers: Younger (<30) used modifiers like ‘dark’ (tëmnyj) and ‘pale’ (blednyj) more often; 70+ relied on a narrow set (dark, light, bright, pale), with other qualifiers (e.g., ‘dirty’, ‘saturated’, ‘pastel’) rare or absent. Modified forms such as tëmno-sinij, tëmno-zelënyj, svetlo-zelënyj, tëmno-/svetlo-fioletovyj were among frequent items, indicating fine-grained GREEN and PURPLE distinctions.
- Descriptor structure: With age, share of single-word BCTs rises (to 69–72% for 60–69 and 70+), while monolexemic non-BCTs and multiword compounds decline. Younger cohorts produce more multiword, nuanced, and expressive descriptors.
- Non-BCT categories: Younger/middle-aged (20–60) produced more ‘evocative’ non-BCTs; the oldest and the youngest (16–19, 60+) favored ‘logical’ and ‘conventional’ terms. ‘Obscure’ terms (idiosyncratic, expressive) were more common among Gen Z and Millennials.
- Object-derived CTs: 18,300 tokens (≈33%) were object-derived; these spanned 2,297 word types, constituting ≈73% of all word types. Younger/middle-aged groups contributed higher proportions and greater variety of object-derived terms; 20–29 had the most object referents (≈258), versus 92 (16–19) and 77 (70+). Across ages, Flora and Inanimate nature were most common classes; younger groups shifted towards Man-made objects (artefacts, dyes/pigments) and Food & beverages (sweets/pastries, dairy).
- Patterns ‘cveta X’ and noun-clad adjectives: ‘Color of X’ was relatively high in the two youngest groups, decreased in middle age, and increased again in 60+. Noun-clad adjectives (‘X’) appeared mainly in younger/middle-aged, virtually absent in the oldest cohorts.
- Loanwords and curtailed forms: Surge of transliterated calques (often from English) and curtailed Russian adjectives (e.g., oranž, bordo, bež) across ages, especially among the young.
- Emerging BCT candidates: Three frequent non-BCTs—sirenevyj ‘lilac’, birûzovyj ‘turquoise’, salatovyj ‘lettuce-colored’—compete with BCTs across age groups, with sirenevyj and birûzovyj particularly rising in younger cohorts (e.g., birûzovyj ranking as high as 10th in 16–39 vs 17th in 70+), suggesting ascendance toward basic status.
- Category refinement: Strong refinement in PURPLE (bordovyj, malinovyj, lilovyj, fuksia) and GREEN (salatovyj, bolotnyj, khaki, gorčičnyj; and light/dark variants). Terms like beževyj and persikovyj remain frequent; okhra rises among the youngest.
The results address how generational differences shape Russian color lexicon and naming strategies under rapid sociocultural change. Apparent-time patterns show incrementation: younger speakers possess richer, more diverse CT inventories, employ more non-BCTs (including loans), more modifiers/compounds, and more object-derived referents from man-made domains, while older speakers rely more on unmodified BCTs and traditional, transparent non-BCTs. The findings align with cross-linguistic observations of late-emerging color categories (PURPLE, ORANGE) and ongoing lexical replacement/refinement, while highlighting Russian-specific dynamics rooted in post-1991 globalization (expanded product palette, marketing discourse, English influence). The identification of sirenevyj and birûzovyj (and to a degree salatovyj) as ascendant non-BCTs supports the hypothesis of emerging BCTs in Russian, paralleling trends in English and Japanese, and underscores active refinement at category boundaries (GREEN, PURPLE, and the beige/peach/ocher zone). Age-related perceptual and cognitive factors likely contribute to older speakers’ greater reliance on BCTs and fewer modifiers. Overall, the study reveals that communicative needs, consumer culture, and generational identities jointly drive expansion and stylistic diversification of Russian color naming.
This apparent-time study demonstrates strong intergenerational differences in Russian color naming. Younger and middle-aged cohorts (Gen Z, Millennials, Gen X) exhibit greater lexical diversity, heavier use of non-BCTs (including loans), multiword and expressive descriptors, and a shift in object referents toward man-made and marketed artifacts; older cohorts favor unmodified BCTs and transparent, traditional non-BCTs. Evidence indicates emerging basic status for sirenevyj ‘lilac’ and birûzovyj ‘turquoise’ (with salatovyj also prominent) and continued refinement within PURPLE and GREEN categories and the beige/peach/ocher region. These dynamics reflect both general mechanisms of lexical change and Russian-specific post-1991 sociocultural transformations. Future research should: (1) rigorously test basicness criteria for sirenevyj and birûzovyj (frequency, consensus, response times, denotative extent); (2) conduct longitudinal real-time studies to complement apparent-time inferences; (3) expand cross-linguistic comparisons of emerging categories and boundary refinements; (4) examine interactions between perceptual aging, stimulus sampling, and naming patterns; and (5) analyze marketing/media corpora to track loanword diffusion and noun-clad adjective usage.
- Apparent-time design infers change from cross-sectional age differences; longitudinal (real-time) confirmation is not provided.
- Diversity estimates via Margalef index are sensitive to sample size, potentially affecting between-group comparisons (youngest and oldest groups differ in N).
- Stimulus set bias: sRGB-limited, with relative prevalence of PURPLE and PINK samples and fewer achromatics, likely inflating frequencies of fioletovyj/rozovyj and depressing belyj/čërnyj.
- Web-based convenience sampling may not fully represent the Russian-speaking population (regional, socio-economic, internet-use biases).
- Unconstrained naming yields wide idiosyncrasy; data cleaning decisions (normalization, lemmatization) may influence counts.
- Color vision screening was online; residual undetected variability may remain.
Related Publications
Explore these studies to deepen your understanding of the subject.

