Interdisciplinary Studies

Stereotypes, disproportions, and power asymmetries in the visual portrayal of migrants in ten countries: an interdisciplinary AI-based approach

J. S. Olier and C. Spadavecchia

This interdisciplinary study by Juan Sebastian Olier and Camilla Spadavecchia explores the visual representation of migrants across ten countries using AI. It reveals the stark disparity in how asylum seekers, expats, and refugees are portrayed in media, often aligning with stereotypes. Discover the complex emotions and demographics behind these images and the colonial undertones of the 'expat vs. migrant' debate.

00:00

~3 min • Beginner • English

Index

Introduction

The paper examines how media images, as political and meaning-making representations, shape perceptions and behaviors toward social groups, with a focus on migrants. Visuals can convey unverbalized meanings, trigger stronger emotional reactions than text, and influence political agendas, particularly on sensational, law-and-order issues. Prior research shows media images reflect societal biases, reinforcing stereotypes that can translate into discriminatory behaviors. In Western media, migrants are frequently framed through dehumanizing visual strategies (massification, infantilization, marginalization) and conflated categories (immigrant/refugee/asylum seeker), with disproportionate visibility of asylum seekers. Migrants are often shown as faceless crowds linked to security threats, while refugees have been reframed from political agents to voiceless victims. Social media can both reproduce mainstream frames and amplify hate discourses. Portrayal varies by media political orientation (humanitarian-victim frames vs threat frames) and is ethnicized (e.g., US media associating migrants with Latin America, low-skilled jobs, or illegality). A hierarchy positions highly skilled, often Western movers as “expats,” contrasted with “migrants” racialized as from poorer, non-white regions. Existing studies largely focus on textual analysis, single outlets, or single countries, and often equate migrants with refugees. This study addresses these gaps by: (1) conducting cross-country analysis in ten countries with diverse political/media systems and migration figures; (2) using machine learning on large-scale, unconstrained web images; (3) applying an intersectional lens (gender, age, facial features, emotions); and (4) comparing portrayals of three groups—migrants, refugees, expats—to reveal stereotypes, disproportions, and power asymmetries. The research question centers on how these groups are visually portrayed across contexts and how portrayals differ demographically and emotionally, relative to official statistics.

Literature Review

Related work is synthesized across five themes. (1) Migrants, refugees, and expats: The IOM broadly defines migrants; refugees have a legal definition (1951 Convention and regional instruments). “Expat” is commonly used for skilled, often Western professionals, reflecting privilege and colonial legacies; categories are unstable, political, and context-dependent, naturalizing power relations and inequalities. Global statistics indicate 281M international migrants (164M workers, 27.1M refugees, 4.6M asylum seekers), but media over-focus on refugees misrepresents the diverse migrant population. (2) Gender: Migrant women are underrepresented and stereotyped (e.g., veiled/oppressed or sexualized), with media defaulting to men as migrants. Women are framed around cultural/community issues or dependency, affecting policy and public attitudes. (3) Gender-age intersection: Studies during the 2015 European context show images skewed towards young males; female children may be more visible than male children, elders underrepresented. Overall, age is underexamined, especially for expats. (4) Emotions: Media use emotions to frame stories, with gendered emotional stereotypes (women-happy; men-anger), and negative emotions prevalent in refugee coverage. There is a gap in emotional analysis beyond refugees. (5) Geographical context: Media portrayals are context-dependent; cross-country comparisons are needed to capture variation in political/media systems, migration figures, and policies affecting coverage and implicit bias formation.

Methodology

Design: An automatic visual content analysis approach leveraging Deep Learning quantifies demographic (gender, age, facial features) and emotional information in images associated with “migrants,” “refugees,” and “expats.” Images were retrieved via Google Custom Search API using English terms translated into official languages per country (Google Translate API). Searches were constrained by country/language context. Up to 200 images per group per location were collected; total retrieved images: 17,898 (some corrupted/not downloadable). Only images with detectable faces were used for demographic/emotional estimations; people were also detected when faces were not detectable. Groups and locations: Nine categories were analyzed—expat, expat man, expat woman; migrants, migrant man, migrant woman; refugees, refugee man, refugee woman—across ten countries: Australia (au), Canada (ca), Colombia (co), Hungary (hu), Italy (it), Spain (sp), Sweden (se), The Netherlands (nl), United Kingdom (uk), United States (us). Detection and preprocessing: Images resized to max dimension 1024 px. Faces detected via MTCNN (confidence >95%; resolution >40×40 px). Faces <64×64 px were upscaled using face super-resolution (FSRGAN-DB) to 224×224. People detected via DETR. Feature vectors (ViT-S/16 DINO) used for crowd similarity analysis. Crowd similarity: Approximately 50 “Crowd” images per location (528 total) were downloaded. For each crowd image, distances to group images’ feature vectors were computed (normalized squared Euclidean). For each crowd image, the 50 lowest distances were aggregated into normalized histograms per group and averaged to estimate P_crowd(C|Group=g). Demographic estimation: FairFace model (Karkkainen & Joo, 2021) estimated gender (Male/Female), facial features (White, Black, Latino-Hispanic, East Asian, South-East Asian, Indian, Middle Eastern), and age ranges (0–2, 3–9, 10–19, 20–29, 30–39, 40–49, 50–59, 60–69, 70+). Reported accuracies: gender ~94%, facial features ~73%, age ~51.5%. Instead of hard class labels, per-face probability distributions were averaged per image/group/location. Emotion estimation: EmoNet (Toisoul et al., 2021) estimated per-face discrete emotions (Neutral, Happy, Sad, Surprise, Fear, Disgust, Anger, Contempt) as distributions, and continuous valence/arousal scores. Validation with statistics: Official immigration statistics were compiled for EU countries (Hungary, Italy, Netherlands, Spain, Sweden), USA (US Census), and Canada (Statistics Canada); also EU first-time asylum applications. For the US, immigrants were split into highly skilled migrants (HSMs: bachelor’s+), assumed to approximate “expats,” vs low-skilled migrants (LSMs). Facial feature categories were mapped to US Census’s race taxonomy as: White (White+Middle Eastern), Black, Asian (East/South-East/Indian), Other (Latino-Hispanic). Comparisons were made between image-based estimates and official stats. Analysis: Intersectional analyses by group, gender, age, facial features, emotions, and location. Differences between distributions across groups/locations were quantified using Kullback–Leibler divergence (KL). Statistical tests (e.g., Z-test) assessed gender differences in valence. Data scale summary (Table 1): Total images: 17,898; total faces: 21,076; total persons detected: 59,777 across 15,241 images (avg 3.92 persons/image). Group-level stats (images; faces; w/ face %; faces/image; persons/image; crowd probability): expat (1929; 1087; 26%; 2.11; 3.2; 5.8%), expat man (1948; 1920; 64%; 1.53; 2.4; 3.0%), expat woman (1970; 2137; 64%; 1.67; 2.5; 2.7%), migrants (2019; 1739; 37%; 1.68; 3.1; 43.2%), migrant man (2009; 2368; 51%; 1.92; 3.4; 7.0%), migrant woman (2001; 3072; 61%; 4.06; 7.4; 7.1%), refugees (2042; 1909; 63%; 1.54; 2.4; 23.8%), refugee man (1957; 2558; 71%; 1.77; 2.9; 4.6%), refugee woman (2023; 4286; 54%; 3.82; 7.0; 2.5%).

Key Findings

- Gender representation: - Across all images, male faces are more prevalent (58.6%). - For general groups: expats 60.5% men; migrants 76.5% men; refugees 66.4% men (Table 2). Gender-specific groups behave accordingly, but migrant woman and refugee woman still include sizable male faces (~32% and ~26%, respectively). - Image-based gender proportions diverge from official immigrant stats (which are more gender-balanced) but resemble EU asylum seeker demographics (male-heavy). Discrepancies by country: largest in Italy (+11% male) and Hungary (+16%), smallest in Spain (+3%) and Sweden (+6%). - Facial features (USA comparison, Table 4): - Expats are depicted as predominantly White (75.8%) with underrepresented Asian (16.0%) compared to official HSM statistics (Asian 42.0%, White 48.9%). - Migrants/refugees show higher Black proportions (migrants 17.7%, refugees 18.5%) than expats (3.9%). Middle Eastern features are higher among refugees (general group ~27%). - Overall, expat portrayals over-associate HSMs with White faces and underrepresent Asians; migrants/refugees align more with LSM demographics. - Age patterns: - Estimated mean ages (men): expats 40.9, migrants 40.1, refugees 39.6. (women): expats 34.8, migrants 34.6, refugees 32.8. - Women are portrayed younger than men; expat women are the youngest among female groups. Children (<10) are underrepresented overall, appearing more with refugees/migrants than with expats; expat women have the lowest child presence among female groups. - Emotions: - Average valence: migrants/refugees negative; expats neutral to positive. Expat women are the only group with positive mean valence; women display higher valence than men across groups (p < 0.001). - Emotion categories: Happy more frequent on female faces (>60%), especially among migrants; Anger more frequent on male faces, highest for expat men. Fear follows as another frequent negative emotion in migrant/refugee groups. - Cross-country emotional variation: Australia and Canada show higher valence; Spain and Italy lower. Sweden and Australia show higher Happy; Colombia, Spain, Hungary higher Anger (notably for migrant men), and Spain higher Fear; Sweden higher Contempt. - Group size and crowds (Table 1): - Migrants are most often depicted as crowds: P_crowd migrants 43.2%, refugees 23.8%, vs 2.5–7.1% for other groups. - Migrant/refugee women images contain larger groups on average (faces/image: 4.06 and 3.82; persons/image: 7.4 and 7.0) than male counterparts (typ. 1.5–1.9 faces/image; 2.4–3.4 persons/image). - Only 26% of expat images contain detectable faces; migrants 37%; refugees 63%. - Cross-country clustering (KL divergence): - Two clusters by location: (UK, Canada, USA, Netherlands, Italy, Australia) vs (Colombia, Spain, Sweden, Hungary). Largest divergence between Sweden and Hungary. - Clear separation between expats and migrants/refugees across demographics and emotions; expat women diverge most from migrant/refugee groups. - Overall narratives: - Media portrayals conflate migrants with asylum seekers, associating them with poverty and risks to host societies. - The “expat vs migrant” dichotomy reveals power and colonial dynamics, privileging Western whiteness and masking the significant share of Asian HSMs in official data.

Discussion

The study’s cross-country, AI-based visual analysis demonstrates that media images systematically diverge from demographic realities and reproduce entrenched hierarchies among migrant categories. Migrants and refugees are visually framed akin to asylum seekers—male-dominated, negatively valenced, and frequently shown as crowds—supporting dehumanizing threat/humanitarian tropes and potentially shaping public attitudes and policy agendas. Conversely, expats are framed as positively valenced, individualized professionals, disproportionately White, with women portrayed as especially positive and young. These portrayals map onto and reinforce the political construction of migrant sub-categories, reflecting colonial legacies that normalize Western, privileged mobility while racializing “migrants” as non-Western, non-white. The gendered emotional patterns (female-happy, male-anger) align with societal emotional stereotypes, suggesting that visual media may propagate gender norms alongside migration narratives. Cross-country divergences indicate context-sensitive media ecologies: countries with higher gender equality show more balanced depictions; emotional valence differences may relate to national migration contexts, policies, and economic/cultural factors. Altogether, findings address the research question by evidencing systematic, intersectional biases in visual portrayals across groups and geographies, highlighting their relevance for discrimination, agenda-setting, and public understanding of migration.

Conclusion

This work introduces an interdisciplinary, large-scale, AI-based approach to quantify and compare the visual portrayal of migrants, refugees, and expats across ten countries. It contributes by: (1) extending visual framing research with multi-country, intersectional analyses (gender, age, facial features, emotions); (2) documenting consistent mismatches between portrayals and official demographics (e.g., male overrepresentation, youth/child underrepresentation, negative valence for migrants/refugees); (3) evidencing colonial/power asymmetries in the “expat vs migrant” dichotomy, including White overrepresentation and Asian underrepresentation among expats; and (4) demonstrating the utility of Deep Learning for media studies at scale. Future directions include deeper investigation into causes of discrepancies between images and statistics (e.g., editorial choices, platform dynamics), expanded emotional analyses across more contexts and time, exploring links between portrayed emotions and countries’ attractiveness to HSMs or macro indicators, and replicating the methodology to additional countries and media sources. Refinements to mitigate potential model/data biases and to incorporate broader gender identities are also warranted.

Limitations

- Method-related biases: Demographic and emotion estimators are trained on web-sourced datasets that may embed societal and annotation biases. FairFace training/validation sets show age distribution discrepancies by gender (female faces skew younger), potentially reinforcing observed patterns. AffectNet exhibits gender-related valence/emotion frequency differences (female-positive, male-anger), which may bias EmoNet outputs. - Ambiguity in facial-feature categories: Model outputs are proxies for visual features, not racial classification. Mapping to official “race” categories (e.g., US Census) requires aggregation and may introduce mismatch. - Data and detection constraints: Only images with sufficiently detectable faces contribute to demographic/emotion estimates; high crowd prevalence reduces face detectability. Super-resolution and detection thresholds may affect inclusion/exclusion of faces. - Sampling via search engines: Results reflect images surfaced by search algorithms per language/country and may not represent all media outlets equally; they can mirror platform-specific biases and temporal events. - Age estimation accuracy is limited (~51.5%), and emotion recognition in the wild remains challenging; continuous valence/arousal and categorical emotions may be sensitive to pose, lighting, occlusions, and cultural display rules. - Comparability with statistics: Official immigration data categories (e.g., HSM proxies, asylum seekers) imperfectly align with media categories (expat/migrant/refugee), limiting direct one-to-one validation.

Related Publications

Explore these studies to deepen your understanding of the subject.

Humanities

Disciplinary power and practices of body politics: an evaluation of Dalit women in Bama's Sangati and P. Sivakami's The Grip of Change through Foucauldian discourse analysis

A. Ghosh

Psychology

The association between loneliness and pain, and the role of physical health and distress: an analysis in 139 countries

L. Macchia and A. Fett

Political Science

Joint liability and aggravation? An inspection of legislative and judicial practices in cases of the crime of the abduction, sale, and purchase of women and children in China

D. Wang

The Arts

Assessing the stress-relief impact of an art-based intervention inspired by the broaden-and-build theory in college students

C. Liu, Y. Xie, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny