logo
Loading...
The representation of migration in parliamentary settings: critical cross-linguistics corpus-assisted discourse analyses

Political Science

The representation of migration in parliamentary settings: critical cross-linguistics corpus-assisted discourse analyses

M. C. Pérez

This compelling study by María Calzada Pérez explores how migration is articulated in the Spanish Chamber, European Parliament, and British House of Commons during Spain's eighth Legislature. Delving into the nuances of language and socio-political contexts, the research sheds light on the dynamic nature of migration representation across these parliaments.... show more
Introduction

The study investigates how migration is linguistically represented in European parliamentary settings, linking corpus-assisted discourse studies with a socio-political framework to better understand how parliaments construct and reflect ideological debates about migration. It focuses on cross-cultural representation across the Spanish Chamber of Deputies (CD), the European Parliament (EP), and the British House of Commons (HC) during 2004–2008, a key phase in Spain’s discourse formation on migration. The purpose is to derive synchronic and diachronic insights and connect linguistic patterns to socio-political implications using Zapata-Barrero’s reactive/proactive discourse principles. Research questions: (1) What differences and similarities can be determined in the corpus-assisted synchronic images of migration representation at the CD, EP and HC? (2) What differences and similarities can be determined in the corpus-assisted diachronic images of the evolution of migration representation at the CD, EP and HC? (3) What are the comparative and socio-political consequences of these representations?

Literature Review

Prior linguistic research on migration has often examined migrants’ language use or language effects (e.g., Adserà and Pytlíková 2016; Canagarajah 2017; Siegel 2018), with fewer studies focusing on language used to describe migrants, and even fewer on parliamentary discourse (e.g., Martín Rojo and Van Dijk 1997; Van Der Valk 2003), typically qualitative and manual. Corpus-assisted discourse studies (CADS) offer scalable, replicable methods to handle large datasets and reveal both widespread patterns and rare but telling instances. The ESRC-funded RASIM project (Baker et al. 2008; Gabrielatos and Baker 2006, 2008) demonstrated synergy between critical discourse analysis and corpus methods, identifying semantic associations and categorising representations (e.g., provenance, number, entry, economic problems, residence, return, legality, plight) and tracing diachronic changes. Cross-linguistic CADS, notably in Taylor’s work (2014, 2020; Taylor and Marchi 2018), emphasised transparency, replicability, and mechanisms like opposition and conflation (e.g., legal/illegal binaries) and invoked higher-level sociological concepts such as moral panic. The present study aligns with cross-linguistic CADS but proposes an alternative abstraction framework to moral panic—Zapata-Barrero’s principles (efficiency of resources; stability and security; cohesion and trust; equality and non-discrimination) underpinning reactive vs. proactive discourses. This framework connects critical discourse notions with socio-political legitimacy principles and is applied to Spanish (CD), European (EP), and British (HC) parliamentary data during 2004–2008, a period of discourse consolidation in Spain.

Methodology

Data source: European Comparable and Parallel Corpus Archive of Parliamentary Speeches (ECPC), accessed via ECPC-Web (CQPweb-based fourth-generation concordancer). Sub-corpora (2004–2008): CD_04-08 (Spanish Chamber): 371 texts, 16,014,420 tokens. EP-ES_04-08 (European Parliament, Spanish versions; originals and translations): 287 texts, 19,440,223 tokens. EP-EN_04-08 (European Parliament, English versions; originals and translations): 287 texts, 18,970,278 tokens. HC_04-08 (UK House of Commons): 721 texts, 50,598,543 tokens. Translations in EP corpora were not separated from originals, reflecting their functioning in practice. Time frame aligns with Spain’s eighth legislature (2004–2008), part of the period where migration discourse was being formed (per Zapata-Barrero et al. 2008). Nodes queried: primary families migra*, immigra*/inmigra*, emigra*; secondary families refuge*/refug* and asyl*/asil*. Frequency analysis: both raw and normalised (per million words). Normalisation basis: 1,000,000 words. Inferential statistics: log-likelihood (LL) with p thresholds p<0.05 and p<0.01 considered significant; for keyness sensitivity, stricter thresholds were reinforced with Bayes Information Criterion (BIC) following Wilson (2013). Interpretation of BIC: positive values indicate evidence against similarity; negative indicate similarity (with guide thresholds provided). Effect size: Hardie’s Log Ratio (LogR), interpreted as binary log of frequency ratios (e.g., 1=2x, 2=4x). Collocation analysis: symmetrical window 5L–5R; included collocates appearing at least 5 times in each sub-corpus; collocations filtered for high significance (LL-based p) and strong association (LogR). The top 35 collocations by LogR per corpus were thematically tagged inductively (following RASIM, Zapata-Barrero, Taylor): denominations of migration events, participants, institutions, movements, places, causes, effects, proposals/plans of action, quantity/objectification, and other grammatical/cohesive items. Synchronic comparisons were performed across chambers; diachronic plots assessed annual trends (2004–2008). Statistical tools: UCREL LL and effect size calculator; ECPC-Web for concordancing and counts.

Key Findings

Frequencies (2004–2008, relative per million words): EP corpora displayed the highest overall attention to migration-related nodes; HC the lowest. Example pmw by node: migra*: CD 41.53; EP-ES 91.05; EP-EN 148.76; HC 47.10. immigra*: CD 332.01; EP-ES 287.65; EP-EN 251.39; HC 112.67. emigra*: CD 34.28; EP-ES 19.91; EP-EN 9.38; HC 1.26. refug*: CD 10.80; EP-ES 80.55; EP-EN 79.86; HC 21.62. asyl*: CD 7.81; EP-ES 71.86; EP-EN 73.48; HC 57.29. Overall comparisons (all nodes combined): EP-EN had more than twice the node-related terms compared to HC (LogR=1.23; LL=3942.16; p<0.0001; BIC=3924.10). EP-ES vs EP-EN overall were virtually the same (LogR=0.03; LL=2.43; p>0.05; BIC=-15.04). CD vs EP-ES showed a small but significant difference (LogR=0.37; LL=278.53; p<0.0001; BIC=261.15). For migra* specifically (Table 1): CD vs EP-ES: LL=328.89; p<0.0001; LogR=1.13; BIC=311.50. EP-ES vs EP-EN: LL=269.59; p<0.0001; LogR=0.71; BIC=252.13. EP-EN vs HC: LL=1672.93; p<0.0001; LogR=1.66; BIC=1654.87. Diachronic trends (migra*): CD exhibited a spiky, unstable pattern (sensitive to events); HC showed steady growth; EP-ES and EP-EN mirrored each other with frequency differences (EP-EN higher). Collocations (raw; normalised per corpus size): raw counts—CD 37, EP-ES 57, EP-EN 97, HC 105. Normalised: EP-EN 5.11; EP-ES 2.93; CD 2.31; HC 2.08. Statistical comparisons: EP-ES generated 21.16% more collocations than CD (LL=1.29; p>0.05; BIC=-16.09: similarity). EP-EN generated 59.30% more than HC (LL=39.24; p<0.00001; BIC=21.18: strong difference). EP-EN generated 42.66% more than EP-ES (LL=11.51; p<0.001; BIC=-5.95). Common cross-chamber collocational representation around migra*: illegal/ilegal; flow(s)/flujo(s)/corrientes; control/controlar; manage/gestión/gestionar; often indicating a reactive framing (security/stability; resource efficiency). EP-EN specifics: additional reactive metaphors and denominations: admission, irregular, patterns, channels, influx, waves; participants as resources (unskilled); also a proactive plan-of-action collocate integrating. EP-ES specifics: references to Euro-African context and causas (causes; a proactive acknowledgment), but also economic and quantifying collocates (altamente) linked to resource efficiency. CD specifics: overlaps with EP on presión, legal, circular, rutas, movimientos, regulación; some neutral general collocations (fenómeno/s). HC specifics: overlaps with EP on asylum, refugees (neutral); reactive overlaps with EP-EN (patterns, irregular, unskilled, influx); HC-only collocates emphasised reactive framing: unauthorised, indigenous, impact(s), exploitation, net, highly, large-scale, point-based; institutionalised bodies (Migration Advisory Committee; Migration Impact Forum; group on Balanced Migration); movements inward and outward; locus concerns include A8/Eastern Europe and non-EU. Overall, immigration-related terms were most frequent across all parliaments; emigration terms were notably higher in CD and nearly absent in HC. EP’s Spanish and English versions behaved similarly overall, with a notable significant difference for migra* frequencies (higher in EP-EN). HC consistently showed lowest attention and fewer collocational connections, standing out as the outlier.

Discussion

The findings address the research questions by showing that, synchronically, EP (in both Spanish and English) gives the most attention to migration-related discourse and builds denser collocational networks; HC gives the least attention and displays distinctive, strongly reactive framing. CD is closer to EP than to HC in overall focus but shows more variability. Diachronically, CD’s spiky trajectory signals a period of discourse formation, while HC’s steady increase suggests a different national trajectory. Collocational evidence across chambers converges on a reactive representation: migration as illegal flows requiring control and management—mapping onto Zapata-Barrero’s principles of security/stability and resource efficiency. Limited proactive signals are present, mainly via neutral legitimising labels (asylum, refugees) and occasional cohesion/integration-oriented collocates (integrating, integration). The cross-linguistic comparison shows broad similarity between EP-ES and EP-EN with a significant difference for migra* frequencies, potentially reflecting translation dynamics and institutional practices. The observed patterns suggest that at the European level migration is a larger agenda item than at the national level, and that dominant parliamentary discourse during 2004–2008 primarily constructs migration as a challenge to be controlled rather than an opportunity—especially in CD’s early discourse-building phase and in HC’s institutionalised, effects-focused framing.

Conclusion

The study demonstrates that combining corpus-assisted discourse methods with Zapata-Barrero’s socio-political framework yields mutually reinforcing insights: quantitative patterns (frequencies, collocations, significance, effect sizes) enable higher-level abstractions about reactive versus proactive discourses across parliaments and languages. EP shows the highest attention and collocational activity; HC the lowest and most distinctive, largely reactive profile; CD aligns more closely with EP but with unstable diachronic dynamics consistent with a discourse-building phase. Collocations common to all chambers—illegal, flow(s), control, manage/management—underscore a predominantly reactive representation, while proactive cues are rarer and largely tied to neutral legitimising terms (asylum, refugees) and integration-related items. The work contributes comparative, cross-linguistic, and diachronic evidence on migration representation in parliamentary discourse, strengthening generalisability and theoretical robustness. Future research directions include: extending the timeline beyond 2008 to trace shifts (e.g., post-2015 refugee movements, Brexit era); isolating and analysing translation effects within EP more systematically; expanding node sets and discursive frames; incorporating additional parliaments/languages; and triangulating with qualitative analyses to explore how proactive signals evolve in contemporary ideologies.

Limitations

Several constraints temper interpretation: (1) Temporal scope limited to 2004–2008; (2) Some statistical comparisons (e.g., CD vs HC for all nodes) and detailed node-by-node analyses are omitted due to space; (3) EP corpora combine originals and translations without separation, potentially influencing cross-linguistic differences (e.g., migra* frequency gap) that were not fully explored; (4) Collocation analysis focuses on items occurring at least five times and highlights the top 35 by LogR, possibly overlooking lower-frequency but meaningful collocates; (5) Discussion of emigration/refuge/asylum collocations is selective due to space; (6) BIC thresholds applied are stringent, occasionally yielding ambiguous evidence assessments.

Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 22+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny