logo
ResearchBunny Logo
Navigating the financial frontier: a serendipitous journey between corpus linguistics and discourse analysis of economy in parliamentary speeches

Economics

Navigating the financial frontier: a serendipitous journey between corpus linguistics and discourse analysis of economy in parliamentary speeches

S. Altamimi

This research by Sadiq Altamimi delves into the linguistic dynamics of the British economy as portrayed in parliamentary speeches from 1900-2020. By combining corpus linguistics and discourse analysis, the study uncovers major themes of finance and hardship, highlighting the key semantic categories of alleviation, scale, and source of economic issues.

00:00
00:00
~3 min • Beginner • English
Introduction
British parliamentary speeches are a fundamental means of communication and leadership in a democratic society, shaping public opinion, influencing policy, and contributing to the nation’s cultural, political, and economic landscape. This study adopts an interdisciplinary linguistic approach integrating quantitative corpus linguistics (CL) and qualitative discourse analysis (DA) to target a large corpus of political speeches and extract a representative subset for analysis. It addresses criticisms directed at applying CL or DA in isolation—such as DA’s perceived fragmentation and lack of systematicity, and CL’s potential decontextualization, bias in seed word selection, and arbitrary cut-off thresholds—by using a triangulatory corpus-assisted discourse studies (CADS) approach. The approach moves back and forth between quantitative and qualitative procedures to enhance objectivity in data selection and analysis. The study constructs a corpus of approximately two million words comprising Conservative and Labour MPs’ speeches from 1900–2020 and seeks to identify terms related to the British economy (BE) for subsequent analysis. It poses two research questions: (1) How can synergy between corpus linguistics and discourse analysis be developed to enhance data selection and representation? (2) How is the British economy discursively represented by British parliamentarians? By situating texts within their socio-political contexts and comparing discourse types, the study aims to reveal how BE is “talked into being” across time, enhancing interpretive validity while maintaining methodological rigour.
Literature Review
The literature review addresses critiques of both CL and DA and motivates their integration. CL is criticized for decontextualization and overreliance on frequency and collocation patterns (e.g., Widdowson), potential bias in seed word selection, and arbitrary cut-off thresholds for keywords or seed words. Responses include manual examination of concordance lines using DA to ensure contextual validity, comprehensive dictionary-based seed word formulation to mitigate selection bias, and iterative decision-making on cut-off points guided by analysis rather than a priori constraints. DA is criticized for data fragmentariness and lack of systematicity, potentially leading to politically motivated or impressionistic interpretations. CADS provides corroboration through quantitative tools (word lists, concordance, collocation) to identify representative items and systematically generate research questions. Prior CADS work shows that triangulation across CL and DA increases reliability and enables richer interpretations while overcoming limitations associated with either method alone. The review concludes that integrating CL and DA through CADS allows broader coverage, reduces researcher bias, and enhances systematic, context-sensitive analysis of discourse.
Methodology
The study employs a five-step CADS procedure combining quantitative CL and qualitative DA, using AntConc for keyword identification and concordance inspection and Sketch Engine for word sketch analyses of collocation and grammatical relations. Data: A purpose-built corpus (BPSs) of 1,973,521 words comprising Conservative and Labour MPs’ speeches from 1900–2020, partitioned into three socio-historically motivated periods (1900–1949: wars and the Great Depression; 1950–2000: postwar prosperity, inflation and unemployment crises; 2001–2020: economic challenges, campaigns and legislation). Step 1 (Seed word formulation): Consult three online dictionaries (Oxford Learner’s, Collins, Cambridge) to gather 813 BE-related synonyms (including poverty/poor and related terms), remove 342 duplicates and refine compounds, yielding 435 single-word seed terms. Step 2 (Corpus examination): Search the corpus for all seed words; 315 occur at least once; apply a standard threshold selecting items occurring ≥10 times (93 items), check party subcorpora, and merge to a 60-item BE seed list (covering both parties). Step 3 (Keyword analysis): Identify BE-related keywords by comparing BPSs with two reference corpora—the British National Corpus (BNC) and the CORPS corpus of political speeches. From the 60 BE seeds, 37 are keywords vs BNC and 40 vs CORPS; 26 keywords are common across both comparisons. Party-specific comparisons produce 44 total BE keywords (33 Conservative, 40 Labour). Step 4 (KWIC disambiguation): Manually read extended concordance lines for each keyword (paragraph-level KWIC), using a 5-word span and MI scores to focus on content collocates and exclude false positives (e.g., non-economic “debt”). Classify keyword occurrences by semantic relevance to BE: high (70–100%), medium (40–69%), low (0.1–39%), or none (0%). Exclude 16 keywords with no BE relevance in context. The final BE-informed list comprises 28 keywords in context (KWICs) with 956 total occurrences (317 Conservative; 639 Labour). Step 5 (Collocational/word sketch analysis): Use Sketch Engine to examine grammatical relations and collocational profiles of the 28 keywords to identify discourse representations. Group keywords into four semantic/discourse categories—finance, workforce, living standards, hardship—based on contextual usage and collocation patterns. Further analyze frequent collocates of BE keywords within the two most frequent discourses (finance and hardship) into three semantic-functional categories: actions to alleviate BE, scale of BE, and source of BE. Throughout, the study manages precision-recall trade-offs by iteratively shuttling between CL and DA, delaying subjective choices until later stages, and ensuring that selected items are both statistically salient and contextually meaningful.
Key Findings
- The integrated CADS approach reduced bias and improved representativeness by combining dictionary-based seed word formulation, corpus-wide frequency filtering (≥10 occurrences), dual-reference keyword comparisons (BNC and CORPS), and manual KWIC disambiguation. - From 813 initial synonyms, refinement yielded 435 seeds; 315 occurred in the corpus; 93 met the ≥10 threshold; merging party lists produced 60 BE seeds. Keyword analysis identified 37 (vs BNC) and 40 (vs CORPS) BE keywords, with 26 in common; more than 35 (76.08%) of the BE seed words functioned as keywords across comparisons. - After KWIC filtering, 28 BE-relevant keywords remained (956 total occurrences): 19 in the Conservative corpus (317 occurrences) and 24 in the Labour corpus (639 occurrences). - Four BE discourses were identified via collocational analysis: finance, workforce, living standards, and hardship. The finance and hardship discourses dominated: 90.35% of the BE Conservative corpus and 91.39% of the BE Labour corpus were accounted for by these two discourses; workforce and living standards together represented only 9.46% (Conservative) and 8.60% (Labour). - Within Conservative finance discourse, ‘debt’ (79) and ‘deficit’ (42) constituted 78.57% of finance keyword uses; within Conservative hardship, ‘need’ (67) and ‘poverty’ (43) accounted for 82.76%. - Within Labour finance, ‘debt’ (57), ‘deficit’ (59) and ‘low’ (50) made up 89.24%; within Labour hardship, ‘poverty’ (184), ‘poor’ (85) and ‘need’ (81) comprised about 87.93%. - Collocational behavior within finance and hardship clustered into three semantic-functional categories: actions to alleviate BE (216 occurrences: 69 Conservative; 147 Labour), scale of BE (337 occurrences: 108 Conservative; 229 Labour), and source of BE (118 occurrences: 29 Conservative; 89 Labour). - Overall, parliamentarians concentrate on finance and hardship as the primary discourses when representing the British economy; collocations emphasize actions (e.g., reduce debt, fight poverty), scale (e.g., huge debt, massive need), and sources/loci (e.g., global poverty, war debt, child poverty).
Discussion
The findings address the research questions by demonstrating that a triangulated CADS framework can systematically enhance data selection and representation and reveal how the British economy is discursively constructed in parliamentary speeches. By integrating CL tools (frequency filtering, keyword comparisons with BNC and CORPS, word sketch analyses) and DA (context-sensitive KWIC disambiguation, semantic categorization), the study minimizes decontextualization and researcher bias while maintaining representativeness and interpretive depth. The corpus partitioning by socio-historical periods and the reliance on contextual reading situate linguistic patterns within broader political and economic events, showing that economic discourse is contingent on historical developments and policy agendas. The dominance of finance and hardship discourses suggests that MPs frame BE largely in terms of fiscal metrics (debt, deficit, low) and social deprivation (poverty, poor, need), using collocational patterns that stress actions to alleviate problems, the magnitude of challenges, and their sources (including global and demographic dimensions). This focus aligns with political strategies of problem definition and justification for policy action, highlighting how parliamentary discourse communicates economic threats, responsibilities, and remedies to the public.
Conclusion
The study demonstrates that synergizing corpus linguistics and discourse analysis via a five-step CADS procedure enhances objectivity in data selection, improves representativeness, and supports contextually grounded interpretation. By constructing a large, periodized corpus of UK parliamentary speeches and iteratively narrowing from dictionary-derived seeds to statistically significant, context-validated BE keywords, the approach mitigates common critiques of both CL and DA. The analysis identifies four discourses of the British economy—finance, workforce, living standards, and hardship—with finance and hardship predominating. Word sketch analyses reveal three recurring semantic-functional categories—alleviation actions, scale, and sources—underpinning these discourses. The workflow is transferable to other subjects and corpora. Future research could test the reception and impact of these discursive patterns among the public and further examine how discourse shifts across political contexts and time.
Limitations
- Although the triangulated approach reduces bias, some subjectivity remains in qualitative KWIC disambiguation and semantic categorization. - Cut-off thresholds (e.g., ≥10 occurrences) and keyword selection criteria, while standard, may exclude less frequent but meaningful terms. - The corpus, while large and periodized, represents selected parliamentary speeches and may not capture all facets of political-economic discourse. - CL findings can still risk decontextualization without extensive manual review; DA interpretations may not reflect public reception. - The study notes that corpus analyses inherently involve limitations, and the public’s reception of identified discourse patterns was not evaluated, suggesting a direction for future work.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny