logo
Loading...
Exploring the global geography of cybercrime and its driving forces

Computer Science

Exploring the global geography of cybercrime and its driving forces

S. Chen, M. Hao, et al.

Explore the intricate geography of cybercrime through a novel global analysis. This research, conducted by Shuai Chen, Mengmeng Hao, Fangyu Ding, Dong Jiang, Jiping Dong, Shize Zhang, Qiquan Guo, and Chundong Gao, highlights how socioeconomic development and technological factors intertwine to influence cybercrime patterns around the world.... show more
Introduction

The paper examines cybercrime as a complex social phenomenon shaped by social, economic, political, technological, and cybersecurity contexts. Motivated by the limitations of purely technical countermeasures and the lack of standardized global cybercrime statistics, the study asks how macro-level contextual factors drive the geography of cybercrime. It develops and tests five hypotheses: H1 social factors positively associate with cybercrime; H2 economic factors positively associate with cybercrime; H3 political factors (e.g., control of corruption, governance) negatively associate with cybercrime; H4 technological factors (e.g., ICT infrastructure, penetration) positively relate to cybercrime; H5 cybersecurity preparedness negatively associates with cybercrime. The study aims to map subnational patterns of cybercrime using a novel IP-based dataset and quantify direct and indirect effects of these contextual drivers globally and across income groups.

Literature Review

Prior research shows macro-level spatial heterogeneity in cybercrime, with certain regions (e.g., Eastern Europe) exhibiting higher activity linked to factors such as technological capacity and corruption. Studies have associated attack origination with higher corruption and bandwidth, and targeting with higher GDP per capita and ICT infrastructure. Others found that better technology and economic capital can increase origination, while stronger cybersecurity readiness can reduce it. Malware reporting has been linked to better technological infrastructure and political freedom, and spam increases tied to unemployment and internet use. However, prior work often used limited predictors, focused on specific cybercrime aspects, and typically operated at national scales, potentially masking intra-country disparities. The paper integrates criminological theories (routine activities, rational choice, general strain, general deterrence) to propose a broader framework encompassing five contextual domains and their pathways to cybercrime, addressing gaps in variable breadth and spatial resolution.

Methodology

Data: The study uses the FireHOL IP blocklist (level 1) as a proxy for cybercrime-related activity. This composite list aggregates malicious or illegitimate IPs (e.g., abuse, attacks, botnets, malware, command-and-control, spam), comprising ~2900 subnets and over 600 million unique IPs. Anonymous IPs (open proxies, VPN providers) were excluded. IPs were geolocated with IP2Location Lite to country/region/city and coordinates, with reported accuracies of ~98% at country and ~60% at city level. Analyses focus on subnational (state/region) level to reduce uncertainty, counting unique IPs per area to quantify cybercrime distribution.

Explanatory variables: Five domains were operationalized. Social: population, population aged 15–64, education index, nighttime light index, human development index (HDI). Economic: income index, GDP growth, Gini index, unemployment (% labor force), poverty rate. Political: World Governance Indicators dimensions—control of corruption, government effectiveness, rule of law, political stability and absence of violence/terrorism, voice and accountability. Technological: internet infrastructure (counts of data centres and internet exchange centres), internet users (% population), international bandwidth (per internet user), secure internet servers (per 1 million people), fixed broadband subscriptions (per 100 people). Cybersecurity: ITU Global Cybersecurity Index (GCI) five pillars—legal, technical, organizational, capacity development, cooperation—and overall index. Population, income index, education index, HDI, nighttime light, and infrastructure were compiled at subnational level; other variables at country level. Log10 transforms were applied to skewed variables (population, nighttime light, infrastructure, fixed broadband, secure servers, bandwidth). All variables were normalized.

Models: Generalized linear models (GLMs) were fitted globally and by World Bank income groups using R (stats::glm) with Gaussian family. Model selection used AIC, R², and predictor significance (p-values). Variance inflation factors (car package) identified multicollinearity; variables with VIF>10 were excluded. Relative contributions and coefficients were visualized (GGally). Structural equation modeling (SEM) tested direct and indirect causal paths among five latent domains (social, economic, political, technological, cybersecurity) to cybercrime, with model fit assessed via RMSEA, SRMR, CFI, and TLI. SEM was conducted in AMOS.

Key Findings
  • Spatial patterns: Subnational mapping shows high concentrations of cybercrime-related IPs in North America, Central and Eastern Europe, East Asia, India, and eastern Australia. Lower counts are seen across much of Africa (except South Africa), parts of South and Central America, the Middle East, Central Asia, and parts of Southeast Asia. Counts increase from Africa to Europe; high-income regions host the majority of IPs and lower-middle-income regions the least.
  • GLM (global): After removing 8 collinear variables (government effectiveness, rule of law, HDI, and 5 GCI pillars) and 7 non-significant variables (GDP growth, unemployment, poverty, political stability, voice and accountability, bandwidth, internet users), the final global GLM included 11 predictors with R²=0.82. Social and technological domains contribute most (53.4% and 30.1% relative contribution, respectively). Infrastructure alone accounts for 18.1% relative contribution; a model with only infrastructure yields R²≈0.504. Adding population and education increases R² to 0.596 (+18.3%) and 0.766 (+28.5%), respectively.
  • GLM (by income group): Contributions vary by income level. Income index contribution decreases from low- to high-income regions. Gini index is more influential in upper-middle and high-income regions. Fixed broadband contributes most in low-income regions and least in high-income regions. Cybersecurity preparedness exerts greater influence in low- and lower-middle-income regions.
  • Coefficient directions: Globally, cybercrime is positively associated with social, economic, and technological factors, and negatively associated with political (e.g., control of corruption) and cybersecurity factors. In low-income countries, cybercrime correlates strongly with higher income index, education, infrastructure, and fixed broadband. In high-income countries, Gini inequality and education are stronger drivers; control of corruption negatively correlates with cybercrime in lower-middle, upper-middle, and high-income regions.
  • SEM: Good fit (CFI=0.917, TLI=0.899, SRMR=0.058). Direct standardized effects on cybercrime: technological +0.61, economic +0.10, social +0.03, political −0.22, cybersecurity −0.07. Cybercrime latent variable R²≈0.80. Social and economic factors also have notable indirect effects mediated by technological and political factors. Overall, technological capacity is a necessary but not sufficient condition; broader contexts significantly shape cybercrime through both direct and mediated pathways.
Discussion

Findings confirm that regions with more advanced ICT infrastructure and connectivity tend to host more cybercrime-related activity, reflecting lower costs and greater opportunities for offenders. However, beyond technology, socioeconomic development, inequality, governance quality, and cybersecurity capacity significantly shape cybercrime patterns. The heterogeneity by income group suggests different mechanisms: in low-income settings, cybercrime emerges from relatively more developed subregions with better communications and education, while in high-income contexts, inequality and education are stronger drivers. Negative associations with governance (control of corruption) and cybersecurity preparedness imply that strengthening institutions and cyber capacity can mitigate domestic origination of cybercrime. Given cyberspace’s borderless nature, international cooperation in legal, technical, organizational, and capacity domains is essential. Technological development mediates the impacts of socioeconomic conditions on cybercrime, highlighting the digital divide’s role in structuring risk and opportunity landscapes.

Conclusion

The study contributes a subnational, globally comprehensive mapping of cybercrime-related activity using the FireHOL IP blocklist and proposes a multi-domain framework integrating social, economic, political, technological, and cybersecurity drivers. Empirically, GLMs show that adding a broad suite of socioeconomic variables greatly improves explanatory power beyond technology alone (global R²=0.82), while SEM quantifies both direct and indirect pathways, establishing technology as a key mediator and necessary but not sufficient condition. Policy implications include prioritizing governance reforms, anti-corruption measures, and cybersecurity capacity building alongside managing ICT growth. Future research should integrate multi-source data (surveys, police/court records, and technical attribution), incorporate temporal dynamics, and disaggregate by cybercrime type and finer-grained micro-level variables to refine causal understanding.

Limitations
  • IP-based geolocation challenges: Malicious IPs may reflect compromised hosts or hosting choices rather than offenders’ true locations; attribution is difficult, so associations with socioeconomic factors require caution.
  • Data sources: The FireHOL blocklist lacks temporal attributes, limiting the ability to analyze dynamics (e.g., COVID-19-related changes).
  • Aggregation across cybercrime types: Using the total of all types may mask distinct mechanisms affecting specific categories; segmentation risks data sparsity.
  • Variable granularity: Many explanatory variables are at country level; micro-level individual/behavioral data and finer-grained contextual measures are needed.
  • Potential measurement biases: Excluded anonymous IPs reduce but do not eliminate uncertainty; differing reporting/collection across sources may introduce bias.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 22+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny