logo
ResearchBunny Logo
Risk assessment and categorization of terrorist attacks based on the Global Terrorism Database from 1970 to 2020

Political Science

Risk assessment and categorization of terrorist attacks based on the Global Terrorism Database from 1970 to 2020

Z. Xu, Y. Lin, et al.

Explore a groundbreaking study by Zonghuang Xu, Yao Lin, Hongyu Cai, Wei Zhang, Jin Shi, and Lingyun Situ that utilizes quantitative methods to analyze the risk of terrorist attacks over fifty years. With insights from the Global Terrorism Database, the research identifies the September 11 attacks as the most significant and reveals four major risk hotspots around the globe. This research provides essential recommendations for effective counter-terrorism strategies.

00:00
00:00
~3 min • Beginner • English
Introduction
The paper addresses how to quantitatively assess and categorize the risk of terrorist attacks worldwide, moving beyond subjective, unilateral metrics (e.g., casualties or economic loss). Terrorism remains a complex global threat with evolving characteristics such as decentralized actors, diverse means, politicized goals, and regional concentration. Existing quantitative research often suffers from short time spans, limited geographic scopes, few indicators, and reliance on subjective or singular methods. The study aims to construct a comprehensive indicator system from the GTD (1970–2020), develop optimal indicator weighting by combining subjective and objective methods via moment estimation theory, apply multiple evaluation models to identify the riskiest events, and categorize risk levels using diverse clustering methods validated by internal metrics and spatial visualization. The purpose is to provide robust, data-driven insights to support counter-terrorism policy and resource allocation.
Literature Review
The authors note that much terrorism research remains qualitative with limited quantitative analyses. Quantitative studies often focus on short periods or single regions/countries, use few and simple indicators, and adopt subjective methods such as AHP, fuzzy evaluation, and K-means without robust cross-method validation. Prior work emphasizes casualties/economic loss but underrepresents contextual factors (political, economic, social) and methodological pluralism. The paper references GTI reports and empirical studies on patterns, spatial clustering, terrorism dynamics, and classification, highlighting gaps in comprehensive multi-indicator frameworks, objective weighting, multi-model evaluation, and rigorous validation of clustering outcomes. This study addresses these gaps by constructing a broader indicator system from GTD, combining subjective and objective weighting, triangulating multiple evaluation and clustering methods, and validating with clustering indices and spatial-temporal visualization.
Methodology
Datasets: The study uses GTD (START), which records 210,454 terrorist incidents worldwide from 1970–2020. Other datasets (ITERATE, TWEED, RDWTI, WITS) are noted but GTD is primary due to scope. Limitations acknowledged include media-based selection bias, underreporting, and lack of broader contextual variables. Data exploration and cleaning: Multidimensional exploration was conducted across year, weapon type, region/country, attack type, and target/victim type. Data cleaning removed: (i) Doubtterr = 1 or -9 (suspected terrorism); (ii) “Unknown” values in fields such as specificity, attacktype1, weaptype1, targtype1, property, propextent, city, corp1, gname, claimmode, nperps, nperpcap, ishostkid, INT_LOG; (iii) Null records in latitude, longitude, nkill, nwound, propextent, corp1, ishostkid, guncertain1. Non-numeric fields were quantified; variables below 20% completeness (e.g., secondary/tertiary weapon fields) were removed. Indicator construction: A nine-primary, twenty-two-secondary indicator system was built (examples include: Iyear; extended; crit3; multiple; region; vicinity; specificity; attacktype1; success; suicide; weaptype1; targtype1; guncertain1; individual; nkill; nwound; property; propextent; ishostkid; INT_LOG; INT_IDEO; INT_MISC). Redundant narrative fields and sparsely populated variables were excluded. Indicator correlation analysis: Independence among the 22 indicators was assessed using Kendall’s tau due to non-normality, indicating very low inter-indicator correlation and supporting indicator quality and independence. Optimal weight method (moment estimation theory): Combined subjective and objective weights to derive optimal indicator weights. Subjective methods: AHP (hierarchy construction, judgment matrices, consistency tests) and ORA (ordering relative importance and synthesizing structural weights). Objective methods: Entropy Weight (EW) and CRITIC (contrast intensity and inter-indicator conflict). Moment estimation integrates subjective and objective weight samples via an optimization minimizing divergence from both, with importance coefficients α and β computed from expectations of subjective/objective weights. The multi-objective model is converted to single-objective via linear weighting and solved under constraints Σw_j=1 and 0≤w_j≤1. Risk assessment models: Four models were applied to compute comprehensive scores per event: - LWE: weighted sum across indicators. - FCE: fuzzy membership matrix R with weight vector W to synthesize evaluation outcomes. - TOPSIS: compute distances to positive/negative ideal solutions, then relative closeness f_i = s^-/(s^+ + s^-). - PSO-PPE: projection pursuit evaluation optimized via particle swarm to handle nonlinear, non-normal, discrete variables; objective function Q(a)=S(a)·B(a) maximized subject to normalized projection vector constraints. Risk categorization (clustering): Five clustering approaches representing partitioning, hierarchical, density-based, grid-based, and model-based families were tested: FCM, CURE, DBSCAN, CLIQUE, and GMM. Internal validation metrics used were Silhouette Coefficient (SC), Calinski–Harabasz Index (CHI), and Davies–Bouldin Index (DBI). The best-performing clustering configuration guided selection of the final categorization method and number of clusters. Spatial visualization: Kernel Density Estimation (KDE) in ArcGIS visualized spatial risk distribution. Parameters used included bandwidth h=50 km and raster cell size 0.005 degrees.
Key Findings
- Indicator independence: Kendall correlation analysis showed very low correlations among the 22 secondary indicators, supporting their independence and quality. - Optimal weighting: Moment estimation yielded importance coefficients α=0.4711 (subjective) and β=0.5289 (objective). Optimal primary indicator weights (Table 3) emphasized: Casualties and Consequences (38.12%) as most important; Incident Location (11.07%), Perpetrator Information (9.43%), Target/Victim Information (9.31%); GTD ID and date (8.35%); Attack Information (7.81%); Incident Information (6.57%); Additional Information and Sources (5.70%); Weapon Information (3.63%). - Top 10 riskiest attacks (1970–2020): Averaging normalized scores from LWE, FCE, TOPSIS, and PSO-PPE identified the top 10 events, led by the September 11, 2001 attacks (EventIDs 200109110005 and 200109110004). Other entries include: 11/12/2016 Pakistan bombings; coordinated vehicle bombs in Iraq; 4/21/2004 Basra car bombings; 8/17/2015 Bangkok bombings; 10/18/2007 Pakistan attack; 3/11/2008 Islamabad suicide bombs; 1/27/2018 Kabul ambulance bombing; 3/11/2004 Madrid bombings; 11/24/2017 Al-Rawda mosque attack in Egypt. Reported impacts include 9/11 causing ~2767 deaths and >16,000 injuries; Madrid 3/11 causing 191 deaths and >1800 injuries; Al-Rawda at least 311 deaths, etc. - Downward counterfactuals: Using Kendall correlation and event metadata, highly correlated precursor/related events were identified for each top event (coefficients often >0.9). For 9/11, notable DC events include 1998 Nairobi US Embassy bombing (coef 0.9038), 1993 WTC bombing (0.8734), and 2001 LTTE airport attack (0.8404), illustrating thematic and tactical continuities. - Clustering and risk levels: Internal metrics indicated best performance for FCM with k=4 (CURE best at k=2; CLIQUE/GMM at k=3; DBSCAN optimal at ε≈0.45). Final categorization used FCM (k=4). Event counts by level: Level-I: 412; Level-II: 931; Level-III: 2224; Level-IV: 6599. Higher-risk levels concentrate in Middle East & North Africa and South Asia, with notable presence in Central America & Caribbean and East Asia/North America for certain levels. - Spatial patterns: KDE revealed four primary "turbulent cores" of terrorist attack risk: Central Asia; Middle East & North Africa; South Asia; Central America & Caribbean. Patterns align with socio-political instability, cross-border militant activity, and economic/ governance challenges. - Descriptive patterns: Over 1970–2020, attacks increased notably post-early 2000s; explosives and firearms dominate weapon types; bombing/explosion and armed assault are most common attack modes; private citizens/property, police, and military are primary targets; countries with highest counts include Iraq, Afghanistan, Pakistan, India.
Discussion
The multi-model assessment framework provides a more balanced risk evaluation than unilateral measures, capturing casualties, event dynamics, location, perpetrator and target information. The concordant rankings across LWE, FCE, TOPSIS, and PSO-PPE bolster confidence in the identified top-risk events, led by 9/11. The downward counterfactual analysis contextualizes extreme events within broader patterns of tactics and targets, offering actionable early-warning insights. Clustering validated by SC, CHI, and DBI supports a four-level risk categorization via FCM, enabling rapid triage of event risk profiles. Spatial KDE illuminates persistent regional cores of risk across five decades, informing strategic allocation of counter-terrorism resources and tailored interventions. The findings directly address the research goal by quantifying risk across multiple dimensions, validating categorization, and identifying geographic hot spots, thereby enhancing situational awareness and guiding preventive, protective, and response measures.
Conclusion
The study constructs a scientific indicator system from GTD (1970–2020) and integrates subjective and objective weighting via moment estimation theory to evaluate terrorist attack risk using LWE, FCE, TOPSIS, and PSO-PPE, and categorizes risk using a suite of clustering methods. Key conclusions: (i) Terrorist attacks have increased over five decades, with concentrations in Middle East & North Africa and South Asia; explosives/firearms and bombing/explosion tactics predominate; private citizens/property are frequently targeted. (ii) The integrated evaluation identifies the top 10 riskiest events (headed by 9/11) and their downward counterfactuals, overcoming limitations of previous unilateral, subjective assessments. (iii) Clustering (FCM, k=4) and KDE reveal four global "turbulent cores" of risk: Central Asia, Middle East & North Africa, South Asia, and Central America & Caribbean. Policy recommendations tailored to the four cores include: strengthening border security and intelligence sharing; investing in economic development; supporting political transitions and conflict resolution; enhancing regional cooperation; addressing governance gaps and promoting interfaith dialogue; dismantling transnational criminal networks and tackling socio-economic drivers. Future research should integrate GTD with contextual datasets (e.g., economic, social, political, ACLED, GDELT) and explore root causes and sociopolitical dynamics to further refine risk assessment and policy design.
Limitations
The study relies on GTD, which is subject to selection and reporting biases due to dependence on media and open sources, with variability across time and regions. GTD focuses on incident-level attributes and lacks broader contextual variables (social, political, economic, ideological) that shape terrorism risk and impacts. Using a single data source constrains analysis. Future work should integrate multiple structured and unstructured sources (e.g., ACLED, GDELT, economic indicators, surveys, social media) for cross-validation and to capture environmental context and downstream impacts, while deepening analysis of root causes and motivations.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny