logo
ResearchBunny Logo
An AI approach for managing financial systemic risk via bank bailouts by taxpayers

Economics

An AI approach for managing financial systemic risk via bank bailouts by taxpayers

D. Petrone, N. Rodosthenous, et al.

Discover a groundbreaking approach to bank bailouts that leverages a dynamic financial network framework and artificial intelligence to minimize taxpayer losses. This research conducted by Daniele Petrone, Neofytos Rodosthenous, and Vito Latora reveals the crucial balance necessary for optimal intervention strategies during financial crises.... show more
Introduction

The paper addresses whether and when government bank bailouts are value for money from taxpayers’ perspective. Motivated by crises such as 2008–2009 and COVID‑19, the authors argue that systemic risk must be assessed through the interbank network of exposures rather than at the level of individual institutions. Governments and macroprudential authorities need quantitative tools to evaluate interventions that trade off stabilizing the financial system with the potential cost to taxpayers. The study frames bailout decisions as dynamic controls over a financial network in distress, aiming to minimize expected taxpayer losses during a crisis episode. It questions under what conditions (e.g., loss severity upon default, network distress, exposure strength, time horizon) interventions become optimal and how network position and prior interventions affect optimal policies.

Literature Review

The authors survey strands of literature on financial networks and systemic risk, including contagion via interbank exposures and fire-sale externalities, and network resilience analytics. They reference extensive work on bailouts’ effects on bank performance, underwriting, market discipline, sovereign risk, and the interaction between risk-taking and regulation. Prior intervention models often optimize social costs, system wealth, or bailout expenditures, and typically assume pre-specified failing banks. The paper distinguishes itself by focusing on minimizing taxpayer losses irrespective of system wealth, allowing preventive interventions before defaults materialize, and embedding governmental controls directly into a dynamic network model. It builds on PD-based network contagion models and joint default modeling while introducing control via capital injections and an AI-based solution for the induced MDP.

Methodology
  • Dynamic financial network model: Banks are nodes i with total assets W_i(t), equity E_i(t), and per-step default probabilities PD_i(t). Directed edge weights w_ij capture exposure of i to j (credit exposures and potentially fire-sale spillovers). When node j defaults at time t, node i incurs an impact I_i(t)=∑_{j≠i} w_ij δ_j(t), reducing W_i and E_i by I_i(t). Government injections ΔJ_i(t) increase W_i and E_i by the same amount. Liabilities B_i=W_i−E_i remain due unless i defaults.
  • PD mapping: The effect of losses and injections on PDs is modeled via the Merton structural model mapping (W_i,E_i, μ, σ) to a one-step PD, with a PD floor PDM_floor to avoid unrealistically low PDs. If impacts exhaust equity (W_i<B_i), PD is set to 1 at the next step. Joint defaults within a step are modeled using a Gaussian latent variable model with correlation matrix Σ, determining probabilities for simultaneous defaults across subsets of nodes.
  • Default simulation: At each time step t, draw latent Gaussian vector X and set δ_i(t)=1 if X_i<Φ^{-1}(PD_i(t)). Losses propagate via impacts to other banks for t+1.
  • Taxpayer loss on default: If bank i defaults at time t, taxpayer loss is L_i(t)=α W_i(t)+LGD_i J_i(t), where α is the fraction of total assets lost upon default borne by taxpayers, and LGD_i is the loss on cumulative government equity investments J_i(t) (often set to 1 for equity-like injections).
  • MDP formulation: States S_t encode bank-level variables (W_i,E_i,PD_i,J_i), parameters (LGD, α, μ_i, σ_i, Σ_ij), the set of prior defaults I_def(t), and remaining time M−t. Actions A_s are vectors of capital injections ΔJ_i(t) (possibly constrained menus). Transitions P_a(s,s′) are induced by updated PDs and the Gaussian latent model; defaulted nodes remain defaulted. Rewards are negative taxpayer losses at each step (sum of L_i for defaults at t). Objective: maximize expected discounted reward (minimize discounted taxpayer losses) over a finite horizon 0…M with discount factor γ.
  • AI solution approach: Direct dynamic programming is infeasible due to a vast state space and sparse rewards. The authors devise a tailored variation of Fitted Value Iteration with:
    1. Value function approximation V_t(s,β) as a weighted sum of expected direct loss contributions Z_{ik}(s) for each active node and remaining step k, with time-varying weights β.
    2. Construction of Z_{ik} using approximated future PD paths, survival probabilities, expected impacts from others’ defaults, and the effect of sequentially chosen greedy actions at each remaining step to minimize total expected direct loss contributions.
    3. Action-value approximation Q_t(s,a,β) using the Bellman decomposition and a duality that rewrites expectations over successor states as sums over nodes’ default events, drastically reducing computational burden. Immediate expected reward terms are computed from updated PDs and L_i; future value terms are estimated via Monte Carlo sampling of default sets using the Gaussian latent model.
    4. Backward-in-time learning: Define a Bellman value V_B(s_t,β_{t+1})=max_a Q_t(s_t,a,β_{t+1}). Starting from terminal step where V=0, fit β at each time t by ridge regression (with cross-validation) to minimize discrepancies between V_t(s,β_t) and V_B(s,β_{t+1}) over a representative portfolio of reachable states. The procedure iterates backward from M−1 to 0.
  • Implementation setup for case studies: Crisis horizon M=7, γ=0.98, initial J_i(0)=0, μ_i=0, LGD_i=1 for equity injections, Σ_ij=0.5 among active nodes, σ_i calibrated at t=0 from reported PD_i(0), W_i(0), E_i(0) via the Merton map, PDM_floor=0.00021. Action menus specify injections as fractions of W_i (varied per case). A convenience metric Conv(s)=max_a Q_t(s,a) − Q_t(s,a_inaction) quantifies the net benefit of intervening versus not intervening at state s.
Key Findings
  • KK (Krackhardt kite) synthetic network (10 nodes, W_i(0)=100, E_i(0)=3; PD_i(0)=0.01 for nodes 4,8,10 and 0.001 otherwise; homogeneous w_ij=1):
    • For low α=0.0001, optimal action is no investment; as α increases, larger injections become preferable. For α=0.01, best is investing 1.5% of assets in all nodes; investing the maximum 2% in all nodes is never optimal, and investing 0.5% in all nodes is consistently the worst, as it raises funds at risk without sufficient resilience.
    • Network position matters: investing in the central node (node 4) dominates investing in a peripheral node (node 10) for the same amount when there are no prior investments.
    • With pre-existing investment of 0.5% W_10 in node 10, the optimal action becomes to continue investing significantly more in node 10 (e.g., 10@15 or 10@20), outperforming alternatives including investing in the central node. This highlights path dependence and potential moral hazard: prior injections create incentives to keep supporting already rescued banks.
    • As time to end M−t decreases, expected losses |Q| shrink for all actions due to less time for defaults/contagion; the importance of network position diminishes near the horizon.
  • EBA (European GSII) reconstructed network (bank balance-sheet and PD inputs from EBA/Fitch):
    • Convenience to intervene Conv>0 for α∈{0.001, 0.005, 0.01}, and Conv<0 for α=0.0001: bailouts pay off only when taxpayer wealth-at-default stakes are sufficiently high.
    • Conv increases with longer time to horizon when positive (and decreases in absolute value when negative) and grows with greater network distress: lower initial capital E_i(0) (e.g., halved capital scenario), higher bilateral exposures w_ij, and longer crisis duration.
    • Sensitivity of optimal policy: • Higher discount factor γ increases the magnitude of expected losses for all actions; qualitative ranking of actions remains stable. • Increasing baseline PDs raises expected losses for all actions; the best action becomes clearly the largest available broad-based injection in risky institutions; the worst action shifts from 0.5% to medium-size injections (1.0–1.5%) when PDs are high, as these are still insufficient for resilience but put more funds at risk. • Larger bilateral exposures w_ij worsen outcomes for all actions; the advantage of sizeable interventions over inaction grows with exposure strength.
    • Critical threshold in α: There exists α_c≈0.00079 (baseline) such that for α<α_c, inaction is optimal; for α>α_c, the optimal action at time 0 is to inject the maximum considered amount (3.0% of W_i) in all risky institutions. In a severely distressed network (capital halved), α_c≈0.0005 and the optimal action decreases to 1.5% in all risky institutions.
    • Universal worst action across scenarios is the smallest widespread injection (0.5% in risky institutions), which elevates taxpayer exposure without making banks sufficiently resilient. Overall, the convenience to intervene increases with higher α, longer horizons, weaker capitalization, stronger interbank exposures, and greater network distress; optimal policies can be discontinuous as α crosses α_c.
Discussion

The study’s findings directly address the central question of when bailouts minimize taxpayer losses. By integrating network contagion, structural default probabilities, and a control framework, the authors show that interventions are not universally beneficial; they become optimal only beyond a critical taxpayer-stake parameter α_c that is endogenously determined by the system’s condition and structure. This criticality explains why, in low-loss environments, inaction is preferable, while in high-loss regimes, decisive, sufficiently large and broad-based capital injections reduce expected losses by preventing cascades. Sensitivity analyses indicate that longer crisis horizons, higher exposures, and weaker capitalization increase the value of intervention, aligning with intuition about compounding contagion and vulnerability. The KK case illustrates how network centrality and, importantly, prior investments shape optimal choices, implying that governments may rationally continue supporting already rescued banks, which raises concerns about moral hazard. These insights are pertinent for regulators designing bailout programs: small, widespread injections can be counterproductive; interventions should be sized above resilience thresholds and targeted considering network structure and prior support; and policymakers should weigh moral hazard alongside systemic stabilization benefits.

Conclusion

The paper contributes a dynamic, PD-based financial network model with explicit governmental control and a bespoke AI method to solve the resulting MDP, enabling quantitative evaluation and optimization of bailout policies from taxpayers’ perspective. Empirically informed case studies (KK and EBA networks) reveal a critical taxpayer-stake threshold α_c that separates regimes where intervention is beneficial from those where it is not, and show that the convenience and optimal size of interventions rise with crisis duration, bank distress, and interbank exposures. The framework indicates that governments may optimally continue supporting previously bailed-out institutions, highlighting moral hazard risks. Future research directions include: modeling post-crisis market dynamics and optimizing sale timing and proceeds of acquired equity; allowing the crisis horizon to be action-dependent (e.g., large bailouts shorten crises); expanding instruments beyond equity (varying LGD and capital structure impacts); incorporating heterogeneous correlations and richer contagion channels; and conducting professional calibration with supervisory data to deploy the tool in policy settings.

Limitations
  • Data and calibration: Bilateral exposures are reconstructed from aggregates; results may differ from true networks. The approach relies on supervisory-quality calibration to be operational; the paper uses stylized or publicly available inputs.
  • Modeling assumptions: Equity-focused injections with LGD=1; neutral post-crisis returns beyond M; fixed PD floor and homogeneous correlation in some setups; Gaussian latent dependence; fixed horizon M with no endogenous resolution.
  • Action space discretization: Investment menus are discretized and may not capture optimal continuous policies.
  • Moral hazard and behavior: Banks’ strategic responses to bailouts are not endogenously modeled; changes in risk-taking are discussed qualitatively.
  • Generalizability: Case studies are illustrative; thresholds like α_c depend on calibration and network specifics and may shift with different data.
  • Computational approximations: Value function approximation and Monte Carlo introduce approximation error; convergence depends on representative state sampling and parameter fitting choices.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny