Transportation
Topological assessment of recoverability in public transport networks
R. Massobrio and O. Cats
Public transport networks (PTNs), like many complex systems, can be represented as graphs to understand how connectivity and interactions shape system dynamics. While extensive work has examined PTN robustness—how systems withstand node/link failures—far less is known about recoverability, namely how quickly and effectively networks return to normal performance after disruptions. Moreover, prior studies often focus only on topology, overlooking service characteristics and passenger implications. This study addresses these gaps by proposing a topological, passenger-centric framework to quantify recoverability in PTNs, extending approaches used in optical networks. Using a curated dataset of 42 metro systems worldwide, represented as weighted infrastructure (L-space) and service (P-space) graphs that incorporate in-vehicle travel times, waiting times, and transfer penalties, the study evaluates how recoverability metrics relate to network size, efficiency, density, redundancy, and hierarchy. The purpose is to inform both design and operational decisions to enhance resilience and service quality.
Prior research has emphasized PTN vulnerability and robustness, simulating random or targeted node/link attacks to assess performance degradation. Foundational works in network science and transport network analysis outline how topology influences resilience and observed system features. However, recoverability—the post-failure rebound—has received limited attention in PTNs. Existing recovery studies in other domains (e.g., optical networks) have proposed topological recoverability metrics, which inspire this work. Past PTN studies commonly used unlabeled (unweighted) graphs; in contrast, this study employs weighted representations capturing service characteristics (in-vehicle times, waiting times), offering richer insights. The paper situates its contribution within this gap, also noting the need for approaches that reflect passenger-centric performance (e.g., generalized travel cost).
Dataset and graph construction: The authors assembled GTFS-based datasets for 51 metro networks that meet UITP criteria (urban, exclusive right-of-way, grade-separated, high-frequency/high-capacity). Using gtfspy, they created initial graphs and curated station nodes by merging stops representing the same physical station via a three-step process: (1) automatic merging of identically named stops within 200 m; (2) candidate merging of stops with ≥75% name similarity within 500 m subject to manual confirmation; (3) manual merging through a visual interface. After sanity checks, two directed graph representations were derived: L-space (infrastructure, links labeled with average in-vehicle travel time) and P-space (service, links representing reachability without transfers; labels approximate average waiting time as half the joint headway). Networks with fewer than 50 links were excluded, yielding 42 networks for analysis.
Performance metric: Network efficiency e(G) is defined as the mean of the reciprocals of the Generalized Travel Cost (GTC) over all node pairs. For each pair (i,j), consider the m shortest L-space paths by in-vehicle time; compute for each path the in-vehicle time, accumulated waiting time per leg from P-space, and number of transfers. GTC(i,j) is the minimum over paths of [in_vehicle_time + α × waiting_time + β × n_transfers]; if no path exists, GTC(i,j)=0. Efficiency is the average of 1/GTC(i,j) across all pairs.
Failure and recovery processes: Starting from L-space G0(V,E0), failures remove K=αE0 links uniformly at random in K steps. At step i, retained performance ratio PG,i equals efficiency at step i normalized by the original efficiency. Recovery iteratively reintroduces the removed links. At each recovery step, a greedy heuristic adds the link (from the removed set) that yields the largest improvement in PG,i. Recoverability indicators are defined as: F = Σi=1..K (1 − PG,i) (cumulative performance loss during failure), R = Σi=1..K (PG,i − PG,i−1) (cumulative performance gain during recovery), along with R/F and R − F.
Experimental design: Failure levels α ∈ {0.25, 0.50, 0.75, 1.00}. Route choice parameters: m=5, α=2, β=5 for all networks except London, Paris, and New York (m=1 due to computational limits). A minimum of 10 independent failure/recovery runs per network ensured 95% confidence; distributions of F and R were normal by Shapiro–Wilk (99% level), so means are reported. Topological indicators computed include |V|, |E|, diameter D, density γ = |E|/(|V|(|V|−1)/2), meshedness α = |E|/(|V|−1), average shortest path SP (hops) and SP^W (in-vehicle time), assortativity r, and rates λ, λ^W of exponential fits to node betweenness centrality distributions when statistically significant (KS test, α=0.99). Correlations between recoverability and topology were assessed (Pearson), and cross-threshold consistency was examined (Spearman). Visualization included retained-performance curves, rankings, maps, and scatter plots (e.g., F vs SP; R vs α (meshedness)).
- Greedy recovery effectiveness: Failure and recovery curves are asymmetric; the area over recovery curves is much smaller than over failure curves, indicating the simple greedy strategy rapidly restores performance, especially as more links are reintroduced. With complete link removal, recovery becomes deterministic per network.
- Cross-threshold consistency: Spearman correlations across link-removal levels show high consistency for F (0.86–0.98) and for combined indicators R/F and R−F; R shows moderate positive correlations that decrease as removal thresholds diverge (0.66–0.89), suggesting some networks excel under partial vs complete failures differently for R.
- Rankings and exemplars:
- Lowest cumulative performance loss (F): Marseille, Lyon, Warsaw at 25% removal (F ≤ 10.1) and 100% removal (F ≤ 76.3). Among large networks, London, Santiago, Paris outperform New York, Madrid, Berlin, Chicago across thresholds. Highest F: Oslo, Stockholm, Rome (F ≥ 14.0 at 25%; F ≥ 84.4 at 100%).
- Highest cumulative performance gain (R): Santiago best across all thresholds (R = 19.9 at 25% removal; R = 59.4 at 100%). New York, London, Paris, Madrid, Valencia also consistently high. Lowest R: Oslo, Rome, Stockholm; Boston and Prague also low at higher removals.
- Combined indicators (R/F, R−F): Top five—Santiago, London, New York, Paris, Lyon. Bottom—Boston, Chicago, Rome, Stockholm, Oslo.
- Correlations with topology (Pearson across thresholds):
- F positively correlates with diameter D (0.39–0.57) and average shortest paths SP (0.60–0.70) and SP^W (0.42–0.48), indicating less efficient networks (longer hops/travel times) suffer larger losses; F shows negligible correlation with size (|V|, |E|) but negative correlation with density γ (−0.37 to −0.65), especially strong at higher removal levels, implying denser networks better withstand disruptions.
- R positively correlates with meshedness α (0.70–0.78), indicating networks with more cycles (redundancy) rebound faster; R also correlates with size (|V|: 0.53–0.60; |E|: 0.56–0.64), suggesting larger decision spaces facilitate greater performance gains during recovery.
- Combined indicators correlate positively with meshedness (0.52–0.70), weakly with size (|V|: 0.23–0.41; |E|: 0.27–0.46), and moderately to weakly with SP/SP^W (weaker at higher removals). Assortativity r and exponential rates of node betweenness (λ, λ^W) show negligible to low correlations with all recoverability metrics.
- Illustrative relations (complete link removal): Lower F aligns with shorter SP (smaller and some efficient large networks like London outperform peers); higher R aligns with greater meshedness, with Santiago an outlier exhibiting exceptionally high meshedness and recoverability.
- Practical implication: Efficient (short SP, SP^W), dense networks better withstand disruptions; larger, more meshed networks recover faster via greedy restoration.
The study demonstrates that a simple, topology-driven greedy recovery heuristic can quickly restore PTN performance after widespread link failures, even without detailed operational optimization. The analysis links recoverability to structural properties: efficiency (shorter average paths) and density reduce performance loss during failures, while redundancy (meshedness) and network size enable faster recovery. These insights align with passenger-centric performance, since shorter in-vehicle and waiting times and fewer transfers both improve baseline efficiency and provide alternative routing options during recovery. The findings suggest practical levers for planners: reduce average in-vehicle travel times to limit failure impacts, and increase redundancy (cycles, alternate paths) to accelerate recovery. Because correlations are robust across failure scales for most indicators, planners can expect similar relative performance under both partial and complete failure scenarios, although some networks may differ in R depending on disruption scale. The negligible role of assortativity and betweenness distribution parameters implies hierarchical mixing patterns are less critical to recoverability than global efficiency and redundancy.
This work introduces a topological, passenger-centric framework to quantify PTN recoverability using efficiency based on generalized travel cost and four recoverability indicators (F, R, R/F, R−F), applied to 42 metro networks with weighted infrastructure and service representations. Key contributions include: (i) an end-to-end recoverability assessment pipeline combining random link failures and greedy link restoration; (ii) empirical evidence that efficiency and density mitigate performance loss, and that meshedness and size enhance recovery; and (iii) actionable insights for design and operations, such as prioritizing redundant connections and reducing in-vehicle travel times. Future research directions include expanding datasets (including additional modes), devising targeted failure scenarios and smarter recovery algorithms that leverage disruption information, incorporating broader performance metrics beyond GTC, and analyzing multimodal PTNs to assess cross-mode interplay during failures and recovery.
The approach adopts a strictly service-topology perspective and does not capture infrastructure operational details (e.g., junctions, passing loops, extra tracks) that could influence real-world recovery dynamics. Results rely on GTFS data availability and quality, which can be a bottleneck. Computational constraints required limiting the number of considered paths (m=1) for very large networks (London, Paris, New York), which may affect path choice richness. The analysis excludes very small networks (<50 links), so findings may not generalize to small systems.
Related Publications
Explore these studies to deepen your understanding of the subject.

