Transportation
A measurement model of pedestrian tolerance time under signal-controlled conditions
X. Hu, N. Wang, et al.
The study addresses the need for low-carbon mobility and pedestrian safety amidst urban signal control strategies that often overlook heterogeneous pedestrian tolerance to red lights. Current signal timings are typically based on aggregate walking speeds and fixed parameters, leading to prolonged waits that can exceed pedestrians’ tolerance, prompting illegal crossings and compromising safety. The authors define observation cycles within signal phases and classify pedestrian states as k1 (arrive during green, cross on green), k2 (arrive during red, wait for green; censored tolerance), and k3 (arrive during red, cross during red; uncensored tolerance). The research focuses on predicting individual red light tolerance time for k3 pedestrians, where both start and end of tolerance are observed, to support dynamic signal timing that could convert illegal (k3) to legal waiting behavior (k2). The paper highlights gaps: lack of micro-level individual predictions, reliance on single models despite complex factors, and absence of pedestrian type classification. The objective is to build a machine learning framework that classifies pedestrians by tolerance type and predicts individual tolerance times to inform flexible, safety-oriented signal timing.
Prior work spans three areas: (1) influencing factors, (2) numerical tolerance levels, and (3) modeling approaches. Factors shown to affect tolerance include age, gender, walking speed, crossing type, group size, distraction (e.g., mobile phone use), countdown timers, and intersection-specific effects (Kumar and Ghosh, 2022; Shaaban and Abdelwarith, 2020; Truong et al., 2022; Cui et al., 2022; Chen et al., 2023). Reported tolerance benchmarks vary by context: in Japan, maximum tolerance 40–60 s with increased anxiety at 21–28 s (Asaba and Saito, 1998); Beijing estimates of ideal 18.7 s and limit 52.8 s (Zhang et al., 2016); about 20% of pedestrians with zero tolerance (Jain et al., 2014). Deterministic models (e.g., logistic regression) are interpretable but sensitive to data quality and often ignore censoring. Survival analysis has been used to address censoring (Hamed, 2001; Tiwari et al., 2007), but nonparametric approaches don’t quantify covariate effects; parametric Weibull models impose distributional assumptions that may not hold (Dhoke et al., 2021; Liu et al., 2022). Deep survival methods have emerged to capture nonlinearities without strong assumptions (e.g., DeepWait, Kalatian and Farooq, 2019), but real-world validation and inclusion of broader factors remain limited. The literature thus lacks micro-level predictive methods integrating classification by tolerance types and multi-model ensembles with robust hyperparameter optimization.
Data: Video-based observations were collected at three high-density signalized crosswalks in Chongqing, China (two commercial areas and one urban road), on sunny days from 7:00–20:00 between Sept. 12–29, 2023, at 60 fps. Signal timings: sites had red durations of 75, 85, and 105 s and green durations of 20, 15, and 20 s. Features recorded for each pedestrian included gender (i1), age (i2), distraction status (i3), walking speed (i4), presence of violators ahead (i5), time of day (day/night) (i6), pedestrian volume (i7), and group size (i8). Event label indicates running the red (1) vs. not (0). For k3 (illegal) cases, tolerance time is observed as time from arrival during red to the start of illegal crossing. For k2 (legal waiting), data are censored since red ends before tolerance is exhausted. Preprocessing: Duplicates and missing entries removed; tolerance times constrained to 5–105 s; outliers assessed via isolation forest. Resulting dataset: 1223 normal crossings (k2, censored) and 1527 illegal crossings (k3, uncensored). Descriptive statistics of illegal cases: 59.5% male; age distribution underage 15.3%, prime 59.2%, elderly 25.5%; 39.6% distracted; 55.7% alone; pedestrian volume loose 35.4%, moderate 30.3%, crowded 34.3%; daytime 71.1%; walking speed low 29.3%, medium 38.6%, high 32.1%; violators ahead 37.5%. Tolerance-type grouping: Random Survival Forests (RSF) were trained on combined k2 (censored) and k3 (uncensored) data, defining the red-light run as the event. RSF produced individual cumulative hazard functions and risk scores (integrated mortality). K-means clustering on risk scores segmented pedestrians into three tolerance groups. The silhouette method indicated an optimal cluster count of 3. The k3 illegal dataset was split into low, medium, and high tolerance sub-datasets with 553, 452, and 522 samples, respectively. Feature grouping: To reduce dimensionality and data collection burden while maintaining accuracy, features were grouped from multiple perspectives: M1 (personal: i1–i4), M2 (environmental: i5–i8), M3 (features selected as important by RF), and M4 (features jointly identified as important by RF, SVR, and XGB; a consensus/voting set). These groups served as inputs to separate models per tolerance group. Modeling framework: A two-layer stacking ensemble was used for numerical prediction of individual tolerance time within each tolerance group. Base learners: RF, XGB, MLP, SVR. Meta-learner: multiple linear regression (MLR). To prevent leakage, 5-fold CV on the training portion generated out-of-fold predictions from each base model, which were concatenated as meta-features; test-set predictions were averaged across folds to form test meta-features. The meta-model was trained on these features to yield final predictions. Hyperparameter optimization: Bayesian Optimization (BOA) with a Gaussian Process surrogate and acquisition maximization was applied to tune base-model hyperparameters over predefined ranges (e.g., MLP hidden units and learning rate; XGB learning rate, tree depth, subsample, regularization; RF number of trees, depth, min samples split; SVR C and epsilon). The objective was validation performance (e.g., MSE) within the CV framework, iterating until convergence or iteration limits. Evaluation: Metrics included MSE, MAE, and MAPE on held-out test sets within each tolerance group. Additional comparisons were made against single models (MLP, XGB, RF, SVR) and an ungrouped stacking model to assess benefits of tolerance grouping and feature grouping.
- RSF + K-means effectively segmented pedestrians into three tolerance groups; silhouette analysis supported three clusters. Illegal-crossing dataset sizes: low 553, medium 452, high 522.
- Across all tolerance groups, the M4 feature set (multi-model voted important features) yielded the best performance: • Low tolerance: MSE 6.58, MAE 1.91 s, MAPE 19.78%. • Medium tolerance: MSE 4.82, MAE 1.53 s, MAPE 7.63%. • High tolerance: MSE 33.32, MAE 3.89 s, MAPE 10.14%.
- Sensitivity in low tolerance group: higher MAPE due to smaller target magnitudes, but absolute errors remained small (≈2 s on average).
- Model comparison on test data showed grouped stacking outperformed single models and ungrouped stacking: • Single models (MSE/MAE): MLP 39.86/4.90; XGB 37.29/4.12; RF 41.74/5.00; SVR 67.23/7.19. • Ungrouped stacking: 34.98/3.97. • Grouped stacking (best per group with M4): low 6.58/1.91; medium 4.82/1.53; high 33.32/3.89.
- Meta-model feature importance varied by group (example MLR weights/importance): in medium group RF predictions had highest importance (0.764), while in high group XGB dominated (0.537); indicating heterogeneous model strengths across tolerance types.
- The approach enables deriving probability distributions and statistics (expectation, variance) of crowd tolerance time per red phase, supporting data-driven signal timing.
The study’s goal was to predict individual pedestrian red light tolerance times to inform dynamic, pedestrian-friendly signal control. By classifying pedestrians into tolerance groups via RSF-derived risk scores and tailoring stacked regressors per group with optimized hyperparameters, the model captured heterogeneous behavioral patterns and improved accuracy relative to single-model baselines and ungrouped ensembles. The findings demonstrate that micro-level, group-specific modeling reduces errors across tolerance types and enables constructing crowd-level tolerance distributions in real time. This supports two practical applications: (1) active signal timing adaptation—estimating current waiting crowd tolerance distributions to set red durations that minimize illegal crossings and balance pedestrian/vehicle efficiency; and (2) passive traffic management—identifying low-tolerance individuals for targeted interventions during high vehicular flow periods to reduce red-light violations. Overall, the results directly address the research gap in predicting individual tolerance times, showing that ensemble learning with classification and feature grouping is effective and operationally relevant.
The paper proposes a two-layer stacking model (RF, XGB, MLP, SVR as base; MLR as meta), with Bayesian hyperparameter optimization, to predict individual pedestrian red light tolerance times under signal control. Using RSF risk scores and K-means clustering, pedestrians were segmented into low, medium, and high tolerance groups, and multi-perspective feature groupings were evaluated. The consensus feature group (M4) consistently yielded the best performance across groups, and grouped stacking surpassed single models and ungrouped stacking. The approach enables dynamic estimation of individual and crowd tolerance time distributions to guide flexible pedestrian signal timing and targeted management to reduce illegal crossings. Future work includes expanding observable features (e.g., travel purpose/time, temperature, weather, built environment) and validating generalization across diverse contexts to enhance robustness.
- Model performance may be dataset-dependent; machine learning and deep learning components can overfit to the specific data used, risking reduced generalization to other cities or conditions.
- Feature set is limited to demographics/behavioral and simple environmental attributes; unobserved factors (e.g., weather, temperature, built environment, trip purpose) were not included and may affect tolerance.
- Censored data (k2) were used only for classification, not direct time prediction; integrating censored-data-aware regression could further utilize available information.
- Data were collected at three sites in one city under fair weather and daytime-dominant periods, potentially limiting external validity.
- Video-derived labels and manual preprocessing may introduce measurement noise and potential biases.
Related Publications
Explore these studies to deepen your understanding of the subject.

