logo
ResearchBunny Logo
A stochastic prediction of minibus taxi driver behaviour in South Africa

Transportation

A stochastic prediction of minibus taxi driver behaviour in South Africa

J. Schlüter, M. Frewer, et al.

Explore the insights of the South African minibus taxi industry, as researchers Jan Schlüter, Manuel Frewer, Leif Sörensen, and Justin Coetzee delve into the working behaviors of taxi drivers. This study presents novel findings on labor supply choices with implications for understanding economic decision-making in a unique context.

00:00
00:00
~3 min • Beginner • English
Introduction
The study examines how South African minibus taxi drivers decide when to stop working within an informal, demand-responsive paratransit system that dominates public transport for low-income users. It aims to determine whether drivers’ labour supply decisions align with reference-dependent preferences (income or hours targets) or with neoclassical profit maximization (intertemporal substitution). Using trip-level operational data from Rustenburg (and additional observations from Cape Town), the paper seeks to identify thresholds in hours worked and revenues earned that trigger stopping behaviour, with implications for regulating and formalizing paratransit operations and for potential technology-enabled dispatching.
Literature Review
Research on South Africa’s minibus taxi industry has largely focused on formalization efforts, policy integration with BRT systems, and governance challenges, often noting resistance to top-down reforms and the sector’s informal, decentralized nature (e.g., Barrett 2003; Schalekamp and Behrens 2009, 2010; Woolf and Joubert 2013; Venter 2013; Del Mistro and Behrens 2015). Quantitative analyses are scarce due to data limitations. In behavioural economics, prospect theory and reference-dependent preferences (Tversky and Kahneman 1979, 1991; Thaler 1980; Kőszegi and Rabin 2006) have been applied to taxi drivers’ labour supply, with mixed evidence on income targeting versus neoclassical substitution (Camerer et al. 1997; Farber 2005, 2008, 2015; Crawford and Meng 2011; Martin 2017; Thakral and Tô 2017). Prior work often uses wage elasticities, hazard-of-stopping models with income brackets, or reference point deviations; results vary, and recent studies suggest positive wage elasticities with income targeting primarily among inexperienced drivers.
Methodology
Data: Onboard observation surveys using the GoMetro Pro app. Rustenburg: 7765 trip observations over 49 days (Nov 2016–Sep 2017) covering full operating days for sampled vehicles; Cape Town: 4012 observations over 27 days (Apr–May 2017). Variables include trip times, boarding/alighting counts, revenues, and derived daily accumulated revenues and hours worked. Cleaning excluded trips with average speed <10 km/h or >80 km/h. Hours worked defined as elapsed time from the day’s first trip start to the current trip end; gaps may include waiting or breaks (not distinguishable). Outcome: Binary indicator for whether a trip is the last trip of the driver’s day (1) vs not (0), interpreted as the hazard of stopping. Models: - Parametric regressions: OLS (for reference), logit, and probit on polynomials and linear splines of accumulated revenues and hours worked. Splines set at inner quintiles (20%, 40%, 60%, 80%) for both revenues and hours; also alternative spline boundaries based on quintiles of last-trip observations. Controls: time-of-day indicators, day-of-week, and driver fixed effects. - Non-parametric: Generalized Additive Models (GAM) with a logit link, smoothing the effects of accumulated revenues and hours worked while including standard controls parametrically. Model selection via AIC. Estimation details: GAM smoothing parameters were selected data-adaptively (e.g., revenue smooth ≈2.47/7.59; hours smooth ≈5.51/1.322 depending on inclusion of hourly fixed effects). Model fits compared via AIC (e.g., ≈4167 without hourly fixed effects, ≈4101 with them).
Key Findings
- Stopping is driven primarily by time-of-day and hours worked, not by accumulated revenue until very high levels: - In GAM without hourly fixed effects, hours worked shows a strong non-linear pattern: reduced stopping likelihood in the first ~4 hours; roughly flat from ~5–8 hours; rapidly increasing stopping likelihood after ~9 hours, peaking in the 10–12 hour range. - With hourly fixed effects included, the hours-worked smooth loses significance (p≈0.47), indicating that time-of-day largely explains stopping, consistent with drivers following a day schedule (morning start, evening stop). - Accumulated revenue exhibits minimal effect on stopping until very high earnings (~>1500 ZAR). When significant (p≈0.04), the effect is small. - Evidence of flexible adjustment around a 10–12 hour workday: on “good days” (higher earnings), drivers tend to extend work by about an hour; on “bad days” (low earnings, e.g., ≤300 ZAR), higher stopping probabilities (~0.3) occur after 4–6 hours. - Parametric models indicate nonlinearity (significant higher-order revenue polynomials) but splines often lack statistical significance; time-of-day indicators are strongly positive in evening hours. - AIC comparisons favor GAM over simple parametric forms (e.g., AIC ≈4167 without hourly fixed effects; ≈4101 with them), aligning with non-linear stopping dynamics. - Additional note (intra-trip): number of passengers boarded (proxy for trip-level income) has a highly significant, non-linear effect on deciding to end a trip segment, suggesting a possible trip-level reference point distinct from day-level behaviour.
Discussion
Findings indicate that minibus taxi drivers predominantly determine the end of their working day by time-of-day and typical work duration rather than by meeting a fixed daily income target. The sharp rise in stopping probability after 10–12 hours suggests a reference in hours (or a conventional workday), while small revenue effects at the day level imply limited income targeting for daily totals. Nonetheless, drivers display profit-maximizing flexibility: shortening shifts on low-earning days and extending by about an hour on high-earning days. The dissipation of the hours-worked effect after controlling for hourly fixed effects implies that stopping decisions are anchored to evening hours, with later starters working fewer hours. The intra-trip analysis note points to meaningful trip-level reference dynamics based on passenger counts. Overall, behaviour combines elements of reference dependence (hours) and profit maximization (adjustments to market conditions), aligning with mixed evidence in the broader literature and possibly reflecting learning effects over experience.
Conclusion
Using parametric splines and GAM on large-scale observational data, the study shows that South African minibus taxi drivers typically work 10–12 hours and end shifts in the evening, with modest flexibility tied to daily earnings. Drivers cannot be cleanly categorized as either reference-dependent or profit-maximizing; instead, both behaviours are present, with clearer thresholds in hours worked. These insights can inform algorithmic dispatch and planning for paratransit formalization and integration, potentially improving efficiency and environmental outcomes. Future research should leverage richer, technology-enabled datasets (including driver characteristics, area and weather effects, precise waiting vs break times, and longer panels) and explore interactive effects and multi-period (weekly/monthly) reference points, as well as trip-level reference dynamics suggested by passenger-boarding effects.
Limitations
- Hours worked variable includes all elapsed time since first trip start (cannot distinguish waiting in queue from breaks), potentially biasing the hours effect. - Limited covariates: lack of weather, route/area start-stop fixed effects, and other contextual factors; only day-of-week, time-of-day, and driver fixed effects included. - No socio-economic or experience data on drivers; cannot test learning effects or driver heterogeneity in depth. - Additive model specification; potential interactions between time and revenue not modeled to retain interpretability. - Observations track drivers only within single days; cannot assess weekly or monthly reference points or turnover cycles. - Per-driver variation is limited (≈18 trips on average) for estimating individual reference levels; common reference levels can be problematic. - Revenue effects may be underestimated due to flat fare structures and limited revenue variance at the day level.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny