Engineering and Technology

Leveraging driver vehicle and environment interaction: Machine learning using driver monitoring cameras to detect drunk driving

K. Koch, M. Maritsch, et al.

This groundbreaking research by Kevin Koch, Martin Maritsch, Eva van Weenen, Stefan Feuerriegel, Matthias Pfäffli, Elgar Fleisch, Wolfgang Weinmann, and Felix Wortmann showcases an innovative machine learning system for predicting critical blood alcohol concentration levels using driver monitoring cameras. With impressive accuracy in detecting intoxication, this study could revolutionize drunk driving detection and prevention.

00:00

~3 min • Beginner • English

Index

Introduction

The study addresses the need for real-time detection of alcohol impairment in drivers to enable timely digital interventions that reduce alcohol-related crashes and fatalities. Current reliable technologies (e.g., ignition interlocks) are costly and require maintenance, limiting scalability. With driver monitoring cameras becoming mandatory in many markets, the authors propose using camera-derived measures of gaze and head movement to infer BAC-related impairment. The research question is whether a machine learning system using driver monitoring cameras can accurately detect (a) any alcohol influence and (b) BAC above the WHO-recommended limit (0.05 g/dL), generalizing across drivers and driving scenarios for timely, in-vehicle interventions.

Literature Review

Prior HCI work has detected alcohol use via gait, smartphone interactions, smart breathalyzers, wearables, and social media, and has deployed digital interventions to influence alcohol-related behavior. In vehicles, driver state monitoring has focused on drowsiness, increasingly using cameras to observe eye and head movements. Camera-based systems detect various states (drowsiness, distraction, emotions) and commonly issue warnings. For drunk driving, most prior work infers intoxication from driving behavior (steering, pedals, speed) but lacks generalization to unseen drivers due to variability across individuals and environments. Camera-based drunk driving detection studies have been limited, often lacking rigorous experimental design or evaluation. This work fills gaps by conducting a clinical trial with controlled alcohol administration and rigorous machine learning evaluation leveraging driver monitoring cameras to capture driver behavior (gaze and head movements) rather than vehicle behavior.

Methodology

Design: Non-randomized, single-blinded, interventional, single-center clinical study (ClinicalTrials.gov NCT04980846), conducted Aug–Nov 2021 in Bern, Switzerland, approved by local ethics committee (ID 2021-00759). Participants provided informed consent. Sample size was projected via ML discrimination extrapolation based on pilot data to target AUROC ~0.85 at n=30 for detecting BAC >0.05 g/dL. Participants: 39 screened, 30 included (15 female, 15 male; age 37.03 ± 9.24 years). Inclusion: valid EU/CH license ≥2 years, moderate alcohol consumption (validated via AUDIT and PEth <210 ng/mL). Exclusions included pregnancy, health conditions incompatible with alcohol intake, non-compliance, motion sickness, interfering medications, etc. Procedure: After training to avoid simulator sickness and learning effects, each participant completed three 30-minute driving blocks (highway, rural, urban; 10 minutes each, randomized order) at three BAC conditions: (1) no alcohol (0.00 g/dL), (2) severe intoxication (0.05–0.07 g/dL), and (3) moderate intoxication (0.00 < BAC ≤ 0.03 g/dL). No driving occurred between 0.03 and 0.05 g/dL. Breaks (1–2 hours) minimized drowsiness. Alcohol administration and measurement: Dose calculated via updated Widmark formula personalized by gender, weight, age. Alcohol delivered as three mixed drinks over 30 minutes; target BAC at drive start was 0.07 g/dL (overshoot to 0.08 g/dL then decay to 0.07). Participants blinded to doses and BAC. Certified breathalyzer (Dräger Alcotest 6820) measured BrAC; converted to BAC using factor 0.2 per Swiss law. Measurements ensured target BAC before each scenario. Observed BACs: 0.000 ± 0.000 g/dL (no alcohol); 0.062 ± 0.005 g/dL (severe); 0.027 ± 0.003 g/dL (moderate). Simulator and scenarios: Research-grade simulator (Carnetsoft) with 270° FOV, automatic transmission, realistic instrumentation and traffic events. Scenarios varied in speed limits and complexity: highway (80–120 km/h), rural (60–100 km/h, intersections/events), urban (30–50 km/h, dense signs/intersections/pedestrians). Scenario order randomized within each block; routes held constant across blocks to mirror real-world familiarity. Participants instructed to follow Swiss traffic laws and navigation prompts. Sensing: Driver monitoring via infrared eye-tracking camera (Tobii Pro Nano) at 60 Hz, calibrated per participant. Collected gaze coordinates (center screen) and eye positions (for head movement inference). Collected vehicle CAN-like signals from simulator at 30 Hz for baseline (steering angle, gas/brake, velocities/accelerations, lateral position). Feature engineering: Derived three groups from camera data: (1) eye movements (vertical/horizontal velocities and accelerations; absolute positions), (2) gaze events (fixations and saccades with duration, amplitude, peak/average velocity using REMoDNaV), and (3) head movements (velocities/accelerations in vertical, horizontal, depth and combined). Sliding windows of 60 s with 1 s shift; aggregated via mean, SD, 0.05/0.95 quantiles, skewness, kurtosis, power; counts for number of fixations/saccades. Totals: 56 eye movement features, 58 gaze event features, 56 head movement features; 151,200 time-window samples (30 subjects × 81 minutes × 1 s stride). Classification tasks: Two binary labels: (a) Early Warning (any alcohol influence: 0.00 < BAC ≤ 0.03 g/dL vs 0.00), and (b) Above Limit (BAC > 0.05 g/dL vs ≤ 0.05), anchored to WHO 0.05 g/dL recommendation. Models and training: Logistic regression with lasso (L1) regularization, class_weight=balanced, default C=1.0; z-score normalization on training data; log loss optimization (scikit-learn 1.0.2). Robustness checks included L1/L2/elastic net, SVM, RF, GBM, MLP; varying lambda and window sizes. Evaluation via leave-one-subject-out (LOSO) cross-validation; additional leave-one-driving-scenario-out nested on LOSO to test generalization to unseen scenarios. Metrics: Primary AUROC; also AUPRC, balanced accuracy, F1 (weighted). Decision stabilization via non-overlapping majority vote over sequences of windows; decision time assessed via cumulative moving average of probabilities across time. Baseline: CAN-only model using the same pipeline and features on simulator CAN signals; also a combined camera+CAN model.

Key Findings

- Detection performance (LOSO): AUROC 0.88 ± 0.09 for Early Warning (any alcohol influence); AUROC 0.79 ± 0.10 for Above Limit (>0.05 g/dL). Performance stable across scenarios (Early Warning AUROC: highway ~0.90, rural/urban 0.87–0.89). - Error patterns: Early Warning exhibited low false alarms and misses. For Above Limit, misses ~18%; false positives low when sober (14%) but higher for moderate intoxication (41%), indicating sensitivity to mild impairment. Multiclass breakdown TPs: no alcohol 70%, moderate 45%, severe 55%, with ~30% confusion between moderate and severe. - Benchmarking vs CAN-only: Camera-only outperformed CAN-only by ~0.10 AUROC. Camera-only vs CAN-only, respectively: Early Warning AUROC 0.88 ± 0.09 vs 0.74 ± 0.10; Above Limit AUROC 0.79 ± 0.10 vs 0.66 ± 0.12. AUPRC: 0.93 ± 0.05 vs 0.84 ± 0.07 (Early Warning); 0.65 ± 0.16 vs 0.50 ± 0.15 (Above Limit). Balanced accuracy: 0.76 ± 0.10 vs 0.65 ± 0.08 (Early Warning); 0.68 ± 0.10 vs 0.60 ± 0.09 (Above Limit). F1: 0.75 ± 0.14 vs 0.64 ± 0.11 (Early Warning); 0.67 ± 0.12 vs 0.60 ± 0.08 (Above Limit). - Combining signals: Camera+CAN achieved AUROC 0.91 ± 0.07 (Early Warning) and 0.81 ± 0.11 (Above Limit), similar to camera-only; improvements were not statistically significant. - Robustness: Comparable performance in leave-one-driving-scenario-out on top of LOSO; AUROC remained >0.70 across feature groups individually, with gaze events best, then eye movements, then head movements. Alternative classifiers achieved mean AUROC ≥0.73; logistic regression with lasso performed best overall. Longer windows slightly improved AUROC, but 60 s chosen for timeliness. - Interpretability: Larger absolute coefficients for gaze event features. Alcohol intoxication associated with longer fixation and saccade durations and shorter saccade amplitudes; fewer/longer saccades and reduced saccade velocity/amplitude consistent with tunnel vision and slowed information processing. - Decision time: Reliable decisions reached quickly. After ~90 s, balanced accuracy 0.77 (95% CI [0.72, 0.82]) for Early Warning and 0.69 ([0.64, 0.75]) for Above Limit. Majority-vote over 150 windows yielded AUROC 0.91 ± 0.08 (Early Warning) and 0.85 ± 0.11 (Above Limit).

Discussion

The findings demonstrate that driver monitoring cameras capturing gaze and head movement can accurately detect alcohol-related impairment in real time, addressing the need for scalable, low-cost detection solutions aligned with pending regulatory mandates for in-vehicle driver state monitoring. Shifting from vehicle behavior to driver behavior improves generalization across drivers and scenarios and enhances early detection, as visual-perceptual impairments appear at lower BACs than motor control changes. Compared with CAN-only models and prior work relying on driving behavior, the camera-based approach yields higher AUROC and precision-recall, indicating practical utility for in-vehicle interventions. Model interpretability aligns with known pathophysiology: increased fixation durations and altered saccades reflect slowed processing and narrowed visual scanning under alcohol, supporting construct validity. The system makes stable decisions within a couple of minutes, enabling timely warnings or escalated safety actions before entering higher-risk driving contexts. Overall, the approach advances HCI for road safety and provides a pathway for integrating drunk driving detection into existing camera-based driver monitoring platforms.

Conclusion

This work presents, to the authors' knowledge, the first rigorously evaluated machine learning system that detects drunk driving using driver monitoring cameras. In a controlled clinical simulator study with 30 participants and LOSO validation, the system achieved AUROC 0.88 for detecting any alcohol influence and 0.79 for detecting BAC above 0.05 g/dL, outperforming CAN-only baselines and matching combined camera+CAN performance. Interpretability reveals reliance on established alcohol-induced oculomotor changes. The approach leverages widely available or mandated in-vehicle cameras and can support digital interventions ranging from warnings to safety system escalations. Future work should (1) validate in real vehicles (e.g., test tracks) with sober and intoxicated drivers, (2) collect open-road data from non-impaired drivers to calibrate false alarms in diverse conditions, (3) integrate with existing driver monitoring algorithms (e.g., drowsiness, distraction) as an ensemble, and (4) extend to other impairments (e.g., drugs, medical conditions) and populations (young/elderly, clinical visual disorders).

Limitations

- Simulator setting: While research-grade and validated, results may differ in real-world driving with varying lighting, weather, secondary tasks, and environmental complexity. Long breaks and training mitigated drowsiness and learning effects, but residual effects are possible. - Population: Healthy adults from Switzerland with moderate alcohol use; performance may vary with different ethnicities (alcohol sensitivity differences), very young or old drivers, and individuals with ocular motor disorders (e.g., strabismus, nystagmus) that could affect gaze tracking. - Sensor constraints: Although infrared cameras are robust to lighting and eyewear, extreme conditions may still impact tracking quality. - Generalization to all BAC ranges and substances: The study targeted discrete BAC ranges (0.00; ~0.027; ~0.062 g/dL) and excluded driving between 0.03–0.05 g/dL; effects of other substances (e.g., cannabis) were not assessed. - Ethical and legal constraints limit immediate real-world validation with intoxicated drivers; test tracks and staged studies are needed.

Related Publications

Explore these studies to deepen your understanding of the subject.

Medicine and Health

Machine-learning algorithms for asthma, COPD, and lung cancer risk assessment using circulating microbial extracellular vesicle data and their application to assess dietary effects

A. Mcdowell, J. Kang, et al.

Psychology

Using machine learning to understand social isolation and loneliness in schizophrenia, bipolar disorder, and the community

S. J. Abplanalp, M. F. Green, et al.

Medicine and Health

Enabling precision rehabilitation interventions using wearable sensors and machine learning to track motor recovery

C. Adans-dester, N. Hankov, et al.

Psychology

Using machine learning of computerized vocal expression to measure blunted vocal affect and alogia

A. S. Cohen, C. R. Cox, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny