Introduction
Alcohol consumption significantly contributes to global disease burden and mortality, with drunk driving being a leading cause of fatal traffic accidents. Current reliable methods for detecting intoxicated driving, such as ignition interlock devices, are expensive and require maintenance. This necessitates the development of scalable, low-cost alternatives. This research proposes a machine learning system using driver monitoring cameras, mandated in many countries and soon to be standard in new vehicles. The system aims to provide real-time information on a driver's BAC, enabling early warnings and potential interventions to prevent alcohol-related harm. The study focuses on developing and rigorously evaluating a machine learning model capable of detecting two critical BAC thresholds: (1) any BAC above 0.00 g/dL but below the WHO recommended limit of 0.05 g/dL (early warning), and (2) BAC levels exceeding the 0.05 g/dL limit.
Literature Review
Existing digital interventions for monitoring alcohol consumption and intervening when necessary have explored various approaches including gait analysis, smartphone interactions, smart breathalyzers, wrist-worn devices, and social media data. These interventions range from self-management tools to peer and family notification systems and virtual agent interventions. However, these lack focus on high-risk situations like driving. In driver state detection, the focus has been on drowsiness, initially using vehicle signals from the CAN bus and later shifting to camera-based systems for better performance. Research on camera-based driver monitoring addresses various use cases including driver identification, distraction detection, and emotion detection. Most systems employ an audiovisual warning approach when impairment is detected. While research exists on the effect of alcohol on driving performance, real-time drunk driving detection from vehicle signals has been challenging due to the high variability in driving behavior across individuals and environments. This study addresses the gap by leveraging driver monitoring cameras to focus on driver behavior, specifically eye and head movements, rather than solely relying on driving behavior data.
Methodology
This study employed a non-randomized, single-blinded, interventional, single-center clinical trial (ClinicalTrials.gov NCT04980846) involving 30 participants (15 female, 15 male; age 37.03 ± 9.24 years). Participants met inclusion criteria including a valid driver's license, moderate alcohol consumption (validated using AUDIT and PEth levels), and absence of certain health conditions. The study procedure involved a simulator training session followed by driving tasks in a research-grade driving simulator under controlled alcohol administration. Three BAC levels were targeted: (1) no alcohol (0.00 g/dL), (2) moderate (0.00 g/dL < BAC ≤ 0.03 g/dL), and (3) severe (0.05 g/dL < BAC ≤ 0.07 g/dL). Alcohol levels were measured using a calibrated breath alcohol measurement device (Dräger Alcotest 6820). Participants drove for 30 minutes in each condition across three scenarios (highway, rural, urban). Gaze behavior and head movements were recorded using an infrared camera system (Tobii Pro Nano) at 60 Hz. The machine learning system extracted features from eye movements (velocity, acceleration), gaze events (fixations, saccades), and head movements. A sliding window approach (60-second window, 1-second shift) was used for feature engineering. Logistic regression with lasso regularization was employed for predictive modeling, with two classification tasks: 'Early Warning' (BAC > 0.00 g/dL and ≤ 0.03 g/dL) and 'Above Limit' (BAC > 0.05 g/dL). Model evaluation used leave-one-subject-out (LOSO) cross-validation, with AUROC as the primary performance metric. A CAN-only baseline was also included for comparison. Post-hoc interpretability analysis examined model coefficients to understand feature importance and align findings with known pathophysiological effects of alcohol.
Key Findings
The machine learning system demonstrated high performance in detecting drunk driving. The AUROC for the 'Early Warning' task was 0.88 ± 0.09, and for the 'Above Limit' task was 0.79 ± 0.10. Performance was consistent across different driving scenarios. The camera-only approach significantly outperformed a CAN-only baseline (AUROC 0.74 for Early Warning and 0.66 for Above Limit). Combining camera and CAN data did not yield significant improvement over the camera-only approach. Sensitivity analyses showed robustness to unseen driving scenarios, varying window lengths, and alternative machine learning models. Interpretability analysis revealed that gaze events (fixations and saccades) were the most important features, with longer fixation durations and shorter saccade amplitudes associated with higher BAC levels. This aligns with known pathophysiological effects of alcohol on visual processing and attention. The system achieved a reliable decision after approximately 90 seconds, further improved by using a majority vote aggregation of multiple windows. The system exhibited high true positive rates for identifying the 'no alcohol' state (70%), 'moderate' (45%), and 'severe' (55%) intoxication levels; however, there was some confusion between moderate and severe intoxication (30%).
Discussion
The findings demonstrate the effectiveness of using driver monitoring cameras for real-time drunk driving detection, outperforming previous approaches based on driving behavior alone. The system's ability to detect even moderate alcohol levels allows for early warnings, potentially enabling timely interventions. The model's reliance on gaze behavior aligns with known pathophysiological effects of alcohol, adding biological plausibility. The high performance and generalizability across participants and scenarios highlight the practical potential for integrating this system into existing vehicle technologies. The relatively short decision time suggests the feasibility of real-time implementation. This system could be part of a comprehensive driver warning system providing transparent feedback to drivers and supporting effective interventions like warnings, speed limits, and even forced stops.
Conclusion
This study presents the first rigorous evaluation of a machine learning system for drunk driving detection using driver monitoring cameras. The high accuracy, generalizability, and interpretability of the model demonstrate the significant potential of this technology for reducing alcohol-related road accidents. Future work should involve real-world driving studies to validate the system's performance in diverse and complex environments and investigate its effectiveness in different populations. Integrating this system with other driver state monitoring systems (e.g., drowsiness, distraction) offers a promising pathway for improving road safety.
Limitations
The study used a driving simulator, which may not fully capture the complexities of real-world driving, including environmental factors and secondary tasks. The participant sample consisted of healthy individuals with moderate alcohol consumption, and generalizability to other populations (e.g., elderly, individuals with visual impairments, different ethnicities) needs further investigation. Individual alcohol tolerances and visual scanning behaviors could potentially affect the system's performance in certain individuals. While the study mitigated learning effects and drowsiness, residual effects might still exist.
Related Publications
Explore these studies to deepen your understanding of the subject.