Economics
Predicting loss aversion behavior with machine-learning methods
Ö. Saltık, W. U. Rehman, et al.
The paper situates its research within behavioral economics and neuroeconomics, focusing on Prospect Theory and Cumulative Prospect Theory and the well-established phenomenon of loss aversion, whereby losses loom larger than equivalent gains. The introduction reviews seminal work by Kahneman and Tversky (1979, 1992) and notes that loss aversion influences financial decision-making, risk-taking, and consumer behavior, with reference dependence and neural evidence supporting its robustness. The study motivates practical implications for policymaking, market behavior, and macroeconomic dynamics, arguing that machine learning can complement behavioral theories to handle complex data and transmission mechanisms. It proposes to predict individual loss-averse choices by integrating demographic (age, income, gender), psychological (overconfidence, hopelessness, financial literacy), and behavioral timing (reaction times) features within hybrid machine-learning models, aiming to improve foresight and policy design.
The literature review summarizes evidence on loss aversion across contexts, including financial decisions, consumer behavior, and macroeconomic implications. It discusses relationships between loss aversion and overconfidence (potentially leading to risky choices and feedback effects), hopelessness (mutual reinforcement with risk avoidance), and financial literacy (bidirectional influences on risk management and decision quality). Evidence on gender, age, and income differences in loss aversion is mixed, with context-dependent findings; some studies suggest women and older individuals are more loss averse, and higher income may correlate with lower loss aversion. Additional studies are reviewed: Molins et al. (2022) linked negativity bias to loss aversion and reinforcement learning; Liu and Fan (2022) found stock downturns increased antidepressant use consistent with loss aversion; He (2022) modeled overconfidence and loss aversion in adaptive markets; Yiwen (2021) examined Chinese markets linking investor biases to firm prices; Baek et al. (2017) found heightened risk and loss aversion in depressed suicide attempters with neural correlates; Gächter et al. (2007) documented socioeconomic heterogeneity in loss aversion; and hybrid ML approaches (Plonsky et al., 2017, 2019; Bourgin et al., 2019) showed that integrating behavioral theories with ML improves prediction, with tree-based models often outperforming other methods.
Design and measures: The study examines loss aversion in mixed gambles using a controlled experimental task. Twenty-eight volunteer students participated (sample size determined via G-Power and ethics-approved). Each trial presented a 50% gain and 50% loss mixed gamble with four response options: Reject, Strictly Reject, Accept, Strictly Accept. A 100% probability of 0 gain (certain amount) was the alternative to not gambling. The gain amounts ranged from 10 TL (lower bound) to 40 TL (upper bound) with 2 TL increments; losses ranged from 5 TL (lower bound) to 20 TL (upper bound) with 1 TL decrements, generating 256 gambles per participant. Tasks were programmed in Visual Basic within Microsoft Excel 2016, with screen recording to compute reaction times (milliseconds) from stimulus onset.
Features: Inputs included objective features (Number of Gamble, Participant ID, Upper Bound, Lower Bound), naive features (Milliseconds, Gain/Loss Differences), and sociodemographic/psychological features (Gender, Age, Department, Income, Self-Confidence scale, Beck Hopelessness scale, Financial Literacy scale). Rather than directly using the loss aversion coefficient as a feature, the gain/loss ratio for each gamble was included as ‘diff’. Output features captured individual decisions and probability of states.
Data and pipeline: The dataset comprised 7168 observations (28 participants × 256 gambles). Data were partitioned into training (80%) and testing (20%). The raw data were also segmented into five blocks by trial index (1–50; 51–100; 101–150; 151–200; 201–256) for robustness. The modeling pipeline used a hybrid approach: first, classification algorithms predicted binary accept/reject decisions (0/1). These predicted labels were then included as inputs to regression models that estimated continuous acceptance probabilities (probpred) for each individual and gamble.
Algorithms and implementation: Supervised models included Decision Tree Classifier/Regressor, Random Forest Classifier/Regressor, Kernel SVC/SVR, and k-NN Classifier/Regressor. Representative hyperparameters (per Table 3) included, for Random Forests, n_estimators=100 with gini criterion (classifier) and mse (regressor), bootstrap=True. Computations were carried out in Python 3.6 using Keras, Pandas, Numpy, Matplotlib, Plotly; MATLAB 2016b curve fitting tools supported visualization. Feature importance and decision tree feature selection were examined to assess the contribution of inputs, particularly the ‘diff’ variable. Reaction time metrics and behavioral scales were integrated as predictive covariates.
- Random Forest superiority: Among hybrid models, Random Forest achieved the best regression performance (lowest MSE) and competitive classification accuracy, aligning with prior literature on predicting human decisions.
- Loss aversion coefficient: The median λ was 3.1 (range 0.5–6), indicating losses were experienced approximately three times as strongly as equivalent gains; higher than commonly reported 2–2.5 in prior work.
- Classification accuracy (test): Kernel SVC 0.896; Random Forest Classifier 0.874; Decision Tree Classifier 0.845; k-NN Classifier 0.808.
- Regression error (MSE/100, lower is better): Random Forest Regression 0.919; Kernel SVR 1.079; Decision Tree Regression 1.399; k-NN Regression 1.501.
- Decision durations and acceptance patterns: Average decision time was ~1600 ms. Accepted gambles (n=4702) had mean 1564 ms; certain gain (rejecting gamble; n=2466) had mean 1670 ms, indicating quicker acceptance decisions. The study labels minimized decision times near the default loss aversion level as the "irresistible impulse of gambling."
- Duration by diff and probability: When diff≈0 and probpred 10–20%, mean duration ~1681 ms; at moderate acceptance probabilities (50–60%) mean ~1618 ms; at high diff (6–8) and probpred 90–100%, mean ~1536 ms; diff 3–8 overall mean ~1506 ms. Around λ≈3.0 (diff 2.9–3.1), 364 decisions had mean ~1560 ms; diff 2.1–1.1, 2660 decisions had mean ~1639 ms; diff 0–2, 840 decisions had mean ~1623 ms. Acceptance was low when diff<0 and increased as diff approached and exceeded 1.
- Behavioral consistency: Decision times were longer for options involving certain zero gain (rejection) and potential losses, consistent with “losses loom larger than gains” and with fast/slow thinking distinctions.
The findings demonstrate that hybrid machine learning models, particularly Random Forests, effectively predict individual acceptance of mixed gambles when enriched with behavioral-theoretic features such as the gain/loss ratio (diff), reaction times, and psychological/demographic variables. The central research goal—to forecast loss-averse behavior—was addressed by showing strong predictive performance across low, medium, and high acceptance probabilities and by identifying ‘diff’ as a highly informative feature. The elevated median λ (3.1) and the systematic variation of decision durations with diff and acceptance probabilities corroborate core propositions of Prospect Theory (losses loom larger than gains) and align with Kahneman’s fast/slow thinking framework, where quicker decisions accompany higher acceptance under favorable gain–loss ratios. These results underscore the value of integrating descriptive decision theories with data-driven ML for understanding and forecasting human choices, with implications for policy modeling, market behavior, and the timing of decisions under risk.
The study shows that hybrid machine learning approaches can successfully model and predict loss aversion in individual mixed-gamble choices by integrating objective task parameters, reaction times, and psychological and sociodemographic features. Random Forests delivered the best overall performance (lowest regression MSE), and the gain/loss ratio (diff) emerged as a key predictor. Empirically, the median loss aversion coefficient was λ=3.1, and decision times were shorter for accepting gambles than for choosing certain zero gains, supporting behavioral theory. Policy implications include incorporating fine-tuned behavioral parameters (e.g., loss aversion) into macroeconomic and market models to anticipate behavior under risk, tailoring interventions by demographics and financial literacy, and designing communications that account for loss-averse responses. Future research directions proposed include extending hybrid models to other biases (herding, overconfidence, endowment effect, anchoring), analyzing neural signatures alongside behavioral features, and exploring group decision-making and intertemporal choices within hybrid frameworks.
The study notes several limitations: (1) Ecological validity: laboratory tasks may not fully reflect real-world decisions; (2) Simplicity: tasks (sure gain vs gamble) may not capture real-world complexity; (3) Self-reported measures: psychological scales can be biased; (4) Data issues: potential missing data/outliers may affect results; (5) Sample size: a small sample (n=28) limits power and generalizability; (6) Confounds: uncontrolled variables may confound causal interpretations.
Related Publications
Explore these studies to deepen your understanding of the subject.

