Introduction
South Korea's increasing inequality since the 1997 Asian financial crisis has spurred numerous studies. Recent cultural content such as *Parasite* and *Squid Game* also reflects growing social awareness of unequal opportunity distribution, especially among young Koreans. This study aims to identify the roots of this inequality by applying algorithmic approaches. It focuses on understanding the importance of various factors rather than simply measuring the extent of inequality. The study uses a decision tree classification algorithm, LightGBM, and SHAP to analyze survey data, focusing on the ex-ante utilitarian perspective of inequality of opportunity. This approach analyzes inequalities between social groups defined by shared circumstances, specifically examining those in the most adverse socio-economic condition (below minimum wage) to identify the influential circumstantial factors.
Literature Review
Existing research on inequality of opportunity often employs parameter-based approaches, which are susceptible to bias and model selection issues. Nonparametric tests, while offering advantages, are criticized for arbitrary segmentation. This study addresses these limitations by utilizing a decision tree classification algorithm. This method is free from bias and model selection problems because it doesn't assume parameters or linearity and uses a machine learning algorithm for sample partitioning. Previous studies using decision trees focused on measuring inequality; this study uses it to identify the roots by estimating and interpreting the importance of variables. The study also uses LightGBM to overcome the decision tree's overfitting and instability issues and SHAP to interpret the results and analyze individual variable importance. The theoretical framework draws upon Rawls' theory of justice and Roemer's distinction between circumstances and effort, with a focus on the ex-ante utilitarian perspective.
Methodology
The study uses data from the Youth Panel Survey conducted by the Korea Employment Information Service. This survey provides information on circumstances (region of upbringing, gender, parental background, housing, etc.) and current wages of South Korean millennials. The dependent variable is a binary classification based on whether an individual's wage is above or below the 2017 minimum wage in South Korea. The study employs a decision tree classification algorithm, which partitions the sample into mutually exclusive regions based on specific classification conditions. This allows the estimation of variable importance in the binary classification process. To address the decision tree's limitations of overfitting and instability, LightGBM, a gradient boosting decision tree method, is employed. LightGBM uses a boosting method to sequentially create and update multiple weak classifiers to improve accuracy and stability. SHAP, based on game theory, provides consistent estimation of variable importance, enabling interpretation of results and analysis of individual variable impact. The study compares the performance of the decision tree and LightGBM models using metrics such as accuracy, precision, recall, F1 score, and ROC-AUC, using an 80/20 training/testing split.
Key Findings
The SHAP summary plots from both the decision tree classification and LightGBM models show the importance of variables in predicting whether an individual's wage is above or below the minimum wage. The LightGBM model, showing greater stability, is deemed more reliable for interpretation. LightGBM results indicate that the region of upbringing, gender, and father's occupation are the most influential factors. Region shows the strongest impact, followed by gender and father's job. The father's occupation and educational background have a greater influence than the mother's. The analysis reveals a significant regional disparity in South Korea and the persistence of gender inequality. The tenancy status of the house (owned vs. rented) also showed a significant influence, reflecting wealth disparities.
Discussion
The findings address the research question by identifying the key circumstances contributing to inequality of opportunity in South Korea. The significant influence of region highlights the structural inequalities embedded in the spatial distribution of resources and opportunities. The strong impact of father's occupation and gender underlines the role of social and family structures in shaping an individual's socio-economic outcomes. The results align with the spoon class theory prevalent in South Korea, emphasizing the impact of inherited social status. The study’s findings demonstrate that achieving equality of opportunity requires addressing structural inequalities related to region, gender, and family background. This study bridges classical theories of justice with data-driven methods, offering a novel approach to understanding inequality of opportunity.
Conclusion
This study uses algorithmic approaches to identify the roots of inequality of opportunity in South Korea. The combination of tree-based models and SHAP provides consistent estimation of variable importance, showing that region, gender, and father’s occupation are key factors. LightGBM offers more stable and reliable results compared to the decision tree classification. The significant regional disparity and the influence of gender and paternal background highlight the need for policies that address these structural inequalities to promote equality of opportunity. Future research could explore more detailed regional disparities and additional socioeconomic criteria.
Limitations
While the study uses robust methods, several limitations exist. The connection between theoretical frameworks, empirical approach, and algorithmic methods requires further discussion. The study uses minimum wage as a criterion for socio-economic disparity, and other criteria could be explored. While the study identifies region as highly influential, a detailed analysis of regional stratification and disparities is beyond its scope. Further, the study relies on self-reported data which may have biases.
Related Publications
Explore these studies to deepen your understanding of the subject.