Education
Evaluation of college admissions: a decision tree guide to provide information for improvement
Y. Liu and L. Lee
Taiwan implemented a 12-year basic national education program grounded in multiple intelligences to reduce exam pressure, support adaptive development, and improve educational equity. In the 5-year junior college track, an exam-free admissions system considers multiple criteria (e.g., multiple learning performance, giftedness, disadvantaged status, balanced performance, counseling competency) and allows applicants to choose schools aligning with their interests while schools select according to set standards. However, oversubscription to popular colleges, particularly in metropolitan areas, leads to high failure rates for applicants and underenrollment at rural/agricultural colleges, raising concerns about fairness and potential marginalization. This study asks whether the exam-free system achieves its educational objectives and identifies the factors associated with enrollment success or failure using decision tree analysis to provide actionable guidance for applicants and institutions.
The study situates itself within data mining and knowledge discovery in databases (KDD), noting decision trees as intuitive supervised learning classifiers widely applied in educational contexts for prediction and explanation. Prior work has used decision trees and related models to predict student achievement, evaluations of teaching, learning behaviors, early warning systems, and enrollment decision support. Applications to admissions include decision support for university choice and predicting institutional popularity. The paper also references debates on multiple intelligences as an educational foundation, noting critiques about limited empirical support and concerns about fairness and resource allocation, as well as ethical considerations surrounding educational big data.
Data were drawn from Taiwan’s joint admissions committee for the 5-year junior college exam-free pathway in 2016, five years after the reform. The dataset includes 6013 applicants registering through the exam-free system across four colleges (A–D) differing by location (metropolitan vs agricultural counties), size, and departments. At registration, applicants submitted verified personal and performance data. Outcome variable: Y1 (enrollment result: success/fail). Independent variables: 21 candidate variables, including college and location (X1–X3), distance from applicant to college (X4: ordinal 0–5), registration threshold met (X5), school type and origin variables (U1–U3), size of JHS graduating class (U4), competition (U5), service-learning (U6), daily life performance (U7), physical fitness (U8), multiple learning performance (U9), technically/artistically gifted (U10), disadvantaged status (U11), balanced learning performance (U12), counseling competency (U13), comprehensive assessment (U14), writing test (U15), and other factors (U16; e.g., GEPT/TOEIC plus points for Schools A and B). Analyses included descriptive statistics; chi-square tests of independence for categorical variables; two-sample t-tests, Mann–Whitney U-tests, and Kolmogorov–Smirnov tests for measured variables. For predictive/explanatory modeling, a CHAID decision tree classifier was employed: variables treated categorically (continuous variables discretized as needed), Bonferroni-adjusted splits at alpha 0.05, maximum depth 6, minimum parent node size 100, minimum child node size 50. Data were split 80/20 into training (n=4809) and test (n=1204) sets. Model performance metrics (risk, sensitivity, overall accuracy) and variable importance ordering were reported. Decision rules with ≥75% accuracy were extracted for interpretation.
- Cohort and distribution: 6013 applicants; the paper states 2294 enrolled and 3719 failed (61.85% failure). Elsewhere (Table 4) counts used in analyses are failures N=3965 and successes N=2048 (~66% failure), consistent with node base rates in the trees. Schools in metropolitan areas filled to capacity, while agricultural-county schools were underenrolled (possible 1544 places vs actual 1286).
- Chi-square associations (p<0.01): enrolling college (X1), county/city (X2), metropolitan vs agricultural (X3), distance (X4), registration threshold (X5), junior high school type (U1), and remote/outlying origin (U3) were all significantly associated with enrollment outcomes.
- Distance patterns: Most applicants were local (X4=0; 35.6%), with failure rates increasing markedly for greater distances (e.g., ≥3 counties/cities away failure ≈80%).
- Performance comparisons (means failures vs successes; significant positive differences indicate higher for failures): failures had higher averages in U4 size (16.643; p=0.009), U5 competition (0.077; p=0.028), U6 service-learning (0.137; p<0.001), U7 daily life (0.098; p<0.001), U9 multiple learning (0.254; p=0.001), U14 comprehensive assessment (0.746; p<0.001), U15 writing (0.126; p<0.001). Successes scored higher on U10 technically/artistically gifted (-0.267; p<0.001), U12 balanced learning (-0.844; p<0.001), and U16 other (e.g., English test plus points; -0.490; p<0.001). These patterns suggest a reverse selection phenomenon in some contexts.
- Decision tree (CHAID) performance: training/test risk 0.231/0.252; SE 0.006/0.013; sensitivity to failures 81.9%/80.6% (precision 67.4%/63.2%); overall accuracy 76.9%/74.8%; 39 nodes. Variable importance order: X3 (metropolitan vs agricultural), U16 (other), X5 (threshold), U10 (technically/artistically), U14 (comprehensive assessment), U11 (disadvantaged), X4 (distance), U1 (school type), U15 (writing).
- First-level split: college location. Metropolitan colleges had much higher failure rates (≈77–78%) than agricultural county colleges (≈51–53%), indicating stronger competition/selection in metro areas.
- Key metro-area rules (selected, with training/test probabilities): • If other ≤1.2 and threshold=No → Fail 100/100%. • If other ≤1.2, threshold=Yes, U10=0, U14≤14, U11≤0 → Fail 98.3/99.1% (disadvantaged status >0 offered some safeguard lowering failure vs none). • If other ≤1.2, threshold=Yes, U10>0, distance>0 (≤2) → Fail ≈78.1/80.0% (distance increases failure risk). • If other >1.2, distance ≤0, U14 ≤15.4 → Success 76.1/71.4%.
- Key agricultural-county rules: • Distance >2 and writing >3 → Fail 83.5/78.2%. • Distance ≤1, U14 ≤11.2, writing ≤3, distance=0 → Success 86.2/73.7%.
- Reverse selection indicators in agricultural counties: lower comprehensive assessment or writing scores sometimes associated with higher enrollment success (e.g., Nodes 12, 23), suggesting better-qualified applicants may decline offers, and colleges may relax conditions to fill seats.
- Additional observations: Applicants from remote/outlying areas had a high failure rate (90.2% of such applicants failed), and metro-area schools’ English plus points (other) and meeting thresholds were critical determinants.
The analysis addresses whether Taiwan’s exam-free junior college admissions aligns with goals of multiple intelligences and equitable access. The decision tree shows location as the primary determinant of outcomes, confirming that metropolitan colleges possess a strong selection advantage with substantially higher failure rates, while agricultural-county colleges face weaker competition and underenrollment. Factors such as English proficiency plus points (other), meeting the registration threshold, technical/artistic giftedness, and distance play decisive roles. Notably, some academic performance metrics (comprehensive assessment, writing) being higher among failures, especially in agricultural counties, point to reverse selection: stronger candidates often opt out for preferred institutions, leaving rural colleges to backfill with lower-scoring applicants. This dynamic undermines the intended equity of the system and may exacerbate urban–rural disparities in talent cultivation. The model’s extracted rules provide actionable guidance for applicants (e.g., proximity and English plus points matter greatly in metro areas; distance is critical in agricultural counties) and for institutions to adjust recruitment strategies and thresholds. Overall, the findings suggest that applicant preferences and institutional advantages interact to produce inequitable outcomes, challenging the realization of multiple-intelligence-based admissions fairness.
Using CHAID decision trees and complementary statistical tests on 2016 exam-free admissions data for Taiwan’s 5-year junior colleges, the study shows: (1) college location is the dominant factor in enrollment outcomes; (2) metropolitan colleges exert stronger selection pressure, with key determinants including English plus points and meeting thresholds; (3) distance strongly affects outcomes, especially in agricultural counties; and (4) evidence of reverse selection arises, with better-qualified applicants often declining offers at rural colleges. The work contributes interpretable rules for stakeholders to anticipate risks of failure/success and highlights structural imbalances that impede the equity aims of multiple-intelligences-based admissions. Policy suggestions include measures to mitigate underenrollment and reverse selection in agricultural counties (e.g., guaranteed or priority admission schemes in rural districts) and reconsideration of incentive structures (such as English plus points) that may amplify metropolitan advantages. Future research could extend to multi-year datasets, additional institutions, and alternative modeling approaches to validate and refine decision rules and assess the impact of policy changes.
Related Publications
Explore these studies to deepen your understanding of the subject.

