logo
ResearchBunny Logo
Validity of Scottish predictors of child obesity (age 12) for risk screening in mid-childhood: a secondary analysis of prospective cohort study data—with sensitivity analyses for settings without various routinely collected predictor variables

Medicine and Health

Validity of Scottish predictors of child obesity (age 12) for risk screening in mid-childhood: a secondary analysis of prospective cohort study data—with sensitivity analyses for settings without various routinely collected predictor variables

G. Carrillo-balam, L. Doi, et al.

This groundbreaking study conducts an in-depth analysis of the Growing Up in Scotland cohort to unveil key predictors of obesity at age 12, beginning at school entry. Utilizing advanced multivariable logistic regression, the research identifies crucial factors impacting childhood obesity, shedding light on the complexities of implementation within the Scottish healthcare system. The study was led by Gabriela Carrillo-Balam and her esteemed colleagues.

00:00
00:00
~3 min • Beginner • English
Introduction
The study addresses whether routinely collected, machine-readable, and linkable data from antenatal life to age 5–6 can predict obesity by age 12. Given the ongoing global childhood obesity pandemic and limited success of interventions, there is interest in early risk prediction to enable timely, targeted treatment before obesity becomes entrenched. Prior work highlights multiple early-life risk factors and the need to identify which routinely collected variables across datasets best predict pre-pubertal obesity. Using the Growing Up in Scotland cohort, the authors aimed to develop and internally validate multivariable prediction models for obesity at age 12 using predictors available by school entry, and to determine which data elements are most critical, particularly for national child obesity surveillance systems.
Literature Review
Multiple systematic reviews have documented early-life risk factors for childhood overweight/obesity in high-income countries, including male sex, lower socioeconomic status (maternal education, income, area deprivation), high maternal pre-pregnancy BMI, excessive gestational weight gain, gestational diabetes, maternal smoking during pregnancy, higher birthweight, rapid infant weight gain, lack of breastfeeding, and early introduction of solid foods. Some factors require primary collection beyond the perinatal period (e.g., breastfeeding duration, indoor smoking, age of solids introduction), while BMI at age 5–6 is routinely measured in Scotland (Primary 1 exam) but not universally in all countries. Prior GUS analyses linked maternal overweight/obesity and smoking in pregnancy to obesogenic trajectories, and documented widening social inequalities in obesity by age 10. Evidence suggests ACEs are associated with higher BMI; the authors created proxies for ACEs and PCEs within GUS, noting critiques of ACEs’ omission of protective experiences. This study builds on that literature by focusing on mid-childhood predictors for later pre-pubertal obesity and the feasibility of using routinely collected data across settings.
Methodology
Design and data source: Secondary analysis of the Growing Up in Scotland (GUS) Birth Cohort 1 (n=5217; born 2004/05), with nine sweeps of data to age 12 (2016–17). Ethical approval was obtained via expedited review; data were anonymised. Participants: 2917 were followed to sweep 9; 2787 had complete outcome data at age 12 (height, weight, age). Outcome: Obesity at age 12 defined per ISD Scotland, using UK reference BMI centiles (Cole’s LMS): BMI ≥95th centile. Candidate predictors (available by age 5–6) were literature-informed and feasible for routine collection in HICs: maternal age at birth; maternal ethnicity; child birth order; maternal smoking in pregnancy; maternal BMI at child age 5–6 (continuous); gestational diabetes/diabetes; maternal education; urban/rural location; equivalized household income quintile; Scottish Index of Multiple Deprivation (SIMD) quintile; indoor household smoking; caesarean delivery; gestational age (<3 weeks early vs ≥3 weeks early); birthweight (<2500g vs ≥2500g); breastfeeding (never, <6 months, ≥6 months); age at introduction to solid foods (<4 vs ≥4 months); child’s sex; child’s ethnicity; child’s BMI at age 5–6 (continuous); ACEs count (7 proxies); PCEs count (5 proxies). Detailed construction in Supplementary Material. Sample size and missing data: Of 2787 with outcomes, 26.2% (n=735) had ≥1 missing predictor. Missingness analyses suggested possible bias; Multiple Chained Equations (30 imputed datasets; no auxiliary variables) imputed missing predictors for all 2787. Statistical analysis: Unweighted analyses (weights not used due to limitations and inclusion of attrition predictors in models). Bivariate associations screened predictors (p<0.1) for multivariable modelling; 16 of 21 entered the full model. Polychoric correlations assessed collinearity; no strong correlations identified. Variable selection used stepwise backward/forward selection (p=0.06), retaining variables causing >10% change in other betas (none added by this criterion). Two final models: Optimum Data model (allowing non-routine variables) and Scottish Data model (excluding non-routine ACEs/PCEs and equivalized income; substituting SIMD; adding age at solids). Internal validation via bootstrapping produced shrinkage factors (applied to recalibrate coefficients). Performance assessed by Nagelkerke R², Harrell’s C-statistic, AUROC. Optimal cut-offs chosen using Youden’s Index; Positive/Negative Predictive Values and referral burden computed. Analyses used R (mice, psfmi, ROCit) and Stata 16.
Key Findings
Participants: 2787 children; 18.3% (393 observed in text; table shows 18.5% at age 11–12) obese at age 12; just over half male; majority urban. Final predictors: - Optimum Data model (6 predictors): maternal BMI; indoor household smoking; equivalized income quintile; child’s sex; child BMI at age 5–6; ACEs count. - Scottish Data model (6 predictors): maternal BMI; indoor household smoking; SIMD quintile; age at introduction to solid foods (≥4 months protective vs <4 months); child’s sex; child BMI at age 5–6. Internal validation and performance: - Optimum Data model: shrinkage 0.974; AUROC 0.855 (95% CI 0.852–0.859); Nagelkerke R² 0.374; C-statistic 0.851 after validation. - Scottish Data model: shrinkage 0.981; AUROC 0.849 (95% CI 0.846–0.852); Nagelkerke R² 0.364; C-statistic 0.845 after validation. Youden-optimal cut-offs and screening metrics: - Optimum Data cut-off (0.217): sensitivity 76.3%; specificity 77.6%; PPV 37.8%; NPV 93.6%; referral burden 37.0% (784/2118). Of 387 false positives, 41.4% (n=160) were overweight at age 12; indisputable false positives ≈10.7% of those screened. - Scottish Data cut-off (0.226): sensitivity 76.2%; specificity 79.2%; PPV 44.7%; NPV 93.8%; referral burden 30.8% (702/2279). Of 388 false positives, 41.5% (n=161) were overweight; indisputable false positives ≈10.0% of those screened. Notable effect directions (per Table 2): higher maternal BMI and higher child BMI at 5–6 increased risk; indoor smoking increased risk; greater deprivation (SIMD Q2, Q5) increased risk; male sex increased risk; introduction of solids at ≥4 months was protective; increasing ACEs count increased risk. Equivalized income categories showed limited statistical significance in the Optimum model after validation.
Discussion
Using variables routinely available by age 5–6, the models achieved good discrimination for predicting obesity at age 12, addressing the research question about which routinely collected data elements are most informative. The Scottish Data model, which relies solely on routinely collected and feasible variables (SIMD and age of introduction of solid foods instead of income and ACEs), provided slightly higher specificity and a lower referral burden at near-identical sensitivity, suggesting practical advantages for implementation in Scottish settings. Predictive factors point to familial and environmental influences (maternal and child BMI, indoor smoking, socioeconomic deprivation, early feeding practices), underscoring potential intervention targets. Nonetheless, even with improved specificity, the projected referral burden (≈31–37% of children screening positive) may exceed current service capacity, highlighting the need to consider system readiness, potential harms (labelling, false reassurance), and the real-world effectiveness of early interventions before adopting universal screening. These findings can guide national surveillance systems in selecting key data elements for early risk identification while balancing feasibility and ethical considerations.
Conclusion
A small set of routinely collected predictors available by age 5–6 can predict obesity at age 12 with acceptable validity, enabling early identification of higher-risk children. The Scottish Data model demonstrates that feasible, routinely collected data (maternal BMI, indoor smoking, SIMD, age at solids, child sex, child BMI at 5–6) can perform nearly as well as models including less routinely available variables. However, the associated referral burden is substantial, and broader evaluations—ethical, logistical, capacity, economic, and potential unintended effects—are necessary before implementing universal screening. Future research should include external validation in other populations, piloting of screening pathways with robust evaluation of outcomes and harms, and exploration of strategies to reduce referral burden (e.g., alternative thresholds, risk stratification, stepped-care approaches).
Limitations
- Differential attrition typical of cohort studies, with greater loss among more deprived families; while key predictors of attrition (maternal age/education, income, SIMD) were included and multiple imputation used, residual bias may remain. - Many predictors were self-reported (e.g., smoking, breastfeeding), susceptible to social desirability and recall bias (breastfeeding and birth data collected at 10 months). - Survey weights were not used; although justified, this may affect generalisability of regression estimates. - ACEs were represented by proxies and may be infeasible and sensitive for universal routine collection; practical constraints favour the Scottish Data model which omits ACEs. - Internal validation only; no external validation was performed, limiting generalisability across settings. - Some predictors (e.g., equivalized income) showed limited significance in final models; model transportability to contexts without routine BMI at 5–6 may be reduced.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny