Childhood obesity is a significant global health concern with limited success in control measures. Universal or targeted screening programs aim to predict and address obesity early in life, before it becomes difficult to treat. This necessitates validated predictive algorithms using universally available, machine-readable, and linkable data. While numerous studies have explored predictors of childhood obesity, few have examined the validity of prediction models in mid-childhood for later childhood obesity, particularly before puberty when adult health outcomes are largely determined. This study uses data from the Growing Up in Scotland (GUS) study, a high-quality population-based cohort, to address the research question: Which predictors collected between antenatal life and age 5-6 from routinely collected databases allow reasonably valid prediction of obesity at age 12? The aim is to identify key data elements for national child obesity surveillance systems, enabling early identification of high-risk children for timely intervention before obesity is fully established.
Literature Review
A systematic review of 282 epidemiological studies in high-income countries (HICs) identified several risk factors for childhood obesity, including male sex, parental socioeconomic status (maternal education, family income, deprivation index), maternal pre-pregnancy BMI, maternal smoking during pregnancy, birthweight, and early childhood factors like high weight gain in the first year of life and lack of breastfeeding. Many of these are routinely collected in HICs. Previous work using the GUS dataset identified maternal overweight/obesity and maternal smoking during pregnancy as independent risk factors for obesogenic growth trajectories. A detailed analysis showed increasing obesity and overweight rates between ages 6 and 10, with widening inequalities by social class. Recent research explored the relationship between Adverse Childhood Experiences (ACEs) and obesity, finding an association between higher ACEs and higher BMI. This study uses ACEs and Protective Childhood Experiences (PCEs) measures, acknowledging criticisms of ACEs' inherent imbalance by omitting positive childhood influences.
Methodology
This secondary analysis used data from the Growing Up in Scotland (GUS) birth cohort 1 (n=5217) born in 2004/05. Data from nine sequential interviews and examinations up to age 12 were utilized. The primary outcome was obesity at age 12 (BMI ≥95th centile). Potential predictors included maternal age, ethnicity, child's birth order, maternal smoking in pregnancy, maternal BMI, GDM/diabetes in pregnancy, maternal education, location, equivalized household income, SIMD quintile, household indoor smoking, mode of delivery (Caesarean), gestational age at birth, birthweight, breastfeeding duration, age at introduction of solid foods, child's sex, child's ethnicity, child's BMI at age 5-6, ACEs count, and PCEs count. Of the initial 5217, 2787 children with complete outcome data were included. Multiple imputation by chained equations (MICE) handled missing data (26.2%). Multivariable logistic regression with stepwise selection (p=0.06) reduced the predictor variables. Two final models were developed: an "Optimum Data" model using all available predictors and a "Scottish Data" model using only routinely collected and machine-readable variables in Scotland. Internal validation was performed using bootstrapping to obtain shrinkage factors for recalibrating odds ratios. Model performance was assessed using Nagelkerke's R², Harrell's C-statistic, AUROC, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and referral burden. The optimal cut-off point was determined using Youden's Index.
Key Findings
The final "Optimum Data" model included six predictors: maternal BMI, indoor smoking, equivalized income quintile, child's sex, child's BMI at age 5-6, and ACEs. After internal validation, the AUROC was 0.855 (95% CI 0.852–0.859). The optimal cut-off yielded 76.3% sensitivity and 77.6% specificity, with a 37.0% referral burden. The "Scottish Data" model, excluding equivalized income and ACEs and instead using SIMD quintile and age at introduction of solid foods, showed slightly lower sensitivity (76.2%) and higher specificity (79.2%), resulting in a smaller referral burden (30.8%). Both models correctly identified over three-quarters of children destined to be obese at age 12, and also identified overweight children who could benefit from early intervention, although a significant portion were identified as “false positives”. However, a substantial percentage (approximately 41%) of these false positives were actually overweight at age 12, suggesting they could still benefit from intervention. The 'indisputable' false positives were approximately 10% of screened children, and thus the models are moderately reliable.
Discussion
The findings demonstrate that routinely collected data before age 6 can predict childhood obesity at age 12 with reasonable accuracy. Both models performed similarly, but the "Scottish Data" model's higher specificity and lower referral burden offer advantages for practical implementation. The high prevalence of obesity (18.3%) and resulting referral burden pose a significant challenge to existing healthcare systems. While the models provide valuable information for early identification, factors such as referral system capacity, treatment efficacy, and potential screening side effects (labeling effects and missed cases) need careful consideration before widespread implementation. Further research should pilot actual screening programs, focusing on the effectiveness of early interventions and the evaluation of both true and false positive impacts.
Conclusion
This study demonstrates that a limited set of routinely collected data from early to mid-childhood can effectively predict obesity risk at age 12, offering potential for cost-effective universal screening. The "Scottish Data" model's higher specificity and lower referral burden make it practically preferable. However, the significant referral burden highlighted necessitates further research into the feasibility, effectiveness, and overall consequences of implementing such screening programs in the context of real-world healthcare capacity.
Limitations
The study acknowledges the limitations of differential attrition in cohort studies, particularly the loss of families from more deprived backgrounds. While imputation methods were used, residual bias might affect the results. Self-reported data may be subject to social desirability bias, and recall bias might influence data collected early in the study. The limited number of non-white participants restricts the generalizability of the findings to other ethnic groups.
Related Publications
Explore these studies to deepen your understanding of the subject.