Health and Fitness

Data integration for prediction of weight loss in randomized controlled dietary trials

R. L. Nielsen, M. Helenius, et al.

This groundbreaking research explores how individual features like gut microbiome and genetics can predict weight loss in overweight Danes. With models achieving an impressive accuracy of 84-88%, this study by Nielsen et al. sheds light on personalized weight management strategies.... show more

Introduction

The study addresses the challenge that individuals vary widely in weight loss response to the same dietary intervention. Prior research has explored predictors such as energy balance, macronutrients, anthropometrics, glycemic/insulinemic status, and microbiome features (e.g., Prevotella-to-Bacteroides ratio), but individual-level prediction remains difficult. Multi-omics approaches have linked obesity and metabolic health to gut microbiome, metabolome, and genetics, yet integrating heterogeneous data with small sample sizes and missingness is challenging. Building on two randomized cross-over trials in Danish adults comparing whole grain-rich, low-gluten, and refined grain diets, the authors hypothesize that baseline multi-omics profiles combined with diet assignment can predict who will lose weight over 8 weeks. The purpose is to develop and evaluate machine learning models integrating diet, gut microbiome, urine metabolome, host genetics, and clinical measures to classify weight loss responders versus non-responders.

Literature Review

Previous modeling efforts focused on energy intake/expenditure, macronutrient balance, anthropometrics, glycemic and insulinemic statuses, and gut microbiome composition (including Prevotella-to-Bacteroides ratio). Multi-omics studies have elucidated host-microbe-metabolome relationships in metabolic health and obesity, with associations in gut microbiome diversity and composition, plasma metabolome perturbations in obesity, and genetic contributions. Integrative approaches have been used to study weight change in insulin-sensitive and insulin-resistant individuals. However, individual-level prediction accuracy is hampered by data heterogeneity, high dimensionality, limited sample sizes, and missing data. Ensemble machine learning and robust feature selection have improved stability in related prediction tasks, such as glycemic response prediction.

Methodology

Design and participants: Data derive from two randomized cross-over trials in Denmark: an 8-week whole grain-rich versus refined grain diet study and an 8-week low-gluten versus refined grain diet study, with a 6-week washout between interventions. A total of 102 participants completed both trials (50 in whole grain study; 52 in low-gluten study). Participants were overweight with cardio-metabolic risk markers but generally healthy. Baseline (pre-intervention) data at the start of each intervention period were used as features. Outcome: Weight loss responders were defined as any individual with a decrease in body weight over the 8-week period; non-responders had no change or gained weight. Across 204 potential instances (two baselines per participant), 203 were analyzed (one missing weight), yielding 106 responders and 97 non-responders. Measurements: Anthropometrics and physiology (28 variables including BMI, CRP, IL-6, HbA1c, HOMA-IR, zonulin, blood pressure, lipid and carbohydrate metabolism markers), gastrointestinal transit time, self-reported VAS, postprandial response to a standardized meal (free fatty acids, GLP-2, glucose, insulin; breath H2) with time series sampling, stool microbiome (16S rRNA amplicon sequencing and shotgun metagenomics), urine metabolomics (GC-MS and LC-MS), and host genome (Illumina CoreExome-24 BeadChip). Data processing: 16S data processed with QIIME2 and Deblur; low-abundance OTUs filtered; taxonomy assigned using SILVA 128; relative abundances computed. Shotgun data mapped via MGmapper against Human Microbiome, MetaHit Assembly (not used for butyrate species), Bacteria, Bacteria Draft; relative abundances computed. Metabolites (LC-MS/GC-MS) processed and putatively annotated (HMDB, METLIN; MSI level 3-4). Genotyping QC in PLINK (call rates ≥98%, ancestry, HWE p≥0.005, MAF ≥1%); 105 samples and 272,588 SNPs remained. Feature engineering and selection: Total initial feature space 287,596, spanning Clinical (28), TransitTime (2), Diet (3), Intake (whole grain and gluten continuous), VAS (5), PostPran (16 derived features), 16S OTUs (10,093), metagenomic species (MGmapper species across catalogs and MGS), urine metabolites (GC-MS 85, LC-MS 1285), and genome (272,588 SNPs). Prior knowledge filtering produced: ClinicalA (8 variables: age, sex, BMI, CRP, IL-6, HbA1c, HOMA-IR, zonulin), literature-based SNP sets (LitPath 703 SNPs; LD-pruned to 56 SNPs plus forward selection), five weighted genetic risk scores (GRS; 32 SNPs total across obesity-related traits, body weight/sagittal diameter change), microbiome species prioritized as butyrate producers (17 literature-curated species across MGmapper catalogs; 30 features across catalogs), top 14 altered MGS from each prior trial (28 total), and top varying 16S-based OTUs (top 10 and top 250 by prevalence/variance). Postprandial dynamics were represented via volatility features (fluc1: cumulative absolute differences; fluc2/fluc3: image grid-based occupancy with 10x10 and 50x50, fluc3 requiring consecutive occupancy). Data-driven selection included exhaustive search of metabolite pairs/triplets and forward selection across datasets (adding features iteratively to maximize cross-validated ROC-AUC, with pruning of low-performing candidates; typical max features 8 or ~5–15). Machine learning: RandomForestClassifier in scikit-learn with n_estimators=50, random_state=42, max_features=None, min_impurity_decrease=0.01. Evaluation via stratified fivefold cross-validation repeated 50 times with shuffle-split (consistent initialization across models). Primary metric: ROC-AUC; also sensitivity, specificity, and Matthews correlation coefficient (MCC). Robustness assessed via permutation tests on class labels under multiple setups (retraining on permuted labels with features from true-label models; full selection on permuted labels; evaluating true-label-trained models on permuted labels). Modeling cohorts: “Complete data” subset (N=130) with all selected datasets available used for comparable single-dataset and combined-dataset models; “All available data” models included all individuals per combination (N=147–203). Ensemble modeling: Models with ROC-AUC>0.62 (above diet-only baseline) were included, yielding 334 models across seven data combinations (diet; forward-selected clinical; literature SNPs; postprandial response; transit time; butyrate-producing species; 16S OTUs; LC-MS metabolites). Ensemble scoring schemes included mean of scores, majority voting, mean of confident scores using thresholds (e.g., s≤0.25 or ≥0.75), and majority voting on confident scores. Thresholding of final ensemble score s supported classification at varying operating points, with calculation of sensitivity, specificity, PPV, and NPV. Feature importance: Gini importance (mean decrease in impurity) summarized across models; features selected in ≥15% of CV runs were highlighted.

Key Findings

Diet-only baseline models had modest predictive ability: ROC-AUC 0.62 (N=203) and 0.61 (N=130 complete data).
Adding habitual whole grain and gluten intake modestly changed performance (ROC-AUC ~0.63), indicating diet alone is insufficient for individual prediction.
Microbiome and metabolome features markedly improved prediction: • Diet + forward-selected 16S OTUs (from top 250 varying): ROC-AUC 0.82 (N=130); 0.81 (N=179). • Diet + forward-selected MGmapper species (Bacteria draft): ROC-AUC 0.82 (N=130); 0.80 (N=183). • Diet + butyrate-producing species (MGmapper Bacteria draft) + LC-MS urine metabolites: ROC-AUC 0.90 (N=130; Sens 0.84, Spec 0.79, MCC 0.64); 0.88 (N=173; Sens 0.83, Spec 0.78, MCC 0.62). • Diet + forward-selected 16S OTUs + LC-MS urine metabolites: ROC-AUC 0.86 (N=130; Sens 0.80, Spec 0.76, MCC 0.57); 0.84 (N=169; Sens 0.78, Spec 0.74, MCC 0.52).
Genetic and clinical features contributed but were less predictive alone: Diet + LitPathLD SNPs ROC-AUC 0.81 (N=130) and 0.77 (N=185); Diet + ClinicalB ROC-AUC 0.72 (N=130 and N=196).
Important microbial taxa/species associated with classification included family Ruminococcaceae (higher in responders) and genus Streptococcus (higher in non-responders), and butyrate-producing species Faecalibacterium prausnitzii, Eubacterium ramulus, and Roseburia faecis (high Gini importance in best models).
Ensemble of 334 models across seven data combinations achieved ROC-AUC up to 0.86 (mean of scores). A confident-score ensemble (s≤0.25 or s≥0.75) achieved ROC-AUC 0.83.
Operating thresholds enabled targeted identification: • With classification threshold s=0.30 for non-responders, the ensemble correctly identified 64% of non-responders with 17% false negatives among them (NPV ≈ 0.83). • For responders at s=0.70, sensitivity 61% with 26% false positives.
Excluding microbiome and urine metabolome in the ensemble still yielded ROC-AUC 0.72 using genotype, clinical markers, transit time, and postprandial response, indicating useful predictive signal without omics in settings lacking such data.
Permutation analyses confirmed models trained on true labels significantly outperformed models trained on permuted labels (p << 1e-6), supporting non-random predictive signal.

Discussion

The study demonstrates that integrating baseline multi-omics and physiological data with diet assignment substantially improves prediction of individual weight loss response beyond diet alone. The findings address the research question by showing that gut microbiome composition and urine metabolomics, in particular, carry strong predictive signals for short-term weight change under dietary interventions. The identification of Ruminococcaceae and Streptococcus (16S) and specific butyrate-producing species such as F. prausnitzii, E. ramulus, and R. faecis as important features aligns with known roles of SCFA-producing microbes and intestinal health in metabolic regulation. The random forest approach captured non-linear feature interactions, explaining why prevalence differences alone were not necessarily significant while combined patterns were predictive. The ensemble framework provided robustness to missing data and allowed confidence-based predictions, facilitating practical application where not all omics types are available. Clinically, these results support tailoring weight management strategies by pre-screening individuals to identify likely non-responders to certain grain-related dietary interventions and potentially directing them to alternative approaches. Even without microbiome/metabolome data, a moderate predictive performance indicates utility of accessible clinical and genetic markers.

Conclusion

This work shows that individual weight loss response to whole grain-rich, low-gluten, or refined grain diets over 8 weeks can be predicted with good accuracy by integrating baseline features from diet assignment, gut microbiome, urine metabolome, genetics, and clinical measures using random forests and ensemble methods. Multi-omics models (microbiome plus metabolome) improved ROC-AUC from ~0.62 (diet-only) to 0.84–0.90, and an ensemble achieved ROC-AUC up to 0.86 while enabling confidence-based classifications (e.g., identifying 64% of non-responders with high confidence). These findings support the development of AI-based tools for personalized dietary weight management. Future research should validate these models in independent and larger cohorts, incorporate more frequent longitudinal weight measurements to define individualized significant weight change thresholds, include lifestyle factors such as exercise and detailed fiber/polysaccharide composition, improve metabolite annotation, and explore causal links between microbial/metabolic pathways and weight regulation.

Limitations

Short-term weight change thresholds for clinical significance at the individual level are not well established; daily weight fluctuations were not captured by multiple longitudinal measurements during the 8-week periods.
No data on exercise habits or detailed polysaccharide/fiber composition, both of which can affect weight loss, though participants were asked not to change lifestyle.
Binary responder definition based on single pre/post measurements may not capture clinically meaningful changes for all individuals; restricting to extreme responders was not feasible due to limited sample size.
Cohort size was modest (102 participants; 203 intervention instances), posing challenges for high-dimensional data integration despite feature selection and cross-validation.
Many informative urine metabolite features lacked definitive annotations.
Although permutation tests supported robustness, external validation in independent cohorts is needed to confirm generalizability and refine feature importance.
Some author-assigned microbial species relied on prior knowledge (butyrate producers), which could bias selection; nonetheless, these contributed substantially to top-performing models.

Related Publications

Explore these studies to deepen your understanding of the subject.

Health and Fitness

Exploratory analysis of eating- and physical activity-related outcomes from a randomized controlled trial for weight loss maintenance with exercise and liraglutide single or combination treatment

S. B. K. Jensen, C. Janus, et al.

Medicine and Health

Effectiveness of virtual reality therapy in the treatment of anxiety disorders in adolescents and adults: a systematic review and meta-analysis of randomized controlled trials

W. Zeng, J. Xu, et al.

Medicine and Health

A multimodal deep learning approach for the prediction of cognitive decline and its effectiveness in clinical trials for Alzheimer’s disease

C. Wang, H. Tachimori, et al.

Health and Fitness

Baseline imbalance and heterogeneity are present in meta-analyses of randomized clinical trials examining the effects of exercise and medicines for blood pressure management

M. A. Wewege, H. J. Hansford, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny