Faculty, Staff and Student Publications
Language
English
Publication Date
10-22-2025
Journal
Scientific Reports
DOI
10.1038/s41598-025-20936-4
PMID
41125742
PMCID
PMC12546719
PubMedCentral® Posted Date
10-22-2025
PubMedCentral® Full Text Version
Post-print
Abstract
Obesity is a major public health concern. Predicting obesity risk from lifestyle data can guide targeted interventions, but current models remain limited. This study first evaluates ensemble learning methods and then combines approaches to improve prediction accuracy and generalizability. Four ensemble techniques-boosting, bagging, stacking, and voting-were tested. Five boosting and five bagging models were constructed alongside voting and stacking models. Hyperparameter tuning optimized performance, and feature importance analysis guided potential feature elemination. In phase two, hybrid stacking and voting models integrated the best-performing boosting and bagging models to enhance predictive capability. Model robustness was ensured through k-fold cross-validation and statistical validation. SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) improved interpretability by analyzing feature contributions. Hybrid stacking and voting models outperformed other ensemble methods, with stacking achieving the best performance (accuracy: 96.88%, precision: 97.01%, and recall: 96.88%). Feature importance analysis identified key predictors, including sex, weight, food habits, and alcohol consumption. The results demonstrated that hybrid ensembles significantly improved obesity risk prediction while preserving interpretability. Integrating multiple ensemble and explainability techniques provides a reliable framework for obesity prediction, supporting clinical decisions and personalized healthcare strategies to mitigate obesity risk.
Keywords
Humans, Obesity, Life Style, Male, Female, Machine Learning, Adult, Risk Factors, Obesity prediction; Ensemble learning; Boosting, bagging, stacking, voting; XAI; SHAP; LIME; Friedman’s rank analysis; Post hoc analysis
Published Open-Access
yes
Recommended Citation
Shahid Mohammad Ganie, Pijush Kanti Dutta Pramanik, and Zhongming Zhao, "Lifestyle Data-Based Multiclass Obesity Prediction With Interpretable Ensemble Models Incorporating SHAP and LIME Analysis" (2025). Faculty, Staff and Student Publications. 806.
https://digitalcommons.library.tmc.edu/uthshis_docs/806