Faculty, Staff and Student Publications
Language
English
Publication Date
4-29-2025
Journal
Bioengineering
DOI
10.3390/bioengineering12050472
PMID
40428091
PMCID
PMC12108849
PubMedCentral® Posted Date
4-29-2025
PubMedCentral® Full Text Version
Post-print
Abstract
Background: Cancer is a leading cause of death worldwide, and its early detection is crucial for improving patient outcomes. This study aimed to develop and evaluate ensemble learning models, specifically stacking, for the accurate prediction of lung, breast, and cervical cancers using lifestyle and clinical data.
Methods: 12 base learners were trained on datasets for lung, breast, and cervical cancer. Stacking ensemble models were then developed using these base learners. The models were evaluated for accuracy, precision, recall, F1-score, AUC-ROC, MCC, and kappa. An explainable AI technique, SHAP, was used to interpret model predictions.
Results: The stacking ensemble model outperformed individual base learners across all three cancer types. On average, for three cancer datasets, it achieved 99.28% accuracy, 99.55% precision, 97.56% recall, and 98.49% F1-score. A similar high performance was observed in terms of AUC, Kappa, and MCC. The SHAP analysis revealed the most influential features for each cancer type, e.g., fatigue and alcohol consumption for lung cancer, worst concave points, mean concave points, and worst perimeter for breast cancer and Schiller test for cervical cancer.
Conclusions: The stacking-based multi-cancer prediction model demonstrated superior accuracy and interpretability compared with traditional models. Combining diverse base learners with explainable AI offers predictive power and transparency in clinical applications. Key demographic and clinical features driving cancer risk were also identified. Further research should validate the model on more diverse populations and cancer types.
Keywords
cancer prediction, ensemble learning, stacking, lung cancer, breast cancer, cervical cancer, XAI, SHAP
Published Open-Access
yes
Recommended Citation
Shahid Mohammad Ganie, Pijush Kanti Dutta Pramanik, and Zhongming Zhao, "Enhanced and Interpretable Prediction of Multiple Cancer Types Using a Stacking Ensemble Approach with SHAP Analysis" (2025). Faculty, Staff and Student Publications. 677.
https://digitalcommons.library.tmc.edu/uthshis_docs/677