Language
English
Publication Date
4-1-2026
Journal
JAMIA Open
DOI
10.1093/jamiaopen/ooaf152
PMID
41873434
PMCID
PMC13006063
PubMedCentral® Posted Date
3-13-2026
PubMedCentral® Full Text Version
Post-print
Abstract
Objectives: We evaluated the data requirement for modern AI tools to outperform simpler models in predicting short-term mortality in over 500 000 patients with hemodialysis-dependent kidney failure.
Materials and methods: We compared logistic regression, boosting, and transformers using increasingly complex feature sets (from last-visit data to full trajectories). Performance was measured using the area under the ROC curve (AUC-ROC) and the Precision-Recall curve (AUC-PR) across training data sizes ranging from 500 to 490 197 samples.
Results: Using features with temporal information is beneficial across all models. On the full dataset, Transformers (AUC-ROC = 0.8568) and boosting (AUC-ROC = 0.8598) perform similarly.
Discussion: Transformers require large datasets to outperform simpler models like boosting, limiting their usefulness in smaller datasets, even on datasets as big as 500K.
Conclusion: Modern AI tools require substantial data to justify their computational cost over simpler approaches. However, a more complex feature set seems to be beneficial across all models.
Keywords
mortality, risk prediction, artificial intelligence, machine learning, hemodialysis
Published Open-Access
yes
Recommended Citation
K, Karthikeyan; Flythe, Jennifer E; Pun, Patrick H; et al., "Advanced Artificial Intelligence vs Simpler Models for 1-Year Death Prediction Among Patients Receiving Hemodialysis" (2026). Faculty, Staff and Students Publications. 6775.
https://digitalcommons.library.tmc.edu/baylor_docs/6775