Publication Date

8-7-2024

Journal

JMIR Diabetes

DOI

10.2196/53338

PMID

39110490

PMCID

PMC11339561

PubMedCentral® Posted Date

8-7-2024

PubMedCentral® Full Text Version

Post-print

Published Open-Access

yes

Keywords

pediatric type 1 diabetes, postdiagnosis diabetic ketoacidosis, risk prediction and stratification, XGBoost, Shapley values, ketoacidosis, risks, predict, prediction, predictive, gradient-boosted ensemble model, diabetes, pediatrics, children, machine learning

Abstract

BACKGROUND: Diabetic ketoacidosis (DKA) is the leading cause of morbidity and mortality in pediatric type 1 diabetes (T1D), occurring in approximately 20% of patients, with an economic cost of $5.1 billion/year in the United States. Despite multiple risk factors for postdiagnosis DKA, there is still a need for explainable, clinic-ready models that accurately predict DKA hospitalization in established patients with pediatric T1D.

OBJECTIVE: We aimed to develop an interpretable machine learning model to predict the risk of postdiagnosis DKA hospitalization in children with T1D using routinely collected time-series of electronic health record (EHR) data.

METHODS: We conducted a retrospective case-control study using EHR data from 1787 patients from among 3794 patients with T1D treated at a large tertiary care US pediatric health system from January 2010 to June 2018. We trained a state-of-the-art; explainable, gradient-boosted ensemble (XGBoost) of decision trees with 44 regularly collected EHR features to predict postdiagnosis DKA. We measured the model's predictive performance using the area under the receiver operating characteristic curve-weighted F

RESULTS: Our model distinguished the cohort that develops DKA postdiagnosis from the one that does not (P

CONCLUSIONS: We have built an explainable, predictive, machine learning model with potential for integration into clinical workflow. The model risk-stratifies patients with pediatric T1D and identifies patients with the highest postdiagnosis DKA risk using limited follow-up data starting from the time of diagnosis. The model identifies key time points and risk factors to direct clinical interventions at both the individual and cohort levels. Further research with data from multiple hospital systems can help us assess how well our model generalizes to other populations. The clinical importance of our work is that the model can predict patients most at risk for postdiagnosis DKA and identify preventive interventions based on mitigation of individualized risk factors.

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.