Language
English
Publication Date
3-1-2026
Journal
Clinical Therapeutics
DOI
10.1016/j.clinthera.2026.01.007
PMID
41723010
Abstract
Purpose: Ensemble machine learning (ML) demonstrated potential for improving predictions based on big health care data. We developed and validated interpretable ensemble ML models in evaluating the nonadherence and nonpersistence of biological or targeted synthetic disease-modifying antirheumatic drugs (b/tsDMARDs) in rheumatoid arthritis (RA).
Methods: This retrospective study used 5% Medicare claims data including older (aged ≥65 years) patients initiating b/tsDMARDs (the index date) between 2013 and 2019, who were diagnosed with RA (the International Classification of Diseases, Ninth and Tenth revision codes). Nonadherence, defined as medication possession ratio < 80%, was evaluated during the 12-month follow-up. Nonpersistence was defined as duration (in days) from the index date until treatment gap ≥60 days during the follow-up period, with death and insurance disenrollment as censoring criteria. Data were split into 75% for model training and 25% for evaluation, with patients' demographics and clinical characteristics measured in the baseline period. Machine learning ensemble classification models (random forest, eXtreme Gradient Boosting [XGBoost], and prediction rule ensembles [PRE]) evaluated the binary nonadherence outcome and ML ensemble survival models (random survival forest, XGBoost survival, and PRE survival) assessed time until nonpersistence. Using the testing cohort, concordance index (C-index) was reported for the ML survival models and AUC was reported for ML classification models.
Findings: We identified 3927 eligible patients with RA (mean age 73 ± 6 years; 75% female). About 53.65% of them were adherent (medication possession ratio 0.72 [0.31]; 0.01-1) and 18.5% of them had the risk of nonpersistence (mean time to nonpersistence 1110 ± 670 days) in the 12-month follow-up. Compared with PRE (AUC, 0.6271; 95% CI, 0.5923-0.6618), random forest (0.6315; 95% CI, 0.5969-0.6661; P = 0.7127) and XGBoost (0.6277; 95% CI, 0.5930-0.6624; P = 0.9643) had comparable performance in predicting nonadherence. Age, claims-based index for RA severity, type of b/tsDMARDs, frailty score, cancer and Elixhauser comorbidity index were ranked commonly as the top features for nonadherence. With reference to PRE survival (C-index, 0.634; 95% CI, 0.611-0.658), random survival forest (C-index, 0.661; 95% CI, 0.665-0.675; P = 0.0276) and XGBoost Survival (C-index, 0.670; 95% CI, 0.666-0.676; P = 0.0033) had improved performance for predicting nonpersistence. Age, Elixhauser score, frailty score, and claims-based index for RA severity were commonly found as the top predictors for nonpersistence.
Implications: We developed ML models predicting nonadherence and nonpersistence of b/tsDMARDs for older adults with RA. There is potential to leverage ML for understanding patients' behavior of utilizing b/tsDMARDs treatment with a goal towards personalized medicine in RA.
Keywords
Humans, Arthritis, Rheumatoid, Female, Aged, Retrospective Studies, Machine Learning, Antirheumatic Agents, Male, Predictive Learning Models, Boosting Machine Learning Algorithms, Random Forest, United States, Aged, 80 and over, Prediction Algorithms, Medicare, Classification Algorithms, Adherence, Ensemble learners, Interpretable machine learning, Machine learning, Persistence, Rheumatoid arthritis
Published Open-Access
yes
Recommended Citation
Yinan Huang and Sandeep K Agarwal, "Interpretable Ensemble Machine Learning Prediction of Nonadherence and the Risk of Nonpersistence of Targeted Disease-Modifying Antirheumatic Agents in Older Adults With Rheumatoid Arthritis" (2026). Faculty, Staff and Students Publications. 6813.
https://digitalcommons.library.tmc.edu/baylor_docs/6813