Faculty, Staff and Student Publications
Publication Date
1-18-2024
Journal
Journal of the American Medical Informatics Association
Abstract
OBJECTIVE: The early stages of chronic disease typically progress slowly, so symptoms are usually only noticed until the disease is advanced. Slow progression and heterogeneous manifestations make it challenging to model the transition from normal to disease status. As patient conditions are only observed at discrete timestamps with varying intervals, an incomplete understanding of disease progression and heterogeneity affects clinical practice and drug development.
MATERIALS AND METHODS: We developed the Gaussian Process for Stage Inference (GPSI) approach to uncover chronic disease progression patterns and assess the dynamic contribution of clinical features. We tested the ability of the GPSI to reliably stratify synthetic and real-world data for osteoarthritis (OA) in the Osteoarthritis Initiative (OAI), bipolar disorder (BP) in the Adolescent Brain Cognitive Development Study (ABCD), and hepatocellular carcinoma (HCC) in the UTHealth and The Cancer Genome Atlas (TCGA).
RESULTS: First, GPSI identified two subgroups of OA based on image features, where these subgroups corresponded to different genotypes, indicating the bone-remodeling and overweight-related pathways. Second, GPSI differentiated BP into two distinct developmental patterns and defined the contribution of specific brain region atrophy from early to advanced disease stages, demonstrating the ability of the GPSI to identify diagnostic subgroups. Third, HCC progression patterns were well reproduced in the two independent UTHealth and TCGA datasets.
CONCLUSION: Our study demonstrated that an unsupervised approach can disentangle temporal and phenotypic heterogeneity and identify population subgroups with common patterns of disease progression. Based on the differences in these features across stages, physicians can better tailor treatment plans and medications to individual patients.
Keywords
Gaussian process, disease progression, unsupervised learning