Faculty, Staff and Student Publications
Language
English
Publication Date
1-1-2024
Journal
AMIA Annual Symposium Proceedings
PMID
40417515
PMCID
PMC12099382
Abstract
Large electronic health records (EHR) have been widely implemented and are available for research activities. The magnitude of such databases often requires storage and computing infrastructure that are distributed at different sites. Restrictions on data-sharing due to privacy concerns have been another driving force behind the development of a large class of distributed and/or federated machine learning methods. While missing data problem is also present in distributed EHRs, albeit potentially more complex, distributed multiple imputation (MI) methods have not received as much attention. An important advantage of distributed MI, as well as distributed analysis, is that it allows researchers to borrow information across data sites, mitigating potential fairness issues for minority groups that do not have enough volume at certain sites. In this paper, we propose a communication-efficient and privacy-preserving distributed MI algorithms for variables that are missing not at random.
Keywords
Electronic Health Records, Algorithms, Humans, Machine Learning, Confidentiality, Privacy
Published Open-Access
yes
Recommended Citation
Yi Lian, Xiaoqian Jiang, and Qi Long, "Federated Multiple Imputation for Variables that Are Missing Not At Random in Distributed Electronic Health Records" (2024). Faculty, Staff and Student Publications. 738.
https://digitalcommons.library.tmc.edu/uthshis_docs/738