Faculty, Staff and Student Publications

Language

English

Publication Date

1-1-2024

Journal

AMIA Annual Symposium Proceedings

PMID

40417515

PMCID

PMC12099382

Abstract

Large electronic health records (EHR) have been widely implemented and are available for research activities. The magnitude of such databases often requires storage and computing infrastructure that are distributed at different sites. Restrictions on data-sharing due to privacy concerns have been another driving force behind the development of a large class of distributed and/or federated machine learning methods. While missing data problem is also present in distributed EHRs, albeit potentially more complex, distributed multiple imputation (MI) methods have not received as much attention. An important advantage of distributed MI, as well as distributed analysis, is that it allows researchers to borrow information across data sites, mitigating potential fairness issues for minority groups that do not have enough volume at certain sites. In this paper, we propose a communication-efficient and privacy-preserving distributed MI algorithms for variables that are missing not at random.

Keywords

Electronic Health Records, Algorithms, Humans, Machine Learning, Confidentiality, Privacy

Published Open-Access

yes

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.