
Faculty, Staff and Student Publications
Publication Date
4-2-2022
Journal
BMC Medical Informatics and Decision Making
Abstract
BACKGROUND: Logistic regression (LR) is a widely used classification method for modeling binary outcomes in many medical data classification tasks. Researchers that collect and combine datasets from various data custodians and jurisdictions can greatly benefit from the increased statistical power to support their analysis goals. However, combining data from different sources creates serious privacy concerns that need to be addressed.
METHODS: In this paper, we propose two privacy-preserving protocols for performing logistic regression with the Newton-Raphson method in the estimation of parameters. Our proposals are based on secure Multi-Party Computation (MPC) and tailored to the honest majority and dishonest majority security settings.
RESULTS: The proposed protocols are evaluated against both synthetic and real-world datasets in terms of efficiency and accuracy, and a comparison is made with the ordinary logistic regression. The experimental results demonstrate that the proposed protocols are highly efficient and accurate.
CONCLUSIONS: Our work introduces two iterative algorithms to enable the distributed training of a logistic regression model in a privacy-preserving manner. The implementation results show that our algorithms can handle large datasets from multiple sources.
Keywords
Algorithms, Humans, Logistic Models, Privacy, Logistic regression, Secret sharing, Multi-party computation, Privacy-preserving, Newton–Raphson
DOI
10.1186/s12911-022-01811-y
PMID
35366870
PMCID
PMC8977014
PubMedCentral® Posted Date
4-2-2022
PubMedCentral® Full Text Version
Post-print
Published Open-Access
yes