
Faculty, Staff and Student Publications
Publication Date
10-1-2021
Journal
American Journal of Public Health
Abstract
Objectives. To develop an imputation method to produce estimates for suppressed values within a shared government administrative data set to facilitate accurate data sharing and statistical and spatial analyses.
Methods. We developed an imputation approach that incorporated known features of suppressed Massachusetts surveillance data from 2011 to 2017 to predict missing values more precisely. Our methods for 35 de-identified opioid prescription data sets combined modified previous or next substitution followed by mean imputation and a count adjustment to estimate suppressed values before sharing. We modeled 4 methods and compared the results to baseline mean imputation.
Results. We assessed performance by comparing root mean squared error (RMSE), mean absolute error (MAE), and proportional variance between imputed and suppressed values. Our method outperformed mean imputation; we retained 46% of the suppressed value’s proportional variance with better precision (22% lower RMSE and 26% lower MAE) than simple mean imputation.
Conclusions. Our easy-to-implement imputation technique largely overcomes the adverse effects of low count value suppression with superior results to simple mean imputation. This novel method is generalizable to researchers sharing protected public health surveillance data. (Am J Public Health. 2021; 111(10):1830–1838. https://doi.org/10.2105/AJPH.2021.306432)
Keywords
Algorithms, Analgesics, Opioid, Data Interpretation, Statistical, Drug Prescriptions, Humans, Information Dissemination, Massachusetts, Outcome Assessment, Health Care, Research Design
DOI
10.2105/AJPH.2021.306432
PMID
34529494
PMCID
PMC8561211
PubMedCentral® Posted Date
October 2021
PubMedCentral® Full Text Version
Post-print