Sensitivity and Positive Predictive Values of ICD-10 Codes Among COVID-19 Patients From Optum Database in the United States: A Retrospective Cohort Study

Muneeza Qureshi, The University of Texas School of Public Health


As of April 20, 2022, the coronavirus disease 2019 (COVID-19) had resulted in 503.1 million confirmed cases and over 6.2 million deaths globally. These statistics are important to understanding the impact of COVID-19 on population health. Therefore, rapid, efficient, high-quality, and multicenter data are needed to collect COVID-19 data for analysis during this pandemic. One option for obtaining these data is the International Classification of Diseases (ICD), a classification and coding system used for patients’ diagnoses, symptoms, and procedures associated (or not associated) with hospital care. This system provides information about each ICD code and its classification criteria for diagnostic results, which is crucial for healthcare providers to categorize patients with various illnesses. In March 2020, due to the outbreak of the COVID-19, ICD Tenth Revision (ICD-10) emergency codes were created to classify confirmed and suspected cases of COVID-19, using the codes U07.1 and U07.3, respectively. These particular codes are used to categorize patients with COVID-19 disease from the previous pandemic year to provide algorithms for identifying COVID-19 cases. However, some ICD-10 codes may lack validity for the intended condition or may misclassify an illness, especially illnesses that have COVID-19–related symptoms. It is also not known whether the performance of these codes has remained stable or improved since the upgrade of these codes in January 2021. In particular, there is a gap in knowledge on the accuracy and validity of ICD-10 codes using a large national database in the United States. Thus, the objective of the present study is to i) evaluate the validity of ICD-10 codes to correctly classify positive COVID-19 patients using PCR (polymerase chain reaction) test results as the reference standard, and ii) to report the prevalence of PCR test result in positive COVID-19 by using a large national (United States) database with PCR test results as the reference standard. To what extent ICD-10 codes can be used to accurately classify a patient with COVID19 needs to be examined more closely using a large patient-based database. To fill this gap, the proposed study utilized the Optum COVID database. In the short term, this study provides estimates on the sensitivity and positive predictive value (PPV) of COVID-19 ICD-10 codes. In the intermediate term, this study may inform other healthcare providers regarding the categorization of patients with suspected or confirmed COVID-19 infection. In the long term, this study can help scientists or healthcare researchers to address or add more codes in response to specific medical conditions of any illnesses.

