Faculty, Staff and Student Publications
Language
English
Publication Date
1-1-2025
Journal
AMIA Summits on Translational Science Proceedings
PMID
40502221
PMCID
PMC12150708
PubMedCentral® Posted Date
6-10-2025
PubMedCentral® Full Text Version
Post-print
Abstract
SNOMED CT is extensively employed to standardize data across diverse patient datasets and support cohort identification, with studies revealing its benefits and challenges. In this work, we developed a SNOMED CT-driven cohort query system over a heterogeneous Optum® de-identified COVID-19 Electronic Health Record dataset leveraging concept mappings between ICD-9-CM/ICD-10-CM and SNOMED CT. We evaluated the benefits and challenges of using SNOMED CT to perform cohort queries based on both query code sets and actual patients retrieved from the database, leveraging the original ICD-9-CM and ICD-10-CM as baselines. Manual review of 80 random cases revealed 65 cases containing 148 true positive codes and 25 cases containing 63 false positive codes. The manual evaluation also revealed issues in code naming, mappings, and hierarchical relations. Overall, our study indicates that while the SNOMED CT-driven query system holds considerable promise for comprehensive cohort queries, careful attention must be given to the challenges offalsely included codes and patients.
Published Open-Access
yes
Recommended Citation
Hao, Xubing; Huang, Yan; Cui, Licong; et al., "Leveraging SNOMED CT for Patient Cohort Identification Over Heterogeneous EHR Data" (2025). Faculty, Staff and Student Publications. 1393.
https://digitalcommons.library.tmc.edu/uthmed_docs/1393