Faculty, Staff and Student Publications
Language
English
Publication Date
1-27-2025
Journal
medRxiv
DOI
10.1101/2025.01.24.25321096
PMID
39974073
PMCID
PMC11839006
PubMedCentral® Posted Date
1-27-2025
PubMedCentral® Full Text Version
Author MSS
Abstract
Electronic health record (EHR) data are a rich and invaluable source of real-world clinical information, enabling detailed insights into patient populations, treatment outcomes, and healthcare practices. The availability of large volumes of EHR data are critical for advancing translational research and developing innovative technologies such as artificial intelligence. The Evolve to Next-Gen Accrual to Clinical Trials (ENACT) network, established in 2015 with funding from the National Center for Advancing Translational Sciences (NCATS), aims to accelerate translational research by democratizing access to EHR data for all Clinical and Translational Science Awards (CTSA) hub investigators. The present ENACT network provides access to structured EHR data, enabling cohort discovery and translational research across the network. However, a substantial amount of critical information is contained in clinical narratives, and natural language processing (NLP) is required for extracting this information to support research. To address this need, the ENACT NLP Working Group was formed to make NLP-derived clinical information accessible and queryable across the network. This article describes the implementation and deployment of NLP infrastructure across ENACT. First, we describe the formation and goals of the Working Group, the practices and logistics involved in implementation and deployment, and the specific NLP tools and technologies utilized. Then, we describe how we extended the ENACT ontology to standardize and query NLP-derived data, as well as how we conducted multisite evaluations of the NLP algorithms. Finally, we reflect on the experience and lessons learnt, which may be useful for other national data networks that are deploying NLP to unlock the potential of clinical text for research.
Keywords
translational research, electronic health records, natural language processing, network, enact
Published Open-Access
yes
Recommended Citation
Wang, Yanshan; Hilsman, Jordan; Li, Chenyu; et al., "Development and Validation of Natural Language Processing Algorithms in the ENACT National Electronic Health Record Research Network" (2025). Faculty, Staff and Student Publications. 628.
https://digitalcommons.library.tmc.edu/uthshis_docs/628