Faculty, Staff and Student Publications
Language
English
Publication Date
10-23-2024
Journal
Database
DOI
10.1093/database/baae103
PMID
39444329
PMCID
PMC11498178
PubMedCentral® Posted Date
10-23-2024
PubMedCentral® Full Text Version
Post-print
Abstract
This manuscript presents PheNormGPT, a framework for extraction and normalization of key findings in clinical text. PheNormGPT relies on an innovative approach, leveraging large language models to extract key findings and phenotypic data in unstructured clinical text and map them to Human Phenotype Ontology concepts. It utilizes OpenAI's GPT-3.5 Turbo and GPT-4 models with fine-tuning and few-shot learning strategies, including a novel few-shot learning strategy for custom-tailored few-shot example selection per request. PheNormGPT was evaluated in the BioCreative VIII Track 3: Genetic Phenotype Extraction from Dysmorphology Physical Examination Entries shared task. PheNormGPT achieved an F1 score of 0.82 for standard matching and 0.72 for exact matching, securing first place for this shared task.
Keywords
Humans, Phenotype, Data Mining, Natural Language Processing
Published Open-Access
yes
Recommended Citation
Ekin Soysal and Kirk Roberts, "PheNormGPT: A Framework for Extraction and Normalization of Key Medical Findings" (2024). Faculty, Staff and Student Publications. 540.
https://digitalcommons.library.tmc.edu/uthshis_docs/540