
Faculty, Staff and Student Publications
Publication Date
10-23-2024
Journal
Database
Abstract
This manuscript presents PheNormGPT, a framework for extraction and normalization of key findings in clinical text. PheNormGPT relies on an innovative approach, leveraging large language models to extract key findings and phenotypic data in unstructured clinical text and map them to Human Phenotype Ontology concepts. It utilizes OpenAI's GPT-3.5 Turbo and GPT-4 models with fine-tuning and few-shot learning strategies, including a novel few-shot learning strategy for custom-tailored few-shot example selection per request. PheNormGPT was evaluated in the BioCreative VIII Track 3: Genetic Phenotype Extraction from Dysmorphology Physical Examination Entries shared task. PheNormGPT achieved an F1 score of 0.82 for standard matching and 0.72 for exact matching, securing first place for this shared task.
Keywords
Humans, Phenotype, Data Mining, Natural Language Processing
DOI
10.1093/database/baae103
PMID
39444329
PMCID
PMC11498178
PubMedCentral® Posted Date
10-23-2024
PubMedCentral® Full Text Version
Post-print
Published Open-Access
yes