Faculty, Staff and Student Publications

Publication Date

10-23-2024

Journal

Database

Abstract

This manuscript presents PheNormGPT, a framework for extraction and normalization of key findings in clinical text. PheNormGPT relies on an innovative approach, leveraging large language models to extract key findings and phenotypic data in unstructured clinical text and map them to Human Phenotype Ontology concepts. It utilizes OpenAI's GPT-3.5 Turbo and GPT-4 models with fine-tuning and few-shot learning strategies, including a novel few-shot learning strategy for custom-tailored few-shot example selection per request. PheNormGPT was evaluated in the BioCreative VIII Track 3: Genetic Phenotype Extraction from Dysmorphology Physical Examination Entries shared task. PheNormGPT achieved an F1 score of 0.82 for standard matching and 0.72 for exact matching, securing first place for this shared task.

Keywords

Humans, Phenotype, Data Mining, Natural Language Processing

DOI

10.1093/database/baae103

PMID

39444329

PMCID

PMC11498178

PubMedCentral® Posted Date

10-23-2024

PubMedCentral® Full Text Version

Post-print

Published Open-Access

yes

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.