Faculty, Staff and Student Publications

Publication Date

3-10-2025

Journal

Scientific Reports

Abstract

The performance of deep learning-based natural language processing systems is based on large amounts of labeled training data which, in the clinical domain, are not easily available or affordable. Weak supervision and in-context learning offer partial solutions to this issue, particularly using large language models (LLMs), but their performance still trails traditional supervised methods with moderate amounts of gold-standard data. In particular, inferencing with LLMs is computationally heavy. We propose an approach leveraging fine-tuning LLMs and weak supervision with virtually no domain knowledge that still achieves consistently dominant performance. Using a prompt-based approach, the LLM is used to generate weakly-labeled data for training a downstream BERT model. The weakly supervised model is then further fine-tuned on small amounts of gold standard data. We evaluate this approach using Llama2 on three different i2b2/ n2c2 datasets for clinical named entity recognition. With no more than 10 gold standard notes, our final BERT models weakly supervised by fine-tuned Llama2-13B consistently outperformed out-of-the-box PubMedBERT by 4.7-47.9% in F1 scores. With only 50 gold standard notes, our models achieved close performance to fully fine-tuned systems.

Keywords

Humans, Deep Learning, Natural language processing, Large language models, Electronic health records, Weak supervision, Translational research, Machine learning, Computer science

DOI

10.1038/s41598-024-68168-2

PMID

40064991

PMCID

PMC11893743

PubMedCentral® Posted Date

3-10-2025

PubMedCentral® Full Text Version

Post-print

Published Open-Access

yes

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.