Publication Date

1-1-2024

Journal

AMIA Summits on Translational Science Proceedings

Abstract

The results of clinical trials are a valuable source of evidence for researchers, policy makers, and healthcare professionals. However, online trial registries do not always contain links to the publications that report on their results, instead requiring a time-consuming manual search. Here, we explored the application of pre-trained transformer-based language models to automatically identify result-reporting publications of cancer clinical trials by computing dense vectors and performing semantic search. Models were fine-tuned on text data from trial registry fields and article metadata using a contrastive learning approach. The best performing model was PubMedBERT, which achieved a mean average precision of 0.592 and ranked 70.3% of a trial's publications in the top 5 results when tested on the holdout test trials. Our results suggest that semantic search using embeddings from transformer models may be an effective approach to the task of linking trials to their publications.

PMID

38827077

PMCID

PMC11141816

PubMedCentral® Posted Date

May 2024

PubMedCentral® Full Text Version

Post-Print

Published Open-Access

yes

Download

Included in

Bioinformatics Commons, Biomedical Informatics Commons, Data Science Commons

COinS

Faculty, Staff and Student Publications

Linking Cancer Clinical Trials To Their Result Publications

Publication Date

Journal

Abstract

PMID

PMCID

PubMedCentral® Posted Date

PubMedCentral® Full Text Version

Published Open-Access

Included in

Search

Browse

Author Corner

More Info

Library

Faculty, Staff and Student Publications

Linking Cancer Clinical Trials To Their Result Publications

Authors

Publication Date

Journal

Abstract

PMID

PMCID

PubMedCentral® Posted Date

PubMedCentral® Full Text Version

Published Open-Access

Included in

Share

Search

Browse

Author Corner

More Info

Library