Publication Date

4-13-2022

Journal

JAMA America Medical Informatics Associaton

Abstract

OBJECTIVES: Scanned documents (SDs), while common in electronic health records and potentially rich in clinically relevant information, rarely fit well with clinician workflow. Here, we identify scanned imaging reports requiring follow-up with high recall and practically useful precision.

MATERIALS AND METHODS: We focused on identifying imaging findings for 3 common causes of malpractice claims: (1) potentially malignant breast (mammography) and (2) lung (chest computed tomography [CT]) lesions and (3) long-bone fracture (X-ray) reports. We train our ClinicalBERT-based pipeline on existing typed/dictated reports classified manually or using ICD-10 codes, evaluate using a test set of manually classified SDs, and compare against string-matching (baseline approach).

RESULTS: A total of 393 mammograms, 305 chest CT, and 683 bone X-ray reports were manually reviewed. The string-matching approach had an F1 of 0.667. For mammograms, chest CTs, and bone X-rays, respectively: models trained on manually classified training data and optimized for F1 reached an F1 of 0.900, 0.905, and 0.817, while separate models optimized for recall achieved a recall of 1.000 with precisions of 0.727, 0.518, and 0.275. Models trained on ICD-10-labelled data and optimized for F1 achieved F1 scores of 0.647, 0.830, and 0.643, while those optimized for recall achieved a recall of 1.0 with precisions of 0.407, 0.683, and 0.358.

DISCUSSION: Our pipeline can identify abnormal reports with potentially useful performance and so decrease the manual effort required to screen for abnormal findings that require follow-up.

CONCLUSION: It is possible to automatically identify clinically significant abnormalities in SDs with high recall and practically useful precision in a generalizable and minimally laborious way.

Keywords

Electronic Health Records, Natural Language Processing, Research Report, Tomography, X-Ray Computed

DOI

10.1093/jamia/ocac007

PMID

35146510

PMCID

PMC9714594

PubMedCentral® Posted Date

February 2022

PubMedCentral® Full Text Version

Post-print

Published Open-Access

yes

Download

Included in

Internal Medicine Commons

COinS

Faculty, Staff and Student Publications

Closing The Loop: Automatically Identifying Abnormal Imaging Results In Scanned Documents

Publication Date

Journal

Abstract

Keywords

DOI

PMID

PMCID

PubMedCentral® Posted Date

PubMedCentral® Full Text Version

Published Open-Access

Included in

Search

Browse

Author Corner

More Info

Library

Faculty, Staff and Student Publications

Closing The Loop: Automatically Identifying Abnormal Imaging Results In Scanned Documents

Authors

Publication Date

Journal

Abstract

Keywords

DOI

PMID

PMCID

PubMedCentral® Posted Date

PubMedCentral® Full Text Version

Published Open-Access

Included in

Share

Search

Browse

Author Corner

More Info

Library