Faculty, Staff and Student Publications

Medical Image Segmentation Assisted with Clinical Inputs via Language Encoder in A Deep Learning Framework

Language

English

Publication Date

3-1-2025

Journal

Machine Learning: Science and Technology

DOI

10.1088/2632-2153/adb371

PMID

41078606

PMCID

PMC12509794

PubMedCentral® Posted Date

2-14-2026

PubMedCentral® Full Text Version

Author MSS

Abstract

Introduction: Auto-segmentation of tumor volumes and organs at risk (OARs) is a critical step in cancer radiotherapy treatment planning, where rapid, precise adjustments to treatment plans are required to match the patient anatomy. Although auto-segmentation has been clinically accepted for most OARs, auto-segmentation of tumor volumes, particularly clinical target volumes (CTVs), remains a challenge. This difficulty arises because images alone are often insufficient to capture the necessary information for accurate delineation of microscopic tumor invasion invisible on the image itself.

Methods: We propose a deep learning-based medical image segmentation framework designed to mimic the clinical process of delineating CTVs and OARs. At its core, the model performs precise segmentation of medical images while enhancing accuracy by integrating clinical information in text format. A transformer-based text encoder converts textual clinical data into vectors, which are incorporated into the segmentation process with image features. This integration bridges the gap between traditional automated segmentation methods and clinician-guided, context-rich delineations. The framework's effectiveness is demonstrated through a prostate segmentation example in the context of radiation therapy for localized prostate cancer, where incorporating clinical context significantly impacts the delineation process.

Results: In our experiments, we included additional clinical information potentially influencing clinicians' prostate segmentation. The results show that our proposed method not only outperforms the baseline model, but also surpasses current state-of-the-art methods, with or without clinical contexts. Furthermore, our method demonstrates high performance even with limited data.

Conclusion: This proposed segmentation framework has shown to significantly improve auto-segmentation, particularly for CTVs, in cancer radiotherapy.

Keywords

Medical image segmentation, transformer, CLIP, deep learning, language encoder

Published Open-Access

yes

Recommended Citation

Zhao, Hengrui; Wang, Biling; Mistry, Deepkumar; et al., "Medical Image Segmentation Assisted with Clinical Inputs via Language Encoder in A Deep Learning Framework" (2025). Faculty, Staff and Student Publications. 6670.
https://digitalcommons.library.tmc.edu/uthgsbs_docs/6670

Download

Included in

Bioinformatics Commons, Biomedical Informatics Commons, Genetic Phenomena Commons, Medical Genetics Commons, Oncology Commons

COinS

Faculty, Staff and Student Publications

Medical Image Segmentation Assisted with Clinical Inputs via Language Encoder in A Deep Learning Framework

Language

Publication Date

Journal

DOI

PMID

PMCID

PubMedCentral® Posted Date

PubMedCentral® Full Text Version

Abstract

Keywords

Published Open-Access

Recommended Citation

Included in

Search

Browse

Author Corner

More Info

Library

Faculty, Staff and Student Publications

Medical Image Segmentation Assisted with Clinical Inputs via Language Encoder in A Deep Learning Framework

Authors

Language

Publication Date

Journal

DOI

PMID

PMCID

PubMedCentral® Posted Date

PubMedCentral® Full Text Version

Abstract

Keywords

Published Open-Access

Recommended Citation

Included in

Share

Search

Browse

Author Corner

More Info

Library