Faculty, Staff and Student Publications
Language
English
Publication Date
3-1-2026
Journal
Oral Oncology
DOI
10.1016/j.oraloncology.2026.107877
PMID
41621281
Abstract
Background: The management of head and neck cancer relies on multidisciplinary expertise; however, access to tumor boards remains variable. Large language models (LLMs) may support guideline-based decision-making, although performance in complex oncologic scenarios is not well defined.
Methods: Fourteen synthetic cases based on real tumor board encounters were evaluated. Five blinded comparator arms produced recommendations: a human expert, Non-RAG-GPT-4, Non-RAG-GPT-5, RAG-GPT-4, and RAG-GPT-5. Eight head and neck oncologic surgeons scored each recommendation for appropriateness, clarity, specificity, and feasibility using 5-point Likert scales. Paired permutation testing and inter-rater reliability were assessed.
Results: LLM outputs showed close alignment with expert recommendations. RAG-based models achieved the highest mean scores across domains, with some statistically significant differences versus the expert comparator in appropriateness and clarity; however, absolute differences were modest. Inter-rater reliability was strong (ICC 0.73-0.87).
Conclusions: Advanced LLMs can generate guideline-concordant management recommendations in simulated head and neck cancer cases, supporting potential utility for decision support and education; prospective validation and expert oversight remain essential.
Keywords
Humans, Head and Neck Neoplasms, Decision Support Techniques, Language, Reproducibility of Results, Large Language Models
Published Open-Access
yes
Recommended Citation
Hack, Sholem; Karni, Ron J; Maniaci, Antonino; et al., "Evaluation of Large Language Models As Decision Support Tools for Head and Neck Cancer Management: A Blinded Multidisciplinary Simulation Study" (2026). Faculty, Staff and Student Publications. 4040.
https://digitalcommons.library.tmc.edu/uthmed_docs/4040