Faculty, Staff and Student Publications
Publication Date
7-1-2023
Journal
EBioMedicine
Abstract
BACKGROUND: Differentiating intrahepatic cholangiocarcinomas (iCCA) from hepatic metastases of pancreatic ductal adenocarcinoma (PAAD) is challenging. Both tumours have similar morphological and immunohistochemical pattern and share multiple driver mutations. We hypothesised that DNA methylation-based machine-learning algorithms may help perform this task.
METHODS: We assembled genome-wide DNA methylation data for iCCA (n = 259), PAAD (n = 431), and normal bile duct (n = 70) from publicly available sources. We split this cohort into a reference (n = 399) and a validation set (n = 361). Using the reference cohort, we trained three machine learning models to differentiate between these entities. Furthermore, we validated the classifiers on the technical validation set and used an internal cohort (n = 72) to test our classifier.
FINDINGS: On the validation cohort, the neural network, support vector machine, and the random forest classifiers reached accuracies of 97.68%, 95.62%, and 96.5%, respectively. Filtering by anomaly detection and thresholds improved the accuracy to 99.07% (37 samples excluded by filtering), 96.22% (17 samples excluded), and 100% (44 samples excluded) for the neural network, support vector machine and random forest, respectively. Because of best balance between accuracy and number of predictable cases we tested the neural network with applied filters on the in-house cohort, obtaining an accuracy of 95.45%.
INTERPRETATION: We developed a classifier that can differentiate between iCCAs, intrahepatic metastases of a PAAD, and normal bile duct tissue with high accuracy. This tool can be used for improving the diagnosis of pancreato-biliary cancers of the liver.
Keywords
Pathology, Machine learning, Oncology, Molecular diagnosis, Epigenetic
Included in
Bioinformatics Commons, Biomedical Informatics Commons, Medical Sciences Commons, Oncology Commons, Pathological Conditions, Signs and Symptoms Commons
Comments
Supplementary Materials
PMID: 37348162