Publication Date
10-1-2020
Journal
JAMA Ophthalmology
DOI
10.1001/jamaophthalmol.2020.3269
PMID
32880609
PMCID
PMC7489388
PubMedCentral® Posted Date
9-3-2020
PubMedCentral® Full Text Version
Post-print
Published Open-Access
yes
Keywords
Algorithms, Artificial Intelligence, Cross-Sectional Studies, Deep Learning, Diabetic Retinopathy, Female, Humans, Male, Neural Networks, Computer, ROC Curve, Rare Diseases, Retrospective Studies
Abstract
IMPORTANCE: Recent studies have demonstrated the successful application of artificial intelligence (AI) for automated retinal disease diagnostics but have not addressed a fundamental challenge for deep learning systems: the current need for large, criterion standard-annotated retinal data sets for training. Low-shot learning algorithms, aiming to learn from a relatively low number of training data, may be beneficial for clinical situations involving rare retinal diseases or when addressing potential bias resulting from data that may not adequately represent certain groups for training, such as individuals older than 85 years.
OBJECTIVE: To evaluate whether low-shot deep learning methods are beneficial when using small training data sets for automated retinal diagnostics.
DESIGN, SETTING, AND PARTICIPANTS: This cross-sectional study, conducted from July 1, 2019, to June 21, 2020, compared different diabetic retinopathy classification algorithms, traditional and low-shot, for 2-class designations (diabetic retinopathy warranting referral vs not warranting referral). The public domain EyePACS data set was used, which originally included 88 692 fundi from 44 346 individuals. Statistical analysis was performed from February 1 to June 21, 2020.
MAIN OUTCOMES AND MEASURES: The performance (95% CIs) of the various AI algorithms was measured via receiver operating curves and their area under the curve (AUC), precision recall curves, accuracy, and F1 score, evaluated for different training data sizes, ranging from 5120 to 10 samples per class.
RESULTS: Deep learning algorithms, when trained with sufficiently large data sets (5120 samples per class), yielded comparable performance, with an AUC of 0.8330 (95% CI, 0.8140-0.8520) for a traditional approach (eg, fined-tuned ResNet), compared with low-shot methods (AUC, 0.8348 [95% CI, 0.8159-0.8537]) (using self-supervised Deep InfoMax [our method denoted as DIM]). However, when far fewer training images were available (n = 160), the traditional deep learning approach had an AUC decreasing to 0.6585 (95% CI, 0.6332-0.6838) and was outperformed by a low-shot method using self-supervision with an AUC of 0.7467 (95% CI, 0.7239-0.7695). At very low shots (n = 10), the traditional approach had performance close to chance, with an AUC of 0.5178 (95% CI, 0.4909-0.5447) compared with the best low-shot method (AUC, 0.5778 [95% CI, 0.5512-0.6044]).
CONCLUSIONS AND RELEVANCE: These findings suggest the potential benefits of using low-shot methods for AI retinal diagnostics when a limited number of annotated training retinal images are available (eg, with rare ophthalmic diseases or when addressing potential AI bias).
Included in
Biochemistry, Biophysics, and Structural Biology Commons, Biology Commons, Medical Sciences Commons, Medical Specialties Commons
Comments
Associated Data