Collective human intelligence outperforms artificial intelligence in a skin lesion classification task

23Citations
Citations of this article
42Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background and objectives: Convolutional neural networks (CNN) enable accurate diagnosis of medical images and perform on or above the level of individual physicians. Recently, collective human intelligence (CoHI) was shown to exceed the diagnostic accuracy of individuals. Thus, diagnostic performance of CoHI (120 dermatologists) versus individual dermatologists versus two state-of-the-art CNN was investigated. Patients and Methods: Cross-sectional reader study with presentation of 30 clinical cases to 120 dermatologists. Six diagnoses were offered and votes collected via remote voting devices (quizzbox®, Quizzbox Solutions GmbH, Stuttgart, Germany). Dermatoscopic images were classified by a binary and multiclass CNN (FotoFinder Systems GmbH, Bad Birnbach, Germany). Three sets of diagnostic classifications were scored against ground truth: (1) CoHI, (2) individual dermatologists, and (3) CNN. Results: CoHI attained a significantly higher accuracy [95 % confidence interval] (80.0 % [62.7 %–90.5 %]) than individual dermatologists (75.7 % [73.8 %–77.5 %]) and CNN (70.0 % [52.1 %–83.3 %]; all P < 0.001) in binary classifications. Moreover, CoHI achieved a higher sensitivity (82.4 % [59.0 %–93.8 %]) and specificity (76.9 % [49.7 %–91.8 %]) than individual dermatologists (sensitivity 77.8 % [75.3 %–80.2 %], specificity 73.0 % [70.6 %–75.4 %]) and CNN (sensitivity 70.6 % [46.9 %–86.7 %], specificity 69.2 % [42.4 %–87.3 %]). The diagnostic accuracy of CoHI was superior to that of individual dermatologists (P < 0.001) in multiclass evaluation, with the accuracy of the latter comparable to multiclass CNN. Conclusions: Our analysis revealed that the majority vote of an interconnected group of dermatologists (CoHI) outperformed individuals and CNN in a demanding skin lesion classification task.

Cite

CITATION STYLE

APA

Winkler, J. K., Sies, K., Fink, C., Toberer, F., Enk, A., Abassi, M. S., … Haenssle, H. A. (2021). Collective human intelligence outperforms artificial intelligence in a skin lesion classification task. JDDG - Journal of the German Society of Dermatology, 19(8), 1178–1184. https://doi.org/10.1111/ddg.14510

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free