Researchers from MIT have discovered that doctors underperform when diagnosing skin diseases in patients with darker skin based on image assessment. Their study included over a 1,000 dermatologists and general practitioners, revealing that dermatologists accurately identified diseases on images around 38% of the time, but their success rate dropped to 34% when it came to darker skin images. This pattern was also observed with general practitioners.
An extra finding was that using an artificial intelligence (AI) algorithm improved diagnostic accuracy, particularly for patients with lighter skin. As the first study to uncover doctor diagnostic disparities across skin tones, the MIT researchers propose that the frequency of lighter skin images used in dermatology textbooks and training materials might be a contributing factor.
Matt Groh, lead author of the study and assistant professor at Northwestern University, highlighted the significance of empirical evidence in changing dermatology policies. Groh was particularly interested in the application of machine learning in improving medical decision-making and increasing patient outcomes.
To evaluate the diagnostic accuracy of doctors, the team gathered 364 images from dermatology materials, representing a wide array of skin diseases and skin shades. Dermatology specialists were found to have a higher accuracy rating compared to general practitioners. However, doctors across the board exhibited a 4% drop in accuracy when more melanin was shown in skin images.
An AI algorithm, developed by the researchers, was also employed, leading to improvements in accuracy for dermatologists (up to 60%) and general practitioners (up to 47%). Despite the AI improvements, general practitioners showed a greater level of proficiency on lighter skin images compared to darker ones.
The findings of this MIT study will hopefully inspire more inclusion of darker skin training in medical schools and throughout various dermatology materials. Moreover, it will help guide the application of AI programs in dermatology. This MIT study was made possible by funding from the MIT Media Lab Consortium, and the Harold Horowitz Student Research Fund.