A study conducted by Massachusetts Institute of Technology (MIT) researchers has revealed that physicians are less adept at diagnosing skin diseases in patients with darker skin, solely based on image analysis. This disparity was revealed in a study that involved over 1,000 dermatologists and general practitioners. The accuracy of dermatologists in characterizing images of darker skin diseases stood at 34%, lower than the 38% accuracy for lighter skins. General practitioners, who were less proficient overall, exhibited a similar decline in accuracy with darker skin.
The researchers posited that artificial intelligence (AI) provided a viable solution to boosting physicians’ accuracy, however, AI delivered enhanced results in diagnosing lighter skin conditions.
Notably, this is the first study demonstrating diagnostic disparities arising from skin tone variations. Previous studies have, however, noted an image bias towards lighter skin tones in dermatology textbooks and training materials. This fact, coupled with the possibility that some practitioners might have less experience treating darker skin, could be contributing factors to the observed discrepancy.
The MIT study, led by Matt Groh, PhD, an assistant professor at Northwestern University, emphasizes the urgent need to revise dermatology education policies. The study, featuring over 364 images depicting various shades of skin and 46 skin diseases, found that these conditions, including Lyme Disease, presented differently on darker and lighter skin.
The study comprised 389 certified dermatologists, 116 dermatology residents, 459 general practitioners, and 154 other physicians. While specialists had a higher accuracy rate of 38% compared to the general physicians’ 19%, both groups displayed a significant 4% drop in accuracy when diagnosing darker skin conditions.
The team also integrated an AI algorithm which registered an accuracy rate of 47%. This AI tool, trained on roughly 30,000 images, significantly improved the accuracy of both dermatologists to 60% and general practitioners to 47%. Surprisingly, the physicians rarely integrated AI suggestions that were incorrect, showing their competence in ruling out diseases. Notably, the algorithm was equally accurate for both light and dark skin.
The study urges training institutions and textbook authors to include more patients with darker skin in studies. The results may also shape the development and deployment policies of AI assistance programs aimed at dermatology, given the many firms now venturing into this space.