AI-Driven Dermatology Could Leave Dark-Skinned Patients Behind

Machine learning has the potential to save thousands of people from skin cancer each year—while putting others at greater risk.

Steve Gschmeissner / Getty

LaToya Smith was 29 years old when she died from skin cancer. The young doctor had gotten her degree in podiatry from Rosalind Franklin University, in Chicago, just four years prior, and had recently finished a medical mission in Eritrea. But a diagnosis of melanoma in 2010 meant she would work in private practice for only a year before her death.

As a black woman, LaToya reflected a stark imbalance in skin-cancer statistics in America. While fair-skinned people are at the highest risk for contracting skin cancer, the mortality rate for African Americans is considerably higher: Their five-year survival rate is 73 percent, compared with 90 percent for white Americans, according to the American Academy of Dermatology.

As the rates of melanoma for all Americans continue a 30-year climb, dermatologists have begun exploring new technologies to try to reverse this deadly trend—including artificial intelligence. There’s been a growing hope in the field that using machine-learning algorithms to diagnose skin cancers and other skin issues could make for more efficient doctor visits and increased, reliable diagnoses. The earliest results are promising—but also potentially dangerous for darker-skinned patients.

Earlier this month, Avery Smith, LaToya’s husband and a software engineer in Baltimore, Maryland, co-authored a paper in JAMA Dermatology that warns of the potential racial disparities that could come from relying on machine learning for skin-cancer screenings. Smith’s co-author, Adewole Adamson of the University of Texas at Austin, has conducted multiple studies on demographic imbalances in dermatology. “African Americans have the highest mortality rate [for skin cancer], and doctors aren’t trained on that particular skin type,” Smith told me over the phone. “When I came across the machine-learning software, one of the first things I thought was how it will perform on black people.”

Recently, a study that tested machine-learning software in dermatology, conducted by a group of researchers primarily out of Germany, found that “deep-learning convolutional neural networks,” or CNN, detected potentially cancerous skin lesions better than the 58 dermatologists included in the study group. The data used for the study come from the International Skin Imaging Collaboration, or ISIC, an open-source repository of skin images to be used by machine-learning algorithms. Given the rise in melanoma cases in the United States, a machine-learning algorithm that assists dermatologists in diagnosing skin cancer earlier could conceivably save thousands of lives each year.

Its deployment, according to Carlos Charles, a New York City–based dermatologist whose practice specializes in treating patients with darker skin tones, does hold the possibility of aiding marginalized people in getting diagnosed at higher rates than they currently do. “You could take this type of tech and it could have a big role in helping marginalized communities who can’t get to the dermatologist,” he says. “Potentially, if you could combine different forms of telemedicine and machine vision, you could access more people and make more educated diagnoses.”

But, he says, the technology “is a long way from prime time.” Chief among the prohibitive issues, according to Smith and Adamson, is that the data the CNN relies on come from primarily fair-skinned populations in the United States, Australia, and Europe. If the algorithm is basing most of its knowledge on how skin lesions appear on fair skin, then theoretically, lesions on patients of color are less likely to be diagnosed. “If you don’t teach the algorithm with a diverse set of images, then that algorithm won’t work out in the public that is diverse,” says Adamson. “So there’s risk, then, for people with skin of color to fall through the cracks.”

As Adamson and Smith’s paper points out, racial disparities in artificial intelligence and machine learning are not a new issue. Algorithms have mistaken images of black people for gorillas, misunderstood Asians to be blinking when they weren’t, and “judged” only white people to be attractive. An even more dangerous issue, according to the paper, is that decades of clinical research have focused primarily on people with light skin, leaving out marginalized communities whose symptoms may present differently.

The reasons for this exclusion are complex. According to Andrew Alexis, a dermatologist at Mount Sinai, in New York City, and the director of the Skin of Color Center, compounding factors include a lack of medical professionals from marginalized communities, inadequate information about those communities, and socioeconomic barriers to participating in research. “In the absence of a diverse study population that reflects that of the U.S. population, potential safety or efficacy considerations could be missed,” he says.

Adamson agrees, elaborating that with inadequate data, machine learning could misdiagnose people of color with nonexistent skin cancers—or miss them entirely. But he understands why the field of dermatology would surge ahead without demographically complete data. “Part of the problem is that people are in such a rush. This happens with any new tech, whether it’s a new drug or test. Folks see how it can be useful and they go full steam ahead without thinking of potential clinical consequences. What these folks [in the CNN trial] have done is they’ve gone after easily accessible data sets. But data sets are inherently biased.”

The ideal solution, then, would be to ensure a more equitable demographic participation in clinical trials, and in the case of machine learning, to save photo sets of skin conditions on diverse skin types for the algorithm to “learn” from. Adamson believes that the remedy “is not necessarily easy, but it is simple.”

Timo Buhl, a dermatologist at the University Medical Center Göttingen and a co-author of the CNN study, readily admits to the study’s demographic data gaps. “Most images in our study were taken of moles and melanomas of white people, which reflects the vast majority of patients here [in Germany],” he says. Buhl adds that he’s currently building data sets and running experiments with images from “other parts of the world.” The ISIC, too, is looking to expand its archive to include as many skin types as possible, according to Allan C. Halpern, a dermatologist at Memorial Sloan Kettering, in New York City, and a spokesperson for the organization.

Adamson wants dermatologists to begin actively contributing photos of lesions on their patients with darker skin tones to the open-source ISIC. Smith agrees, saying that contributions will be most valuable if they extend beyond the United States and Europe. “You have [dozens of] countries across the world with majority-black populations,” he says. “There needs to be more photos taken of their moles.”

Improving machine-learning algorithms is far from the only method to ensure that people with darker skin tones are protected against the sun and receive diagnoses earlier, when many cancers are more survivable. According to the Skin Cancer Foundation, 63 percent of African Americans don’t wear sunscreen; both they and many dermatologists are more likely to delay diagnosis and treatment because of the belief that dark skin is adequate protection from the sun’s harmful rays. And due to racial disparities in access to health care in America, African Americans are less likely to get treatment in time.

“These organizations are training their machine learning on Caucasian skin, so it ends up providing advancement for the population that has the highest survival rate,” Smith says.

“AI isn’t bad; quite the opposite,” Adamson adds. “I just think it should be inclusive.”