Rooting Out Racial Bias in Health Care AI, Part 2

June 1, 2023


Artificial intelligence could revolutionize health care. It could also perpetuate and exacerbate generations of racial inequities. In Part 2 of our special series on racial bias in health care AI, we dig into what the Biden administration is doing to keep biased algorithms from getting to the bedside.

If you missed Part 1, you can find it here. Listen to the full episode below, read the transcript or scroll down for more information.

If you want more deep dives into health policy research, check out our Research Corner and subscribe to our weekly newsletters.

Doctors, data scientists and hospital executives believe artificial intelligence may help solve what until now have been intractable problems. Hospitals are already using AI to help clinicians diagnose breast cancer, read X-rays and predict which patients need more care. But as excitement grows, there’s also a risk: These powerful new tools can perpetuate long-standing racial inequities in how care is delivered.

“If you mess this up, you can really, really harm people by entrenching systemic racism further into the health system,” said Mark Sendak, a lead data scientist at the Duke Institute for Health Innovation.

These new health care tools are often built using machine learning, a subset of AI where algorithms are trained to find patterns in large data sets like billing information and test results. Those patterns can predict future outcomes, like the chance a patient develops sepsis. These algorithms can constantly monitor every patient in a hospital at once, alerting clinicians to potential risks that overworked staff might otherwise miss.

The data these algorithms are built on, however, often reflect inequities and bias that have long plagued U.S. health care. Research shows clinicians often provide different care to white patients and patients of color. Those differences in how patients are treated get immortalized in data, which are then used to train algorithms. People of color are also often underrepresented in those training data sets.

“When you learn from the past, you replicate the past. You further entrench the past,” Sendak said. “Because you take existing inequities and you treat them as the aspiration for how health care should be delivered.”

A landmark 2019 study published in the journal Science found that an algorithm used to predict health care needs for more than 100 million people was biased against Black patients. The algorithm relied on health care spending to predict future health needs. But with less access to care historically, Black patients often spent less. As a result, Black patients had to be much sicker to be recommended for extra care under the algorithm.

“You’re essentially walking where there’s land mines,” Sendak said of trying to build clinical AI tools using data that may contain bias, “and [if you’re not careful] your stuff’s going to blow up and it’s going to hurt people.”

The challenge of rooting out racial bias

In the fall of 2019, Sendak teamed up with pediatric emergency medicine physician Emily Sterrett to develop an algorithm to help predict childhood sepsis in Duke’s emergency department.

Sepsis occurs when the body overreacts to an infection and attacks its own organs. While rare in children — roughly 75,000 annual cases in the U.S. — this preventable condition is fatal for nearly 10% of kids. If caught quickly, antibiotics effectively treat sepsis. But diagnosis is challenging because typical early symptoms — fever, high heart rate and high white blood cell count — mimic other illnesses including the common cold.

An algorithm that could predict the threat of sepsis in kids would be a gamechanger for physicians across the country. “When it’s a child’s life on the line, having a backup system that AI could offer to bolster some of that human fallibility is really, really important,” Sterrett said.

But the groundbreaking study in Science about bias reinforced to Sendak and Sterrett they wanted to be careful in their design. The team spent a month teaching the algorithm to identify sepsis based on vital signs and lab tests instead of easily accessible but often incomplete billing data. Any tweak to the program over the first 18 months of development triggered quality control tests to ensure the algorithm found sepsis equally well regardless of race or ethnicity.

But nearly three years into their intentional and methodical effort, the team discovered possible bias still managed to slip in. Ganga Moorthy, a global health fellow with Duke’s pediatric infectious diseases program, showed the developers research that doctors at Duke took longer to order blood tests for Hispanic kids eventually diagnosed with sepsis than white kids.

“One of my major hypotheses was that physicians were taking illnesses in white children perhaps more seriously than those of Hispanic children,” Moorthy said. She also wondered if the need for interpreters slowed down the process.

“I probably had my hands on my face within seconds, just like, oh my God,” Sendak said. “We totally missed all of these subtle things that if any one of these was consistently true could introduce bias into the algorithm.”

Sendak said the team had overlooked this delay, potentially teaching their AI inaccurately that Hispanic kids develop sepsis slower than other kids, a time difference that could be fatal.

“I was angry with myself. How could we not see this?”

Regulators are taking notice

Over the last several years, hospitals and researchers have formed national coalitions to share best practices and develop “playbooks” to combat bias. But signs suggest few hospitals are reckoning with the equity threat this new technology poses.

Researcher Paige Nong interviewed officials at 13 academic medical centers last year, and only four said they considered racial bias when developing or vetting machine learning algorithms.

“If a particular leader at a hospital or a health system happened to be personally concerned about racial inequity, then that would inform how they thought about AI,” Nong said. “But there was nothing structural, there was nothing at the regulatory or policy level that was requiring them to think or act that way.”

Several experts say the lack of regulation leaves this corner of AI feeling a bit like the “wild west.” Separate 2021 investigations found the Food and Drug Administration’s policies on racial bias in AI uneven, with only a fraction of algorithms even including racial information in public applications.

The Biden administration over the last 10 months has released a flurry of proposals to design guardrails for this emerging technology. The FDA says it now asks developers to outline any steps taken to mitigate bias and the source of data underpinning new algorithms.

The Office of the National Coordinator for Health Information Technology proposed new regulations in April that would require developers to share with clinicians a fuller picture of what data were used to build algorithms. Kathryn Marchesini, the agency’s chief privacy officer, described the new regulations as a “nutrition label” that helps doctors know “the ingredients used to make the algorithm.” The hope is more transparency will help providers determine if an algorithm is unbiased enough to safely use on patients.

The Office for Civil Rights at the U.S. Department of Health and Human Services last summer proposed updated regulations that explicitly forbid clinicians, hospitals and insurers from discriminating “through the use of clinical algorithms in [their] decision-making.” The agency’s director, Melanie Fontes Rainer, said while federal anti-discrimination laws already prohibit this activity, her office wanted “to make sure that [providers and insurers are] aware that this isn’t just buy a product off the shelf, close your eyes and use it.”

Industry welcoming — and wary — of new regulation

Many experts in AI and bias welcome this new attention, but there are concerns. Several academics and industry leaders said they want to see the FDA spell out in public guidelines exactly what developers must do to prove their AI tools are unbiased. Others want ONC to require developers to share their algorithm “ingredient list” publicly, allowing independent researchers to evaluate code for problems.

Some hospitals and academics worry these proposals — especially OCR’s explicit prohibition on using discriminatory AI — could backfire. “What we don’t want is for the rule to be so scary that physicians say, okay, I just won’t use any AI in my practice. I just don’t want to run the risk,” said Carmel Shachar, executive director of the Petrie-Flom Center at Harvard Law School. Shachar and several industry leaders said that without clear guidance, hospitals with fewer resources may struggle to stay on the right side of the law.

Duke’s Mark Sendak welcomes new regulations to eliminate bias from algorithms, “but what we’re not hearing regulators say is, ‘We understand the resources that it takes to identify these things, to monitor for these things. And we’re going to make investments to make sure that we address this problem.’”

The federal government invested $35 billion to entice and help doctors and hospitals adopt electronic health records earlier this century. None of the regulatory proposals around AI and bias include financial incentives or support.

‘You have to look in the mirror’

A lack of additional funding and clear regulatory guidance leaves AI developers to troubleshoot their own problems for now.

At Duke, the team had feared the delay in treating Hispanic patients would be baked into their algorithm for predicting sepsis among children. But after weeks of additional rigorous testing, they determined the algorithm predicted sepsis at the same speed for all patients.

Sendak’s best guess is that Duke has treated too few cases of sepsis among children for the algorithm to learn the human bias. He said the conclusion was more sobering than a relief.

“I don’t find it comforting that in one specific rare case, we didn’t have to intervene to prevent bias,” he said. “Every time you become aware of a potential flaw, there’s that responsibility of [asking], ‘Where else is this happening?’”

Sendak plans to build a more diverse team, with anthropologists, sociologists, community members and patients working together to root out bias in Duke’s algorithms. But for this new class of tools to do more good than harm, Sendak believes the entire health care sector must address its underlying racial inequity.

“You have to look in the mirror,” he said. “It requires you to ask hard questions of yourself, of the people you work with, the organizations you’re a part of. Because if you’re actually looking for bias in algorithms, the root cause of a lot of the bias is inequities in [our] care.”

Tradeoffs coverage on diagnostic excellence is supported in part by the Gordon and Betty Moore Foundation.

Want more Tradeoffs? Sign up for our weekly newsletter!

Episode Resources

Selected Reporting, Research and Analysis on Regulation of AI and Racial Bias:

AI leaders issue a plea to Congress: Regulate us, and quickly (Casey Ross, STAT News, 5/16/2023)

Congress wants to regulate AI, but it has a lot of catching up to do (Claudia Grisales, NPR, 5/15/2023)

‘Nutrition Facts Labels’ for Artificial Intelligence/Machine Learning-Based Medical Devices—The Urgent Need for Labeling Standards (Sara Gerke, George Washington Law Review, 4/18/2023)

Artificial Intelligence and Machine Learning Blog Series (Kathryn Marchesini, ONC, 4/13/2023)

New ONC rule aims to raise trust in clinical decision support algorithms (Rebecca Pifer, Healthcare Dive, 4/12/2023)

FDA proposes a new plan to streamline updates to medical devices that use AI (Casey Ross, STAT News, 3/30/2023)

Prevention of Bias and Discrimination in Clinical Practice Algorithms (Carmel Shachar, Sara Gerke and Dipl-Jur Univ; JAMA; 1/5/2023)

HHS’s proposed rule prohibiting discrimination via algorithm needs strengthening (Ashley Beecy, Steve Miff and Karandeep Singh; STAT News; 11/3/2022)

In new guidance, FDA says AI tools to warn of sepsis should be regulated as devices (Casey Ross, STAT News, 9/27/2022)

FDA Review Can Limit Bias Risks in Medical Devices Using Artificial Intelligence (Liz Richardson, Pew Charitable Trusts, 10/7/2021)

Ethics and governance of artificial intelligence for health (World Health Organization, 6/28/2021)

How medical AI devices are evaluated: limitations and recommendations from an analysis of FDA approvals (Eric Wu, Kevin Wu, Roxana Daneshjou, David Ouyang, Daniel E. Ho and James Zou; Nature Medicine; 4/5/2021)

As the FDA clears a flood of AI tools, missing data raise troubling questions on safety and fairness (Casey Ross, STAT News, 2/3/2021)

Selected Reporting on AI and Racial Bias:

A research team airs the messy truth about AI in medicine — and gives hospitals a guide to fix it (Casey Ross, STAT News, 4/27/2023)

How Doctors Use AI to Help Diagnose Patients (Sumathi Reddy, Wall Street Journal, 2/28/2023)

How Hospitals Are Using AI to Save Lives (Laura Landro, Wall Street Journal, 4/10/2022)

From a small town in North Carolina to big-city hospitals, how software infuses racism into U.S. health care (Casey Ross, STAT News, 10/13/2020)

Widely used algorithm for follow-up care in hospitals is racially biased, study finds (Shraddha Chakradhar, STAT News, 10/24/2019)

Selected Research and Analysis on AI and Racial Bias:

Predictive Accuracy of Stroke Risk Prediction Models Across Black and White Race, Sex, and Age Groups (Chuan Hong, Michael J. Pencina, Daniel M. Wojdyla, Jennifer L. Hall, Suzanne E. Judd, Michael Cary, Matthew M. Engelhard, Samuel Berchuck, Ying Xian, Ralph D’Agostino Sr, George Howard, Brett Kissela and Ricardo Henao; JAMA; 1/24/2023)

Responsible AI: Fighting Bias in Healthcare (Chris Hemphill, Actium Health, 4/18/2022)

Underdiagnosis bias of artificial intelligence algorithms applied to chest radiographs in under-served patient populations (Laleh Seyyed-Kalantari, Haoran Zhang, Matthew B. A. McDermott, Irene Y. Chen and Marzyeh Ghassemi; Nature Medicine; 12/10/2021)

Racial Bias in Health Care Artificial Intelligence (NIHCM, 9/30/2021)

Artificial Intelligence in Healthcare: The Hope, The Hype, The Promise, The Peril (National Academy of Medicine, 2019)

Dissecting racial bias in an algorithm used to manage the health of populations (Ziad Obermeyer, Brian Powers, Christine Vogeli and Sendhil Mullainathan; Science; 10/25/2019)

Selected Best Practices to Mitigate Racial Bias in AI:

Organizational Governance of Emerging Technologies: AI Adoption in Healthcare (Health AI Partnership, 5/10/2023)

Blueprint for Trustworthy AI: Implementation Guidance and Assurance for Healthcare (Coalition for Health AI, 4/4/2023)

Preventing Bias and Inequities in AI-Enabled Health Tools (Trevan Locke, Valerie J. Parker, Andrea Thoumi, Benjamin A. Goldstein and Christina Silcox; Duke Margolis Center for Health Policy; 7/6/2022)

Algorithmic Bias Playbook (Ziad Obermeyer, Rebecca Nissan, Michael Stern, Stephanie Eaneff, Emily Joy Bembeneck and Sendhil Maullainathan; Chicago Booth Center for Applied Artificial Intelligence; June 2021)

Ensuring Fairness in Machine Learning to Advance Health Equity (Alvin Rajkomar, Michaela Hardt, Michael D. Howell, Greg Corrado and Marshall H. Chin; Annals of Internal Medicine; 12/18/2018)

Episode Credits


Emily Sterrett, MD, Associate Professor of Pediatrics, Director of Improvement Science, Duke University School of Medicine Department of Pediatrics

Mark Sendak, MD, MPP, Population Health & Data Science Lead, Duke Institute for Health Innovation

Minerva Tantoco, Chief AI Officer, New York University McSilver Institute for Poverty, Policy and Research

Carmel Shachar, JD, MPH, Executive Director, Petrie-Flom Center for Health Law Policy, Biotechnology, and Bioethics at Harvard Law School

Kathryn Marchesini, JD, Chief Privacy Officer, Office of the National Coordinator for Health Information Technology

Melanie Fontes Rainer, JD, Director, HHS Office for Civil Rights

Ryan Levi, Reporter/Producer, Tradeoffs

The Tradeoffs theme song was composed by Ty Citerman, with additional music this episode from Blue Dot Sessions and Epidemic Sound.

This episode was reported by Ryan Levi, edited by Dan Gorenstein and Cate Cahan, and mixed by Andrew Parrella and Cedric Wilson.

Special thanks to: Suresh Balu, Jordan Everson, Sara Gerke and Jeff Smith.

Additional thanks to: Julia Adler-Milstein, Brett Beaulieu-Jones, Bennett Borden, David Dorr, Malika Fair, Marzyeh Ghassemi, Maia Hightower, John Halamka, Chris Hemphill, John Jackson, Jen King, Elaine Nsoesie, Ziad Obermeyer, Michael Pencina, Yolande Pengetnze, Deb Raji, Juan Rojas, Keo Shaw, Mona Siddiqui, Yusuf Talha Tamer, Ritwik Tewari, Danny Tobey, Alexandra Valladares, David Vidal, Steven Waldren, Anna Zink, James Zou, the Tradeoffs Advisory Board and our stellar staff!