Tradeoffs LIVE! Rooting Out Racial Bias in Health Care AI

February 8, 2024

(L-R) Tradeoffs host Dan Gorenstein, ONC Director Micky Tripathi and Oracle Health Government Services Chief Health Officer James Ellzy onstage in Washington, D.C., at the ONC Annual Meeting on Dec. 15, 2023. (Image courtesy of ONC)

A live conversation recorded onstage in Washington D.C. between a top federal health care official and a top executive with Oracle Health. They discussed new Biden administration regulations designed to inject transparency into the black box world of health care AI and how regulators and industry leaders can work together to keep biased algorithms from harming patients.

Scroll down to listen to the full episode, read the transcript and get more information. You can check out our previous in-depth coverage of AI and racial bias here.

If you want more deep dives into health policy research, check out our Research Corner and subscribe to our weekly newsletters.

Dan Gorenstein: Hey, it’s Dan.

We’re bringing you something a little different this week: Tradeoffs LIVE!

On December 15, I spent the morning talking with health care leaders about how the industry can avoid racial bias in artificial intelligence, or AI. We were in Washington D.C. for the Annual Meeting of ONC — the federal Office of the National Coordinator for Health Information Technology. ONC is one of the federal regulators in charge of creating rules around health care AI, and this meeting brought together about 1,600 leaders from government and industry. I moderated three conversations on stage between federal regulators and top industry executives. The goal was to explore how the two sides — regulators and the companies they regulate — can work together to achieve a shared goal: reduce bias in health care AI. Today, one of those conversations featuring the head of ONC and a top executive at Oracle Health. From the studio at the Leonard Davis Institute at the University of Pennsylvania, I’m Dan Gorenstein. This is Tradeoffs.

********

DG: The conversation you’re about to hear is between Micky Tripathi, the national coordinator of Health IT, and James Ellzy, a physician, electronic health record expert and chief health officer for Oracle Health Government Services. Oracle Health is one of the largest electronic medical record companies in the country. We’ve edited down the conversation for clarity. In the first half, Micky explains the problems ONC saw with AI and racial bias and how they’re looking to fix them. We’ll hear from James and audience questions in the second half.

DG: Micky, could you talk with us a little bit about when it comes to racial bias and AI, what is the purview of ONC? Just as a quick introduction.

Micky Tripathi: So we actually started looking at this in the summer of 2021 as we were thinking hard about health equity issues. And the secretary was asking us, what are the the health equity issues related to health IT? So we started looking deeper into it, and there was mounting evidence that Ziad Obermeyer and a number of other researchers were starting to uncover about the unintended consequences that of relatively naive applications of algorithms. One, there was a provider organization in California that had taken an algorithm for what seemed to be a pretty benign thing, efficiency and scheduling because, as you know, no shows is a big deal in health care, right? It costs all of us a lot of money. That’s a very valuable, highly skilled set of people who are left standing without anything to do. And none of us really wants that. We want productivity. And what they did with the algorithm is they would determine looking across a patient population who has a higher risk of being a no show. And for those patients, we’re going to double or triple book the providers. And what ended up happening then, you can imagine, is someone shows up and if they have transportation insecurity, for example, which a number of people do, they would have a higher likelihood of having a much longer waiting period if they showed up. And you can imagine that applying in a wide variety of settings, in inner city settings, in rural settings where people have to travel 50 miles with perhaps an old car that happened to break down twice in the last six months. Now their bad luck, the algorithm has picked them up as being at risk for not showing up, and all of a sudden they’re waiting three hours in the waiting room and perhaps they lose their job because, as we know, a lot of people don’t have the flexibility to say, I’m still at the doctor’s office, I’ll be in a little late. Right? That just means, oh, don’t show up tomorrow then. So we’re looking at things like that.

DG: And let me jump in. And so you see this kind of algorithm and you’re like, okay, this is intended to try to solve a problem, a meaningful problem that could actually end up helping a lot of people. But the unintended consequence ultimately was what?

MT: The unintended consequence was that communities of people who have less access to resources for a wide variety of reasons are being treated in a differential way that can really negatively affect them in ways that no one had really predicted. And to the credit of that provider organization in California, once once they were made aware of it, which I think was their own studies actually, they quickly shut it down and they tried to figure out different ways of doing it.

DG: Great. So, Micky, you’re seeing these algorithms out in the wild. You’re getting concerned that different people are going to be treated inequitably. And you have responded with a set of regulations which, as most people in this room know, were finalized on Wednesday. When it comes specifically to trying to reduce the amount of racial bias in AI, what is this regulation doing? And you’ve got a slide to sort of walk us through here, I think.

MT: Sure. So what we’ve proposed is nine general categories of information that you can think of as a nutrition label. And,we can talk about whether that analogy works or not. But across the industry this has been an emerging concept of saying we ought to be able to have a way of providing some descriptive information about that algorithm so that the user of that algorithm is better informed about the appropriateness of the use in my particular setting. And so we’ve proposed nine categories that are I think there are 31 different requirements underneath this, which would be something that an EHR developer who is certified by ONC would be required to establish transparency for the user of the AI enabled tools that are in the system. And that transparency means that they have to create functionality to allow this data to be documented. And what we hope is that that will create an incentive for higher quality, because provider organizations and others who are the users will say, well, wait a minute, this algorithm has all these fields filled out and I have a much better sense of, are there known risks that I’m now aware of? And this algorithm is pretty spotty. They either weren’t able to give me good information or they actually left the fields blank. And you know, we’re not saying that that should tell the provider, use this or don’t use this. We’re not approving these in the way that FDA approves medical devices. What we’re saying is give that information to the provider organization and let them decide.

DG: And so you’re asking for something like this nutrition label that you see on the back of a jar of mayonnaise. And you’ve got these like 31 source attributes. And the categories that need to be included on that AI nutrition label the fats, the sugars, calories, etc. And I know one of the key ingredients that ONC thinks is super important in all of this is this idea of fairness, which we all sort of know as a word, but in this context is a little confusing. Can you walk us through what fairness means as it applies to this nutrition label?

MT: Sure. The idea of fairness generally is anything that leads to conclusions based on inherited or acquired traits which really have nothing to do with what you’re trying to accomplish at the end of the day. And so it’s prejudice in some way, shape or form. As you know, my good friend John Halamka likes to say, Mayo Clinic has fantastic algorithms developed on a million blond Lutherans in Rochester, Minnesota. And now I’m in San Juan, Puerto Rico, trying to figure out is this algorithm, in many cases, it may it may not matter. Right? There are no biological differences. The science suggests that it’s OK, but the idea is that you ought to be aware that that’s where that came from. So those are the kinds of things that I think that we want to be able to look for and raise and sort of surface. There’s no clear answer to those. What we’re trying to do is say, let’s take a first step here. The industry needs to get a lot more consensus around the best ways of doing this. But let’s start. You have to start with transparency. You have to start with making available the information so that people can see that there are actually issues here.

DG: And so, going back to this nutrition label, ONC is not — you’re very you’re very clear — ONC is not endorsing one way that companies like James’ should measure fairness or how it should even be displayed. That is not the role for ONC. Why could it be a problem if different companies use different measures of fairness or display them in different ways? Who cares?

MT: We don’t have industry consensus on what’s the best way of representing these things, which are very complicated. And there are arguably very reasonable variants in the way that you might measure certain things and the way you would present it. And for right now, what we want to say is let’s leave that a little bit open to see how the industry sort of responds to that. And then we as an industry work together on developing what are more standardized kinds of ways of doing that.

DG: Right. So basically, just to sum this up in layperson’s terms, like flexibility at the beginning, it’s cutting edge, people don’t know what they know. They need that room. They need that space to be creative, to experiment. Because there could be some great, amazing idea. Let’s not try to like over-regulate initially and see what we can get. And as we do that, over time we’ll begin to refine this idea more and more. But we understand that we need the standardization, because at the end of the day, if you have one label that says x, y, z and another that says one, two, three, it’s like, what the hell do you do with that?

MT: Right. If we tried to be that prescriptive right now, I can guarantee you 100%. This isn’t an “it depends on.” This is guarantee 100% from a federal official: We would be precisely wrong. I can guarantee that we would be precisely wrong because we’re kind of overreaching in a space that there’s still a lot of fluidity in. Stuff like this and my experience with data, with statistics, with visualizations, they only get better through use. They don’t get better by someone sitting in a room. Whether you’re the smartest software engineer in the room or the smartest statistician or data scientist in the room, there’s no perfect answer to that. The best way of doing it is getting it out, having people use it and bang away at that, and you start to get convergence around the things that work best.

DG: When we come back, how an industry leader is thinking about measuring and sharing fairness data with clinicians, and how industry and regulators can work together to reduce bias in health care AI.

MIDROLL

DG: Welcome back. Today we’re bringing you a conversation about artificial intelligence in health care between Micky Tripathi, the director of health IT for the Biden administration, and Dr. James Ellzy, chief health officer for Oracle Health Government Services. This conversation happened live on stage in Washington D.C. in December at the ONC Annual Meeting.

DG: James, let’s bring you in here. So Oracle Health is one of the companies that we’ll have to follow these new regs. And you’re going to have to put together this nutrition label.

MT: You’re welcome.

DG: Can you walk us through how Oracle Health is thinking about which fairness measures it thinks is most important?

James Ellzy: So we’ve been looking at fairness for a little bit actually. And it’s looking at the idea of, as we have these datasets and models that people are building inside of our cloud infrastructure, how do we give them the tools to actually look at fairness? And we have something called Oracle Guardian AI where it gives you fairness and looking more on the statistical side. So not trying to address unconscious bias, but really that statistical bias. Whether it be gender, whether it be race, whether it be age, getting the statistical prediction that you think you would, regardless of which group you’re looking at. Just trying to make sure that the predictions are true no matter what variables you’re introducing.

DG: Right. And how do you think you will display this for end users, these clinicians, these health system execs who are not experts in this? So it’s easy and accessible and meaningful, right, because that’s the whole point.

JE: We’re struggling with that. We have many ideas, but we don’t have anything concrete yet because when you talk about I think it’s box three that says where you should not use an algorithm. Well, that can be a very lengthy conversation on why you shouldn’t use it, not just don’t use it, but well, why shouldn’t I use it? So how do you put that into something simple? Like it’s tempting to give it a grade or a letter or a number just to say boom, or do you do narrative? And I was thinking it was very interesting because you’re more of a statistician, and we are trying to figure out how to get the statistician PhD to translate something for the MD.

DG: And so great, you’re an MD. What would you want?

JE: Something I can digest in seconds. No, truly. My fellow clinicians in the front. I mean, if I have 20 minutes with a patient, I have the first 5 or 10 minutes is trying to get the patient who is new to me, to trust me, to actually tell me what’s going on and to get that history. And then I do my physical exam and look at what’s already been done for that patient. And now I have five minutes left to figure out what we should do going forward, so I don’t have time to go and figure out and read a long narrative of, well, on this population, it was done in the Swedish in the ’70s, and therefore — I don’t have time for that. I need you to truly tell me, based on you saw what patient I have, and based on that, you’re giving me productivity of 97% that this applies to your patient. And here’s what you should do.

MT: I would agree with that. And one one thing I would add on that, we had made a proposal that this information be available to the end users, essentially like a drill down. From a lot of feedback we got on that was like, don’t do that, because that may not make sense, and it could clutter the screen and it could confuse people. And so what we did is we sort of pulled that back one level and we said it needs to be made available to the customer, to a limited set of users that the customer will determine. So Mayo Clinic can have its governance committee and make it available to that governance committee. They will figure out who are the limited users they want to make it available to, and then they will make decisions about what they’re going to enable for the frontline users. And if they decide with Oracle or Epic or whoever they’re working with, that we would actually like that to be made available to our end users then great. That’s up to them. But we’re not making it a requirement.

JE: Thank you.

MT: James, I’m curious, so ONC is looking for some sort of standardization. Based on the conversations you and others at Oracle are having, handicap this for us. How close do you think the industry is to getting towards knowing, coalescing around an idea of what fairness, specifically around measuring fairness and displaying fairness.

JE: I think there’s kind of two parts of the industry. There is more of the model development side of the industry and then there’s more the EHR side of the industry. And I think they look at it differently. I think when you look at it from an EHR standpoint, it’s more looking at almost a patient safety lens. Are we causing harm to that individual patient versus looking to say, is this model valid across the entire population? I think Oracle is very well positioned because we cover both of those sides to really blend those together and help move the industry along. And I came in wanting to say one thing about having more standards up front. But Micky, you’ve turned me with what your earlier comments that you really did set us up for success, to work, to partner together, to say, while we’re not being prescriptive right now, we are probably going to be in the future and help us figure out what that looks like so that we can actually have standards in the way we display this information for the customers.

DG: Well, really, thank you so much, James, for introducing that. Micky, what do you what do you feel like you really need from the Oracles, the Epics, the other companies out there around this? What do you need from the regulated entities to help you pull this off and ensure that there is not racial bias or other kinds of bias being baked into the AI? What do you need from the regulated?

MT: Unwavering compliance. That was a joke. I think it’s really working in partnership with industry, with the Oracles, with the provider organizations, with the experts in the field to say, how do we converge on a set of industry conventions that are going to work for us here? I think that people who aren’t sort of deeply involved in standards, often have a misconception that standards come from the top down and then get enforced and everyone uses them. And there are some that happen that way. And I would argue the vast majority of those are terrible and they don’t work. The best standards are the ones that come more from the bottom up, that people have problems to solve, and they start getting together and say, I’ve got this problem to solve. And now we’ve developed different ways of standardizing and then it starts to emerge that, okay, now we’ve just got four variants of something, and now we can take that over the line. And that’s what the federal government can come in and say, all right, we’re going to pick that up and say, all of you agreed basically on four variants of the same thing. Let’s now all agree on this one. Then at that point most people are like, okay, fine, great. I’m already there.

DG: And James what do you need from ONC here so you guys can do this thoughtful, complicated, sometimes seemingly impossible task? What do you think would help you help ONC?

JE: I think definitely the continued dialogue like we had for this rule where you had it out there, you listened to us and our concerns. But also the conversation we’re having today where you said, well, this is where we know we need to go in the next couple of years and let’s partner together and do that.

DG: James, what’s one thing that you want people in this room to know as they go, thinking about some of the challenges and opportunities that exist for companies, your company, but companies like yours in the industry here?

JE: I think I want people to know that it’s not something new, that we have not been thinking about bias and fairness. We’ve been thinking about it for years. We are happy that ONC is now kind of pushing us as a industry to address it, and we’re looking forward to the time that we all are addressing it in a more standardized way.

ME: Thank you.

DG: So many thank yous. Okay, great. We’ve got a couple of questions. Fantastic.

Audience: Hi. Good morning. I’m Nick VanDuyne and I’m from Healthix in New York City. I have a question about where do you see AI handling the non-quantifiable issues, such as the fact that African American women are more predisposed to having a stroke if they’re affected by racism, things that you can’t really quantify in a dataset?

MT: You know, that’s precisely the area that AI can actually be really good at, right, is that there is pattern detection capabilities with these kinds of advanced tools that actually allow you to see patterns that are very, very hard using traditional mechanisms and means to be able to do that. So with more and more unstructured data being made available, I mean, ONC’s information blocking rules, for example, require the availability of all electronic health information. You’ve got deeper notes, deeper narratives, more social determinants of information to be able to detect those kinds of patterns. As long as you have the guardrails, right? So I mean, on net, we think that from a health equity perspective, this can actually be a good thing. While we’re right now identifying there are a lot of unintended consequences where it could be a bad thing.

JE: I think also looking at the datasets that we do have and how do we get the right datasets to look at those kind of questions. So Harry’s got a Meharry and Nashville’s looking at a T for C project to really get the dataset that maybe you could go look at that. Veterans got the Million Veterans plan. So getting the right datasets to be able to look and answer the questions we’re looking at. Something we’re working with our more rural partners and community works users is a learning health network because many of our datasets are about the mega universities, large cities. You’re not getting many people from rural America. So if we can go to those smaller facilities, those FQHCs and get their data and bring it in so you can also apply whatever algorithms you want onto those datasets.

Audience: The question I had is for Micky. I think historically when we’ve sort of tried to regulate a space, very often bigger players can engage. They have both the chest to engage, like cash to engage, but also because they’re big, they’re often the go-to people to figure out what makes sense. And then naturally, the incentives, they’re likely to give ideas that work from their point of view. I think in AI, it’s so important that, I think a lot of the ecosystem innovation is going to come from the very vast community of like smaller companies and startups that will hopefully be big companies at some point. What’s a way in which they can be engaged more?

MT: Yeah. I think there’s, you know, we’ve already got a variety of ways through our, advisory recruiting process, which we try to make sure actually has good representation and isn’t just the big players. But it’s always hard because the smaller players don’t have the time or the resources to be able to invest in that. But I would just say that through the the different collaborative organizations who we work with through our advisory committee, and we are always open to talking to organizations as well. I mean, Micky-dot-tripathy-at-Hhs-dot-gov, we, you know, absolutely have a team who are that.

DG: Say that slower.

MT: Micky-dot-Tripathy-at-Hhs-dot-gov.

DG: And on that, you mentioned earlier talking about wanting to make sure the patient voices are included and probably community voices. I mean that sounds good, but how does that actually happen? I would suspect this is a moment where leadership is required to insist that those voices are at the table. Otherwise it’s going to be impossible.

MT: Yeah. And we do that in our advisory committee. We try to engage as much as possible with the patient community. It’s not dissimilar from the challenge that Suchi was describing with small developers that individuals don’t have time to participate in these as well. So fortunately, we’ve got great patient representatives like Grace and others who can speak on behalf of patients. But we never get enough.

JE: And I dare say that everyone in this room is a patient in some way themselves. So don’t forget to bring that to the table when you’re having conversations.

DG: Okay. Thank you very much. Let’s give it up for James and Micky here.

DG: That was ONC Director Micky Tripathi and Oracle Health Government Services Chief Health Officer James Ellzy talking with me live onstage in December 2023 at the ONC Annual Meeting in Washington, DC. We had two other conversations that morning about AI and health care. One with the FDA and a biotech founder.

Troy Tazbaz: Government cannot regulate this alone because it’s moving at a pace that it requires a very, very clear engagement between the public private sector.

Suchi Saria: It’s exciting to see what’s possible, but then also sort of the anxiety of like, you know, things being done to us as opposed to with us.

DG: And another featuring Duke Health and the federal health department’s Office for Civil Rights.

Jenny Ma: If you are a physician who has 25 years of experience and suddenly the AI algorithms are telling you something very different from your medical training, don’t just go with the machines.

Mark Sendak: These regulations have to come with on the ground support for health systems.

DG: You can find links to videos of all three conversations in our show notes or on our website tradeoffs.org. I’m Dan Gorenstein. This is Tradeoffs.

Tradeoffs’ coverage of diagnostic excellence is funded in part by the Gordon and Betty Moore Foundation.

Want more Tradeoffs? Sign up for our weekly newsletter!

Episode Resources

Additional Resources and Coverage of Tradeoffs LIVE:

Full ONC Annual Meeting Recording

Can government and industry solve racial bias in AI? (Andrea Fox, Healthcare IT News, 12/20/2023)

‘A long time in the making’: New ONC rule targets healthcare AI transparency (Emily Olsen, Healthcare Dive, 12/14/2023)

Rooting Out Racial Bias in Health Care AI (Ryan Levi, Tradeoffs, Summer 2023)

Episode Credits

Guests:

Micky Tripathi, PhD, MPP, National Coordinator for Health Information Technology, U.S. Department of Health and Human Services

James Ellzy, MD, Chief Health Officer, Oracle Health Government Services

The Tradeoffs theme song was composed by Ty Citerman.

This episode was produced by Ryan Levi, edited by Dan Gorenstein, and mixed by Andrew Parrella.

Additional thanks to: Cannon Leavelle, Zhan Caplan, Kevin Eike, Troy Tazbaz, Suchi Saria, Greg Thole, Andrew Roberts, Jeff Smith, Kathryn Marchesini, Jordan Everson, Jenny Ma, Daniel Shieh, Rebecca Winkert, Mark Sendak, the Tradeoffs Advisory Board and our stellar staff!