Physician and New Yorker writer Dhruv Khullar says artificial intelligence is a powerful tool to get quicker and more accurate diagnoses. But it can also be dangerous.
Dhruv Khullar became a doctor to solve medical mysteries.
“I loved the idea that you could talk to someone, understand what is ailing them, and then put that information together in your mind along with lab tests and imaging studies to come up with a reason, a diagnosis that might put them on a path to feeling better,” Khullar told us in a recent interview.
Khullar, a physician and professor at Weill Cornell Medicine in New York, recently noticed his colleagues and his patients adding a new component to the diagnosis mix: artificial intelligence. One survey found that 1 in 6 American adults use AI chatbots at least once a month to get health information, and another survey showed that 2 in 3 physicians are using AI in their work.
Khullar — who is also a contributing writer at The New Yorker — was initially skeptical about how useful AI could be as a diagnostic tool. But what he learned through his research for a story changed his mind.
Here are three key lessons Khullar shared with us that he thinks every doctor, nurse and patient should know to safely get the most out of AI in health care:
- Some AI diagnostic tools are really good. One model in development at Harvard Medical School can accurately diagnose complex cases in minutes and explain step-by-step how it arrived at its conclusion — in a manner and voice that seems remarkably human. “If you were just listening to the presentation, it was hard to distinguish [the AI] from many of the human doctors that I had heard during my own medical training,” Khullar said after seeing a demonstration.
- If we rely too much on AI, doctors could get worse at their job, and patients could get hurt. “I think there’s this notion now that all a doctor has to be is an empathic human, because everything else can be looked up or given to you by an AI. And that is not at all true,” Khullar said, noting that some of the more widely available AI tools, like ChatGPT, often provide inaccurate or incomplete medical advice.
- Khullar recommends clinicians and patients treat AI as a guide — not a god — when it comes to making a diagnosis. As a doctor, he often looks to AI for a second opinion once he’s arrived at a diagnosis, to check if he’s missed anything. And he thinks patients could benefit from AI by using it to prepare for an office visit — asking it to look over their medical records and suggest questions to ask their clinician.
Listen to our full conversation with Khullar above, or read the transcript below. We’ve included audio of that human-sounding AI “doctor” Khullar met, as it reasons through a diagnosis. And you’ll hear a very human Dhruv Khullar answer questions from Tradeoffs listeners.
Episode Transcript and Resources
Episode Transcript
Dan Gorenstein (DG): One of my favorite things to do is riding my bike. But I started noticing recently that my leg muscles would get really stiff in the days after one of my long rides.
So I did what a lot of people have started doing: I asked ChatGPT for help.
Dan computerized somehow: What steps should I take to avoid muscle stiffness after a long bike ride?
ChatGPT: To avoid muscle stiffness after a long bike ride, here are the key steps you should take before, during, and after your ride.
DG: It told me to stretch beforehand, drink lots of water, and eat a protein and carb-heavy meal 30-60 minutes after I finished.
ChatGPT: Would you like a simple post-ride stretching routine to follow?
DG: I didn’t. But this whole experience got me thinking that I was actually comfortable turning to artificial intelligence for at least this kind of health care advice, something that would’ve felt unthinkable to me even a year ago.
DG: A 2024 survey found that 1 in 6 American adults use AI chatbots at least once a month to get health information.
Physicians are increasingly using AI too.
But both groups still have a lot of questions about how reliable the technology is, especially when it comes to making a diagnosis.
Dhruv Khullar (DK): It’s the most important challenge because everything else hinges on having the right diagnosis.
DG: Today, a physician and New Yorker contributing writer shares three things every patient and doctor should know to get the most out of AI — safely.
From the studio at the Leonard Davis Institute at the University of Pennsylvania, I’m Dan Gorenstein. This is Tradeoffs.
******
DK: My name is Dhruv Khullar. I’m a physician at Weill Cornell Medicine in New York. I’m also an associate professor of health policy and economics.
DG: Last month, Dhruv published an article in the New Yorker called, If AI Can Diagnose Patients, What Are Doctors For?
I started our conversation by asking Dhruv why he became a doctor.
DK: Well, my dad’s a doctor, so that’s a high risk factor for becoming a physician. But independent of that, I loved the idea that you could talk to someone, understand what is ailing them, and then put that information together in your mind along with lab tests and imaging studies to come up with a reason, a diagnosis that might put them on a path to feeling better.
DG: There’s a line, Dhruv, that you wrote early in your piece that really got me. You said misdiagnosis disables hundreds of thousands of people each year, and that autopsy studies suggest that misdiagnosis contributes to perhaps one in every ten deaths in the U.S. — 10% of all deaths.
Can you share an example, an anecdote that helps explain why in 2025, misdiagnosis continues to be such a problem?
DK: Diagnosis is an incredibly complex challenge. So at its core, it’s an exercise in trying to match what a patient is feeling to the thousands of ways that the human body can fail. And sometimes a patient’s symptoms line up perfectly with the textbook version of disease, and sometimes they don’t. So I want to start by saying that it’s a really hard challenge.
I had a patient a couple months ago that came in with shortness of breath. We thought that maybe the patient has heart failure. So we started to give them diuretics, medications to remove fluid.
That didn’t seem to be working. And we took a look at the X-ray and it seemed like maybe there was a pneumonia hiding behind the heart. And so we start antibiotics, and even still, the patient’s symptoms got worse.
Throughout this entire experience, there’s not one test that can give you the correct answer. But you try treatments almost as a diagnostic test. If they get better with some of these remedies, then you know that they have the diagnosis that you thought they did.
DG: You were telling me about this patient before we started recording … and you said it took five days before you figured out what was actually causing their breathing issues.
DK: Yeah, in the end, they had a completely different diagnosis, which was interstitial lung disease that was making it difficult for oxygen to enter their bloodstream. And we ended up at the correct diagnosis. But that was only after several false starts. And that type of thing is quite common, actually, in the hospital.
DG: So as your story, as your anecdote about this patient with the breathing problems illustrates, diagnosing someone sort of sounds easy, but oftentimes can be really, really hard. And it’s this process of trial and error.
Here comes AI and one of the promises of AI is that it can help improve this misdiagnosis problem that we have. Before you started reporting this story for The New Yorker, how did you think about AI as a diagnostic tool? What assumptions did you have?
DK: What I assumed at the beginning of my reporting was that AI would be good at generating a list of potential diagnoses, so that if a patient came in with shortness of breath, it might offer five, six, seven potential diagnoses, some general sense of which ones appear to be most likely, but not much more than that.
DG: So based on your reporting and your experience as a clinician, you’ve come with three big ideas that you think every patient and doctor should know in order to get the most out of AI safely. Can you hit us with the first one, please, Dhruv?
DK: The first big idea is the technology is just really, really good.
DK: Particularly the models that have been trained in a particular way are able to deliver the right diagnosis on very complex diagnostic challenges. And so I want people to understand that these technologies are powerful. They will play a larger role in medical care in the years to come. And the real challenge now is to figure out how to integrate them into clinical practice, how doctors can do that and how patients can do that.
DG: I mean, listening to you, I can tell how almost despite yourself, you were really impressed with this AI as a diagnostic tool. And that came through most clearly in the story when you went to Harvard this summer and you saw an AI model called CaBot do a sort of diagnosis showdown against one of the country’s best human diagnosticians, a physician named Daniel Restrepo.
You actually sent us a recording of the AI’s presentation.
CaBot: Good morning, everyone. Thanks for joining this case conference. I’m Doctor Cabot.
DG: You describe how Restrepo and CaBot were both given a series of symptoms for a 41-year old man who’d come to the hospital. And then they were asked to “show their work,” essentially walk through how they arrived at their diagnosis. Dhruv, what was that like watching CaBot in action?
DK: Yeah what was so remarkable about this experience was that not only did Cabot arrive at the correct answer, but it was able to cite the literature. It was able to walk the audience through how it arrived at its diagnosis.
CaBot: The features scream perilymphatic distribution, which to me, um, really narrows the list.
DK: It did all this in the style and the cadence and the humor that a human physician might do.
CaBot: No exotic exposures, just life in urban New England with a cat that scratched him six months ago. Which, you know, I keep in the back of my mind, but I’m not married to it.
DK: You know, if you were just listening to the presentation, it was hard to distinguish from many of the human doctors that I had heard during my own medical training. And I write in the piece that I’ve been, you know, kind of skeptical that AI could reproduce the cognitive work of doctors. I always thought, you know, how could it do what we’re doing?
But after this presentation, I flipped the question, which was how could it not? How could it not be a bigger and bigger part of medical care, given that it can perform as effectively as it did?
CaBot: Thanks everyone for your attention. Happy to take questions.
DG: I laughed out loud when you wrote the line that not only did Cabot do all this impressively, you know, citing the literature, sounding kind of like a human replicating human cognition. But you say, quote, it created that in the time it takes me to brew a cup of coffee.
DK: That’s right.
DG: Dude, that’s like a kind of holy shit moment, isn’t it?
DK: Oh, absolutely. Absolutely. You know, when I was telling you about a misdiagnosis earlier with the patient with shortness of breath in my own practice, you know, it’s very possible that if CaBot had been next to me, it would have picked up a very subtle imaging finding that was on the CT scan and may have gotten us to that diagnosis of interstitial lung disease before.
DG: So Cabot sounds incredible, but you also write a lot in the article about the risks of replacing a doctor’s diagnosis with AI. The second idea that you’re bringing to us touches on the downside. What is that second thing, Dhruv?
DK: So the second part of this is that if we rely too much on AI, doctors might get worse at what they do and patients could actually get hurt.
DG: For doctors you called this risk “cognitive de-skilling,” a situation where doctors become so reliant on AI that they forget how to make complex diagnoses on their own. Or maybe they never even learn in the first place. I think the story of Benjamin Popokh illustrated this really nicely. Can you tell us about Benjamin?
DK: Sure. So Benjamin, he’s a medical student, and he told me that for a while after these models came out, every time he stepped out of a patient’s room, he would basically put a patient’s labs and symptoms, what they told him into the AI, which would then generate a diagnosis for him.
And over time, he grew concerned that he wasn’t thinking about how to go through a case by himself. And he started to feel strange about presenting his thoughts knowing that they were actually just the AI’s output.
You know, there’s already evidence that doctors can get de-skilled pretty quickly. And at the end of the day, we need to be the people that are making the most, consequential decisions for patients.
DG: I mean, Adam Rodman, one of the co-creators of CaBot, the AI diagnostic model, told you that, quote, we’re screwed if this cognitive de-skilling happens. How worried are you, Dhruv, that doctors will effectively lose their ability to do the job?
DK: Well, my hope is that, you know, these tools get integrated responsibly and slowly into medical practice such that, you know, we are able to figure out how to make the most of them without losing the skills that we need to ultimately adjudicate their output.
DG: I mean, Dhruv, you know as well as I do, some healthcare places struggle to get doctors and nurses to wash their hands still. And so I really want to get back to this, like, how confident are you that we’re going to get all of this right? That we’re going to implement those best practices?
DK: Yeah. You know what? What gives me pause is that these tools are so pervasive. I think there’s definitely going to be growing pains. I think both on the side of people who are not using AI enough, and people who are overrelying on AI.
But medicine has undergone technological revolutions in the past. It takes time to sort out the wrinkles and the challenges. But I’m confident that on the other side of this, patients are going to get better care and doctors are going to have a more satisfying type of work that they’re doing.
DG : We’ve been talking a lot about physicians, clinicians here. But for patients, the risk of becoming too reliant on AI is also high. One study showed that ChatGPT got two-thirds of open-ended medical questions wrong. In another study, the popular chatbot misdiagnosed more than 80% of complex pediatric cases.
Dhruv, is AI just the newest flavor of “Dr. Google,” or is this a significantly more dangerous tool?
DK: So the main difference here is that AI is so much more fluent and so much more persuasive and so much more personalized than Dr. Google ever was. So in the past, if you went to Google, you could see how well the symptoms you were experiencing lined up with a list of diseases on some website.
But now you can input your medical data. It can respond to you directly. You can have a conversation with it. And that’s much more powerful if it’s used well, but it’s also much more dangerous if it’s used poorly.
And so, you know, in the piece, I talk about the case of the 60 year old man who was concerned about how much salt he was eating in his diet, and so he asked for substitutes from ChatGPT.
And the AI suggested something called bromide. And bromide can be toxic. And in his case, he ended up having paranoia. He started hallucinating. He went to the hospital. And that’s the type of thing that would never happen in an interaction with a real doctor. And it speaks to the limitations of AI.
DG: When we come back, Dhruv explains how he thinks patients and doctors can best use AI right now and he answers your questions about this rapidly evolving technology.
BREAK
DG: Welcome back. We’re talking with Dhruv Khullar, a physician and researcher at Weill Cornell Medicine in New York and a contributing writer for The New Yorker.
Based on one of his recent pieces in the New Yorker, Dhruv is walking us through three big things he thinks could help doctors, nurses and patients get the most out of AI – safely.
So Dhruv, You’ve told us about the power of AI to do the work of world class diagnosticians in minutes, and potentially save lives in the process. You’ve told us how becoming too reliant on AI could leave our doctors less capable and patients more at risk. The obvious next question to me is how do we strike the right balance?
DK: I think the main thing you want to think about is using AI as a guide. A way to navigate through a diagnosis or through the medical system, as opposed to just something that’s going to magically produce the ultimate result.
This is a concept that I got from a doctor at UCSF named Gurpreet Dhaliwal, who told me that, you know, if you ask AI to solve the case, you’re kind of starting with the end, you’re starting with the destination that you’re trying to get to.
What’s going to be more helpful for doctors and probably for patients as well, is to ask for help with wayfinding, you know, going with you along the diagnostic journey. And so AI in this way might alert us to a recent informative study that we weren’t aware of, or might propose the next test that we should be thinking about.
And for patients in particular, I think it’s a nice way to prepare for your interactions with the doctor, by asking it to maybe look over your medical records, asking it to explain what’s going on with your medical care in more detail and over more time. And then using that to make the most of your interactions, when you actually see the physician in front of you.
DG: This idea of “guide, not God” is compelling. And I know Dhaliwal like I interviewed him before, I really respect him a lot and think he’s a pretty wise person. But like, what everybody wants to know is what’s wrong with me, right? Whether it’s the doctor or the patient, the impulse is to get to diagnosis.
DK: Well, we all want the final diagnosis. Of course we all want the answer. The challenge is that AI at this stage is not good enough to reliably give that to us every time in a way that I think can be or should be used at scale. And so we need to think about it in the same way that you might think of a lab test, or a CT scan. It is one input that goes into the equation of what’s actually wrong with someone at the end of the day.
And so thinking about it as another piece of data that we’re incorporating into your story to get to the final answer, I think that’s the best way to think about it right now.
DG: All right. Before we wrap up, we want to do a bit of a lightning round with some questions that we got from our listeners. The first one dives a little deeper on the conversation from earlier about AI potentially making doctors worse.
Tiffany Vaughn: My name is Tiffany Vaughn, and I’m an internal medicine resident at Yale. How should we integrate artificial intelligence into medical training such that it enhances and doesn’t undermine the development of clinical knowledge and reasoning skills for early career physicians? Thank you so much.
DK: It’s a great question. I think this is the fundamental question for medical educators. I think a lot of the basics need to remain the same. I think people still need to carry around a ton of information in their minds I think there’s this kind of notion now that all a doctor has to be is an empathic human, because everything else can be looked up or now given to you by an AI, and that is not at all true.
And we need to train not just medical students and residents, but also attending physicians on how they should be using these tools.
DG: Next question gets to the heart of the parts of medicine that feel particularly human and tough to replicate for AI.
Bhav Jain: My name is Bhav Jain, and I’m a medical student at Stanford. I’m very interested to understand from your perspective whether patients will ever truly feel human connection from AI. How do we implement AI such that it mimics human-led care?
DK: I think that AI is going to play a larger and larger role in the transactional forms of health care. I just need this prescription. I want a quick check in on my ankle sprain.
But for the really important aspects of health, I think there’s always going to be a central role for humans, as people who manage uncertainty, who make judgments, who integrate values. Ultimately, as someone who’s taking responsibility for the care of another person. I don’t see those things being displaced by AI.
DG: Eric Maurer, who works at a community health center in Minnesota, has a question about whether a type of AI known as large language models, or LLMs, could make care more accessible.
Eric Maurer: Last year patients spoke over 50 languages and dialects in our clinic. I’m curious what Dr. Khullar sees as the opportunities and potential barriers for using LLMs to reduce written and spoken language barriers in health care.
DK: I think this is a huge potential advance. So already we see that we can do real time translation in the latest version of the AirPods. And this is only going to get better and better. And there’s a lot of evidence that suggests that people who don’t speak English, they have worse access to care, they’re less likely to understand the nuances of their treatment.
In some cases, providers may avoid caring for them because it takes more time or not spend the necessary time to go over diagnosis and treatment options. So I think this is a potentially transformative advance for people who don’t have English proficiency.
DG: Last audience question comes from a colleague of yours, I believe, at Cornell, David Scales.
David Scales: What policy mechanisms are coming out to ensure AI has accountability for when it egregiously fails, like with AI-induced psychosis, suicides, or other times when it suggests behaviors that lead to harm?
DK: The FDA and other regulatory bodies are still trying to figure out how to regulate these technologies. And so I think a lot of the issues that we’re discussing around liability, at least, will be adjudicated by the courts. Some of this may occur through self-regulation on the part of the companies.
OpenAI, for instance, has recently rolled out parental controls after some adolescents have had very negative experiences. And so I think this is going to be an area that receives a lot of attention in the coming years into trying to figure out how we can regulate AI such that it’s safe, but that we don’t diminish the potential for innovation.
DG: How are you different, Dhruv, now, as a doctor, because of the reporting that you’ve done on AI?
DK: In some ways I’m more humble. You know, it’s breathtaking to see how good some of these models are at doing the type of work that we spent years, in some cases, decades, trying to become proficient at. But tactically, I use AI as a second opinion to pressure test the conclusions that I’ve arrived at, to try to broaden my thinking of what’s potentially going on.
I’ve been thinking a lot about how do I prompt the AI to give it the right details, to get the best answer to the question that I’m trying to understand. And so I think we all need to be thinking about how we can use these technologies in our own work to enhance what we’re doing.
DG: In some ways, it makes me think that communication has become even more critical, and that tomorrow’s best diagnosticians will be expert communicators. One they will be listening very closely to their patients, and two, being able to turn that around and share with the AI tool to get the best possible information.
DK: That’s right. And I think there’s a third layer of communication here, which is talking to patients about what the AI is doing and how you’re using the AI to support the work that you’re doing. I think we need to do a lot to engender trust in these models, if they’re going to play a larger and larger role in health care.
DG: Dhruv, thanks so much for taking the time to talk to us on Tradeoffs.
DK: This was great. Thanks so much for having me.
DG: I’m Dan Gorenstein, this is Tradeoffs.
Episode Resources
Additional Reporting and Resources on Artificial Intelligence in Health Care:
- If A.I. Can Diagnose Patients, What Are Doctors For? (Dhruv Khullar, The New Yorker, 9/22/2025)
- 2025: The State of AI in Healthcare (Greg Yap, Derek Xiao, Johnny Hu, Ph.D., JP Sanday and Croom Beatty; Menlo Ventures; 10/21/2025)
- An AI System With Detailed Diagnostic Reasoning Makes Its Case (Catherine Caruso, Harvard Medical School, 10/8/2025)
- OpenAI launches parental controls in ChatGPT after California teen’s suicide (Reuters, 9/29/2025)
- Are A.I. Tools Making Doctors Worse at Their Jobs? (Teddy Rosenbluth, New York Times, 8/28/2025)
- Educational Strategies for Clinical Supervision of Artificial Intelligence Use (Raja-Elie E. Abdulnour, Brian Gin and Christy K. Boscardin; New England Journal of Medicine; 8/20/2025)
- Lots of Hospitals Are Using AI. Few Are Testing For Bias (Ryan Levi and Dan Gorenstein, Tradeoffs, 2/27/2025)
- AI Chatbots as Health Information Sources (Irving Washington and Hagere Yilma, KFF, 8/22/2024)
- Rooting Out Racial Bias in Health Care AI (Ryan Levi, Tradeoffs, May 2023)
Episode Credits
Guests:
- Dhruv Khullar, Physician, Weill Cornell Medical College; Contributing Writer, The New Yorker
This episode was produced by Ryan Levi, edited by Dan Gorenstein and Deborah Franklin, and mixed by Andrew Parrella and Cedric Wilson.
The Tradeoffs theme song was composed by Ty Citerman. Additional music this episode from Blue Dot Sessions and Epidemic Sound.
Tradeoffs reporting for this story was supported, in part, by the Gordon and Betty Moore Foundation.
