Is AI better than a doctor?

Dec 10

“ChatGPT outperforms human doctors at accurately diagnosing patients.”

While attention-grabbing, this recent Futurism article requires deeper examination. Let’s analyze the data and understand what this means for the future of medical diagnosis.

The study design

This was a small but well-structured trial involving 50 clinicians randomized 1:1 to use either the large language model (LLM) ChatGPT or conventional tools like Google. The participating doctors had been in practice for a median of 3 years, specializing in internal medicine, family medicine, or emergency medicine. Notably, 25% of participants had minimal LLM experience (used once or never), while only 16% reported regular weekly usage.

Participants evaluated standardized clinical vignettes within a one-hour timeframe, typically completing 5 cases. They received scores based on both diagnostic accuracy (incorrect, partially correct, or correct) and their proposed management steps.

Surprising results

The primary outcome — diagnostic performance — showed no significant advantage for AI-assisted doctors (76% vs 74% for conventional tools). Similarly, the time spent per case (secondary outcome) remained comparable (8.7 minutes vs 9.4 minutes).

The unexpected finding came when testing the LLM independently. Operating without clinician oversight, it achieved a remarkable 92% accuracy, outperforming both groups with humans.

Understanding the Implications

This disparity between AI-alone and AI-assisted performance raised important questions. The researchers attributed the LLM’s superior solo performance to receiving more optimized prompts than those used by clinicians — not to any inherent superiority over human medical judgment. They explicitly cautioned against interpreting these results as support for autonomous AI diagnosis without physician oversight.

The authors even emphasized this point in their conclusions stating that the “results of this study should not be interpreted to indicate that LLMs should be used for diagnosis autonomously without physician oversight.”

The complexity factor

Another recent study in the UK looked at the accuracy of ChatGPT in answering exam questions in obstetrics and gynacology and found clear limitations of the tool, particularly in complex clinical reasoning tasks. The tool achieved 72.5% accuracy on single-answer style questions but dropped to 50.4% when faced with a more complex scenario.

In this instance, it would be very interesting to understand if these limitations were correlated with a lack of training data, which would reflect historical biases in women’s healthcare data collection and research. This gap in data quality and representation could explain why AI tools may struggle with complex obstetric and gynecological cases, highlighting a broader systemic issue in healthcare data.

Looking forward

The rapid adoption of AI in healthcare is clear — a UK study revealed that 20% of GPs are already using generative AI in clinical practice. While this enthusiasm is understandable, it demands thoughtful implementation.

Current AI chatbots excel at processing textbook-style medical knowledge but fail to replicate the nuanced understanding gained from years of clinical experience. More fundamentally, they cannot replace the human connection and empathy that lie at the heart of quality patient care. So, will AI replace doctors in the future? In short, no.

As these tools evolve, their true value will likely emerge as augmentative aids that enhance clinical decision-making rather than replacements for medical professionals. Success lies in striking the right balance: leveraging AI’s powerful pattern recognition capabilities while preserving the essential human elements of healthcare. Looking ahead, our priority must be developing robust frameworks for AI integration that uphold both patient safety and clinical excellence.

***

📫 If you enjoyed reading this article, please consider subscribing to the Healthy Innovations newsletter, where I distil the most impactful advances across medicine, biotechnology, and digital health into a 5-minute briefing that helps you see the incredible future of healthcare taking shape.

Alison Doughty

Hello! I'm Alison, and I translate tomorrow's healthcare breakthroughs into today's insights for forward-looking clinicians and healthcare business leaders.

For over two decades, I've operated at the intersection of science, healthcare, and communication, making complex innovations accessible and actionable.

As the author of the Healthy Innovations newsletter, I distil the most impactful advances across medicine, biotechnology, and digital health into clear, strategic insights. From AI-powered diagnostics to revolutionary gene therapies, I spotlight the innovations reshaping healthcare and explain what they mean for you, your business and the wider community.

https://alisondoughty.com

Is AI better than a doctor?

Revolutionizing pharmaceutical R&D through blockchain

What is the black box problem in healthcare AI?