PULSE

Welcome to Pulse: Amplify, where we sit down with the leaders and changemakers shaping the future of health.

A recent Nature Medicine study went viral after reporting that ChatGPT Health under-triaged more than half of emergency cases when tested using clinician-written scenarios. The finding raised serious concerns about whether consumer AI tools are safe for medical triage.

But researchers from Macquarie University’s Australian Institute of Health Innovation took a closer look at the study design and suspected the results might reflect the evaluation format rather than the AI’s clinical capability.

In this episode of Pulse Amplify, Louise and George speak with David Fraile Navarro about their follow-up study testing five frontier AI models across more than a thousand trials. Their research suggests that when AI systems are evaluated using more natural, patient-style interactions rather than exam-style prompts, triage performance improves significantly.

The discussion explores why prompt structure, forced answer formats, and restrictions on clarifying questions can dramatically alter model behaviour, and why designing realistic evaluation methods is essential as millions of people begin using AI for health advice.

The conversation also examines broader questions:
How should AI triage tools be evaluated?
What role should clinicians play in AI-mediated care?
And what do patients need to know before trusting AI with health decisions?

References

Ramaswamy A. et al. (2026). ChatGPT Health performance in a structured test of triage recommendations. Nature Medicine.
https://www.nature.com/articles/s41591-026-04297-7

Fraile Navarro D, Magrabi F, Coiera E. (2026). Evaluation format, not model capability, drives triage failure in the assessment of consumer Health AI. Zenodo.
https://doi.org/10.5281/zenodo.18975048

Connect with David Fraile Navarro: LinkedIn

Visit Pulse+IT.news to subscribe to breaking digital news, weekly newsletters and a rich treasure trove of archival material. People in the know, get their news from Pulse+IT – Your leading voice in digital health news.

Send us your questions pulsepod@pulseit.news

Production by Octopod Productions | Ivan Juric

What is PULSE?

PULSE, the podcast, produced by Pulse+IT and hosted by digital health legends Louise Schaper and George Margelis, is an enlightening, entertaining look at global digital health trends and current debates with our hosts’ deep takes on all the latest news in digital health.

More episodes

Chapters

What is PULSE?