**Your Apple Watch Knows Your Heart Better Than Your Doctor? Not Quite.** Alex: So apparently your Apple Watch can now detect heart disease with ninety-nine percent accuracy. Which is brilliant, really—why bother with cardiologists when you've got a fancy wristband? Bill: Yeah, I saw that headline everywhere this week. And my first thought was: ninety-nine percent accuracy at *what* exactly? Alex: Right, because that's the question, isn't it? When something sounds too good to be true in a headline, I've learned to immediately ask what they're not telling me. Bill: This one's from Yale, presented at the American Heart Association conference in November. The headlines are all variations of "AI-powered smartwatch detects heart disease with 99% accuracy." Alex: And people are understandably excited. I mean, structural heart disease—we're talking weakened heart muscle, damaged valves, dangerous thickening—that's serious stuff. If your watch could catch that early, that's potentially life-saving. Bill: Absolutely. And the underlying research is actually pretty sophisticated. They trained an AI algorithm on over 266,000 ECGs from hospital patients, then tested it on 600 people using just the single-lead ECG from an Apple Watch. Alex: Okay, so far that sounds reasonable. What's the catch? Bill: The catch is that "99% accuracy" isn't accuracy at all. It's something called negative predictive value, and it's answering a completely different question. Alex: Go on. Bill: So when they say 99%, what they actually mean is: if your smartwatch says you *don't* have heart disease, there's a 99% chance that's correct. That sounds great, but here's the thing—that number is massively inflated by how rare the disease is in their study population. Alex: Hang on. You're saying the stat looks good because the disease is uncommon? Bill: Exactly. Only 21 people out of 596 in their study actually had structural heart disease. That's about 5%. When disease prevalence is that low, almost any test with decent performance will give you a really high negative predictive value. It's a mathematical property of screening rare conditions. Alex: So the 99% is less about the test being brilliant and more about the disease being uncommon. Bill: Right. And here's what the headlines buried: the test's sensitivity—its ability to actually *catch* disease when it's there—was only 86%. Alex: Which means it missed... Bill: Fourteen percent. Roughly one in seven people with actual structural heart disease got a negative result and false reassurance. Alex: That's not a small problem. If someone's watch says they're fine, they might not bother going to their doctor even if they're having symptoms. Bill: Exactly. And we're talking about 3 people in this study who had serious heart conditions—weakened pumping, damaged valves—but the algorithm said they were fine. Alex: So the 99% statistic is technically true but functionally misleading. Bill: And that's not even the worst part. Let's talk about what happens when the test says you *do* have a problem. Alex: Oh no. Bill: The positive predictive value—how often a positive result is actually correct—was only 27%. Alex: Twenty-seven percent? So if your watch flags you for potential heart disease... Bill: There's a 73% chance it's wrong. Three out of four positive results are false alarms. Alex: Which means unnecessary echocardiograms, cardiology appointments, anxiety, cost. I remember when I was covering health stories, we saw this exact pattern with early smartwatch screening for atrial fibrillation. Loads of false positives clogging up the system. Bill: Same issue. And when you scale this up—imagine millions of people wearing these watches—you're creating a massive downstream problem for healthcare systems. Alex: But here's what I'm wondering. The researchers at Yale, they must have known these limitations. Did they actually claim this was ready for clinical use? Bill: No, and that's what's frustrating. The lead researcher, Dr. Rohan Khera, explicitly said: "On its own, a single-lead ECG is limited; it can't replace a 12-lead ECG test available in health care settings." Alex: So the researchers were appropriately cautious. Bill: Completely. They acknowledged the small number of disease cases, the false positive problem, and they even emphasized this was preliminary work presented at a conference—not peer-reviewed, not published in a journal yet. Alex: And somehow between the conference abstract and the headlines, all of that nuance just... vanished. Bill: It's the classic translation gap. The American Heart Association press release emphasized the 99% figure. Media outlets ran with it. And suddenly preliminary research became "your Apple Watch can detect heart disease." Alex: This is exactly the sort of thing that used to drive me mad when I was working in journalism. You'd have a perfectly good study with appropriate caveats, and then the headline would promise a miracle. Bill: The thing is, the underlying research is actually interesting. Using AI to extract structural information from a single-lead smartwatch ECG is technically impressive. An 86% sensitivity for preliminary work isn't bad. Alex: So this isn't junk science. Bill: Not at all. It's solid early-stage research that demonstrates proof-of-concept. The problem is presenting it as if it's clinically validated and ready to replace actual medical testing. Alex: What about the way they trained the algorithm? Because I imagine a single-lead ECG from a watch you're wearing while walking around is a lot noisier than a hospital ECG. Bill: They actually thought about that. They added artificial noise to their training data—Gaussian noise—to simulate real-world conditions. That's a smart approach. Alex: But? Bill: But simulated noise isn't the same as actual noise from movement, poor contact, electrical interference. We don't know how well this performs when someone takes a reading while they're stressed, or their wrist is sweaty, or they're in a car. Alex: And presumably the 600 people in the study were taking their smartwatch ECGs in a controlled setting at Yale. Bill: Exactly. They were already there for echocardiograms—this was a clinical population, median age 62, already being evaluated for cardiac concerns. That's not representative of the general public wearing Apple Watches. Alex: So we've got a study with 21 actual disease cases, conducted in a hospital setting, on a population that's already at higher risk, and the technology hasn't been peer-reviewed yet. Bill: And headlines saying "99% accuracy." Alex: What's the actual prevalence of structural heart disease in the general population? Because if it's lower than that 5% in the study... Bill: It is lower. In an actual population-wide screening where prevalence might be 1%, that positive predictive value would be even worse than 27%. You'd have even more false alarms per true case. Alex: So the problem compounds as you expand the screening. Bill: Right. And here's the thing—current medical guidelines don't even recommend routine screening for structural heart disease in healthy adults without symptoms or risk factors. Alex: Because the harm from false positives potentially outweighs the benefit. Bill: Exactly. This is why screening tests need to be really, really good—and why context matters so much. Alex: I think what bothers me most is that someone with actual heart disease symptoms might look at their Apple Watch, see a negative result, and think "well, I don't need to see a doctor then." Bill: That's the false reassurance problem. And with 14% of disease cases being missed, that's a real risk. Alex: So what should people actually take away from this research? Bill: I think the real finding is: AI applied to smartwatch ECG data shows promise for future development. It's a proof-of-concept that deserves further study with larger populations and real-world testing. Alex: But it is not— Bill: —a validated diagnostic tool. It cannot replace clinical evaluation. And a negative result does not rule out heart disease. Alex: And that 99% figure? Bill: Is the perfect example of why you need to ask: 99% at *what*? In this case, it's the probability that a negative test is correct, which sounds impressive until you realize it's mostly just telling you the disease is rare. Alex: When I see health headlines now, especially ones with really precise, impressive-sounding percentages, I've learned to look for what they're not telling you. What was the actual sample size? Who were the participants? What metric are they actually measuring? Bill: And whether the study has been peer-reviewed. Conference abstracts are an important part of scientific communication, but they're preliminary. They're the start of the conversation, not the end. Alex: The American Heart Association actually includes a disclaimer on these abstracts saying they're "considered preliminary until published as a full manuscript in a peer-reviewed scientific journal." Bill: But that disclaimer doesn't make it into the headlines. Alex: Of course not. "Preliminary research shows modest promise but needs more work" doesn't exactly grab attention. Bill: Although honestly, as someone who used to work with data, preliminary research that shows promise is actually exciting. It's the beginning of figuring something out. Alex: But we've turned it into the finish line. Bill: Yeah. And that does a disservice both to the public, who gets misleading information, and to the researchers, whose work gets hyped beyond what they actually claimed. Alex: So if you're someone who wears an Apple Watch and you're wondering whether you should trust it for heart disease screening— Bill: Don't. If you have symptoms, risk factors, or concerns, see an actual doctor. Get a real ECG, get an echocardiogram if needed. Your smartwatch can't replace that, and the researchers themselves explicitly said so. Alex: And if you see a headline claiming some new technology is 99% accurate at detecting a serious medical condition— Bill: Ask what that 99% actually measures. Ask about false negatives and false positives. Ask whether it's been peer-reviewed. And remember that in healthcare, accuracy isn't just a number—it's about real people getting the right diagnosis and the right care. Alex: The technology will probably get there eventually. This kind of research is how we make progress. Bill: Absolutely. But we're not there yet, and pretending we are doesn't help anyone. Alex: Except maybe Apple Watch sales. Bill: Fair point.