**The 64% Illusion: What That Viral Addiction Wearable Story Actually Found** Alex: So this one landed in my inbox about six different times last week. "Smart patch reduces cravings for alcohol and drugs — wearable technology shown to help addiction recovery with 64% reduction in substance use." From Harvard. Published in one of the most prestigious psychiatry journals on the planet. Bill: And I'll be honest, when I first saw it, even I went "oh, that's significant." Sixty-four percent is a big number. That's not a rounding error kind of number, that's a headline number. Alex: Which is exactly why we need to talk about it. Because addiction affects something like one in seven people in the US at some point in their lives, and in the UK the numbers aren't much better. People are desperate for new tools. Their families are desperate. Bill: Right, and that's what makes this one matter beyond just "oh the media got a stat wrong again." If someone reads "64% reduction in substance use" and decides a wearable gadget is their plan, that's a real decision with real consequences. Alex: So let's actually look at what happened here. The study is real — published in JAMA Psychiatry in October 2025, out of Massachusetts General Hospital and Harvard Medical School. Researchers took roughly a hundred people in early addiction recovery and split them into two groups. Bill: 115, actually. Alex: 115, yes, thank you — split into two groups. One group got the wearable device plus their normal treatment. The other group just got their normal treatment. And the device uses something called heart rate variability biofeedback, HRVB. Basically, you wear this patch, it monitors your heart rhythm, and it coaches you through breathing exercises designed to regulate your nervous system. Bill: Which, honestly, makes mechanistic sense. There's decent prior evidence that controlled breathing can reduce stress and anxiety. I get why researchers wanted to test this. Alex: Yeah. And the study actually found some real results. Over the eight weeks, the HRVB group showed significant reductions in negative affect and craving. Their stress and urges to use went down. That's a genuine finding. Bill: Right. But then there's the substance use number. The 64%. And this is where I need to get a bit technical for a second — I promise it's worth it. Alex: Go on. Bill: So when you measure outcomes in a study like this, there are two completely different ways to look at the data. There's what researchers call the "between-person" effect — do the two groups look different overall? And then there's the "within-person" effect — Alex: Which is... Bill: Did people actually change over the course of the study. Did they move. Did the treatment shift something over time. Alex: Okay. Bill: And those are two genuinely different questions. "Are these groups different" versus "did the treatment actually move the needle." The 64% reduction? That's the between-person number. It's comparing the two groups overall. Alex: Right. Bill: The within-person analysis — which is the one that actually tells you whether the treatment changed behavior over time — showed no statistically significant change in substance use at all. Alex: Wait. So the study found the groups were different, but not that the treatment made people use less? Bill: That's it exactly. And the researchers say this directly in the paper. I'm going to quote this because it matters: "At the within-person level, we did not find study day to be associated with daily alcohol or other drug use." That's the gold standard test for whether a treatment actually worked. And it came back null. Alex: So where does the 64% actually come from? Because it's plastered across every headline. Bill: Here's the sleight of hand. The HRVB group started the study with fewer substance use days in the past 90 days than the control group — 8.34 days versus... actually, let me make sure I don't muddle this — 10.50 days. They were already using less before the intervention began. The 64% largely reflects that starting difference, not a treatment effect. Alex: The groups weren't identical going in. Bill: Right. And with only 115 people, you can get that kind of baseline imbalance even with proper randomization — the math just doesn't guarantee balance in a sample that size. Alex: Okay, but here's where I want to push back on something, because I think there's an even bigger problem than the baseline numbers. Bill: Yeah? Alex: The no-placebo issue. There was no control group doing fake biofeedback, or a different breathing exercise, or wearing a dummy patch. Nothing. And I think that's actually more damaging to this study than the baseline imbalance, because it doesn't just undermine the substance use number — it contaminates the craving findings too. The ones you just called real. Bill: Mm. I hear that, but I'd say the baseline imbalance is the headline killer. That's the thing that directly explains where the 64% came from. The placebo issue is more about the stress and craving data. Alex: But that's exactly my point. You said the reduction in craving was a genuine finding. But without a placebo group, how do we know that? We know that in psychiatric studies, just believing you're receiving treatment can meaningfully change mood and craving. Those are precisely the measures that improved here. So we can't call them genuine findings — we can't know. Bill: ...Okay. Yeah. You're right. I was treating the craving data as solid because it's within-person, and the within-person analysis is the better test. But that doesn't protect you from placebo effects — placebo effects are within-person too. Alex: The researchers actually say it themselves in the paper. They wrote that "without a placebo group, it cannot be known how expectancy effects may have influenced study outcomes." They flagged it. The press releases just left that part at home. Bill: No, you've convinced me. The placebo issue is more fundamental. I came into this thinking the baseline imbalance was the main thing to nail, and the craving finding was the salvageable bit. But if we can't rule out expectancy effects on the craving data, then we can't actually say what this device did or didn't do. Alex: Right. And this is the thing that used to drive me mad when I was covering health stories. Press releases have this way of taking the authors' own caveats and just... quietly setting them aside. Bill: Were they always that bad? Alex: Worse, honestly. At least now journals post plain-language summaries. Back when I started, you were basically just working from whatever the university communications team decided to highlight. Which was never "we can't rule out placebo effects." Bill: That tracks. Alex: Anyway — the dose-response point. Because I think that's important and we haven't got there yet. Bill: Right, yes — okay, so this is a thing that I find genuinely exciting from a methodology standpoint, and I'll try not to go on too long about it. Alex: Famous last words. Bill: The idea is simple. If the device is actually driving the outcome, you'd expect more practice to produce better results. More of the thing, more of the effect. That's called a dose-response relationship. And it wasn't there. No relationship between how much people practiced with the device and how much they reduced their substance use. None. Alex: That's the bit that would have made me very suspicious, reading this. If the thing works, more of the thing should work more. Bill: It's one of the standard tests for causality. When I was doing A/B testing at my old job, we used it constantly — if a feature was genuinely driving user behavior, you'd see the effect scale with exposure. If it didn't scale, that was a serious red flag that something else was causing the change. The same logic applies here. And honestly, this is where being in data science for years helps you see the shape of a real effect versus a spurious one. Alex: So you'd have red-flagged this immediately. Bill: I'd have red-flagged this immediately. And actually — there's another layer to the practice data. Let me back up. Only 24% of participants actually hit the daily practice target. The goal was 15 minutes of biofeedback a day, and the average was about 9 minutes. Three-quarters of people weren't meeting the adherence threshold. Alex: So not only did more practice not predict better outcomes, but most people weren't even doing the practice. And yet somehow the headline is "64% reduction." Bill: Right. And look, I want to be fair here, because this is where it matters to understand what kind of study this actually is. This was designed as a phase 2 trial. The researchers are explicit about that in the paper. Alex: Can you break that down? Because I think "phase 2" sounds technical and people's eyes glaze over. Bill: Sure. Think of it like a scouting report before you commit to signing a player. Phase 1 — is it safe? Phase 2 — does it show any signal of working, in a small group, enough to justify a bigger test? Phase 3 — does it actually work, in a large diverse group, with all the controls in place? The researchers themselves conclude the paper by calling for phase 3 trials. That's the scientific equivalent of saying "we have a lead, not an answer." Alex: And phase 3 trials typically involve hundreds to thousands of participants, long-term follow-up, proper placebo controls. This study had 115 people, eight weeks, no follow-up at all once the trial ended. Bill: So we have no idea if the improvements — even the real ones, if we set aside the placebo question for a moment — lasted beyond week eight. We don't know if it's a novelty effect. We don't know if it's just the effect of tracking yourself twice a day with mood surveys, which is a real thing. Alex: That's actually the bit that stuck with me longest, that. Participants were filling out twice-daily smartphone surveys about their mood, cravings, and substance use. There's decent evidence that monitoring yourself more closely changes your behavior on its own — completely separately from any device. And we have no way to untangle that here. Bill: The population matters too. Every participant was in their first year of an abstinence-based recovery attempt. That's the highest-motivation, highest-natural-recovery group you can find. Natural recovery rates in the first year of that kind of attempt run somewhere between 20 and 40 percent on their own. Alex: So the bar the device needs to clear is actually higher than it looks. Because if you pick the most likely-to-succeed group and they succeed... Bill: That doesn't tell you much about the device specifically. It tells you motivated people in early recovery improve. Which we already knew. Alex: Okay. So here's what I think the honest version of this story is. There are real findings in this paper — or at least potentially real, with the placebo caveat sitting over all of it. There's also an interesting signal that people in the HRVB group were less likely to actually use after experiencing a craving, which suggests the device might be doing something specifically at the craving-to-use step. That's... actually worth following up. Bill: That's the scientifically honest version. "We found a preliminary signal that this device might help with stress and craving management in highly motivated early-recovery participants. We need a much bigger, better-controlled study to know if it actually reduces substance use." That's the real paper. Alex: Instead of "Harvard patch cuts addiction by 64%." Which is what got shared tens of thousands of times. Bill: The practical takeaway — if you have addiction in your life, or you're in recovery yourself — this is not evidence to swap out therapy, medication, or recovery support for a wearable device. Full stop. The researchers didn't claim that. The journal didn't claim that. The press releases claimed that. Alex: And the frustrating thing is that the researchers seem to have done the work honestly. They acknowledged the limitations, they called it phase 2, they called for more rigorous trials. The gap is between what the paper says and what went into the headline. Bill: That gap is exactly what gets people hurt. Because someone shares that headline to a family member who's struggling, and suddenly "have you tried this patch" becomes the conversation instead of "have you spoken to your doctor about treatment options." Alex: But what about people who might read this and think — well, surely it can't hurt to try it alongside proper treatment? Bill: That's not necessarily wrong. If someone wants to add breathing exercises to their recovery toolkit, the mechanism is plausible, there's no safety concern flagged in this study. The problem is specifically when it replaces evidence-based treatment, or when the 64% number gives someone false confidence that they've found the answer. Alex: Right. The headline doesn't say "promising lead." It says "shown to help." That's the word doing the damage — "shown." Bill: Exactly. Alex: So — what to watch for next time you see a health headline with a big percentage: ask whether it's a relative reduction or absolute, ask whether it was measured within the same people over time or just compared two different groups, and ask what phase the trial was. If the answer is phase 2, the honest translation is "promising lead, not proven treatment." Bill: And if there's no placebo group in a study measuring mood and craving, apply extra skepticism. Those are the measures most likely to respond to expectancy effects in all of medicine. The researchers even told you so. You just have to read that far. Alex: The device might turn out to be genuinely useful — we genuinely don't know yet. But "might turn out to be useful after a much bigger study" and "64% reduction confirmed" are very different things. And only one of them is in the actual paper. Bill: The study is real. The researchers are credible. The journal is prestigious. None of those things tell you whether a finding is clinically meaningful or ready for practice. That's what phase 2 versus phase 3 means — and that distinction is the whole story.