In the Interim...

On the latest episode of "In the Interim…", Dr. Scott Berry provides an empirical examination of two recent JAMA trials: TRACK (low-dose rivaroxaban in advanced kidney disease) and VICTORY (IV vitamin C in severe burn injury). The TRACK trial lacked any pre-specified futility criteria, with a DSMB-initiated stop based on conditional power calculations. Scott argues that conditional power, especially in this interim context, is a poor, misleading tool—contrasting it against a Bayesian predictive probability calculation that produced a much lower and more realistic estimate of success. In VICTORY, a pre-specified risk ratio threshold for futility was incorporated, with simulation confirming minimal effect on bias and statistical power. Scott underscores the practical and ethical importance of rigorously pre-specified, simulation-based futility rules and operationalizes the case for Bayesian predictive probability as a decision metric in interim monitoring. He reiterates that responsibility for defining futility belongs to trial designers, not left to ad hoc DSMB judgment, and calls for precise statistical planning in adaptive trial protocols.

Key Highlights

TRACK: No pre-specified futility rule; DSMB stopped for futility using conditional power post hoc.
Technical critique of conditional power as misguided at interim, supporting Bayesian predictive probability instead.
VICTORY: Pre-specified futility threshold, with simulation confirming minimal operational bias and power reduction.
Emphasizes pre-specified, simulation-based futility planning and predictive probability monitoring as standards for all trials.

For more, visit us at https://www.berryconsultants.com/

Creators and Guests

Host

Scott Berry

President and a Senior Statistical Scientist at Berry Consultants, LLC

What is In the Interim...?

A podcast on statistical science and clinical trials.

Explore the intricacies of Bayesian statistics and adaptive clinical trials. Uncover methods that push beyond conventional paradigms, ushering in data-driven insights that enhance trial outcomes while ensuring safety and efficacy. Join us as we dive into complex medical challenges and regulatory landscapes, offering innovative solutions tailored for pharma pioneers. Featuring expertise from industry leaders, each episode is crafted to provide clarity, foster debate, and challenge mainstream perspectives, ensuring you remain at the forefront of clinical trial excellence.

Judith: Welcome to Berry's In the
Interim podcast, where we explore the

cutting edge of innovative clinical
trial design for the pharmaceutical and

medical industries, and so much more.

Let's dive in.

Well, welcome everybody
back to "In The Interim."

I'm your host, Scott Barry.

And for today's topic, I feel a
little bit like, uh, years ago,

I was on faculty at Texas A&M.

I was assistant professor of statistics
at Texas A&M, and, uh, just starting

at A&M, I had come from graduate
school, came from Carnegie Mellon.

And, uh, and I don't know if they
still do it, but Texas A&M had a, at

the time, a policy that every, um,
graduate student, everybody pursuing a

dissertation, would get a random faculty
member on their dissertation committee

So if you were a statistics graduate
student doing a dissertation, you might

get a faculty member from, uh, meat
sciences or, uh, from something to be

on your dissertation committee, a voting
member of the dissertation committee.

And so when I got there, I was assigned to
a student's committee, and it was when I

walked in, it was sort of like, "Oh, no."

The, the-- generally, throughout
all of the graduate programs, the

worst thing that could happen to you
was getting a statistician randomly

assigned to your dissertation committee.

So I think the first one I did was
forestry and crop sciences, and

they're actually doing experiments.

It was, it was fascinating.

And the second one I think was,
was wildlife and fishery science.

It was a fascinating project where they
were following wolves, and they were

tracking them, and they were reintroducing
wolves into the wild, and they were

trying to figure out their behaviors.

It was dynamic data.

It was absolutely fascinating, but they,
they were fearful of a statistician

walking in like, "Oh no, I now have
a statistician on my committee."

Now what, what does that have
to do with today's topic?

Today's topic, uh, I don't think it's
the same sort of scenario because

statisticians are rampant in the clinical
trial, uh, uh, clinical trial world.

But I am gonna talk about today a
look at a recent in, i-issue of JAMA.

So a wonderful thing about clinical trial
design is you can go see the results,

the published results of other trials and
see those trials' design, see how they

carry out the analyses, the interpretation
of those, and of course, JAMA, New

England Journal, Lancet, all of these
provide this really nice opportunity.

So I make sure whenever possible to
open up these trials, to check out the

design, see what they say, what's being
done in the world of trial design,

what, what would I have done differently
in them, and do I agree with the

interpretation from the medical journal?

So I get, I get to jump in a
statistician review, and I'm gonna

talk about them on this show.

So I always-- I, I've
done, I've done one other.

You can go back to the episode where I,
I sort of randomly fell upon a really,

really interesting trial of using, um
iron supplements in heart failure where

the, the patients are deficient in iron.

Fascinating, fascinating trial.

So you should go back to that episode.

These trials, uh, I'm gonna look at two
trials from read-- The email comes in

from JAMA and I, I click on them and, you
know, these are, these are interesting

enough I, I, I wanna talk about these.

Okay, so a little bit is doing this, doing
this a bit publicly, and I hope people

don't say, "Oh no, a, a statistician is
reviewing my trials," uh, in this setting.

By the way, an interesting thing, one
of the m- more interesting things I

got to do when I was at Texas A&M, I
was a columnist for Chance Magazine.

It's a publication of the ASA.

And each quarter I got to write
an article that was a statistician

reads the sports pages.

So I got to go in and find
topics that were coming out,

and I did this for 10 years.

It was a quarterly magazine, so I wrote
40, uh, articles about reading the sports

pages, things that come up in sports, uh,
statistician's interpretation of them.

So this is a little bit like that.

This is a statistician reads JAMA and, and
you know, what, what, what is, what, what

do I look at when I read these trials?

Okay, probably a bit of a bias in
the trials that, that I look at, but

this is a re- recent, uh, episode
of what showed up in my mail.

So the first trial was published
June 4th, 2026, and it's called the

TRACK Randomized Clinical Trial.

It is a trial of, and the title of it is
"Low-Dose Rivaroxaban in Cardiovascular

Events in Advanced Kidney Disease."

So a little bit about the trial, and
I, I am not a clinician, so I, I'm,

I'm reading the article of this.

So rivaroxaban is a anticoagulant and
apparently anticoagulations in some

form of kidney disease with aspirin have
demonstrated good clinical outcomes.

So this is a trial looking at low-dose
rivaroxaban in that looking at does it

alter cardiovascular outcomes for patients
with advanced chronic kidney disease Now

these are, uh, CKD, chronic kidney disease
stage four or five, including patients

on dialysis, and they're randomized
to, uh, a rivaroxaban low dose, uh,

or not in the trial, and the primary
outcome is a composite of cardiovascular

death, non-fatal MI, stroke, uh,

or peripheral artery disease event.

And the primary safety outcome is
major bleeding, which is typical

of trials of anticoagulation.

The risk, of course, are
bleeding events, uh, in that.

So that's the trial, the, the major part
of what the trial is trying to address.

Now, a little bit about the trial design.

It's an event-driven trial.

It, it's a little bitâ¦

I, I found it a little bit
confusing in the trial design.

It's an event-driven trial, but it
talks about, uh, uh, the trial is

designed to enroll nineteen hundred
participants over three years with

an expected follow-up of five years.

So the nineteen hundred patients would
provide statistical power to detect a

reduction, so a hazard ratio of point
seven five, would have ninety percent

power of, to detect a difference.

They're gonna do a log-rank test.

They're gonna look at the
hazard ratio in a Cox model

for, um, the, the cardiovascular
events, time to first event.

Now it does-- And then it says
this, the nineteen hundred is a

bit, they're gonna enroll that, but
they're gonna wait for the events.

It's an event-driven trial, five
hundred and fifty primary, uh, outcome

events, assuming a log-rank test.

Uh, ninety percent power for a
point seven five hazard ratio.

And it also says the number of
events provides eighty percent

pow-power for a twenty-two percent
risk reduction, point seven eight.

It's, we'll come back to that
interesting point seven eight has

eighty percent power, uh, in the trial.

Now, the first thing I, I try to
diagnose from the trial was this

adaptive i-in any way in the trial,
or is it a trial and we're gonna

wait five years and look at the data?

Well, it says two formal interim
analyses are planned to assess efficacy

when approximately one-third, which is
pretty early in event-driven trials,

and two-thirds of the primary endpoint
outcomes have been observed They

use Hay-Bittle-Peto stopping rules.

It says, uh, using three
standard deviations at either

of the first two interims.

So that's a, a, a nominal
alpha of, of 0.27,

um, uh, within that.

So the, uh, you know, the typical two
point five, this is, this is 0.27,

uh, which has it, it describes
minimal impact on the nominal type one

error rate by the end of the trial.

So four point eight two
percent is the alpha.

So, uh, uh, at the end of the trial,
so it pre- preserves a good bit of it.

Now, it says nothing about
futility analyses in here.

It says a detailed DSMB charter will
be developed before starting the study

in consultation with DSMB members.

That's essentially the extent of it.

I looked through the SAP, uh,
that which was a, uh, which was an

appendix to this, a supplement to it.

Looked through the SAP, and this is
really the wording it gives there.

I think in hindsight, the answer is there
were no futility analyses in the trial.

So it's a superiority trial for
rivaroxaban on cardiovascular events.

Now, what happened in the trial?

So it says, and this is in the, in the
paper, in the report, on July 8th, 2025,

during a pre-specified review of interim
data, and it's kind of interesting

because the, the-- it says two hundred and
fifty-four primary outcome events, which

is fifty percent of the planned events.

Fourteen hundred and
thirty-two patients randomized.

Remember they said nineteen hundred,
so fourteen hundred and thirty-two

randomized, and it, it's at two
hundred and fifty-four events.

They said that they're gonna
do a superiority at a third of

events and two-thirds of events.

So it says on July 8th, uh, fifty percent
of the events, two hundred and fifty-four.

The Data and Safety Monitoring Board
recommended early termination of the trial

to concerns regarding net harm and a low
probability of demonstrating efficacy

Okay, so the DSMB made this
recommendation, and then it says

the recommendation was not based
on pre-specified stopping rules.

So from all my read, there were
no pre-specified futility, but the

DSMB makes this recommendation.

Net harm and low probability
of demonstrating efficacy.

And then it says it
gives additional detail.

It says, "Rather, the decision
was based on a post hoc analysis

estimating a conditional power of
16% for a hazard ratio of 0.78,

indicating a low likelihood of
demonstrating efficacy even if

the trial had continued until the
planned 515 primary outcome events."

Okay, so a lot to unpack here.

So no pre-specified rules.

They do a post-hoc.

I think with no pre-specified
rules, everything is post-hoc.

So I, I, I found that wording somewhat
weird, uh, within that setting

as though that's super relevant.

Butâ¦

And they do conditional power,
and they say a low likelihood

of demonstrating efficacy.

So first of all, what
is conditional power?

16% probability of demonstrating
s- uh, superiority at the final

analysis, and it says 16%.

Now, conditional power is given the
current data at the interim at 254 events.

It's a calculation of what's the
probability of seeing superiority

at the end of the trial.

Now, conditional power assumes a
single value for the hazard ratio,

and it can be done in many ways.

Sometimes conditional power looks at the
current estimate of the hazard ratio.

Sometimes it uses a preset
hazard ratio like it did here.

So it assumes a hazard ratio of 0.78

for the rest of the trial, uh, with
variability of the rest of the data.

Suppose the truth is 0.78.

What's the probability that at
550-- 15 events we see superiority?

That's the calculation
that's made, and that's 16%.

My first reaction is that's not that low.

That's, that, that, that likelihood,
I d- I don't find that low at all.

Uh, it was interesting.

My, my son, Cooper, who I've talked
about on this show before, he's, um,

he was in seventh grade, and we h-
would not let him play tackle football.

And he came to us with an argument that he
should be allowed to play tackle football.

And as part of his argument, he
found, uh, uh, data that said the

likelihood of a neurological event
For a kid playing tackle football,

a serious injury was only 13%.

And he presented that in his
PowerPoint presentation to us.

Yes, we have a weird family, um, in
this which, uh, that was only 13%, so he

should be allowed to play tackle football.

And our reaction was, "Oh
my God, 13% is enormous."

You know, the 13% chance of, of serious
injury in that setting is very large.

So 16% probability of demonstrating
superiority by the end of the

trial doesn't strike me as that
low, actually, that unlikely.

Now, there's multiple parts to this.

It uses this .78

to make this calculation.

Now, I went back and figured out if
the, uh, you can find the, the formula

for conditional power, uh, in this.

There's a beautiful normal
approximation to the log hazard ratio.

If you assume .78

and the conditional power for 16% by
the end of the trial, at the time, it

means that the hazard ratio was 1.03.

So it was worse than one at the

time, and the, the data was doing worse.

The 95% credible interval for that 1.03

hazard ratio at the time of 254
events goes down to about .81.

So somebody's calculating this conditional
power for what's the chance this trial

is successful, and they're using .78

The 95% confidence
interval goes down to 0.81

as the lower bound of that.

So 0.78

is not even in the 95% confidence interval
of when they're making this calculation.

It's a very, very unlikely value.

So what is the relevance
of this calculation?

It's a super odd calculation, and I
think it's, it provides no information

for the DSMB to make the decision.

It's not a low number.

You sh- probably shouldn't be
stopping trials for futility at 16%.

It's not that unusual of a number.

But it's calculated using-- Remember
when we talked about the way they

powered the trial, they said it
has 80% power, assuming 0.78.

So halfway through, through the
trial, if you still assume that 0.78,

it's gone from 80% to 16%.

Doesn't seem that super odd at the time.

I, I-- That I feel like it's not
a very informative number at all.

You won't be surprised to find out I'm
not really a fan of conditional power.

Now, conditional power assumes
you know the answer and what's the

probability that that, that happens.

And it uses a number that's incredibly
unlikely given the data we've observed.

And so it's, it's, it's
a bad analysis in that.

Now, what would I do?

I would do a Bayesian
predictive probability.

I would use the current posterior
distribution of given 254 events

with a current hazard ratio of 1.03

What is the posterior
distribution of the hazard ratio?

And then given that estimate of the
hazard ratio, what's the probability the

trial's going to demonstrate superiority?

Okay, I did that with this data,
and it comes out to be about 1.4

So given the current data, the, and
the current posterior distribution

of the hazard ratio at the interim,
integrating over that, the probability

of seeing statistical significance
at the end of the trial is 1.4%.

Now, I think that's a
really, really useful number.

And by the way, it's a number we use.

It's, it's a calculation we use very
frequently in s- for stopping for futility

because it incorporates the uncertainty
at the interim, the variability of

the data to come of demonstrating
superiority or the goal of the trial.

It's, it's a beautiful number.

And you can use, you can
use prior distributions for

that that are informative.

You could use fairly weak priors.

Uh, that, that, uh, analysis prior that
goes into that, that's a word that the

FDA draft guidance used, the analysis
prior that goes in to calculate that

could be based on, uh, an optimist's view.

And when an optimist thinks it's
unlikely, you could stop the trial.

Now, if your optimist believes 0.78

with no variability, that's
the conditional power.

That's equivalent to what they calculated.

You have a prior probability
that the hazard ratio is 0.78

with zero variability.

You think the chance of
winning this trial is 16%.

I don't think that Optimus would stop the
trial, but I think it's, it, it's not the

right calculation at that particular time.

Okay, so a lot went into that.

Now, the other component to it is that
at the time, they saw increased major

bleeding, which was the safety endpoint,
which is what anticoagulations do.

They n- you know the bleeding's gonna
be higher, but you're hoping it's offset

by reduction in cardiovascular events.

Now, the final data, I don't know what
the data were at the interim on bleeding.

The final data on bleeding was for
the rivaroxaban arm, it was 8.8%

major bleed.

That was 64 events, uh, on that
arm, 44 events on placebo for a

s- a 6% rate of major bleeding.

So that went from six to 8.8,

which surely went into the DSMB's
decision to recommend stopping the trial.

The probability of showing superiority,
and I think they used an awkward number

for that, but with this elevated rate.

Now, we know major
bleeding's gonna be higher.

That's what the drug does.

But the perception of this 2.8%

increase with a hazard ratio of 1.3

at the time had them recommend
stopping for futility.

Now, the final result of the
trial was a hazard ratio on

cardiovascular events that was 1.09.

There was additional follow-up.

They, they, um, at this July event,
they stopped the trial a month

later, so there were deliberations
on this, but they stopped it.

There was additional follow-up,
probably additional events

going through adjudication.

With the final events, it ended
with a hazard ratio of 1.09

in that, um, 13, uh, uh, events
per 100 years on rivaroxaban, 11.8

on placebo.

It was doing worse than placebo, uh,
at this particular time, and stopped

with 1,463 patients randomized, about
five- 400 and some plus patients less

than what they planned to enroll in it.

Okay.

So overall, what does this mean?

So I think the DSMB got it right.

Um, knowing what I know of the results,
uh, I would have recommended stopping.

But I think the problem is, I don't think
that's the group to make this decision

Uh, the thing that bothers me here
is there were no prospective futility

rules So now you turn over this trial.

It is an incredible amount of effort by
a ton of people, 1400 patients agreeing

to enter this trial, and there's no rules
for stopping the trial for futility.

You leave it up to smart
people on the DSMB.

But those are really hard
decisions for a DSMB.

I think this is something that the
PIs in the trial should define.

They should shet- set up
what, what is futility?

What is the be- should it be a
balance of, of bleeding to, to that?

Are they happy that the trial was
stopped at this particular time?

Took about a month before they,
they eventually stopped it.

Did they review the data?

Did they agree with this?

Uh, I feel like it has
to be part of the design.

It's frustrating as a statistician
to read trials like this that

don't have futility rules set up.

It's a five-year trial with
nineteen hundred patients.

Cardiovascular death is part of
this endpoint, major bleeding.

So I think the DSMB did right,
but they're in a really hard spot

to make a decision for futility.

Is sixteen percent right?

Is that conditional power
number the right thing in it?

Now, this is a trial looking at a readily
available treatment, so this is kind of

comparative effectiveness, if you will.

I don't think this is gonna result--
Maybe it results in some level of,

of guideline saying, "Oh, patients
with CKD stage four or five should

take low-dose rivaroxaban," if the
data were in a particular situation.

But it's not regulatory approval.

It's not a sponsor going after this.

I don't, I don't believe,
uh, in the setting.

Um, but especially in sponsor-designed
trials, the sponsor should make

the decision about futility.

When is enough is enough?

When should we stop the trial?

It's a business decision.

It's a money decision.

There's an ethical component to
it about asking patients to be

randomized into a trial with a very
low likelihood of demonstrating a,

a, a result that changes practice.

And, uh, so are they contributing to
science if the probability of success

is one point four four percent?

All these questions should go
into the design of the trial, not

humans that have to sit around and
make this decision post-hoc in it.

The last part about it is
I, I think conditional power

is just a bad tool for that.

It's a bad tool.

Integrate over the uncertainty of what you
know at the time, reflective that point

seven eight was very, very unlikely, so
sixteen percent is just the wrong measure.

Whether it's big or small, it- it's
the wrong measure So I'd love to

see them use a much more realistic,
relevant calculation in that.

Okay, so that was article number
one that shows up in my email box.

Uh, article number two is the
VICTORY randomized clinical trial.

Love the names.

And this one caught my
attention right off the bat.

This is the first one I looked at.

It was-- This came out June 10th,
published online in JAMA, June 10th, 2026.

And this is the VICTORY trial, and
it is high-dose intravenous vitamin

C, uh, studying mortality and organ
dysfunction in severe burn injury.

I don't, I don't know much.

We've done a little bit in burn, but
I don't, I don't know much about it.

Um, uh, the d- the, the syndrome,
uh, within it, you can imagine

the clinical syndrome of this.

Uh, but I don't know much about
treatment in this or endpoints.

So, but it struck me was
the vitamin C part of it.

Why did vitamin C strike me?

Well, um, in, in our REMAP-CAP trial, we
investigated, um, REMAP-CAP and LOVIT.

LOVIT was a trial investigating
vitamin C in sepsis at the time,

and COVID, the pandemic broke out.

The two trials combined their
data together to investigate and

randomize patients with COVID and
investigating does high-dose vitamin

C improve outcomes in severe COVID

And our trial came out with--
and the endpoint was organ

support-free days, in the back of
my mind, I remember this result.

It came out, it estimated essentially
harm on the ordinal endpoint of mortality

and then organ support-free days, it came
out with a adjusted odds ratio of .88,

less than one is harm, with a ninety-one
point four percent probability of harm.

It was doing worse on mortality, uh, so
on survival was fifty-seven percent in

the vitamin C group and sixty percent
on the non-vitamin C, the control.

So it did worse.

The trial was stopped for futility,
uh, uh, in, uh, in that case.

So I knew this vitamin C
did not do well in COVID.

And then I saw that
LOVEIT read out in sepsis.

And so this, the paper came
out, um, uh, came out in two

thousand twenty-two reporting the
results of vitamin C compared to

placebo in patients with sepsis.

And the, um, the primary endpoint
in that trial was persistent organ

dysfunction, uh, or mortality.

So they died or I believe at
day twenty-eight they still had

persistent organ dysfunction.

They couldn't get off organ
support, a bad outcome.

It was a dichotomous outcome of that.

And in the vitamin C group, the
proportion of patients that met that

condition, died or organ dysfunction
was forty-four point five percent.

In placebo, it was thirty-eight
point five percent.

So a six percent increase in that primary
outcome in sepsis, uh, statistically

significant, uh, harm in that trial.

So it sort of-- I, I remembered
vitamin C was not doing well,

uh, in, in those two trials.

So here vitamin C is being given
to patients with severe burn injury

So I said, "Oh, I've, I've
got to look at, see this.

Does vitamin C work here?"

So the trial design, one-to-one
randomized high-dose vitamin

C given by IV versus placebo.

The primary outcome was a composite
of 28-day mortality, so clearly severe

burn, and persistent organ dysfunction,
uh, defined as dependence on mechanical

ventilation, kidney replacement,
or needing vasopressors inotrope.

So, uh, still, uh, needing, you know,
low blood pressure vasopressors.

Uh, you'd have that support at 28 days.

So this, uh, this composite outcome
of bad things at day 28 is the

primary endpoint, similar to the LOVIT
trial actually, for the Burn trial.

So the total sample size was planned
three-333 patients per group.

So 666-patient trial, which
would provide 78% power, uh, for

demonstrating a decrease in that
outcome at 27% for placebo to 18%.

So that, that reduction,
it would be 78% powered.

So still relatively rare,
27% on placebo was thought to

be the rate of this outcome.

My first thought as a statistician,
I'm just jumping in, is when I

read this is, boy, there's got
to be a better endpoint here.

A dichotomous outcome of these bad
events, but there's got to be more

to this to, to look at this outcome
where you learn something from the

73% of patients that aren't dead
or aren't in this really bad state.

Now, death is worse than being on
vasopressors on day 28 and, and likewise.

So I don't like the endpoint.

I'd like to see something better.

Now, that isn't so much the point of
this, but as a statistician reads through

this, I cringe a little bit at that.

Now, it says, while allowing for two
interim analyses for futility/safety

The, and it says, "The operating
characteristics of the trial under

the pre-specified futility rule."

So they have a pre-specified futility
rule here in this vitamin C may

be triggered by the poor results
of these other trials, the harm,

the reasonable likelihood of harm.

In one case, 92%, the other one was
statistically significantly harmful.

So the pre-specified futility rule is an
adjusted risk ratio of greater than 1.1.

So less than one is
good, favors vitamin C.

So at these two pre-specified interims,
if they saw a relative risk above 1.1

favoring control, it-- they would
s- they would recommend futility.

That was the pre-specified rule, and
these w- these interims were to take

place at one-third and two-third of
the events that if it was above 1.1,

they would stop for futility.

And it says it was determined
through simulation.

So in the SAP, they actually go
through and they look at the operating

characteristics of this futility
rule, and they talk about, uh,

it reduces power from 79 to 78%.

Futility rules reduce power because
every once in a while, the trial

w- hits those futility rules,
depending on what it is, and that

would've gone on to be successful.

Depending on that rule, you
can see reduction in power.

So they, they report what it is,
and it says minimal effect on bias,

which, um, uh, futility rules,
uh, any stopping rule has bias.

Bias is not bad, and in this
case, this bias is not bad.

So they say it's minimal, so that's good.

But they investigated, they created
a pre-specified futility rule in this

trial as opposed to the track trial
where there were no, uh, futility rules.

Now, it doesn't mention anything about
efficacy, so it looks like they did two

futility rules and, and not for efficacy.

Fantastic, uh, in the trial.

Now, I might throw out a little bit
of, you know, superiority at the

time, but, uh, maybe they wanted
enough data to demonstrate and, and,

and change practice in that Okay.

So two interim analyses were pre-planned
at 222 patients, 444 patients of the

666 total sample size, uh, in it.

Uh, by the way, I wonder, again,
statistician reading this, they

report the sample size in the
trial as 333 per group, and then

later they talk about the interims
happening at 222 total and 444 total.

Did they not wanna write
the sample size of 666?

I throw that out there as a possibility.

So they re- said it was 333 per group.

Um, is 1.1,

and so what happened in the trial?

This is where they report that.

This threshold was crossed at the first
interim analysis, prompting the DSMB

to recommend termination of the trial.

By the way, I think that's a, a very
reasonable thing to do, that you have

pre-specified rules, but you have a human
group that's looking at the data and they

say, "Yes, we think it's appropriate."

But you've predefine what that is
in the setting So predetermined

rules hit the trigger.

DSMB said, "Yes, you should
stop the trial," and the trial

was stopped in this setting.

Okay, what were the data?

The data were on this endpoint
were, it was at 120 patients

randomized on vitamin C,

41%.

Remember they talked about
making 27% on control go to 18.

There wereâ¦

I, I, I'll do placebo first.

30% of placebo patients met this
endpoint, very close to the 27 that

they said, which was a, a pretty
good design estimate of that.

It was 41% on vitamin C,
an increase from 30 to 41%.

The Fisher exact test
p-value at the time was .08,

of course, on the way to
harm, um, uh, in that setting.

Uh, I don't know why you give two-sided
p-values in a superiority trial, but

we, we, we, we can interpret that, uh,
Fisher, Fisher exact p-value, of course,

going the wrong way in the setting.

The mortality rate on
placebo was 8%, so 9 of 118.

It was 15%, 18 out of 120 onâ¦

I'm sorry, I, I may have said that wrong.

On placebo it was 8%, nine of 118.

On vitamin C it was 15%, so it went
from 8 to 15%, almost a doubling of the

mortality rate, uh, within the setting.

And it hit the, the relative risk,
the final relative risk was 1.38,

which hit the 1.1

and the recommendation, recommendation
to stop, uh, for futility

Okay.

So overall within this, uh, and
by the way, the, the result, I,

I, I love reading the conclusion.

So I imagine, uh, clinicians don't have
time to read all these, these results,

but they wanna get the highlights of this.

And that's why I think this little
cartoon they do in JAMA, and I think

New England and other journals do that,
is a summary of the trial results.

And many times I disagree
with the conclusion.

I strikingly disagree with it.

I think it's, it's bad.

But here it says the conclusion is
high-dose intravenous vitamin C did not

reduce mortality or organ dysfunction.

So again, that's, it, you know,
didn't reach statistical significance

and may be associated with harm in
patients with severe burn injury.

I think that's pretty
reasonable in this setting.

It wasn't statistically
significantly harmful.

You know, this P value, if we, if
we do a Bayesian interpretation

of the P value, there's a
ninety-six percent chance of harm.

So I, I would've been upset had they
just said, "Did not reduce mortality

of an or- or organ dysfunction."

I think there's reasonable
evidence that it is harmful.

And of course, I just in my results
with vitamin C, vitamin C seems to be

harmful for people with severe disease.

This is a third result now where it
does worse and seems to be harmful.

Uh, so it says may be
associated with harm.

I thought very reasonable conclusion.

Okay.

So I liked this.

I liked this, that they put
this futility rule in there.

By the way, I think it could have stopped
earlier than that, um, uh, in the trial.

I think had they seen it even
earlier, they might have stopped for

futility, but they had it set up.

Now, I don't li-- a- and by the
way, it stopped at thirty-three

percent of the sample size.

So it stopped at roughly two hundred
and forty-ish patients rather than six

hundred and sixty-six patients and likely
doing harm to those patients, certainly

not contributing to, uh, uh, scientific
results that are gonna change care, uh,

in this setting of that vitamin C is a
good thing to add to the, the, the, the

way to care for patients with severe burn.

I don't love the rule of relative
risk greater than one point one.

At a third of the way through the
trial, one point one is a very

different predictive probability than
two-thirds of the way of trial, a one

point one And now this isn't where
the, you know the number of events.

Uh, this could have been an event-driven
trial, but it's rarely done in sort

of a binary yes, no at day 28 kind of
thing, so they do number of patients.

So had the events gone up and down, one
point one is a very different statistical

conclusion as to the likelihood of
success of the trial if the event on

control would have been 15% or 40%.

It means very, very different
things a third of the way through.

Now, they simulated the rule
and presumably under a number of

different scenarios, different
control rates, but I'd much prefer

predictive probability in that.

I think I'd even prefer conditional
power, but using the MLE, I don't

love it, and there's better ways to do
it, but then just an absolute value.

But when you do sufficient, um,
simulations of that and you, you time

it and carry out the analyses at the
appropriate time, you know that, that's a,

that's a reasonable summary of the data.

Now here, you know, there could
have been less events, and

it's a different kind of thing.

So I don't love the rule, but
I love the fact that they did

futility, was pre-specified, they
carried out the futility analyses.

Kind of a no-brainer that this
trial, very low likelihood of

success and high likelihood of harm.

Uh, and by, by the way, should
have calculated the Bayesian

predictive probability at the time.

Uh, much less than 1%.

No chance of success in
this trial, uh, on it.

So I, I commend the designers on it.

I would have done it differently.

Lots of flavors to that.

So two papers in JAMA.

Interestingly, you know, a lot of people
talk about these medical journals as,

you know, they only publish positive
results, publication bias in the setting.

Here are two papers published where
futility was the result of both, both

trials, and they got published in JAMA.

I think really important
results I, I suspect.

I don't know whether vitamin C is
commonly done in burns, um, uh, in that,

but I imagine anticoagulation is, is
used and not that uncommon in patients

with chronic kidney disease in that.

So very, very important
results in it All right.

Uh, I, I leave you with when you're
designing trials, design futility.

It, it-- really, it's a really
important part of the, the trial design.

Uh, don't leave it to the DSMB to
make hard decisions, uh, in this.

Help make those decisions ahead of time.

I appreciate you joining me here.

I'll look at several papers.

Again, we can go back to the design
and think about them from the results.

But I appreciate you joining
me here and, and joining again.

And until next time, we
will be here in the interim

More episodes

Chapters

Creators and Guests

What is In the Interim...?