A podcast on statistical science and clinical trials.
Explore the intricacies of Bayesian statistics and adaptive clinical trials. Uncover methods that push beyond conventional paradigms, ushering in data-driven insights that enhance trial outcomes while ensuring safety and efficacy. Join us as we dive into complex medical challenges and regulatory landscapes, offering innovative solutions tailored for pharma pioneers. Featuring expertise from industry leaders, each episode is crafted to provide clarity, foster debate, and challenge mainstream perspectives, ensuring you remain at the forefront of clinical trial excellence.
Judith: Welcome to Berry's In the
Interim podcast, where we explore the
cutting edge of innovative clinical
trial design for the pharmaceutical and
medical industries, and so much more.
Let's dive in.
Scott: All right.
Welcome everybody back to In The Interim.
your host, Scott Berry.
Uh, I'm your co-host, Scott Berry.
Uh, my, my co-host, uh, my common
co-host, Kurt Veily, is back.
Though Kurt, it's been a bit of a,
Kert Viele: Benny Gap.
Yep, happy to be here.
Happy to be here
Scott: we, we may find out, uh, if
Kurt has any pet peeves for the today.
But get to the topic of the day,
and you know, we are somewhere in
the 60-plus episodes, Kurt, and
we have not done an episode on
response adaptive randomization.
Something that isâ¦
makes people passionate.
Uh, it's controversial.
Um, it's, it's a great topic
for innovative trial designs.
I think it's especially a great
topic as the world coming, the world
of clinical trials is evolving.
You know, what is, what is the role of,
of response adaptive randomization in it?
What are our thoughts on it?
Um, lots of critics of response
adaptive randomization.
So let's, let's get to the, to the
bottom, to the top, to the sides
of response adaptive randomization
Kert Viele: Sounds great
Scott: Okay.
So it's a controversial topic.
Let's, let's, let's, uh, start with
what is response adaptive randomization?
Remember, my, my, my, my wife
loves to tune in on these, and
she's gonna wanna make sure that
everybody understands the topic.
So explain to Tammy what response
adaptive randomization is.
Kert Viele: All right.
First off, hi, Tammy.
Good to see you.
Um, the, uh, so the idea behind RAR,
response adaptive randomization - we'll
abbreviate it from here on out - um,
is essentially, uh, all adaptive
trials, they do interim analyses, and
they make some kind of adjustment to
the trial at those interim analyses.
In RAR, the adjustment that you're making
is to change the allocation probabilities.
And usually, what we would do is
we would increase allocation to
arms that are performing better.
We decrease allocation to arms
that are performing worse.
The basic idea is to do a couple things.
One is to try to treat patients better
in the trial, try to get them the
better-performing arms, and we hope we get
some better inferences at the same time.
But I know we're gonna touch on
that when we get into some of the
controversies because that sometimes
happens and sometimes doesn't
Scott: Yeah.
So e-examples of, of how this may
be, and we, we've been involved
in multiple trials where response
adaptive randomization has been used.
And so a couple of those, the, the BAN2401
trial was a trial for, of lecanemab, which
is now approved for Alzheimer's disease.
Their phase two trial was a placebo
and five active arms of lecanemab.
It was actually two
frequencies and three doses.
And those five arms, they, they did a
hundred and ninety-six patients enrolled
where it was initially fixed randomization
to those, those six arms in the trial.
An interim analysis took place,
and a new randomization probability
was set for each of the arms.
in-- essentially, the
control was a fixed rate.
slightly different than that, but, but
for all intents and purposes, the control
was fixed, and the other five could vary.
With the remaining probability, a
new randomization vector was created.
For the next period of time, patients were
randomized u-using those probabilities,
and then fifty patients later enrolled,
a new interim was done, and this was
repeated during the course of the trial.
And it ended up that two of
those doses, the two higher
doses with the two frequencies,
were the most patients explored.
One of them was moved to phase
three, and it, it was, uh, uh,
successful in a phase three trial.
goal of that phase two trial
was to find the best dose.
What is the best arm,
and how good is that arm?
And we'll come back to whether RAR
is good or not, but that, that was
an example of, of response-adaptive
randomization being used.
I'll point out one more.
We'll, we'll probably talk about
other trials and how they do it and
the, the good and the bad of it.
Uh, I'll touch on another one just
because I've done two recent podcasts
on it, and we are doing a podcast
on the results of the ICECAP trial.
It-- By the way, it's been recorded, and
we're just waiting for JAMA to publish
the article, and then we're gonna release
the discussion of the ICECAP trial.
The ICECAP trial had 10 arms.
arms were the duration of hypothermic
cooling for the treatment of cardiac
arrest, post-cardiac arrest resuscitation.
You go to the, uh, the emergency room,
and they do different durations of
hypothermic cooling, and the goal was
to find the best duration and largely
was there an increasing dose response?
Largely, what is the best duration?
Was the goal of that, and there
were 10 arms in that trial
Kert Viele: And I-- you should also
add that the, the lecanemab story,
you published that article as well, so
people can go back and read that and
see what happened interim to interim.
I thought that wasâ¦
A lot of times we don't publish
that kind of information.
I thought it was good that it's out there
Scott: Yeah, it's the only trial I know of
that published the actual interim reports
that were done at each of, I think there
were 18 or 19 interim analyses, and it,
it shows the quantities that were used to
calculate the randomization probabilities.
It shows what each one
were at all the interims.
You can see those reports.
Fantastic.
I'm trying to get the ICECAP people
to publish that, and we will.
We'll, we'll show each of the interims for
that as well so people can follow along.
Watch the movie at home,
um, uh, uh, of the trial.
So that gives a little bit of
a flavor in just explaining
response-adaptive randomization.
Now, there, thereâ¦
Uh, it-- to some extent, it
sounds fantastic, um, uh, as this.
If I'm a patient, I feel like
I would like to be in a trial
that uses response-adaptive
randomization just at face value.
Um, they-- At, at, at the same time
within this, I think there are a
lot of people who are unfamiliar
with clinical trials, that when I
explain to them adaptive trials,
they look at me I have three heads.
Like, don't all trials do this?
Shouldn't the r- the allocation of
treatments the latter part of the
trial be informed by the earlier part
of the trial and what's given to them?
And when you describe that, no, most of
them not, um, they're surprised by that.
So there's, there's that part.
But there are, there are critics of it.
There areâ¦
A-and, a-and I, I don't know
where we fall in this, Kurt.
Um, I don't necessarilyâ¦
I'm not pro-RAR, uh, as I, I
don't think there's value in that.
It's a tool in the trial to g- in some
cases, get better answers, some cases
treat patients better, but it's a tool
available to us as trial designers,
and it has good and bad in it.
I think there-- it's fair to say there
are, there are authors out there who
are, against RAR and write articles.
"Don't, don't fall for the RAR trick.
Don't do it.
It's bad."
Uh, but it is quite
controversial in the literature
Kert Viele: At least we're very
persuasive because we, we're lay-
laying out traps and we're getting
people to do our, our stuff.
But ironically, we don't-- how
many of these do we actually do?
We certainly do them.
I mean, we do, you know, hundreds
of trials over the years, but it's
certainly not 60% of these involve RAR.
It's particular cases where we've done
it and particular cases where we haven't
Scott: Very much so.
Um, let, so let's talk about
what, what those cases.
So i- if, if, um, um, what, what are
the, what are the critics of RAR?
What, where does this perhaps not go well?
Kert Viele: So, and well, this gets
at whether, you know, we're critics
of RAR for certain cases as well.
So, I mean, the, the central thing, I
think what you said from the patient
standpoint, you know, obviously
if you're enrolling late in the
trial, you wanna be, "Hey, a lot of
people have generated information.
I wanna get the benefit of that."
I would fully understand that perspective.
There's a scientific aspect that the
purpose of the trial is to generate
information beyond the trial.
If you're doing, in lecanemab,
there are millions of people who
are suffering from Alzheimer's that
want a good and effective drug.
So we wanna keep them in mind while we're
also trying to help patients in the trial.
One thing that happens is to get a lo--
I'm not gonna do a lot of math here at
all, of course, but if you're looking
at estimating a treatment effect, so the
sample sizes in each of the arms matter.
You're comparing control to treatment.
If you can make both of those
arms have bigger sample sizes,
you're gonna get better estimates.
You're gonna get more information.
So now the question is, well,
how does RAR actually do that?
Well, RAR does that by
increasing allocation to some
arms, decreasing it to others.
If you're in a two-arm trial, for
example, this is the classic and a
lot of the papers that are against
RAR, they focus on two arms with
reason to say it's problematic here.
You increase the treatment allocation
potentially, but you decrease the control.
And this gets into kind of a
robbing Peter to pay Paul situation.
You can actually decreaseâ¦
You, you estimate treatment effects
worse in two-arm RAR s- than you
would in doing a fixed trial.
So a lot of people have noted this.
There's a problem.
Two-arm trials, we
don't do RAR very often.
In this case, I think it's
a really specialized issue.
All that's gonna be turned on its head
in a multiple-arm setting, uh, and I
think we'll get to that in a minute.
But two arms is a very-- If you read the
literature, two arms is a very, very,
very controversial case, and I think we
agree with the, the limitations there
Scott: So, so a lot of critics
of RAR will point out that in a
two-arm trial, uh, moving away from
one-to-one, reduces your power.
It reduces your, uh, uh, i-
inferences, your inferences about
the, the difference between them.
Now, there are some odd cases where,
uh, it might be good to reduce one,
the variability's different in the
two arms and, and, and all of that.
But largely, you reduce power in that.
And then they make a sweeping argument
that so you should never do RAR
Kert Viele: Yeah, so that's my pet peeve.
You wanted a pet peeve.
This is-- My, my pet peeve is how
overgeneralized the RAR literature is.
Uh, uh, RAR, it's a lot
of different methods.
You can do it a lot of different ways.
You can do it in a lot of different cases.
A lot of papers are written, "I'm
gonna explore this version of RAR
in this setting, but I'm gonna
generalize it to everything," and
that just doesn't serve us well
Scott: And they say, "Look,
this method doesn't work here.
RAR doesn't work," and they
generalize it to every possible
RAR cases of that.
Kert Viele: So
Scott: Okay, so in-- So you talked
about the two-arm case, the two, the
two-arm setting, and so that would
be a specialized scenario where the
goal isn't entirely treatment effect.
There, there may be other goals, and
we, and we'll come to that a little bit.
So let's just, uh, take, uh, another
criticism of it is temporal trends.
So what is the, the, the, the
criticism about temporal trends?
Kert Viele: So there's something
really magical about doing fixed
randomization, not doing RAR.
So if, if I run a trial and I enroll
patients one-to-one, and I have patients
that enroll early in the trial, I have
patients that enroll late in the trial.
If I look at the treatment and control
groups, however those split between
how many patients came in early and
how many came in late, they're the same
between treatment and control because
I've done one-to-one the whole way.
I've equalized allocation the
whole way through the trial.
If I do RAR, that doesn't
happen necessarily.
If I start the trial one-to-one and then
shift to three-to-one, for example, or
some people would say nine-to-one, uh,
which I don't think we get to very often.
Uh, but in any case, if you do that, so
what's gonna happen is you go three-to-one
in favor of treatment later in the trial.
You look at the treatment subjects,
they're more late patients.
You look at the control subjects,
there are fewer late patients.
If there's any kind of anything going on,
and the classic example is COVID, where
there's a new variant coming in and it--
people act-- react to it differently,
that is gonna generate a bias, and
that is potentially really, really bad.
So I mean, it depends on how big
the bias is, but it can completely
invalidate your conclusions.
Typically, we would
add terms to our model.
We'd model time to address this, uh, but
that's generally what the issue is, is you
do need to take active steps to correct
for that and make sure you're robust.
Scott: Oh, that, that creates biases.
And that term, it creates biases, always
depends on how you're analyzing the data.
Kert Viele: Right
Scott: you ignore the fact that it's
different over time and it's not part
of the model, then it could cause bias.
If you're adjusting by time, then,
know, assuming you have additive
effects across time, then it's
not biased within that setting.
So there are, there are ways to
adjust for time if it's a concern.
There, there are some cases where we have
much less concern about time effects.
In some cases, we have larger effects
about, uh, larger concerns about time.
So there are ways to adjust for that.
Okay, um, there's one other potential
criticism, which is the rare circumstance
where people may be un- uh, unblinded
to the, what's the allocation is, is
that you may learn one treatment is
doing better than another one just
by the fact it's being given more.
in a blinded trial where patients
are blinded, investigators are
blinded, nobody would know that a
particular treatment is given more.
In trials where they're unblinded,
this becomes also a potential
criticism, um, uh, uh, of, of RAR
Kert Viele: Yeah, and among many other
criticisms of an unblinded trial.
Lots of things can happen in those
settings, and RAR may make them worse, so
Scott: Okay, so we've set
up potential negatives.
This unblinding aspect of
it is a potential negative.
trends is a potential negative to it,
and in a two-arm trial, you reduce power.
So, uh, o- okay,
Kert Viele: sounds, sounds awful, Scott
Scott: sounds, sounds bad.
So what are the potential
Kert Viele: So, so
Scott: of, of, of RAR?
We, we started, it seemed
like a positive thing.
Kert Viele: So,
Scott: positives of RAR?
Kert Viele: well, so there is a
case where RAR is a good match.
So in that case, the one we often
use it for is situations where
we are looking at multiple arms.
We're not in a two, two-arm setting,
and we're looking for the best
arm among those alternatives.
So the classic examples
here would be dose-finding.
I'm not particularly worried about,
you know, which arm is the fourth-best.
I wanna know what is the best.
Um, in the epidemic settings, the
pandemic settings, we wanna know what's
the best treatment for, for patients.
We don't necessarily
need to know the others.
In those cases, what RAR does for you
is because you are increasing allocation
to the best arm, it's the one doing the
best, it's the one you're gonna raise,
you get more information on that best arm.
You wanna be careful here.
This is the thing we've always been
telling people is you wanna make sure
you maintain the control allocation
because that's part of your comparison.
If you're gonna do an estimate at the
end to compare to control, you need
controls, so don't, don't skimp on those.
But in those cases, RAR will give
you more power 'cause it gets
more patients on the, on the arms
that matter in your comparison.
It gets you lower bias, lower,
um, uh, lower variance, so
basically you get better estimates.
All of these inferential things
start working in your favor
Scott: So i-in a, in a setting, and
looked at in a, in a treat-- potential
trial for Ebola, for example, you might
have four active treatments that could
be brought in as a treatment for it.
If you're increasing the probability
of one or two of those treatments and
keeping control the same, you have a
better idea of figuring out the best.
But now you've reduced the, the
randomization to arm four, in a setting,
the question is, do you care about that?
So power is such an oddâ¦
Uh, it's not an odd thing, but
it's a really important thing to
unders- make sure we understand what
are we trying to do in the trial.
What are the goals of the trial?
If it's trying to identify every
possible treatment that's good or
bad, you might not wanna do RIR
'cause it's not finding the best.
If it's about finding the best, then
identifying the effect, it can increase
the power, and dose finding is a great
example of that because you don't care,
uh, about a-arms that aren't as effective.
You're trying to identify the best.
And so it's a perfect example, and it's
commonly where we do it the most at Barry
is a dose finding trial, uh, where, where
it can create a better trial design.
Okay.
Now, at the same time, so if somebody
enters into a trial and, um, there's
this framework that's set up in these
trials of a VIP, where suppose you
were running a trial and a VIP came
in and you wanted to treat the patient
as well, and you could look at the
data and assign one of the arms.
As the trial goes on, you're more and
more likely to put that patient on a
better arm because you're learning,
and you would treat that VIP better.
Now, we don't do that in trials,
and VIPs don't get to come in
and look at the data of it.
Um, but that, i-if we're, if we're
increasing the probability to arms
that are doing better and we're
learning more efficiently, we're
treating patients better in the trial.
And so a patient that comes in later
in the trial is more likely to be
given a more effective arm in the trial
Kert Viele: So we've talked about this.
Um, it probably doesn't happen in
trials very often, but we've talked
about this example in the context of
a learning healthcare system where
the, the whole healthcare system
is trying to improve itself, but
somebody walks in the door going,
"I, I don't want you to randomize me.
You just look at the data and tell me
which one's the best, and I want that."
And you can imagine situations where
that's-- I mean, we talk about the
ethics of it for a long period of time.
Uh, but it's at least conceivable.
The nice thing about RAR is in that
ethical case, if you were doing
fixed randomization, that VIP coming
in wanting special treatment, they
actually get treated much better
than the average patient because the
average patient's getting randomized,
not making use of information.
If you're doing RAR, what's happening
is the difference between that
VI-VIP and the average patient in the
trial is actually pretty darn small.
It's a couple-- You know, if you're
doing mortality, it's a couple percent
rather than twenty, twenty-five percent.
So it really helps on getting
treat-- patients treated well and
can-- the, the idea that we're still
learning as well as, um, as well
as making use of that information
Scott: Yep.
Yep.
so from a perspective, and I started
this off with if I'm, if I'm a
patient going into a trial and there
are two trials, and one of them is
doing RAR and one of them is not,
I'd rather go in the trial with RAR.
I find that a, a, a better place to go.
interestingly, we do a number of trials,
especially rare disease, we're treating
a pediatric disease, um, a Duchenne
muscular dystrophy, where you might do
two to one or three to one randomization.
There's a s- very strong preference
of a patient to enter into a trial
where they're more likely to be put
on the active arm, and it's, it's
done as a tool to increase enrollment
wi- within the trial to, to, to
increase the likelihood patients
would agree to enter into that trial
because it's favoring the active arm.
Now you've got a procedure that learns
from the data and, and goes to different
arms within it, which is, which is a
perception generally, I think, that
patients would rather be in that trial
Kert Viele: And I, I think we ought to
add kind of to the, the objections to RAR
here that, you know, we simulate these
trials and how are we doing this RAR?
How are we increasing the allocation?
You can, of course, do this badly.
Uh, when we present this to people
who have never, you know, done
adaptive trials before, they-- the
natural question is, well, what if
you overreact to early data and the
early data is just noisy and wrong?
And certainly, the answer is if you do
RAR badly, you can overreact to that.
There are papers that have gone
through, you know, doing interims
at a sample size of three where,
hey, this isn't working out well.
So you wanna simulate this.
You wanna make sure that you do this
in a way where you don't overreact, but
you're making use of the information.
And that requires, uh, basically
a seasoned practitioner in
order to find the right balance
Scott: Yeah.
So, uh, we would never enter into
a trial, put an algorithm together
that does RAR, just hope it works.
Uh, we simulate thousands of trials.
Weâ¦
Millions of trials.
We make sure under a huge range of
scenario, it's getting better answers,
or it's accomplishing the goals
that it's intended to improve upon.
Uh, if it's that trial itself, we
wanna improve the outcome of patients,
if it's to learn the right dose.
Uh, so we simulate them
e-extensively ahead of time.
There's a famous example of an RAR
algorithm where, in hindsight, I'm not
sure that they like the outcome of it,
um, within it, and it's the ECMO trial.
Uh, uh, Bartlett, years ago, ran a
trial, and it was, um, newborns that
had particular respiratory issues.
They were being randomized between
conventional care and ECMO,
an ECMO machine, um, for them.
Now, the, the-- What was the design?
It was called a play the winner
design and, or Polya urn design,
where initially there's a bowl
and there's two balls in the bowl.
One is red for conventional,
one is blue for ECMO.
And the first patient comes in, you grab a
ball, and whatever ball is selected, that,
that's assigned to the patient If that
patient is a success and they survive,
survival was the endpoint, you put back
the ball on the therapy they were given.
So if they were given the blue
therapy and they lived, you put
another blue ball in the, in the bowl.
And then the next patient that
comes in has a two-thirds chance
of be given the, the blue therapy.
If they're given blue and they
don't survive, you put a red ball in
the, in the urn, and now it's more
likely the patient would get red.
And every patient that, that continues
and you put more and more balls in the
urn i- in there, and therapy's doing
better have more balls in there, therapy's
doing worse, the other one has more.
And it was a, it was a procedure for doing
response adaptive randomization that.
Now, what happened in the trial
is the first patient that came
in was randomized to conventional
care, and the patient died.
Um, and so a new ball was put in for ECMO.
The next patient that came
in got ECMO and survived.
now a new ball is put
in, so now it's 75% ECMO.
it went on a run of like 27
consecutive ECMO patients,
and they were all surviving.
I think there may have
been a death in there.
But the trial ended up, I believe,
uh, like 26 out of 27 on ECMO
and oh for one or oh for two.
I think at the end they even added a few
more conventional therapies on there.
But it ended up incredibly
disparate, uh, uh, in the trial.
Now, it turns out ECMO
is highly effective.
It's better than conventional.
We know that now.
But the trial was greatly criticized,
and I had the, I had the benefit once
of asking Bartlett, "Had you seen a
simulation that showed what happened
in the trial, would you have said
that was good, I'm glad it did it?"
He said, "No.
No.
We would have done something different."
And that's the value of simulation, to be
able to evaluate that particular algorithm
Kert Viele: So I, uh, one aspect of
that, so all of these play the winner,
the poly-polyarms schemes, they
come out of a particular literature,
which obviously, you know, your
father Don's heavily contributed to.
But w- there is one aspect of that.
A lot of that literature is based
on trying to treat a sequence
of patients well without having
any regulatory aspect to it.
So at no point do you have to
generate, you know, P less than .025,
so to speak.
Um, one thing that people should keep
in mind is that when you put that kind
of regulatory requirement on it, you
need a different kind of evidence.
And so that's one of the things
more modern RAR methods do is they
take into account the regulatory
environment as well, which is a
change over the past 40 years.
Scott: Wh- whi- which is doable.
Kert Viele: Yeah
Scott: there are guidance documents
that say response adaptive randomization
is, can be done in confirmatory trials.
Um, it's probably not very
common because phase three a
lot of times are two-arm trials
Kert Viele: Yeah
Scott: and, and there may not be
benefit in a t- two-arm trial,
especially in that regulatory setting.
It's unlikely that you would do
RAR in a two-arm phase three trial.
Kert Viele: Well, you, you have examples
of seamless two, three trials that have
done RAR in the first part and then
lowered, lowered the number of doses
Scott: Right.
So Eli Lilly's treatment Trulicity,
dulaglutide, uh, it's one of the
earliest GLP-1 inhibitors, was
seven-arm phase two trial that at
some point could trigger phase three.
it selected two doses to move forward into
a fixed randomization phase three portion.
By the way, it included data from the
first part, uh, in the final analysis
of that, which was done through
response adaptive randomization.
And it honed in on the 1.5
milligram dose.
Higher doses were having, safety issues,
high heart rate, high blood pressure.
It moved away from them, and 1.5
was the dose that moved forward,
eventually has done incredibly
well and, you know, billions
of dollars of year treatment.
So that, that is certainly done in
seamless two, three trials, where the
first part is to find the right dose.
An interesting side effect of that is
that the original design in that trial
was three arms in the phase two part.
And when they looked at expanding
that to a range of seven doses,
that increases the sample size by
that factor, you know, seven thirds
and it's, oh, that, that's too big.
But by doing response adaptive
randomization and honing in on a
particular part of the dose response
curve, y- the sample size isn't
seven thirds, and the sample size
wasn't even any bigger to do seven
doses than three doses because
of the modeling in the trial.
So there's a huge benefit to that.
The ICECAP is very similar to that.
ICECAP was three durations, but they
wanted a wider range of durations.
They went to 10 with largely the
same sample size because it hones
in very quickly in the region of
interest, uh, in the scenario.
So there areâ¦
The, the higher power
isn't just higher power.
It may enable more doses, uh, to
be used in a better trial design
Kert Viele: And one thing that you
talked about, well, you'd obviously
talked about the TRILOGY trial
for a long time, but the, uh, the
notion that it incorporated safety.
You, you were doing RAR not just on
an efficacy measure, but a combination
which allowed them to pick a dose
that balanced several features.
I forget what the features
are, what were in that trial?
Scott: clinical utility
index that was HbA1c chains,
Kert Viele: Yeah
Scott: Uh, uh, amazingly in the
world we're in, weight loss, which
has become, the GLP-1s now approved
solely for weight loss, um, and
blood pressure and heart rate.
So there are four endpoints that
went into selecting the optimal
therapeutic, uh, uh, dose, uh, in it.
okay.
Um, uh, within this, by the way,
we should mention, uh, it's sort
of a really interesting story
when it w- in and of itself.
You mentioned, uh, Don's work in this and
bandit problems, uh, and its relationship
to response adaptive randomization.
But even going farther, back farther
than this, largely, I think the first
randomized trial, randomized clinical
trial of humans, um, was in the 1940s.
Uh, streptomycin,
Kert Viele: 1948, streptomycin
Scott: uh, treatment of that,
fixed randomization, uh, two-arm
trial, fixed randomization.
But you can go back to 1933, and there's
a paper by Thompson where he introduces
response adaptive randomization, uh, to
that, and it was a long-forgotten paper,
um, within it that has now gr- been,
been, uh, uh, cited many, many times,
and it's even a little bit of the, uh,
search, uh, uh, uh, uh, Google search
things and, and bandit problems that
has brought that paper back to life.
Kert Viele: And not just
brought it back to life.
I mean, it's one of the more cited papers.
I mean, talk about
having to wait for fame.
You publish it in 1933, you get
very little attention for 70 years,
and now suddenly you have 5,000
citations or whatever it's at now
Scott: Yep.
bandit problems is largely, it-- you've
got multiple arms to pull, and you decide
on which arm to allocate to a patient.
Usually, they're done
in a deterministic way.
Oh, give them arm three,
give them arm one.
And usually, you've set up a goal
of that particular trial, which can
include a very large horizon that
at some point you have to pick a
treatment and go with one treatment.
Um, but you set up a goal to, to
save as many particular patients
or have a particular outcome.
And, and this was my father's dissertation
work, was bandit problems, uh, within it.
So closely related to response
adaptive randomization is this whole
literature of bandit problems, which
have also been cited many, many times.
Um, and many of this is now
advertising, uh, online advertising.
If Google has multiple ways in
which it can present an ad to you,
and I'm using Google just as a, a
Kert Viele: Yeah
Scott: of things, it can, uh,
throw out three and, and find
out how many clicks does it get?
Well, maybe we'll try
two, maybe we'll try one.
Maybe we'll personalize it to
individuals within that so they
can be using this technology to
increase the number of clicks
Okay.
Um, uh, now within the, the properties
of this, I wanted-- platform
trials have all, uh, brought out
a really interesting part to this.
So platform trials are those where we
have multiple agents in the trial and
a, a, a relatively new, uh, uh, thing.
And we might have four
e-experimental treatments and a
control in the trial at one time.
So this opens up the interesting
question of do we wanna do response
adaptive randomization platform trials,
which largely are a multi-arm trial
Kert Viele: So I, I think platform trials
to me are we, we still haven't explored
this in as much detail as we would like.
We've done simulations for our own
trials and so on, but RAR, it's a
really different beast in platform
trials, or at least it can be.
If, if I'm doing an umbrella trial where
I've got four arms, I've got to enroll
two hundred patients, when I do RAR,
if I'm increasing allocation to certain
arms, I'm decreasing allocation to others.
In a platform trial that's perpetual,
so I'm gonna do four arms, I'm
gonna replace an arm when I drop it,
increasing the allocation doesn't
necessarily lower sample size.
It slows things down.
And so you could say, "I'm gonna speed up
the arms that look the most promising,"
but I'm not gonna abandon everything else.
It's still there for me to
get to it later, potentially.
And I've been really interested in
how all that plays out in practice,
and we've done this both ways.
But I, I think this is one of the
open research areas is how this works.
Scott: Yeah, one of the controversial
things or the w- things that may
be problematic is if there's four
sponsors that own-- four separate
Kert Viele: Yeah
Scott: that own the drugs, and we
accelerate sponsor A, it slows down
sponsor B by, by, by definition,
and that may be an undesirable
thing to recruiting arms in a trial.
So many of them in that scenario,
phase two setting, don't use response
adaptive randomization because
Kert Viele: it would even be worse if
we said we weren't going to explore an
arm at all, much less than slow it down.
That often generates more controversy
with sponsors with reason, "Hey,
we want you to give us an answer."
Scott: Right.
One trial where this was done with
multi-sponsors, and it was generally
perceived by all to be a good thing, is
the I-SPY2 So the I-SPY2 trial is a Phase
2 trial in neoadjuvant breast cancer,
and there would be 20% fixed on the
control, and then the remaining 80% was
set up across the experimental arms that
were in the trial, and it would impr-
it would increase the, the probability.
The, the really interesting thing there
is breast cancer, we, we understand
heterogeneity of disease in breast
cancer better than most diseases.
HER2 breast cancer is, HER2
positive is a different breast
cancer than HER2 negative.
So the trial was enrolling, uh,
hormone receptor by HER2 status,
also had MammaPrint status.
So there's eight types of cancer that
women are being enrolled within that.
So if you came in with HER2 negative,
hormone receptor negative, triple
negative disease, you are have a
different randomization probabilities
over the four arms that are in
the trial than somebody who's HER2
hormone receptor negative within that.
Theâ¦
we might accelerate one drug for one
type of cancer, but a different drug is
being accelerated in another type, and
they had a fixed 120-patient sample size.
So they got more patients within
the subtypes where they were
doing better, and they might have
gotten less in other settings.
So you were treating patients better, but
the, the sponsors perceived this to be a
positive because of the multiple subgroups
where RRR was done different by subgroup.
So it wasn't just slowing
down one for the other.
One slows up one place and
speeds up the other place.
That was thought to be
a very positive thing
Kert Viele: Well, and it, it speeds up
their phase three because if you know
where your treatment has the biggest
benefit, you can design a smaller
trial to detect that in phase three.
You enroll the people who benefit, and you
don't dilute the effect through people who
don't, and obviously the people who don't
benefit can go to other trials where they
may achieve benefit and benefit them there
Scott: So, well, one of the places where
I've run into the most controversy is,
uh, so I-I've been involved in, uh,
multiple dozens of trials that use RAR.
Very happy with all of them.
And really very happy
with e-every use of RAR.
We did have a couple issues
in the REMAP-CAP trial, uh, so
fair to sort of point that out.
Uh, these have been made public.
Where REMAP-CAP is multifactorial.
So a patient that comes in could
be randomized to drug A, yes or no.
Dr- And then a different domain of
treatments that you could get drug
one, two, or three, uh, within that.
And, you know, w- I think the
most a patient's been randomized
is seven different domains,
simultaneously randomized that.
it uses response adaptive randomization.
Some of those domains
are two, um, arm domains.
There's a control, and then
there's, yes, you get therapy.
Should you get high-dose
vitamin C, yes or no?
That was one of the domains in the trial.
Now, so it was deemed this is a trial
that was treating a pandemic, COVID-19,
and I'll, I'll be specific to that.
We're also enrolling
non-pandemic community-acquired
pneumonia, uh, within that.
But within the pandemic, it was thought
that this was highly beneficial to
do response adaptive randomization.
was more likely that groups, uh, that,
that, there would be a positive to
entering into this trial if it was
using the most up-to-date information.
It's treating the pandemic while
it's learning at the same time.
Now, we did end up in a two-domain
case of simvastatin, where it ended
up about ninety percent probability
for simvastatin versus no simvastatin.
And it hovered around determining
efficacy, and it probably took longer
it to declare efficacy, uh, than it
would have if it stayed one-to-one.
But meanwhile, it was increasing
the probability of patients getting
simvastatin during the pandemic.
So this was a case that was debated a
lot as to whether RAR was a good thing
or not in that particular scenario
Kert Viele: So we've got-- There's another
example in addition to the pandemic.
We have a trial called PROSPECT, which
ha- we've-- there's a protocol paper out.
The results paper is not out yet.
We're hoping that will
come relatively soon.
Um, but anyway, it is a
two-by-two factorial experiment.
So technically there are four arms,
but you could think of it as doing
two arms twice on the rows and
columns of that two-by-two table.
Scott: Yep
Kert Viele: there were similar discussions
there over this is-- PROSPECT is on,
uh, respiratory distress in children.
And so the notion was, hey, we
want to treat these kids well
while we're trying to learn.
And that was exactly
part of the debate there.
Scott: Uh, one other case just to be
upfront is we did have a domain where
initially data were reported and flipped.
Um, and so what happens is the RAR,
intending to improve the better, um,
went the wrong way, uh, within that.
Lot, a, a number of lessons learned.
Now, the final data that went into
it was all fixed and all of that.
So by the way, there's a, a, a
strong operational burden upon,
uh, I, I don't wanna say strong.
There's an operational burden upon
doing RAR, making sure the data's
right that goes into the RAR,
the very- various pieces of this.
It's, uh, an important part of the story
is the logistical part of doing RAR.
Some settings, it's
reasonably straightforward.
I-Spy 2, it, it ran weekly automated,
um, but you gotta get the data right
Kert Viele: I remember doing
the RACE trial and I would get
data every 12 patients, and
this was back in the old days.
And so I would be sitting there,
I basically put together the
randomization table and it got
replaced in the randomization system.
But I was more or less manually
doing that, which was not what
you'd want if you can avoid it.
I think everything worked out fine in
this, but it's been good seeing more
and more groups learn how to do this
and be able to do this automatically.
But as you said, there are still
some snafus that occur and you
want an experienced group doing
this, not just on the design side,
but on the implementation side
Scott: Yep.
Uh, I do wanna come back to something
you said that I think in sort of looking
forward to the future here, uh, ofâ¦
I, I think platform trials,
multi-arm experiments, uh, are
growing in popularity, even
in comparative effectiveness.
I do think somewhat of the future
is a learning healthcare system,
where if you imagine where we are
today, we do these experiments to
learn the right therapy, but 99.9%
of people are treated outside of those,
and we don't learn from them at all.
As we start to merge learning about
different treatments with treating
patients at the same time, we, we
do experiments where the Belmont
Report lays out that it's okay that
these people are being experimented
on, that we do one-to-one.
But in a learning healthcare system
where we want to treat patients
better and learn about therapies,
response-adaptive randomization
is an incredibly powerful tool.
If I owned a healthcare system and
there was a particular treatment
and there's five available drugs
to treat that, uh, IBS, psychiatry,
number of scenarios, radomizing
patients to the different treatments,
response-adaptive randomization,
learning the right therapies, we don't
learn from these patients at all i-in
a randomized causal way for sure,
uh, is an incredibly powerful thing.
So I think response-adaptive
randomization, while it's a powerful
tool in the right setting, it's
a bad tool in other settings, and
that's, that's the part of it.
Um, is it has a really strong future
as I think healthcare becomes more
integrated, learning healthcare.
Uh, I think it, it, it's,
it's a powerful future
Kert Viele: And you're gonna be
exploring combinations of therapies.
All of this is gonna go together
in ways that typically aren't done
in, say, a hundred-patient trial.
So just 'cause I can't
Scott: And like the I-SPY 2 trial, we're
gonna have a much better idea of the, how
diseases are cl- similar but different,
heterogeneity of treatment effect.
Uh, all of this comes together where I
think res- response adaptive randomization
is, is going to be even more valuable.
Thompson is going to, post
100 years, get more and more
citations as, as time goes on.
Uh, which I think is Thompson
sampling, it's even called.
Um,
Kert Viele: Certain types?
Yep
Scott: All right.
Well, it took us, it took us 60-plus
episodes, uh, before we got to
response adaptive randomization.
I, I suspect it's not our last
discussion of this, but, uh, one of
my favorite topics and, and one of
them that get- keeps people listening
So thanks for joining, Kurt.
Appreciate everybody out
there tuning in today.
Until next time, we'll
be here in the interim
Kert Viele: Thanks, Scott