A podcast on statistical science and clinical trials.
Explore the intricacies of Bayesian statistics and adaptive clinical trials. Uncover methods that push beyond conventional paradigms, ushering in data-driven insights that enhance trial outcomes while ensuring safety and efficacy. Join us as we dive into complex medical challenges and regulatory landscapes, offering innovative solutions tailored for pharma pioneers. Featuring expertise from industry leaders, each episode is crafted to provide clarity, foster debate, and challenge mainstream perspectives, ensuring you remain at the forefront of clinical trial excellence.
Judith: Welcome to Berry's In the
Interim podcast, where we explore the
cutting edge of innovative clinical
trial design for the pharmaceutical and
medical industries, and so much more.
Let's dive in.
Well, welcome everybody
back to In The Interim.
I'm your host, Scott Berry.
And as you can tell,
my studio has changed.
So the new studio, which, um, uh,
if you go back a bit seasonally,
that, um, I am here in northern
Minnesota, and you can see beautiful.
That's actually, uh, that,
that's not a, that's not a
virtual, uh, background to that.
That is, that is, uh, the woods outside
of our little cabin here in Minnesota.
We get to escape the heat of
Austin, Texas and spend some time
here in the woods of Minnesota.
So I'm in a little, uh, uh,
office, uh, above the garage
here in northern Minnesota, and
I appreciate you joining me.
So I, I wanna introduce today's topic
with a bit of a panicked email and
then phone call I got in October 2016.
The, the panic was from my father,
uh, my collaborator on I-SPY 2.
And you can go back to previous
episodes of In The Interim
looking at the I-SPY 2 trial.
We actually did, uh, two consecutive
episodes about the trial.
Fantastic, fantastic trial.
I'll, I'll, I'll introduce
a little bit, but it isâ¦
A, a important part of the story is
the primary endpoint in the trial.
The trial is I-SPY 2
neoadjuvant breast cancer.
And the important part of the neoadjuvant
part of th-this is that what that means is
rather than, uh, traditionally surgeryâ¦
You're diagnosed with breast cancer.
A surgeon goes in and removes as much
as the breast-- of the breast cancer
as they can, and then you're given
therapies That's adjuvant treatment.
Neoadjuvant is where you don't go
in and remove the tumor immediately.
You give treatment, and six
months later, surgery is done.
And the-- a-an advantage of that is you
get to see whether or not the treatment
had any benefit on the tumor itself.
The primary endpoint in I-SPY 2 was--
it, it was pathologic complete response.
That means that when surgery is done six
months after the start of the trial with--
for a patient, is that the cancer's gone.
There's-- It's a complete response.
There's no evidence of tumor
at the time of surgery.
And again, this is the statistician
telling you the endpoint.
A, a, um, an oncologist, breast cancer
oncologist could do a much better
job of this, but I'll give you the
statistical needed parts of this.
So that surgery is done six months
after, and there's no evidence of cancer.
The, the, the tumor's gone.
Meaning that the treatment
that you started and initiated
six months earlier had really
strong benefit for that patient.
A, a, a lack of pathologic complete
response would be that surgery is
done, and there's tumor remaining that
the, the surgeon removes at that time.
Now
That endpoint is a, is
a six-month endpoint.
In the trial, we don't know the result of
that endpoint for a patient for six months
Now, in the trial, there are earlier
readings within a patient that might
be predictive of that response.
There's an MRI that's taken
one month, three months, and
then actually before surgery.
It's, it's, we call it six months,
but it's relatively, um, um,
proximal to when surgery is done,
and that MRI gives an estimate of
the amount of tumor that is there.
And now, now we, uh, we'll go back
and even at six months, for example,
the MRI might say there's no tumor
left, but surgery reveals some lay or
some, some, some aspects of, of tumor.
So even at six months it's not perfect.
Now, in the trial, that-- those
values, the MRI values, are used
within a patient in the modeling.
I'll come back to that.
What, what is the, what is the, the,
the email and phone call panic that
I got is that pembrolizumab, which
one-- it was one of the treatments in
the trial, had triggered graduation.
Graduation is a, a, is an adaptive outcome
in the trial where there's at least an
eighty-five percent probability that that
treatment would beat control in a phase
III trial of pathologic complete response.
Now, it's uncertain the role
of pathologic complete response
and do we need disease-free, uh,
survival, uh, uh, as an endpoint?
Do we need a different endpoint in this?
And what's the role of pathologic
complete re-response in breast cancer?
Let, let's not get into that.
But in the, in the trial, that's
the-- that's one of the outcomes,
and think of that as a positive,
uh, uh, outcome that has graduated.
It means enrollment in I-SPY stops.
All patients would go through six months.
They would continue on their therapy,
but no new enrollment would be given.
Essentially, the treatment
has had a positive outcome.
I-SPY 2 is a platform trial with
many therapies going through it,
pembrolizumab being one of those.
Now, the panic was, and, and I'll back
up even a little bit more, is that in
I-SPY 2, Berry Consultants wrote the
code that runs that adaptive trial.
The trial uses response adaptive
randomization across multiple subtypes
of disease, hormone receptor status,
HER2 status, and the, the adaptive design
was driven by Bayesian modeling of that.
And Berry Consultants and I was
involved in writing the code for that.
Now, coming back to
that, what's the panic?
At the time that it triggered for
graduation, the estimated pathologic
complete response rate for pembrolizumab
was sixty percent, and the control rate
had an estimate of twenty-three percent.
Very, very successful in the trial.
The panic was only one patient at
that time had an observed outcome
of pathologic complete response.
Had had the six-month surgery and
we knew the data on one patient out
of the patients on pembrolizumab.
And the panic was, what's happening?
Is, is the modeling broken?
We have one patient, and it
estimates a sixty percent response
rate compared to the twenty-three
percent on control, control.
Something must be broken
Okay, so that's the panic.
Now, I'll come back to, to that
part of the story, but I wanna
shift things to adaptive designs,
and I-SPY 2 was an adaptive trial.
I'll come back and talk a little bit
more about I-SPY 2 and this October
2016 panic, uh, uh, from I-SPY 2.
But let's talk a little
bit about adaptive trials.
So what is an adaptive trial?
An adaptive trial is a, a clinical
trial that has pre-specified
dynamic aspects to it.
I try to not use the word change
because the protocol is set up to
have these dynamic pieces to it.
So in I-SPY 2, there's adaptive allocation
to the different treatment arms.
There's triggering of
graduation of an arm.
All of that's pre-specified.
The protocol doesn't change.
It says in the protocol
these things will happen.
These are key aspects of the
trial that are dynamic and by
design can happen in the trial.
Yes, from the perspective of the
design, it feels like these are
things that are changing in the trial.
You know, we stop enrolling
to pembrolizumab when it
hits a trigger of graduation.
The role of that is to accelerate
development of it to get it to Phase 3.
I-SPY 2 is a Phase 2 trial.
Now, during the trial, these things are
triggered by data in the trial itself
Now that's it seemsâ¦
When, when I explain to people one
of the things I do as a statistician,
the main thing Barry does is build
adaptive trials, and I've been building
adaptive trials for, for 26 plus
years now, is it, it seems strange
that, "Okay, I don't understand.
You mean normally we don'tâ¦
the trial doesn't change?"
And to explain that is really the opposite
of an adaptive design is a fixed trial.
We- w- at the beginning of the trial
in a fixed trial, you say, "Let's
enroll 200 patients on a control.
Let's enroll 200 patients on a
treatment, and let's wait till all
of those patients get six-month data,
and then we'll look at the data and
figure out whether pembrolizumab
or whatever treatment it is works.
And then we make the
decisions going forward."
That's a fixed trial.
Now, many times in a fixed trial, you
look at the data and you say, "Oh, shoot,
I wish the sample size was smaller.
We might've known this treatment
was effective long before we
got to, to, to 200 on each arm."
Uh, in the, in I-SPY 2 it's
120 is the maximum, so youâ¦
long before you get to 120,
you might think that treatment
works or it doesn't work.
Or I wish we'd have put different
kind of patients on that treatment
compared to another treatment.
I wish we would've changed doses.
But generally larger outside of
I-SPY 2, we could be changing
doses in a, in a trial.
We could be altering the
patients enrolled in a trial.
We could move it to a different stage
of trial where it selects a dose and
it moves to a, a confirmatory part
of the trial, and I'll talk about
a trial that, that did that today.
So lots of things in the trial
can, can be dynamic Now what's,
what, why is that, what's, what's
the promise of adaptive trials?
We think about fixed trials where you
say, "Let's go enroll 200 patients on
each arm, and we'll look at the data
when all of the patients have reached
a time point that we consider primary."
We look at the data and we learn something
from that, and maybe we approve a
treatment, maybe we go to phase three,
maybe we run fa- another phase two.
Whatever it is, the role of the
trial, that's a fixed trial.
The promise of this is that we can change
this trial in ways that improve the trial.
By, by changing the allocation,
by changing the doses, by
accelerating to another stage,
we make the trial design better.
That's really promising.
There are a lot of things in trials,
in a fixed trial, we guess at.
We guess at the effect size.
We guess at which patients
might, might benefit from it.
We, we guess at the doses.
We, we guess at a lot in the trial.
We guess at the variability
of the outcome, the, the
prevalence of events in trials.
We guess at a lot.
The promise of adaptive trials is
during the course of the trial,
you learn a lot about that.
If you look at the data in the
trial itself, you learn how
effective the treatment is.
You learn about variability.
You learn about the behavior of
different treatments, the behavior
of different doses, and then we
make the trial better given that.
Had we known that information
before the trial started, we
would've designed a better trial.
That's the promise of adaptive
trials, that we learn things to
make that trial more efficient.
We make the engine better.
It's a, it's incredibly promising
thing of, of adaptive trials.
A- a- and if you think about it, the
opposite of it is you close your eyes
and you hope everything you designed four
years later was the right thing to do.
So that's the promise of an adaptive
t- uh, of an adaptive trial.
Now, the crux of it is as
that adaptive trial is going,
we need mechanisms to learn.
There's data and, and data's ge-
being generated in the trial.
We need to learn from that data If you
think about I-SPY 2, we need to learn from
the pathologic complete response outcomes
that come out of that trial at six months.
In an Alzheimer's trial, we might be
interested in the 18-month response.
In a, in a, a trial of a vaccine,
we're interested in, uh, events
that are somebody comes up with
the disease that the vaccine's
supposed to combat, and we're
waiting for events to happen in that.
That becomes the thing
we're learning from.
So we need models in the trial itself as
a learning mechanism, and those models
drive the dynamic things in the trial.
That's, that's what we need.
So there's two key parts
to an adaptive trial.
We need to learn so that we make
changes, and then when we design the
trial, we figure out what changes
in the trial make learning more
efficient, make the trial better,
treat patients better in the trial.
All of those things are the
dynamic changes, but we need
the data in the trial to learn.
If we don't-- If, if we try to make
adaptations and we've learned nothing
in the trial, you know, it, it, it's all
based on, on, on things we thought going
in, which is we already designed the
trial, uh, a-as well as we can for that.
So it's really these two key things.
We need to learn from the data, and that
drives changes in the trial that make it
more efficient, make it a better trial.
So in that I-SPY 2 example, graduation
is an adaptation in the trial.
We need to learn from the data to say,
"Ah, this treatment should graduate."
Now, in that circumstance of
the panicked email was we have
one patient at six months.
The patient was a, a, aâ¦
was a, a positive response.
It was a pathologic complete
response, but that's one.
How are-- How is your learning mechanism
saying we have at least an eighty-five
percent chance that it's better than
a twenty-three percent response rate?
If it's based on one out of one,
that's-- we're not gonna come up with
an eighty-five percent probability.
The data's not strong enough.
In that trial, we have built
models to accelerate the learning.
Now, you come back to this whole
question about adaptive trial.
If we're waiting for everybody to
get to six months, the trial's slower
The trial to make changes, to make dynamic
adaptations, has to wait a long time, has
to wait for patients to get to six months.
Meanwhile, we've enrolled a bunch
of patients earlier than that.
Have we allocated them correctly?
Now, in the trial, I, I, I brought
up these MRIs that are done
at one month and three months.
In I-SPY 2, we built a model that looks
at the correlation between the MRI,
which is a quantitative measure of
the reduction in tumor from baseline.
Hundred percent reduction is getting
to pathologic complete response
at one month and three months.
If there's no tumor in the MRI, we
think that patient is highly likely
to be a pathologic complete response.
And hence, patients that have taken
pembrolizumab that have MRI values that
are really positive or really negative
inform what we think the pathologic
complete response is through what
I'm gonna call a longitudinal model.
So the longitudinal model is
the statistical mechanism that's
learning what does one-month
MRI predict about six-month PCR?
What does three months MRI
predict about six-month PCR?
What does six-month MRI predict
about six-month pathologic
comp-complete response?
Surgery happens after that, not too
far from it, so it's not a huge benefit
to the trial at six months, but three
months can speed up the learning.
One month can speed up the learning.
So we build models for that to accelerate
the learning about the primary endpoint.
We haven't changed the endpoint.
The endpoint is six-month pathologic
complete response, but early
values are driving that estimate,
which are driving the adaptation
Now, an important part of, uh, you
know, what does that model look like?
And I'll come back to specifically
what that model looks like
and how we learn from that.
That's the aspect in adaptive trials.
So what-- when people ask me what
scenarios, what trials can benefit from
adaptive design, the most important
part of it is that speed of learning.
So if you have a trial where the endpoint,
uh, i- is a migraine trial where it's
a treatment of acute migraine, there
are, there are treatments that are
trying to prevent migraines, but suppose
it's the treatment of acute migraine.
A lot of times we look at two
hours later, are you pain-free?
So when you give a treatment to
a patient, you know the outcome
of that patient two hours later.
We can do a lot of really efficient
things adaptively because the, the, the
learning ratio is really fast, and so we
can be very efficient Alternatively, if
the endpoint doesn't come for 18 months
in that scenario, we've got a whole lot
of patients that have been enrolled that
we don't know anything about, and we're
waiting for them to get to 18 months.
It's hard for the trial to be efficient.
Should we stop enrolling?
Should we change doses?
Should we make changes?
We don't know anything at that point.
So can we speed that up?
So adaptive trials become more
efficient relative to fixed trials
the faster we can accelerate that
learning, and that becomes a huge part
of adaptive trials, and it becomes
the huge fight that we have with time.
We're fighting time, in that
case, to make the trial more
efficient, to treat patients
better, to learn more efficiently.
That's the whole promise
of adaptive trials.
A huge part of building adaptive trials
is this acceleration of learning within
single patients that are moving through.
It's what w- what I've
been doing for 26 years.
Now, the interesting thing about it, a
lot of fixed trials don't care about this.
If your primary endpoint is 12 months,
and you're gonna enroll 200 patients,
and you're gonna wait till everybody
gets to 12 months, you don't really
care if six months is predictive of
12, if three months is predictive.
It's irrelevant to the primary analysis.
Now, you might do an MMRM model that
incorporates that, but it's all done when
all patients have had the opportunity
to complete the primary endpoint time.
Now, the whole world of this
is missing data, but this is
different kind of missing data.
These are patients that just haven't
reached that time point, which is a
whole new thing in adaptive trials that
don't come into it in fixed trials.
So this whole thing of trying to
accelerate the learning is new and
relatively specific to adaptive trials.
Okay, so a lot of what we do is
spending this time on building
that predictive of it in it.
So thinking about fighting time here.
Let's go back to I-SPY 2.
So in I-SPY 2, this-- what,
what does this model look like?
At one month and three months and
six months, we get an observation
of the, the, the size of the
tumor relative to baseline.
And let's think about percent reduction.
So the endpoint that we get at one month
and three months is a percent reduction.
We could get 100% reduction.
We could get 50% reduction in that.
Now, we're interested in suppose somebody
at one month, a, a woman at one month
has an 80% reduction in tumor size.
What does that mean about the likelihood
of pathologic complete response?
That's, that's really what we're
interested in, uh, from that perspective.
The beauty of the trial itself
is that patients pass through
one month and six months.
We get to see a lot of observations
on women that went from one month and
had an MRI, and then we found out were
they a pathologic complete response.
So we get a lot of pairs of what was
the percent reduction at one month, and
were they a responder at six months?
So we're building a model
here for that prediction.
You could build a logistic regression
model for that as that percent
reduction to a response yes/no.
Now, we built that original model, and
a lot of times we're building that model
in adaptive trial, and we don't have
the data in the middle of the trial.
We, we don't get the luxury of
opening up the data and building
a model and say, "Oh, this is
what the behavior of the model.
This is the best-fitting model.
Here's the BIC," you know, whatever,
whatever modeling approach you take.
We don't have it.
We build it before the trial starts.
It's hard-coded.
It runs the model at, at that
time point, so we have to
build the model ahead of time.
We had the luxury in I-SPY
2 of actually I-SPY 1.
I-SPY 1 was an observational cohort.
It didn't, it didn't change the
treatment of patients, but it got
these observations, and so we had
about 40 patients where we, we
could build a tri-- build a model.
Now, we want that model to be robust
We don't want it to be overly strong
in ways that it's making bad forecasts
of PCR because it's somewhat hard coded
and it's running that and it doesn't
necessarily know it's not fitting
well, and then it's driving decisions
about allocation to women that way with
breast cancer based on a model that
maybe isn't, isn't fitting very well.
So we want this to be really robust.
So the model we built in that
circumstance is a piecewise
prediction model where we actually
had 13 pieces to the model that then
forecasts the relative odds of a PCR.
What do I mean by that?
We broke that interval up into suppose
the reduction is in the-- is zero to 10%.
That's an outcome.
By the way, the tumor could grow.
That's the, that's the worst outcome.
Then it could be at zero to 10,
10 to 20, 20 to 30, 30 to 40, and
so on, all the way down up to 80
to 90, and then we break 90 to 95.
Remember, pathologic complete response
is six months, there's zero tumor.
So 90 to 95, 95 to 99,
and then more than 99.
Those are the 13 classifications of the
MRI at one month, and then a separate
instance of the model is at three months,
and a separate instance is at six months.
Then we learn that relative, uh, the
relative odds of a responder as a
function of which category your MRI is in.
And based on that, that's learned
from past women in the trial.
It's one of the beautiful things
about a platform trial is we have
a lot of data on that in ICE by two
from previous arms and previousâ¦
the, the control throughout
that So that model is running.
So at the time, let's go back to
this time where one patient had gone
through, and this estimate was 60%.
We had a number of women at that time in
the trial, and I'm gonna try to remember.
I believe we had some 60-plus
women who were on the arm that
had been allocated at that time,
but only one had passed through.
Now, all of the other women in the trial
have some level of, of maturity of their,
their MRI data, one month, three month.
Some have-- don't, haven't even
reached one month, and they
don't inform the, the l- the, the
response, the PCR six-month response.
And then we have some that
have six months but haven't had
surgery yet at that time point.
So when this panic happened, of
course, we jumped in and we're,
we're looking at the model.
You want humans to, to watch this
automatic pilot to make sure that
if, if the model is broken to some
extent, that w- we, we make sure
it's doing what it's supposed to do.
So we went in and we investigated.
And as Don says, pembrolizumab
was just melting the tumor away.
The values of the MRI for that
treatment were really, really good.
Lots of them were 99% reduction, 95
to 99% reduction, that the model was
predicting that a lot of these were
going to be six-month PCRs, and the
fraction of them that are going to be
six-month PCRs was 60% out of them.
It was all based on
the longitudinal model.
We had one patient.
It was entirely driven by the longitudinal
prediction of those patients Uh, by the
way, at that time, I, I should back up.
Some 60 patients had been
enrolled with pembrolizumab.
The I-SPY2 trial had these
different, uh, categories thatâ¦
Uh, different subtypes that are
differentiated largely by HER2
status and hormone receptor status.
One of those subtypes is called
triple-negative breast cancer, and
that's that you're negative on hor- uh, a
hormone receptor status and HER2 status.
There were 29 of those patients that
had been enrolled, and one of those
had passed through the six months.
That was the subtype that was
estimated to be a 60% response.
At the time that it said, "Graduate this,
stop enrolling, and pembrolizumab should
go to Phase 3," the final analysis takes
place when all patients get to six months.
It's another reallyâ¦
As statisticians, it's just ano- a-
another really, um, neat, exciting
area where we're doing these forecasts.
It's forecasts and predictions of
what's gonna happen, and you stop
enrolling a treatment arm, and then
you get to see the outcome of it.
So you're doing these predictions,
but you get to observe the outcome.
In many circumstances, we might make
predictions, and we never actually
get to see the outcome of that, but
it's forecasting what would happen.
Here, we get to see the outcome of that.
So there was nothing wrong with the model.
It was functioning exactly as designed.
Made some people nervous that it
was this, this forecast entirely.
Well, theyâ¦
The pembrolizumab stopped enrolling, and
the trial read out, and it was published
in JAMA Oncology, and the final estimate
for the PCR response rate was 60%.
It actually, the, the final posterior
estimate of that, it, it triggered
it almost exactly in the setting.
Really, uh, uhâ¦
And, and by the way, it was an
over 99% probability of winning
Phase 3 trial in triple-negative
breast cancer, and of course,
pembrolizumab is an amazing therapy
now, um, uh, in many, many different
cancers So it actually hit exactly.
Uh, ended up 60% to a control
rate of 22%, and you can go see
the posterior distribution in
that paper, uh, Nanda et al.,
and, uh, within that.
So it triggered it perfectly six months
ahead of what if we'd have waited for
all of those patients, and meanwhile,
we would've enrolled more in that.
Now, in the design you wanna set
up, what do we want the trial to do?
Do we want it to do that?
These are all decisions made
before the arm started in the trial
Okay, so accelerating it.
That's a case where even six monthsâ¦
By the way, a, a huge problem in breast
cancer is there are really many effective
therapies looking at endpoints like
overall survival, event-free survival.
These endpoints are, are
thankfully very, very long.
To do trials and learning
trials in an adjuvant setting
looking at those endpoints are
really, really hard and long.
And so a huge desire in, in many of
these diseases are looking at endpoints
that are earlier that are predictive.
That's a different part of
disease modeling and longitudinal
modeling and hugely important.
And that, that, that's a separate thing.
Look for surrogate markers and huge thing.
And Barry does a lot of that modeling
in other diseases and, and cancer.
But this is the setting where
let's sort of fix in that the
trial has a primary endpoint, and
within that trial, we're trying to
accelerate to that primary endpoint
Okay.
What are other examples of this?
What are other examples of
fighting time in clinical trials?
Uh, I'll go back to, uh, BAN2401
adaptive trial for Alzheimer's.
This is now the treatment is lecanemab.
It's, it was the first disease-modifying
treatment approved for Alzheimer's.
This is Eisai, uh, ran
the development of this.
They ran a Phase 2 trial, and in the Phase
2 trial, they had five doses and control.
The doses were different dose
levels and frequency, monthly
or bi-monthly infusions.
And they ran a, an adaptive trial
that included response adaptive
randomization over the different doses.
Whichever doses were doing-- were,
were, were working better, they wanted
to put more patients on that to learn
what is the right dose going forward.
They wanted proof of concept.
If, if the predictive probability
of winning Phase 3 got high enough,
they wanted to jump to Phase 3.
They could stop for futility.
The trial had an adaptive sample size
from three hundred to eight hundred.
The primary endpoint in the
trial was 18 months in that.
Now, to make all these decisions in
the trial, it's hard to do a response
adaptive randomization if you have to
wait for patients to get to 18 months.
Now, the primary endpoint in that trial
was ADCOMS, which is a cognitive measure.
It's actually a combination of different,
uh, questions from CDR sum of boxes,
ADAS-Cog, and MMSE, and they were, uh,
selected to be earlier in the disease
course, cognitive-type endpoints,
that would be more informative for
early Alzheimer's disease, which
is what was done in this trial.
In order to be an efficient design, we
needed to learn faster than 18 months.
We get three, six, nine,
12, 15, 18 months values.
Every three months, we get patients' data.
All of that data was used to forecast the
effect of the different doses relative
to control, estimate the control rate,
to, to speed that learning up, to have
all of those adaptations take place.
The prediction-- So we use the same
endpoint observed at earlier time periods.
They're not primary.
They're the, they, they're being used
as an auxiliary endpoint, a predictive
endpoint for the final cognitive outcome
for a patient, which is 18 months.
Now, in this trial, we did, uh,
largely the early part of the
trial was predicting 12 months, but
the full exposure went out to 18.
So I'm gonna alter that a little
bit and talk about 12 months.
The, the adaptive algorithm
was driven by the 12 months.
Now, the, the model we used was a linear
regression model between three months and
12 months, six months and 12 months, nine
months and 12 months, informed by previous
patients that had path-- passed through
the trial, passed through those endpoints
in the trial itself, and we had previous
data on the expected correlation of those.
We used prior distributions for those
correlations from previous data,
but then that's updated in the trial
itself on that correlation to then
drive the estimates of the treatment.
It's a beautiful, uh, uh, example, in
part because ASID published the results
of every one of the interim analyses.
You can go in and see what did it
forecast at the first interim analysis
of the trial, which was 200 patients.
And then every 50 patients in the
trial, it did an, it did an, an update.
At that first interim analysis,
which is triggering 12 months, there
were zero patients at 12 months.
At the second interim analysis
at 250, I think we had several
that had reached 12 months.
A good number have gotten to nine
months, which is really predictive
of 12 months and six months, so
they're progressing through that.
Response adaptive randomization
started at that 200.
At that time, the response adaptive
randomization immediately wanted the two
high doses, 10 monthly and 10 bimonthly,
and kind of went away from the lower doses
based on cognitive values at threeâ¦
mostly three and six months.
You-- The randomization probabilities
change, then the data get updated 50
patients later, and the exposure of the
early patients gets longer, which is
critical in adaptive design because that's
the most valuable data in the trial.
The values were updated again.
Interestingly in the trial, the final
sample sizes of this, by the time
everybody got to 18 months, were about
250 patients on the control, placebo.
About 250 patients on 10 monthly
and about 175 on 10 bimonthly.
If you add together the other
three doses, it was about 175.
So these two doses got the lion's
share of patients, appropriately so.
The 10 bimonthly arm of, eventually is
the w- arm that went to the phase three
trial and demonstrated superiority.
About a 27% slowing of disease.
Interestingly, in the adaptive trial
of 800 patients, if you go through the
movie of what happened, that forecast
was about, uh, interim five or six or
seven, right in there, the algorithm
started to estimate a 25% to 30% slowing
and probability of benefit above 95%.
The probability of at least a 25% slowing,
which is what the criteria was in the
trial, that was what they wanted to
define go, at interim five was about 75%.
Seventy-five percent chance that
you have at least a 25% slowing.
The final number when all patients got
through 18 months, years later, was 76%.
Now, there was the huge value of
the follow-up and all of that, but
the algorithm was forecasting that.
Now it's forecasting it by doses and
place pacing-- placing patients on
the right doses, and it made for a
much more efficient trial design.
It enabled them to explore
five doses by the ability to do
response adaptive randomization.
But none of that happens without the
longitudinal modeling of the endpoint.
You learn nothing until it's too
late to make those adaptations.
Okay.
A couple other examples of this,
just to give you the wide view of,
of what this looks like, different
trial designs utilizing this.
The Trulicity is El-Eli Lilly'sâ¦
It was a GLP-1, uh, kinda kicked off Eli
Lilly's, um, um, jumping into to GLP-1s,
which, uh, I assume everybody understands
the story of the impact of these.
This was treatment of diabetes,
and they ran a seamless two-three
trial, the Award Five trial.
You can go see the publications of that.
And in the seamless, in the phase two
part of it, they were exploring seven
different doses of, of dulaglutide.
Uh, uh, the trade name became Trulicity
after it was eventually approved.
The seven different doses were explored.
The primary endpoint was HbA1c.
Interestingly, there was a utility
function over four endpoints where they
wanted to make the decision on allocating
the doses and dose selection for phase
three, taking into account HbA1c, HbA1c
change uh, blood pressure and heart rate.
So they didn't want those
things to be elevated to the
point of cardiovascular risk.
The fourth endpoint was weight loss.
The more weight loss, the more benefit it
was perceived, the more, the, the better
the therapeutic profile and the better
patients would want to take the treatment.
If there was weight gain, there was no
commercial en- aspect of this treatment.
Ironically, that became the huge part
of GLP-1s are now approved specifically
for weight loss, even in non-diabetics,
and that played a huge role.
In that trial, that was one of four
endpoints in a utility function.
Over the seven endpoints, response
adaptive randomization was done, the
ability to jump from phase two to phase
three, and it was predicting will it
show non-inferiority to sitagliptin as
an active comparator for the primary
endpoint of the eventual move to phase
three It made that tw- that's a 12-month
endpoint, and during the trial we're
enrolling and we're getting monthly
values on HbA1c change, and we built a
model between those values and 12 months.
It jumped to phase three.
It selected two doses, 1.5
milligram and 0.75
milligrams, to go to phase three, and
there were no patients at 12 months.
It was entirely based on the
longitudinal modeling of HbA1c change,
which was pretty well understood,
the behavior of that endpoint.
They had previous data on
patients, and again, we spent a
lot of time building that model.
Interestingly, early on, the
question was, what's a better
predictor of 12-month HbA1c?
Early values of HbA1c or fasting
blood glucose, which is a measure
that is thought to move faster, and
maybe that's a better predictor.
We actually had data to jump
in and look at if we were gonna
predict 12 months which one was
better, and HbA1c was better.
It's thought that that's kind of an
integrated value over time, and maybe
it's slow-moving, but it was, was
much better predictor of 12 months
than fasting blood glucose was.
And so these decisions in the building
of that trial all go in because
we're trying to accelerate it.
Now, Lilly did something tremendous
in that trial, and I've not seen it
in another trial, is they knew that
this longitudinal model was a really
important part to selecting a dose.
That dose that went to phase 3 spawned
cardiovascular trials, other trials.
That was the decision on dose.
It was making that drug
development decision.
The 1.5
milligram dose became the dose that
was used for dulaglutide and has
made billions for, for Eli Lilly.
And so that key decision, they
didn't want to enroll too fast.
Remember, this whole thing
is about time to information.
How quickly do we learn to make good
decisions like a seamless decision to
go to phase 3 They controlled enrollment
We simulated the trial for a range of
enrollments, and we saw that if you enroll
too fast, you pile patients in, it's less
efficient than enrolling slower to allow
for further follow-up and, and, and data.
We were making better decisions.
It led to a more efficient
development strategy to go slower.
And they were worried if they handed
it over that, that operationally
we wanna enroll fast, fast, fast.
They knew that slower was actually more
efficient of a development strategy.
And they controlled it, and it went there.
It jumped to phase three
as soon as it could.
The data were amazing for the
treatment, and it showed superiority
to sitagliptin on HbA1c change.
Um, and they say it s- accelerated
development 12 to 18 months The
key to that was the longitudinal
modeling of HbA1c and weight loss,
the prediction of six-month weight
loss based on earlier values, which
is very, very predictive, by the way.
Okay, so also a really
nice example of that.
We, we do a lot of stroke trials.
Uh, ICE CAP was the topic of
a podcast last week where the
endpoint is 90-day modified Rankin.
It's an ordinal scale, zero through
six, of their status at 90 days.
In the, in the ICE CAP trial, we
use their 30-day modified Rankin
status as a predictor of 90 days, and
it's an incredibly good predictor.
Interestingly, by the way, if you go
back to the ICE by two example, I know
I'm jumping around here a bit, the
prediction of the MRIs, what was really
interesting is the most predictive thing
is if the reduction wasn't very great,
if it was less than a 50% reduction.
At, at one month, that wasn't too bad.
But if it was a 25 or 30% reduction,
very unlikely you're gonna be a PCR.
PCR is a high hurdle.
If you weren't seeing 50%-plus
reduction at the first month and
90-plus at the third, you were unlikely.
If you were seeing 90 perâ¦
90 to 95, it was kind of 50/50
whether you're gonna get there.
If you're seeing 99%-plus,
then you're highly likely.
Some of the most predictive part in
that trial was the lack of response,
the lack of MR-- good MRIs became
very, very predictive, where other
values are less predictive but
valuable that it's not negative.
In MRS, for example, a, a 90--
a 30-day MRS of six means the
patient's dead is a really strong
predictor of death at 90 days.
Some of the middling values can
bounce around a little bit, but it's
incredibly predictive of 90 days.
You can accelerate your adaptive design
two months by using 30-day values.
Now, how do you do this?
Well, we incorporate Bayesian models,
not surprisingly, uh, within it.
They're really good for longitudinal
models because they incorporate
uncertainty really, really well
They can use prior models, prior
estimates of that correlation.
So you're doing analyses with,
with little data early on.
You want it to be smart
in that circumstance.
So what we do is we build a model,
and we have a distribution of the
parameters of the longitudinal model.
That's updated through Bayesian
modeling, Markov chain, Monte Carlo.
And then we make a forecast of every
patient that's early in the trial.
Based on that longitudinal
model makes a forecast.
We, we multiply impute those values,
which gives you a complete set of data.
Every patient has a
final v-value for that.
Then we update the final analysis modeling
because we have a complete trial, and
then we do that again and again and again.
So at every interim, we're simulating
ten thousand complete trials
where that imputation is done
through the longitudinal modeling.
Become a very common technique
now for missing data.
Very valuable technique for missing data.
That's the similar stuff we're doing for
an a-adaptive trial in the middle of it.
So the fun part of this, we do a
lot of this with different kinds
of endpoints, different kinds of
predictors w-within the trial.
We get to build these models
from them, this ordinal model
of thirty days to ninety days.
If you're a three, how do we
predict values at ninety days?
If you're a four, how
do we predict values?
Do we want those forecasts only to
look at the patients that were threes?
at the early time point?
Or do we want a monotonic model like in
High SPY 2, there was a monotonic model
that the more reduction, the better.
So you're learning about all of the values
from neighboring ones as well potentially.
That if you're a two at 30 days, do
we want forecasts to be better, which
is a better value, than a three?
So do we want that, or do we want to only
look at patients that are on that value?
Might even have non-monotonic forecasts.
These are all the things that go
into building this adaptive design
machinery that, by the way, when the
trial's over, the machinery goes away
unless you're using it for missing
data within the circumstances of it.
We've actually now done a number of stroke
trials where we're using 24-hour NIHSS,
NIH stroke score, as forecasting 90 days,
and it's actually really good forecasting.
It's really, really good.
And then updating it based on seven-day
status, 30-day status to 90-day status.
Again, accelerating the learning
to make a better trial design.
Okay, number of other examples of this.
Wide range of types of variables, wide
range of, of, of potential outcome
variables, spending a lot of time on
what, what those models look like.
How do we make inferences?
What do we know going in?
Prior distributions.
All in the name of accelerating learning
to make a more efficient adaptive trial,
treat patients better, make better
development decisions, enroll patients
in a more efficient way, making a
better design through this mechanism.
All right, appreciate you
joining me here today.
Appreciate you joining me
in the fight against time.
Uh, sometimes we're trying to slow time.
We're all, we're all trying to slow time.
Here, we're trying to speed up time.
We want to improve learning, speed up
learning through longitudinal modeling.
All right, I hope you can make your
adaptive trials more efficient, and until
next time, we'll be here in the interim.