In the Interim...

In this episode of "In the Interim…", Dr. Scott Berry explores the challenge of protracted endpoint timelines in adaptive clinical trials and the statistical strategies used to increase the rate of actionable information gain. Drawing on detailed case studies from breast cancer (I-SPY 2), Alzheimer’s disease (BAN 2401), diabetes (AWARD-5/Trulicity), and cardiac arrest, Scott addresses the technical demands of longitudinal modeling and interim data imputation for accelerating learning. The discussion prioritizes a critical, empirical perspective of demonstrating how carefully constructed statistical models, simulation, and Bayesian methods can convert interim patient data into more robust estimates of delayed outcomes and support key design adaptations. The episode is a direct account of the methods, uncertainties, and real-world impact of fighting time in adaptive trials.

Key Highlights

Analyzes how delayed primary endpoints challenge adaptive trial efficiency, and how adaptive trial designs use accumulating in-trial data to inform adaptive allocation, arm graduation, and early trial conclusions.
Dissects the use of longitudinal models in I-SPY 2, in which interim MRI measurements at one and three months are mapped to predicted six-month pathologic complete response, through an ordinal stratified, pre-specified modeling approach—illustrating both the strengths and limits of interim forecasting.
Reviews the BAN 2401 adaptive Alzheimer’s trial, where early cognitive assessments were modeled to forecast 12-month outcomes enabling response adaptive randomization and sample size adaptation based on projections from interim data.
Details the AWARD-5 seamless trial for dulaglutide (Trulicity), where strategic enrollment pacing, predictive modeling of early HbA1c and weight loss, and a utility function across four endpoints supported both dose selection and seamless transition to phase 3 without requiring full cohort maturation.
Summarizes recent cardiac arrest trial (ICECAP), using 30-day ordinal scales and multiple imputation to predict 90-day outcomes and improve interim decision-making.
Unpacks the importance of prior-data-driven modeling, simulation, and strict robustness checks in the construction of all predictive models used for interim adaptation.

For more, visit us at https://www.berryconsultants.com/

Creators and Guests

Host

Scott Berry

President and a Senior Statistical Scientist at Berry Consultants, LLC

What is In the Interim...?

A podcast on statistical science and clinical trials.

Explore the intricacies of Bayesian statistics and adaptive clinical trials. Uncover methods that push beyond conventional paradigms, ushering in data-driven insights that enhance trial outcomes while ensuring safety and efficacy. Join us as we dive into complex medical challenges and regulatory landscapes, offering innovative solutions tailored for pharma pioneers. Featuring expertise from industry leaders, each episode is crafted to provide clarity, foster debate, and challenge mainstream perspectives, ensuring you remain at the forefront of clinical trial excellence.

Judith: Welcome to Berry's In the
Interim podcast, where we explore the

cutting edge of innovative clinical
trial design for the pharmaceutical and

medical industries, and so much more.

Let's dive in.

Well, welcome everybody
back to In The Interim.

I'm your host, Scott Berry.

And as you can tell,
my studio has changed.

So the new studio, which, um, uh,
if you go back a bit seasonally,

that, um, I am here in northern
Minnesota, and you can see beautiful.

That's actually, uh, that,
that's not a, that's not a

virtual, uh, background to that.

That is, that is, uh, the woods outside
of our little cabin here in Minnesota.

We get to escape the heat of
Austin, Texas and spend some time

here in the woods of Minnesota.

So I'm in a little, uh, uh,
office, uh, above the garage

here in northern Minnesota, and
I appreciate you joining me.

So I, I wanna introduce today's topic
with a bit of a panicked email and

then phone call I got in October 2016.

The, the panic was from my father,
uh, my collaborator on I-SPY 2.

And you can go back to previous
episodes of In The Interim

looking at the I-SPY 2 trial.

We actually did, uh, two consecutive
episodes about the trial.

Fantastic, fantastic trial.

I'll, I'll, I'll introduce
a little bit, but it isâ¦

A, a important part of the story is
the primary endpoint in the trial.

The trial is I-SPY 2
neoadjuvant breast cancer.

And the important part of the neoadjuvant
part of th-this is that what that means is

rather than, uh, traditionally surgeryâ¦

You're diagnosed with breast cancer.

A surgeon goes in and removes as much
as the breast-- of the breast cancer

as they can, and then you're given
therapies That's adjuvant treatment.

Neoadjuvant is where you don't go
in and remove the tumor immediately.

You give treatment, and six
months later, surgery is done.

And the-- a-an advantage of that is you
get to see whether or not the treatment

had any benefit on the tumor itself.

The primary endpoint in I-SPY 2 was--
it, it was pathologic complete response.

That means that when surgery is done six
months after the start of the trial with--

for a patient, is that the cancer's gone.

There's-- It's a complete response.

There's no evidence of tumor
at the time of surgery.

And again, this is the statistician
telling you the endpoint.

A, a, um, an oncologist, breast cancer
oncologist could do a much better

job of this, but I'll give you the
statistical needed parts of this.

So that surgery is done six months
after, and there's no evidence of cancer.

The, the, the tumor's gone.

Meaning that the treatment
that you started and initiated

six months earlier had really
strong benefit for that patient.

A, a, a lack of pathologic complete
response would be that surgery is

done, and there's tumor remaining that
the, the surgeon removes at that time.

Now

That endpoint is a, is
a six-month endpoint.

In the trial, we don't know the result of
that endpoint for a patient for six months

Now, in the trial, there are earlier
readings within a patient that might

be predictive of that response.

There's an MRI that's taken
one month, three months, and

then actually before surgery.

It's, it's, we call it six months,
but it's relatively, um, um,

proximal to when surgery is done,
and that MRI gives an estimate of

the amount of tumor that is there.

And now, now we, uh, we'll go back
and even at six months, for example,

the MRI might say there's no tumor
left, but surgery reveals some lay or

some, some, some aspects of, of tumor.

So even at six months it's not perfect.

Now, in the trial, that-- those
values, the MRI values, are used

within a patient in the modeling.

I'll come back to that.

What, what is the, what is the, the,
the email and phone call panic that

I got is that pembrolizumab, which
one-- it was one of the treatments in

the trial, had triggered graduation.

Graduation is a, a, is an adaptive outcome
in the trial where there's at least an

eighty-five percent probability that that
treatment would beat control in a phase

III trial of pathologic complete response.

Now, it's uncertain the role
of pathologic complete response

and do we need disease-free, uh,
survival, uh, uh, as an endpoint?

Do we need a different endpoint in this?

And what's the role of pathologic
complete re-response in breast cancer?

Let, let's not get into that.

But in the, in the trial, that's
the-- that's one of the outcomes,

and think of that as a positive,
uh, uh, outcome that has graduated.

It means enrollment in I-SPY stops.

All patients would go through six months.

They would continue on their therapy,
but no new enrollment would be given.

Essentially, the treatment
has had a positive outcome.

I-SPY 2 is a platform trial with
many therapies going through it,

pembrolizumab being one of those.

Now, the panic was, and, and I'll back
up even a little bit more, is that in

I-SPY 2, Berry Consultants wrote the
code that runs that adaptive trial.

The trial uses response adaptive
randomization across multiple subtypes

of disease, hormone receptor status,
HER2 status, and the, the adaptive design

was driven by Bayesian modeling of that.

And Berry Consultants and I was
involved in writing the code for that.

Now, coming back to
that, what's the panic?

At the time that it triggered for
graduation, the estimated pathologic

complete response rate for pembrolizumab
was sixty percent, and the control rate

had an estimate of twenty-three percent.

Very, very successful in the trial.

The panic was only one patient at
that time had an observed outcome

of pathologic complete response.

Had had the six-month surgery and
we knew the data on one patient out

of the patients on pembrolizumab.

And the panic was, what's happening?

Is, is the modeling broken?

We have one patient, and it
estimates a sixty percent response

rate compared to the twenty-three
percent on control, control.

Something must be broken

Okay, so that's the panic.

Now, I'll come back to, to that
part of the story, but I wanna

shift things to adaptive designs,
and I-SPY 2 was an adaptive trial.

I'll come back and talk a little bit
more about I-SPY 2 and this October

2016 panic, uh, uh, from I-SPY 2.

But let's talk a little
bit about adaptive trials.

So what is an adaptive trial?

An adaptive trial is a, a clinical
trial that has pre-specified

dynamic aspects to it.

I try to not use the word change
because the protocol is set up to

have these dynamic pieces to it.

So in I-SPY 2, there's adaptive allocation
to the different treatment arms.

There's triggering of
graduation of an arm.

All of that's pre-specified.

The protocol doesn't change.

It says in the protocol
these things will happen.

These are key aspects of the
trial that are dynamic and by

design can happen in the trial.

Yes, from the perspective of the
design, it feels like these are

things that are changing in the trial.

You know, we stop enrolling
to pembrolizumab when it

hits a trigger of graduation.

The role of that is to accelerate
development of it to get it to Phase 3.

I-SPY 2 is a Phase 2 trial.

Now, during the trial, these things are
triggered by data in the trial itself

Now that's it seemsâ¦

When, when I explain to people one
of the things I do as a statistician,

the main thing Barry does is build
adaptive trials, and I've been building

adaptive trials for, for 26 plus
years now, is it, it seems strange

that, "Okay, I don't understand.

You mean normally we don'tâ¦

the trial doesn't change?"

And to explain that is really the opposite
of an adaptive design is a fixed trial.

We- w- at the beginning of the trial
in a fixed trial, you say, "Let's

enroll 200 patients on a control.

Let's enroll 200 patients on a
treatment, and let's wait till all

of those patients get six-month data,
and then we'll look at the data and

figure out whether pembrolizumab
or whatever treatment it is works.

And then we make the
decisions going forward."

That's a fixed trial.

Now, many times in a fixed trial, you
look at the data and you say, "Oh, shoot,

I wish the sample size was smaller.

We might've known this treatment
was effective long before we

got to, to, to 200 on each arm."

Uh, in the, in I-SPY 2 it's
120 is the maximum, so youâ¦

long before you get to 120,
you might think that treatment

works or it doesn't work.

Or I wish we'd have put different
kind of patients on that treatment

compared to another treatment.

I wish we would've changed doses.

But generally larger outside of
I-SPY 2, we could be changing

doses in a, in a trial.

We could be altering the
patients enrolled in a trial.

We could move it to a different stage
of trial where it selects a dose and

it moves to a, a confirmatory part
of the trial, and I'll talk about

a trial that, that did that today.

So lots of things in the trial
can, can be dynamic Now what's,

what, why is that, what's, what's
the promise of adaptive trials?

We think about fixed trials where you
say, "Let's go enroll 200 patients on

each arm, and we'll look at the data
when all of the patients have reached

a time point that we consider primary."

We look at the data and we learn something
from that, and maybe we approve a

treatment, maybe we go to phase three,
maybe we run fa- another phase two.

Whatever it is, the role of the
trial, that's a fixed trial.

The promise of this is that we can change
this trial in ways that improve the trial.

By, by changing the allocation,
by changing the doses, by

accelerating to another stage,
we make the trial design better.

That's really promising.

There are a lot of things in trials,
in a fixed trial, we guess at.

We guess at the effect size.

We guess at which patients
might, might benefit from it.

We, we guess at the doses.

We, we guess at a lot in the trial.

We guess at the variability
of the outcome, the, the

prevalence of events in trials.

We guess at a lot.

The promise of adaptive trials is
during the course of the trial,

you learn a lot about that.

If you look at the data in the
trial itself, you learn how

effective the treatment is.

You learn about variability.

You learn about the behavior of
different treatments, the behavior

of different doses, and then we
make the trial better given that.

Had we known that information
before the trial started, we

would've designed a better trial.

That's the promise of adaptive
trials, that we learn things to

make that trial more efficient.

We make the engine better.

It's a, it's incredibly promising
thing of, of adaptive trials.

A- a- and if you think about it, the
opposite of it is you close your eyes

and you hope everything you designed four
years later was the right thing to do.

So that's the promise of an adaptive
t- uh, of an adaptive trial.

Now, the crux of it is as
that adaptive trial is going,

we need mechanisms to learn.

There's data and, and data's ge-
being generated in the trial.

We need to learn from that data If you
think about I-SPY 2, we need to learn from

the pathologic complete response outcomes
that come out of that trial at six months.

In an Alzheimer's trial, we might be
interested in the 18-month response.

In a, in a, a trial of a vaccine,
we're interested in, uh, events

that are somebody comes up with
the disease that the vaccine's

supposed to combat, and we're
waiting for events to happen in that.

That becomes the thing
we're learning from.

So we need models in the trial itself as
a learning mechanism, and those models

drive the dynamic things in the trial.

That's, that's what we need.

So there's two key parts
to an adaptive trial.

We need to learn so that we make
changes, and then when we design the

trial, we figure out what changes
in the trial make learning more

efficient, make the trial better,
treat patients better in the trial.

All of those things are the
dynamic changes, but we need

the data in the trial to learn.

If we don't-- If, if we try to make
adaptations and we've learned nothing

in the trial, you know, it, it, it's all
based on, on, on things we thought going

in, which is we already designed the
trial, uh, a-as well as we can for that.

So it's really these two key things.

We need to learn from the data, and that
drives changes in the trial that make it

more efficient, make it a better trial.

So in that I-SPY 2 example, graduation
is an adaptation in the trial.

We need to learn from the data to say,
"Ah, this treatment should graduate."

Now, in that circumstance of
the panicked email was we have

one patient at six months.

The patient was a, a, aâ¦

was a, a positive response.

It was a pathologic complete
response, but that's one.

How are-- How is your learning mechanism
saying we have at least an eighty-five

percent chance that it's better than
a twenty-three percent response rate?

If it's based on one out of one,
that's-- we're not gonna come up with

an eighty-five percent probability.

The data's not strong enough.

In that trial, we have built
models to accelerate the learning.

Now, you come back to this whole
question about adaptive trial.

If we're waiting for everybody to
get to six months, the trial's slower

The trial to make changes, to make dynamic
adaptations, has to wait a long time, has

to wait for patients to get to six months.

Meanwhile, we've enrolled a bunch
of patients earlier than that.

Have we allocated them correctly?

Now, in the trial, I, I, I brought
up these MRIs that are done

at one month and three months.

In I-SPY 2, we built a model that looks
at the correlation between the MRI,

which is a quantitative measure of
the reduction in tumor from baseline.

Hundred percent reduction is getting
to pathologic complete response

at one month and three months.

If there's no tumor in the MRI, we
think that patient is highly likely

to be a pathologic complete response.

And hence, patients that have taken
pembrolizumab that have MRI values that

are really positive or really negative
inform what we think the pathologic

complete response is through what
I'm gonna call a longitudinal model.

So the longitudinal model is
the statistical mechanism that's

learning what does one-month
MRI predict about six-month PCR?

What does three months MRI
predict about six-month PCR?

What does six-month MRI predict
about six-month pathologic

comp-complete response?

Surgery happens after that, not too
far from it, so it's not a huge benefit

to the trial at six months, but three
months can speed up the learning.

One month can speed up the learning.

So we build models for that to accelerate
the learning about the primary endpoint.

We haven't changed the endpoint.

The endpoint is six-month pathologic
complete response, but early

values are driving that estimate,
which are driving the adaptation

Now, an important part of, uh, you
know, what does that model look like?

And I'll come back to specifically
what that model looks like

and how we learn from that.

That's the aspect in adaptive trials.

So what-- when people ask me what
scenarios, what trials can benefit from

adaptive design, the most important
part of it is that speed of learning.

So if you have a trial where the endpoint,
uh, i- is a migraine trial where it's

a treatment of acute migraine, there
are, there are treatments that are

trying to prevent migraines, but suppose
it's the treatment of acute migraine.

A lot of times we look at two
hours later, are you pain-free?

So when you give a treatment to
a patient, you know the outcome

of that patient two hours later.

We can do a lot of really efficient
things adaptively because the, the, the

learning ratio is really fast, and so we
can be very efficient Alternatively, if

the endpoint doesn't come for 18 months
in that scenario, we've got a whole lot

of patients that have been enrolled that
we don't know anything about, and we're

waiting for them to get to 18 months.

It's hard for the trial to be efficient.

Should we stop enrolling?

Should we change doses?

Should we make changes?

We don't know anything at that point.

So can we speed that up?

So adaptive trials become more
efficient relative to fixed trials

the faster we can accelerate that
learning, and that becomes a huge part

of adaptive trials, and it becomes
the huge fight that we have with time.

We're fighting time, in that
case, to make the trial more

efficient, to treat patients
better, to learn more efficiently.

That's the whole promise
of adaptive trials.

A huge part of building adaptive trials
is this acceleration of learning within

single patients that are moving through.

It's what w- what I've
been doing for 26 years.

Now, the interesting thing about it, a
lot of fixed trials don't care about this.

If your primary endpoint is 12 months,
and you're gonna enroll 200 patients,

and you're gonna wait till everybody
gets to 12 months, you don't really

care if six months is predictive of
12, if three months is predictive.

It's irrelevant to the primary analysis.

Now, you might do an MMRM model that
incorporates that, but it's all done when

all patients have had the opportunity
to complete the primary endpoint time.

Now, the whole world of this
is missing data, but this is

different kind of missing data.

These are patients that just haven't
reached that time point, which is a

whole new thing in adaptive trials that
don't come into it in fixed trials.

So this whole thing of trying to
accelerate the learning is new and

relatively specific to adaptive trials.

Okay, so a lot of what we do is
spending this time on building

that predictive of it in it.

So thinking about fighting time here.

Let's go back to I-SPY 2.

So in I-SPY 2, this-- what,
what does this model look like?

At one month and three months and
six months, we get an observation

of the, the, the size of the
tumor relative to baseline.

And let's think about percent reduction.

So the endpoint that we get at one month
and three months is a percent reduction.

We could get 100% reduction.

We could get 50% reduction in that.

Now, we're interested in suppose somebody
at one month, a, a woman at one month

has an 80% reduction in tumor size.

What does that mean about the likelihood
of pathologic complete response?

That's, that's really what we're
interested in, uh, from that perspective.

The beauty of the trial itself
is that patients pass through

one month and six months.

We get to see a lot of observations
on women that went from one month and

had an MRI, and then we found out were
they a pathologic complete response.

So we get a lot of pairs of what was
the percent reduction at one month, and

were they a responder at six months?

So we're building a model
here for that prediction.

You could build a logistic regression
model for that as that percent

reduction to a response yes/no.

Now, we built that original model, and
a lot of times we're building that model

in adaptive trial, and we don't have
the data in the middle of the trial.

We, we don't get the luxury of
opening up the data and building

a model and say, "Oh, this is
what the behavior of the model.

This is the best-fitting model.

Here's the BIC," you know, whatever,
whatever modeling approach you take.

We don't have it.

We build it before the trial starts.

It's hard-coded.

It runs the model at, at that
time point, so we have to

build the model ahead of time.

We had the luxury in I-SPY
2 of actually I-SPY 1.

I-SPY 1 was an observational cohort.

It didn't, it didn't change the
treatment of patients, but it got

these observations, and so we had
about 40 patients where we, we

could build a tri-- build a model.

Now, we want that model to be robust

We don't want it to be overly strong
in ways that it's making bad forecasts

of PCR because it's somewhat hard coded
and it's running that and it doesn't

necessarily know it's not fitting
well, and then it's driving decisions

about allocation to women that way with
breast cancer based on a model that

maybe isn't, isn't fitting very well.

So we want this to be really robust.

So the model we built in that
circumstance is a piecewise

prediction model where we actually
had 13 pieces to the model that then

forecasts the relative odds of a PCR.

What do I mean by that?

We broke that interval up into suppose
the reduction is in the-- is zero to 10%.

That's an outcome.

By the way, the tumor could grow.

That's the, that's the worst outcome.

Then it could be at zero to 10,
10 to 20, 20 to 30, 30 to 40, and

so on, all the way down up to 80
to 90, and then we break 90 to 95.

Remember, pathologic complete response
is six months, there's zero tumor.

So 90 to 95, 95 to 99,
and then more than 99.

Those are the 13 classifications of the
MRI at one month, and then a separate

instance of the model is at three months,
and a separate instance is at six months.

Then we learn that relative, uh, the
relative odds of a responder as a

function of which category your MRI is in.

And based on that, that's learned
from past women in the trial.

It's one of the beautiful things
about a platform trial is we have

a lot of data on that in ICE by two
from previous arms and previousâ¦

the, the control throughout
that So that model is running.

So at the time, let's go back to
this time where one patient had gone

through, and this estimate was 60%.

We had a number of women at that time in
the trial, and I'm gonna try to remember.

I believe we had some 60-plus
women who were on the arm that

had been allocated at that time,
but only one had passed through.

Now, all of the other women in the trial
have some level of, of maturity of their,

their MRI data, one month, three month.

Some have-- don't, haven't even
reached one month, and they

don't inform the, the l- the, the
response, the PCR six-month response.

And then we have some that
have six months but haven't had

surgery yet at that time point.

So when this panic happened, of
course, we jumped in and we're,

we're looking at the model.

You want humans to, to watch this
automatic pilot to make sure that

if, if the model is broken to some
extent, that w- we, we make sure

it's doing what it's supposed to do.

So we went in and we investigated.

And as Don says, pembrolizumab
was just melting the tumor away.

The values of the MRI for that
treatment were really, really good.

Lots of them were 99% reduction, 95
to 99% reduction, that the model was

predicting that a lot of these were
going to be six-month PCRs, and the

fraction of them that are going to be
six-month PCRs was 60% out of them.

It was all based on
the longitudinal model.

We had one patient.

It was entirely driven by the longitudinal
prediction of those patients Uh, by the

way, at that time, I, I should back up.

Some 60 patients had been
enrolled with pembrolizumab.

The I-SPY2 trial had these
different, uh, categories thatâ¦

Uh, different subtypes that are
differentiated largely by HER2

status and hormone receptor status.

One of those subtypes is called
triple-negative breast cancer, and

that's that you're negative on hor- uh, a
hormone receptor status and HER2 status.

There were 29 of those patients that
had been enrolled, and one of those

had passed through the six months.

That was the subtype that was
estimated to be a 60% response.

At the time that it said, "Graduate this,
stop enrolling, and pembrolizumab should

go to Phase 3," the final analysis takes
place when all patients get to six months.

It's another reallyâ¦

As statisticians, it's just ano- a-
another really, um, neat, exciting

area where we're doing these forecasts.

It's forecasts and predictions of
what's gonna happen, and you stop

enrolling a treatment arm, and then
you get to see the outcome of it.

So you're doing these predictions,
but you get to observe the outcome.

In many circumstances, we might make
predictions, and we never actually

get to see the outcome of that, but
it's forecasting what would happen.

Here, we get to see the outcome of that.

So there was nothing wrong with the model.

It was functioning exactly as designed.

Made some people nervous that it
was this, this forecast entirely.

Well, theyâ¦

The pembrolizumab stopped enrolling, and
the trial read out, and it was published

in JAMA Oncology, and the final estimate
for the PCR response rate was 60%.

It actually, the, the final posterior
estimate of that, it, it triggered

it almost exactly in the setting.

Really, uh, uhâ¦

And, and by the way, it was an
over 99% probability of winning

Phase 3 trial in triple-negative
breast cancer, and of course,

pembrolizumab is an amazing therapy
now, um, uh, in many, many different

cancers So it actually hit exactly.

Uh, ended up 60% to a control
rate of 22%, and you can go see

the posterior distribution in
that paper, uh, Nanda et al.,

and, uh, within that.

So it triggered it perfectly six months
ahead of what if we'd have waited for

all of those patients, and meanwhile,
we would've enrolled more in that.

Now, in the design you wanna set
up, what do we want the trial to do?

Do we want it to do that?

These are all decisions made
before the arm started in the trial

Okay, so accelerating it.

That's a case where even six monthsâ¦

By the way, a, a huge problem in breast
cancer is there are really many effective

therapies looking at endpoints like
overall survival, event-free survival.

These endpoints are, are
thankfully very, very long.

To do trials and learning
trials in an adjuvant setting

looking at those endpoints are
really, really hard and long.

And so a huge desire in, in many of
these diseases are looking at endpoints

that are earlier that are predictive.

That's a different part of
disease modeling and longitudinal

modeling and hugely important.

And that, that, that's a separate thing.

Look for surrogate markers and huge thing.

And Barry does a lot of that modeling
in other diseases and, and cancer.

But this is the setting where
let's sort of fix in that the

trial has a primary endpoint, and
within that trial, we're trying to

accelerate to that primary endpoint

Okay.

What are other examples of this?

What are other examples of
fighting time in clinical trials?

Uh, I'll go back to, uh, BAN2401
adaptive trial for Alzheimer's.

This is now the treatment is lecanemab.

It's, it was the first disease-modifying
treatment approved for Alzheimer's.

This is Eisai, uh, ran
the development of this.

They ran a Phase 2 trial, and in the Phase
2 trial, they had five doses and control.

The doses were different dose
levels and frequency, monthly

or bi-monthly infusions.

And they ran a, an adaptive trial
that included response adaptive

randomization over the different doses.

Whichever doses were doing-- were,
were, were working better, they wanted

to put more patients on that to learn
what is the right dose going forward.

They wanted proof of concept.

If, if the predictive probability
of winning Phase 3 got high enough,

they wanted to jump to Phase 3.

They could stop for futility.

The trial had an adaptive sample size
from three hundred to eight hundred.

The primary endpoint in the
trial was 18 months in that.

Now, to make all these decisions in
the trial, it's hard to do a response

adaptive randomization if you have to
wait for patients to get to 18 months.

Now, the primary endpoint in that trial
was ADCOMS, which is a cognitive measure.

It's actually a combination of different,
uh, questions from CDR sum of boxes,

ADAS-Cog, and MMSE, and they were, uh,
selected to be earlier in the disease

course, cognitive-type endpoints,
that would be more informative for

early Alzheimer's disease, which
is what was done in this trial.

In order to be an efficient design, we
needed to learn faster than 18 months.

We get three, six, nine,
12, 15, 18 months values.

Every three months, we get patients' data.

All of that data was used to forecast the
effect of the different doses relative

to control, estimate the control rate,
to, to speed that learning up, to have

all of those adaptations take place.

The prediction-- So we use the same
endpoint observed at earlier time periods.

They're not primary.

They're the, they, they're being used
as an auxiliary endpoint, a predictive

endpoint for the final cognitive outcome
for a patient, which is 18 months.

Now, in this trial, we did, uh,
largely the early part of the

trial was predicting 12 months, but
the full exposure went out to 18.

So I'm gonna alter that a little
bit and talk about 12 months.

The, the adaptive algorithm
was driven by the 12 months.

Now, the, the model we used was a linear
regression model between three months and

12 months, six months and 12 months, nine
months and 12 months, informed by previous

patients that had path-- passed through
the trial, passed through those endpoints

in the trial itself, and we had previous
data on the expected correlation of those.

We used prior distributions for those
correlations from previous data,

but then that's updated in the trial
itself on that correlation to then

drive the estimates of the treatment.

It's a beautiful, uh, uh, example, in
part because ASID published the results

of every one of the interim analyses.

You can go in and see what did it
forecast at the first interim analysis

of the trial, which was 200 patients.

And then every 50 patients in the
trial, it did an, it did an, an update.

At that first interim analysis,
which is triggering 12 months, there

were zero patients at 12 months.

At the second interim analysis
at 250, I think we had several

that had reached 12 months.

A good number have gotten to nine
months, which is really predictive

of 12 months and six months, so
they're progressing through that.

Response adaptive randomization
started at that 200.

At that time, the response adaptive
randomization immediately wanted the two

high doses, 10 monthly and 10 bimonthly,
and kind of went away from the lower doses

based on cognitive values at threeâ¦

mostly three and six months.

You-- The randomization probabilities
change, then the data get updated 50

patients later, and the exposure of the
early patients gets longer, which is

critical in adaptive design because that's
the most valuable data in the trial.

The values were updated again.

Interestingly in the trial, the final
sample sizes of this, by the time

everybody got to 18 months, were about
250 patients on the control, placebo.

About 250 patients on 10 monthly
and about 175 on 10 bimonthly.

If you add together the other
three doses, it was about 175.

So these two doses got the lion's
share of patients, appropriately so.

The 10 bimonthly arm of, eventually is
the w- arm that went to the phase three

trial and demonstrated superiority.

About a 27% slowing of disease.

Interestingly, in the adaptive trial
of 800 patients, if you go through the

movie of what happened, that forecast
was about, uh, interim five or six or

seven, right in there, the algorithm
started to estimate a 25% to 30% slowing

and probability of benefit above 95%.

The probability of at least a 25% slowing,
which is what the criteria was in the

trial, that was what they wanted to
define go, at interim five was about 75%.

Seventy-five percent chance that
you have at least a 25% slowing.

The final number when all patients got
through 18 months, years later, was 76%.

Now, there was the huge value of
the follow-up and all of that, but

the algorithm was forecasting that.

Now it's forecasting it by doses and
place pacing-- placing patients on

the right doses, and it made for a
much more efficient trial design.

It enabled them to explore
five doses by the ability to do

response adaptive randomization.

But none of that happens without the
longitudinal modeling of the endpoint.

You learn nothing until it's too
late to make those adaptations.

Okay.

A couple other examples of this,
just to give you the wide view of,

of what this looks like, different
trial designs utilizing this.

The Trulicity is El-Eli Lilly'sâ¦

It was a GLP-1, uh, kinda kicked off Eli
Lilly's, um, um, jumping into to GLP-1s,

which, uh, I assume everybody understands
the story of the impact of these.

This was treatment of diabetes,
and they ran a seamless two-three

trial, the Award Five trial.

You can go see the publications of that.

And in the seamless, in the phase two
part of it, they were exploring seven

different doses of, of dulaglutide.

Uh, uh, the trade name became Trulicity
after it was eventually approved.

The seven different doses were explored.

The primary endpoint was HbA1c.

Interestingly, there was a utility
function over four endpoints where they

wanted to make the decision on allocating
the doses and dose selection for phase

three, taking into account HbA1c, HbA1c
change uh, blood pressure and heart rate.

So they didn't want those
things to be elevated to the

point of cardiovascular risk.

The fourth endpoint was weight loss.

The more weight loss, the more benefit it
was perceived, the more, the, the better

the therapeutic profile and the better
patients would want to take the treatment.

If there was weight gain, there was no
commercial en- aspect of this treatment.

Ironically, that became the huge part
of GLP-1s are now approved specifically

for weight loss, even in non-diabetics,
and that played a huge role.

In that trial, that was one of four
endpoints in a utility function.

Over the seven endpoints, response
adaptive randomization was done, the

ability to jump from phase two to phase
three, and it was predicting will it

show non-inferiority to sitagliptin as
an active comparator for the primary

endpoint of the eventual move to phase
three It made that tw- that's a 12-month

endpoint, and during the trial we're
enrolling and we're getting monthly

values on HbA1c change, and we built a
model between those values and 12 months.

It jumped to phase three.

It selected two doses, 1.5

milligram and 0.75

milligrams, to go to phase three, and
there were no patients at 12 months.

It was entirely based on the
longitudinal modeling of HbA1c change,

which was pretty well understood,
the behavior of that endpoint.

They had previous data on
patients, and again, we spent a

lot of time building that model.

Interestingly, early on, the
question was, what's a better

predictor of 12-month HbA1c?

Early values of HbA1c or fasting
blood glucose, which is a measure

that is thought to move faster, and
maybe that's a better predictor.

We actually had data to jump
in and look at if we were gonna

predict 12 months which one was
better, and HbA1c was better.

It's thought that that's kind of an
integrated value over time, and maybe

it's slow-moving, but it was, was
much better predictor of 12 months

than fasting blood glucose was.

And so these decisions in the building
of that trial all go in because

we're trying to accelerate it.

Now, Lilly did something tremendous
in that trial, and I've not seen it

in another trial, is they knew that
this longitudinal model was a really

important part to selecting a dose.

That dose that went to phase 3 spawned
cardiovascular trials, other trials.

That was the decision on dose.

It was making that drug
development decision.

The 1.5

milligram dose became the dose that
was used for dulaglutide and has

made billions for, for Eli Lilly.

And so that key decision, they
didn't want to enroll too fast.

Remember, this whole thing
is about time to information.

How quickly do we learn to make good
decisions like a seamless decision to

go to phase 3 They controlled enrollment
We simulated the trial for a range of

enrollments, and we saw that if you enroll
too fast, you pile patients in, it's less

efficient than enrolling slower to allow
for further follow-up and, and, and data.

We were making better decisions.

It led to a more efficient
development strategy to go slower.

And they were worried if they handed
it over that, that operationally

we wanna enroll fast, fast, fast.

They knew that slower was actually more
efficient of a development strategy.

And they controlled it, and it went there.

It jumped to phase three
as soon as it could.

The data were amazing for the
treatment, and it showed superiority

to sitagliptin on HbA1c change.

Um, and they say it s- accelerated
development 12 to 18 months The

key to that was the longitudinal
modeling of HbA1c and weight loss,

the prediction of six-month weight
loss based on earlier values, which

is very, very predictive, by the way.

Okay, so also a really
nice example of that.

We, we do a lot of stroke trials.

Uh, ICE CAP was the topic of
a podcast last week where the

endpoint is 90-day modified Rankin.

It's an ordinal scale, zero through
six, of their status at 90 days.

In the, in the ICE CAP trial, we
use their 30-day modified Rankin

status as a predictor of 90 days, and
it's an incredibly good predictor.

Interestingly, by the way, if you go
back to the ICE by two example, I know

I'm jumping around here a bit, the
prediction of the MRIs, what was really

interesting is the most predictive thing
is if the reduction wasn't very great,

if it was less than a 50% reduction.

At, at one month, that wasn't too bad.

But if it was a 25 or 30% reduction,
very unlikely you're gonna be a PCR.

PCR is a high hurdle.

If you weren't seeing 50%-plus
reduction at the first month and

90-plus at the third, you were unlikely.

If you were seeing 90 perâ¦

90 to 95, it was kind of 50/50
whether you're gonna get there.

If you're seeing 99%-plus,
then you're highly likely.

Some of the most predictive part in
that trial was the lack of response,

the lack of MR-- good MRIs became
very, very predictive, where other

values are less predictive but
valuable that it's not negative.

In MRS, for example, a, a 90--
a 30-day MRS of six means the

patient's dead is a really strong
predictor of death at 90 days.

Some of the middling values can
bounce around a little bit, but it's

incredibly predictive of 90 days.

You can accelerate your adaptive design
two months by using 30-day values.

Now, how do you do this?

Well, we incorporate Bayesian models,
not surprisingly, uh, within it.

They're really good for longitudinal
models because they incorporate

uncertainty really, really well
They can use prior models, prior

estimates of that correlation.

So you're doing analyses with,
with little data early on.

You want it to be smart
in that circumstance.

So what we do is we build a model,
and we have a distribution of the

parameters of the longitudinal model.

That's updated through Bayesian
modeling, Markov chain, Monte Carlo.

And then we make a forecast of every
patient that's early in the trial.

Based on that longitudinal
model makes a forecast.

We, we multiply impute those values,
which gives you a complete set of data.

Every patient has a
final v-value for that.

Then we update the final analysis modeling
because we have a complete trial, and

then we do that again and again and again.

So at every interim, we're simulating
ten thousand complete trials

where that imputation is done
through the longitudinal modeling.

Become a very common technique
now for missing data.

Very valuable technique for missing data.

That's the similar stuff we're doing for
an a-adaptive trial in the middle of it.

So the fun part of this, we do a
lot of this with different kinds

of endpoints, different kinds of
predictors w-within the trial.

We get to build these models
from them, this ordinal model

of thirty days to ninety days.

If you're a three, how do we
predict values at ninety days?

If you're a four, how
do we predict values?

Do we want those forecasts only to
look at the patients that were threes?

at the early time point?

Or do we want a monotonic model like in
High SPY 2, there was a monotonic model

that the more reduction, the better.

So you're learning about all of the values
from neighboring ones as well potentially.

That if you're a two at 30 days, do
we want forecasts to be better, which

is a better value, than a three?

So do we want that, or do we want to only
look at patients that are on that value?

Might even have non-monotonic forecasts.

These are all the things that go
into building this adaptive design

machinery that, by the way, when the
trial's over, the machinery goes away

unless you're using it for missing
data within the circumstances of it.

We've actually now done a number of stroke
trials where we're using 24-hour NIHSS,

NIH stroke score, as forecasting 90 days,
and it's actually really good forecasting.

It's really, really good.

And then updating it based on seven-day
status, 30-day status to 90-day status.

Again, accelerating the learning
to make a better trial design.

Okay, number of other examples of this.

Wide range of types of variables, wide
range of, of, of potential outcome

variables, spending a lot of time on
what, what those models look like.

How do we make inferences?

What do we know going in?

Prior distributions.

All in the name of accelerating learning
to make a more efficient adaptive trial,

treat patients better, make better
development decisions, enroll patients

in a more efficient way, making a
better design through this mechanism.

All right, appreciate you
joining me here today.

Appreciate you joining me
in the fight against time.

Uh, sometimes we're trying to slow time.

We're all, we're all trying to slow time.

Here, we're trying to speed up time.

We want to improve learning, speed up
learning through longitudinal modeling.

All right, I hope you can make your
adaptive trials more efficient, and until

next time, we'll be here in the interim.

More episodes

Chapters

Creators and Guests

What is In the Interim...?