The NeuralPod

Exploring Agentic Systems and Infrastructure with Schaun Wheeler, Chief Scientist at Aampe.

In this in-depth discussion with Schaun, we discuss his unique career path, from cognitive anthropology to leading roles in civil service and ad tech. Sean offers insights into the development of agentic systems for optimizing user experience and the challenges within. Here's what we covered 👇

✓ Schaun's career journey: From cognitive anthropology to data science
✓ The role of anthropology in data science and decision-making
✓ Common themes and challenges in machine learning projects
✓ Developing agentic infrastructure for consumer apps
✓ Key technical challenges in startup environments
✓ Building agentic systems at Aampe
✓ Aampe's Unique Approach to User Engagement
✓ The Role of LLMs in Engineering
✓ Agent Decision-Making Processes
✓ Navigating a fully remote global team culture
✓ Future advancements in agentic systems and AI
✓ The importance of understanding evidence and decision-making processes in AI
✓ Future of AI and Agents

Tune in for real-world AI insights and cutting-edge developments in agentic systems.

Thanks, Schaun for sharing!

What is The NeuralPod?

The NeuralPod AI Podcast

The NeuralPod is all about deep diving into technical machine learning topics and showing its real-world impact, bridging the gap between AI-first companies and adoption.

We chat with subject matter experts across reinforcement learning, recommender systems, deep learning and generative AI. As well as chatting with business leaders, VCs, operations and strategy experts.

Who are NeuralRec.ai? NeuralRec is a recruitment staffing agency. We build niche ML teams and represent some of the globe's best ML talent.

Chris : Hi Sean Wheeler, uh,
chief Scientist of of amp.

Welcome to the show

today.

Schaun Wheeler: Glad to be here.

Chris : So, um, yeah, let,
let's just jump into this.

I know.

Um, do you wanna just start by,
uh, talking us through your career

history and an introduction to you
and, and how you got to be the,

uh, the chief scientist at, at a.

Schaun Wheeler: Uh, sure.

Um, so let's see where to start.

Uh, I'm a, I'm a cognitive
anthropologist by training.

Um, so that's different from general
anthropology in that it looks

at how people think and organize
knowledge and not just how they live.

Um, and it's different from psychology
in that it's not, it's not based in a lab

trying to understand individual brains.

It's trying to understand
shared ways of thinking.

So, um, my, my earliest experiences,
uh, in research actually we're

doing ethnography in a, a central
Asian country named Kyrgyzstan.

Um, that's a, that's a folk
instrument, uh, from Stan back there.

But, um, my, my research was, was on a.

Uh, social movements.

Um, people who, uh, joined religious
movements who, uh, interacted with and

interviewed people who had, uh, engaged
in a, a, uh, a nonviolent, uh, overthrow

of their, their corrupt president.

Um, things like that.

I basically studied how people
came to make up their minds about

a subject strongly enough that they
were willing to sacrifice for it.

Um, so that got me a job, uh, with the
US Department of the Army of all places.

Um, at the time the war in Afghanistan
was, uh, was, um, the largest conflict

that the, the military was involved in.

And, uh, they brought in me and several
other people to try to help them treat

the conflict less like a military
problem and more like a people problem.

Um, but the US Army is very, very
good at many things like logistics

and maneuvers and, and taking on huge
tasks on very short notice, but they

aren't all that great with people.

So, um, our group was responsible.

For kind of making sense of all of the
unstructured data coming out of the war.

Um, we ended up actually this, that's,
that's where I, I hacked myself into

a, a data scientist that I, I taught
myself, uh, how to code, started

teaching myself machine learning.

We actually, our team was one
of the very first to use machine

learning for intelligence
analysis in the US government.

Chris : Oh, that's cool.

And do you think, um, anthropology, that
connection with how people think it is

played a part in you going into, uh, agent
infrastructure for kind of marketing.

Schaun Wheeler: I always actually say
that anthropology is, uh, is the, um,

it has more to do with me being a good
data scientist than anything else.

Uh, the, there's a

data science in general, but
a agentic, uh, approaches.

To, to decisions, uh, in particular.

Um, those are people problems.

They, like, they, they're, they're,
you're trying, it, it, it's a matter

of human computer interaction.

Um, and if you go into, actually, just
like the Army went in saying, can we

treat people like they're bombs and tanks?

No, you can't, you can't go into a human
computer interaction treating a person as

if they are a computer, as if they're just
another interface, um, with your program.

So that, yeah, I, I, I find
that, uh, for a while I wondered

whether my, my background in
anthropology actually was worth it.

I mean, it was a lot of years to get that
PhD, but, um, uh, I, I find that I use it

more often than I use a lot of my methods,
like my technical methods training.

Chris : Yeah.

It's one, it's one of the pieces
of feedback we get from, from our

customer is, you know, a good data
scientist should be able to link

it to the, the business problem.

And it sounds like that's
potentially where you started.

So I'm, I'm sure

Schaun Wheeler: Yeah.

Chris : uh, very worthwhile.

Um, okay, I'm, I'm moving on.

Uh, you know, your background in
civil service, education, ad tech,

um, I think that's a unique blend
of seeing machine learning, uh,

problems across different industries.

you know, I know problems can be
very specific to certain industries,

but, um, are there any kind of
common themes to machine learning

that you see across, across those
industries and how it's impacted them?

Schaun Wheeler: Yeah.

Um, I think the, maybe the biggest
theme I've seen is that, uh, data

science is the great unkept promise.

Like it, it was supposed to be
the, the sexiest job of the 21st

century and people were supposed to
gather all this data together and

somehow get tons of value from it.

Um, and there are certainly many
companies that have gotten a lot

of value, uh, from their data.

Um, but that's been very.

Unevenly distributed.

Um, if you look at the companies who pay
attention to this Gartner VentureBeat

who do surveys on this, they estimate
that, like, what was the numbers?

I saw something between 70 and 90%
of ML projects never actually get

to the point where they yield value.

Um, and that tracks with
my experience like that.

There, there's tons and tons of hope
put in, in, uh, machine learning

and data science, but it very
often doesn't come to fruition,

Chris : Yeah.

Schaun Wheeler: most companies.

Chris : Yeah.

And do, do you, do you think that's still
the case with like a, a gen project?

Because I, I read something similar
a, in a Harvard business view a while

back around 80% of projects failed,
but it was a, a 3-year-old study.

You know, fast forward

Schaun Wheeler: Mm-hmm.

Chris : you know, what's happened last
year is, um, know, there's been a bit

of a, quote unquote gold rush with,
with AI and a lot of hype, you know, a

lot of companies investing into agents.

So you think that around, know, lessons
have been learned or are still around 8%.

Uh.

Schaun Wheeler: I, I, I think it's
probably more true with, with ENT systems

than, than, uh, traditional ML systems.

I mean, an ENT system is even less
deterministic than an ML system.

Like it.

You're not just doing, introducing
a random seed in there to get

your, your, uh, your predictions.

Um, you have, you can literally put in
the exact same inputs over and over again

and get entirely different responses.

Um, this is especially true if you're
talking about, uh, uh, LLMs, um,

but any kind of EENT system, um.

Is more in line with the, the
uncertainty that you get from

interacting with a human.

A human can interact different ways
depending on lots of different contexts.

Um, and I, I think that's, I think,
uh, I mean from what I've seen so far,

uh, I mean I, I know everyone is, is
rushing for the, the gold in there.

But, uh, there's mo most of
the examples I've seen have

been small proof of concept.

Um, very fragile, reliant on
things working out just right.

Um, and as far as putting those
things into production, I've

seen very few examples so far
that have, have really worked.

I think we're way off there.

Chris : Yeah.

Interesting.

And, and I think in your, your experience,
just to zoom in on that a little bit,

what do you think the major pitfalls of
developing agents are at, at the moment?

Where, where do you see people,
um, going wrong or where do you

believe people are going wrong?

Schaun Wheeler: I actually think it is the
same problem that, uh, I, I've seen with

more traditional ml, um, I, I've, I've
never seen a, a ML project fail because

you couldn't get the model to converge
or because you couldn't get your feature

set engineered or anything like that.

It's because you couldn't get
the right interface with the

human part of the organization.

Whether that is as simple as just
not getting, being able to get

buy-in from the people who need
buy-in to not really understanding

the human workflows that this system
is supposed to inject itself into.

So, um, there were many times, um,
throughout my career that I, or I, data

scientists that I was working with,
managing mentoring, um, got all the way

to where they had a beautiful technical
product, but it couldn't get used because

they realized they had, they had built it.

To solve a problem that
didn't actually exist.

They had misunderstood the human problem.

And I think that that is what
I'm seeing a lot of cases with,

with, um, agen systems as well.

You're, you have people who are, um,
making some pretty, uh, quick and

simplistic assumptions about what
the underlying problem is, and then

trusting that somehow the, the, the,
the data and the algorithms will

make sense of that problem for them.

But data doesn't, data
doesn't solve your problems.

Data is a, that is a tool.

It's not a, it's not a person.

Um, and I, I think that's true of ag
systems as well as traditional ml.

Chris : Makes sense.

Well, hopefully you can talk
us through, uh, amp and some of

the, uh, successful e jet systems
that that'll work here shortly.

Um, but yeah, ju just to keep it on
piece with the introduction, I think

one thing we like to discuss, um, it,
it, it is just some of the lessons

you've learned as a data scientist.

I think it's a, you know, it's a hit and
miss market for, for junior people with,

with some of the AI systems coming out.

And, um, you know, some,
some lessons you've learned.

And then with that in mind, how you
would, um, appreciate, there's no

syllable to this, but how, how you would
position yourself in the current market.

Schaun Wheeler: I would position myself.

That's that one.

I don't know I have an answer to, but I
think, I think if I've learned anything

about this over my career, it's that the,
the data science market, the tech industry

in general, but especially data science,
is very much a gamble in terms of, of

where you end up in, in your career,
um, where you're, you're faced with.

Trying to get hired by people who do not
know actually what they want you to do.

Um, they, they, everyone wants the, a
sprinkling of that data science, magic.

Um, but they, unless you are a data
scientist or have worked with them

extensively at a technical level, most
people don't have a very vague idea

of what's actually going to happen.

Um, and that makes for a very
difficult hiring scenario.

So, um, the way I, I handled
it in my career was, uh, I

embraced that, that ignorance.

So, uh, finding a, a job that isn't
doing what you want to do, but because

they don't know what they want you to
do, it actually gives you freedom to.

To take, to steer the ship a little
bit where you can, you have more free

time maybe than you would if someone
actually knew what they wanted you to do.

They aren't because they, they can't
say, are you doing X, Y, and Z?

If they don't know X, Y, and Z are, but
also you have more room to go through

and suggest and say, I know you want to
do this very simple way of doing it, but

I have this fancy tool and maybe it's a
fancy tool that you just want to learn.

Um, but there's, there's value in that
for the business, but there's value in

that for you in terms of building up
your skillset that you can then leverage

to get into a job that's closer fit.

So, um, I went through several jobs
that were not a good fit for me,

um, that I took for a variety of
reasons, but a lot of the benefit I

got from them was, was from using that
as sort of an on the job training.

And I think that's a very valid way to,
to navigate a situation when there isn't

consensus on what, uh, best practices,
uh, look like for the most part.

Chris : No Pa Yeah, that's a, a,
a really interesting approach.

And it, terms of engineering at
the moment, you know, you've got

coding assistant like Claude or you

Schaun Wheeler: Yeah.

Chris : insert assistant.

would you encourage people to really
focus on, on engineering first approaches

or being really strong engineers?

Um, or, you know, I, is that
time you think we'll all be re

uh, software engineers might
be replacing five years, for

Schaun Wheeler: No, no, no.

They, they're not gonna be replaced.

Um, and anyone who has used, uh, a, anyone
outside of, of like that, that rare breed

of, like LinkedIn influencer who has used
an LLM to code, um, knows that they're,

uh, they're pretty good at boilerplate.

Like if it's something like you,
basically I, I, I saw the graph of

like stack overflow traffic has been
going down, down, down, down, down.

Um, because it's easier to, to find, uh,
the information you need through an LLM

than, than by searching through it there.

But if you need to do anything at all, um,
off the beaten path, or if you need to do

anything that requires a large multi-part
code base, um, the LMS aren't there yet.

And I'm not sure that, I mean, to
get a context window that big and

to get the right attention mechanism
in place to handle a context window

that big, um, that's, that's not a,
I don't view that as a problem that's

going to be solved really soon.

Chris : I.

Schaun Wheeler: Uh, I, in, in
the end, LLMs are like, in this

way, are like another person.

It's like pair coding and no.

I, I don't, I don't think anyone would
ever ask, like, do you think we're

reaching a place where you're gonna
pair code and you're just gonna offload

the entire job to your pair coder?

Like, it's always this
conversation between you and,

and you do some things right?

And then the pair coder like catches that
thing and, uh, something you did wrong

and says, Hey, you should change this.

You do the same with them.

That's a, that kind of back and
forth iterative relationship.

Like I, I use LLMs every time I code.

Um, but it's, it's always a very
collaborative relationship and a lot of

me correcting on scope, on pointing out
obvious things that, that were, are wrong.

Um, if nothing else, um, the more
you offload your coding to an

LLM, the more less maintainable
your code base is going to be.

Um, they're incredibly redundant.

They'll create a.

Three versions of the same function
and name them all different because it

can't remember that it created it back
there and that it does the same thing.

Those are real problems if you're talking
about putting a system in production.

Um, because ultimately the LLM is
not going to maintain your code base

if something breaks, you need to be
able to go in and make sense of it.

Um, so yeah, I, I, I'm a skeptic
on, on the LLMs are going

to replace engineers front.

Chris : I think, I think
that's a, a sensible approach.

And I'm sure some, uh, software engineers
listening will, will breathe a sigh

of relief after, after hearing that.

Um, okay.

Let, let's move on to, to amp,
which we'll break into two parts.

Um, some about the product that
you're building, how you're developing

the agent infrastructure, and it
say, uh, from our talks in the

past, really, um, unique approach.

And then more around how you've, you've
built amped and, um, you know, some of the

remote culture that, that you've built.

And, um, and we'll move on to a bit
of a quick fire round at the end,

but yeah, just introduce people to
who am is and, um, and intro and

introduction to Agen infrastructure.

'cause that may be the first time
they've, they've heard that term.

Schaun Wheeler: Sure.

So, um, what AMP is, depends
on who you're talking to.

Um, if you're in product, then AMP is
an adaptive experimentation engine.

It does feature validation.

Um, if you're in lifecycle, it
optimizes message, copy, timing,

frequency for every user individually.

If you're in engineering,
it's orchestration.

Um, if you're in data science, then
it's a, a deployment layer, uh,

for your experiments and a source
of truth for those experiments.

Um, as, as an architect of the system,
I tend to think of it as, I guess you

could call it iterative alignment.

So every user on any app has things
they want and don't want, things

they like and and don't like.

Um, and we assign a dedicated agenda
learner to each individual user on an app.

And that agent's sole purpose in
life is to adjust the holistic

user experience over time

for that one user to better line
up that user's desi, to better

line up with what that user wants.

So it cuts across the
entire user experience.

It's not just messages, it's not just
app screens, it's the entire interaction

landscape that this agent is managing.

And the idea is not to get the
user to change their behavior,

to fit what the business wants.

It's to get the businesses, um.

Serving of content and
interactions to change dynamically

to fit what the user wants.

That way the user gets more of what they
want, and that gets into a virtuous cycle

where the business gets more of what they
want because their user base is happy.

Chris : Hmm.

we, we talked about successful
ag agent, um, project.

So what, what kind of companies or
industries are you seeing getting

the most, um, benefit from this
kind of ag agentic infrastructure?

Schaun Wheeler: um, so the,

this works really well,
um, in any situation.

Uh, so we, we primarily work,
work with consumer apps.

Um, it works really well in any
situation where, uh, the things the

user does that really add value.

Um, to you as a, for you as a business
are things that could happen any day.

Um, so e-commerce apps, that works
extremely well with, with e-commerce.

This is a case where, um, yes,
some users like don't have the

money to spend at certain times.

You aren't in market for certain things,
but plausibly any user, if you get

the right thing in front of them at
the right time, could say, yeah, okay,

I'll, I'll put out some money for that.

Uh, and so in that case, the
agents are very good at taking

users who are already active and
incrementing up their, their activity.

Taking users that aren't
active and giving them.

Undivided attention that brings them back,
um, in cross-selling, things like that.

But we also, um, we work really well
in, uh, uh, gaming, uh, streaming.

Um, we also do well on, uh, in, uh, lots
of different subscription situations.

Um, finance management, uh, actually
streaming in some ways is, is both a,

a transactional system in that you want
people to watch or listen, but also a

subscription system in that you want
them to, to, um, pay for a membership.

That's, that's a lot of ways that
you, that you, uh, make money there.

So, uh, in all of those situations
we're, we're basically dealing with,

um, users sitting around and they
could do lots of different things

with their time at any given moment.

And what you're trying to do is
find the right moment to remind

them that this particular app is one
of the things they find value in.

It's one of the things
that they like doing.

Chris : Okay.

No, that's, that sounds, uh,
super interesting and we, we

kind of, uh, touched on LLMs at
the, the start of the show here.

Um, you know, um, doesn't rely on LLMs.

you wanna just talk us through kind

Schaun Wheeler: Sure,

Chris : to some of the traditional
AI methods and, you know, the

unique approach that you're taking?

Schaun Wheeler: yeah.

Uh, we don't rely on LLMs, uh, really
at all, but there's the option.

Uh, as a, as a last stage, um, we
actually find that very few customers

take advantage of that option, um,
usually because of concerns around quality

control, brand voice, um, consistency.

Like you don't want the LLM
suddenly becoming racist or, or

off 90% off or something like that.

Um, and so most, uh, like enterprise
companies are really concerned

about seeing what could go out
to a user and pre-approving it.

Um, which that, so, um, in the cases
where they do use LLMs, it's usually, uh,

to basically augment their human efforts
to pre-populate a content inventory.

Um, so messages or content
that could go on an app screen

that's already been approved.

Um, but we, I mean we, we started this
before chat, GPT came on the scene.

Uh, and when it did come on the scene,
I mean, it obviously caught a lot of

people's attention and we, we considered
for a little bit like, should we be

using this kind of technology instead?

And we, um, we pretty quickly decided
that it wasn't, that wasn't a viable

option for what we wanted to do.

Um, LLMs,

uh, not to get too wonky here,
but LLMs replicate what in humans

is called procedural memory.

It, it, it's sequences.

So do a then B, then C,
um, shuffle deck of cards.

Like you do it in a certain,
you coordinate your muscles

in a certain order, uh, riding
a bike, riding a sentence.

Um, and that works really well, uh,
in certain kinds of situations where

you can't, where knowing the right
behavioral pattern is the key to success.

So, um, think about
like a game like chess.

Chess is a success in chess is
largely a matter of good tactics.

Like you, you have your pieces here.

Which piece do you move?

Okay, you move this piece.

I have three options.

What's the best place to move?

Uh, that's all very
procedural, very tactical.

Um, it's what, uh, there's a psychologist
named, uh, uh, Hogarth, who, uh.

Who called this a kind
learning environment.

Like there, there's definite rules,
there's stable rules, you can

know what they are ahead of time.

Um, LLMs work really well in
those kinds of situations.

So if you go in and say, Hey, I
want to write this boilerplate

piece of code, um, that, that is
like a pretty common use case.

Like they, it, it can say, okay, what's
the right behavior, the right tokens to

put together to, to satisfy that use case.

Um, but we found that with
customer engagement, um, you

don't have clear rules like that.

It's what, it's what I've got
a wicked learning environment.

So, um, there aren't rules.

If there are rules, you
can't know them anyway.

Even if you can know
them, they keep changing.

And so in that case, what humans do
isn't to memorize behavior patterns.

'cause the behavior isn't going to
be consistently successful 'cause

your environment keeps changing.

Instead, you remember environmental cues.

You learn features of the environment and
say, when I recognize this kind of thing,

at least I know what I'm dealing with.

And then I will, I will figure out
my behavior on the fly that uses

a different kind of memory what,
what's called semantic memory.

It's categorization and humans.

Take these categories of here's what's
important in our environment, and we

use, uh, what's called associative
learning to basically add positive and

negative weights onto those categories,
um, over time from our experience.

So, uh, you learn that, uh,
you have, okay, you wanna go

to someplace to work at a cafe.

And so you have several different
cafes and you know that this cafe has

a great wifi, but their coffee stinks.

And this cafe that doesn't have great
wifi, this one doesn't have, uh,

outlets, but they have great muffins.

Like you're, you're taking all these
experiences and putting them all

together to kind of get the lay of
the conceptual land, and then you

make a decision based on how you
feel about your different options.

And that's what user engagement is.

You have all of these different
options for engaging users.

You have.

Just basic things like what time
of day and what day of week, how

often should you contact them?

You have things like, I have,
I'm an e-commerce company.

I sell clothing, and I could, I
could talk to 'em about shirts

or shoes or dresses or pants.

What do I talk to them about?

You have, uh, value propositions,
like why should they engage?

You might have different incentives.

You can offer different tones of voice.

There's like countless things
that, that are ways you can

categorize the experience for them.

And you're not going to be able to learn,
oh, the user looked at this and then did

this, and therefore this is obviously
the way I should interact with them.

That that kind of behavioral stability
doesn't exist in human, in humans, period.

Um, but you can say over time, I have
found that this user o, even though I

sell 50 different kinds of clothing, this
user only uses me for shirts and pants.

Apparently they get their shoes and hats
and everything like that somewhere else.

So that tells me something about
how I can interact with them.

Either are you in the market
for the things you normally get?

Let me try to do that, or I'm gonna
have to make a cross sell attempt, which

is different from approaching someone
who's already in the market for it.

So you have to create different
content for those situations.

And then you could also say, well, this
person doesn't so much care about price

point, so incentives aren't really gonna
make much of a difference for them.

They're willing to pay for quality,
but they want to be assured that it's

quality, so they want social proof.

So that means I need to put together
communication that emphasizes

ratings, um, reviews people have
given, um, popularity of the item.

Is this something that a lot of people
on the, on the system are buying?

Stuff like that, that's going
to be more likely to succeed.

And so when our agents
are assigned to a user.

The business populates the categories,
and then the agent is going through and

trying content that indexes to each of
these categories, and it creates a waiting

system for each category that tells it.

Here's how good of a bet it is.

And it uses that to then
drive its next decisions.

So with when we realized that's
really what we were trying to

do was we, we weren't trying to
optimize behavior within a session.

We're trying to get behavior
better over days, weeks, months.

Um, an LLM doesn't do that.

An I'M for, it's called
catastrophic forgetting.

Like they don't, they don't remember,
they can't keep that much in mind.

We needed a, an agent that had
a brain that could keep that

history in mind, but not the whole
raw history, but a consolidated,

here's my lessons learned history.

Chris : Hmm.

Schaun Wheeler: and that that
just wasn't something that the

L LMS aren't built to do, that
they're the wrong mechanism for it.

Um, and other systems, like traditional
ML systems aren't built for that either.

Bandits aren't built for that either.

Like that.

That was, that was one of the biggest
challenges when we realized, wow, we

can't actually use nice established,
boring technology to do this.

We have to figure out a
different way to do it.

Chris : Yeah, it, it sounds like
you've really tried to create

something that's, uh, brings us most
value to the user, uh, as possible

a a very, uh, unpredictable user.

So I guess, um, you know, kind of spoke
about before the, the session, you know,

how you created your a l base system
architecture, you know, the, the million

dollar question is how, how have you,
you know, built that around the, the user

and, you know, we, we don't expect you
to give the, the secret sauce away here,

but, um, you know, what, what, what's,
um, what's been your process there?

Schaun Wheeler: Uh, so we're actually
pretty open about, we, we, we are, we

don't, um, I don't so much believe in
secret sauce, so, uh, we, we actually

try to be pretty open about this.

Um, the core idea behind the system
is that instead of trying to predict

what works on average for a large
group of people over some period

of time, like that's what, that's
what AP testing or bandits do.

Um, we wanna figure out what
works for each individual person

at a very specific moment.

So that's not a, that's not
common in machine learning.

The standard assumption is if
it worked for a group of people,

it'll probably work for you.

Um, but real people don't work that way.

So the first step we take.

Is to look at a stream of behavior.

In this case, it's the, the app, the app
event stream, um, is what we, we, we use.

Um, but we don't just define like,
okay, here's my conversion event and

we're gonna try to optimize that.

We actually use the entire event stream.

So every button, click, every screen,
view, every add to cart, the entire thing.

If you're, if your a, if your engineers
instrumented it in order for your

app to work, the agent can use it.

Um, and so it goes through and we predict
for each of those events, the probability

that the user doing that event is going
to enter some kind of desired end state.

That's your goal, like
your conversion event.

The agent makes that probability judgment
based on several different factors.

One, it looks at the user's history.

Does this user come every day or have
they not been around for 90 days?

Um, it looks at app dynamics.

Is this a particularly busy
time of day or busy day of week?

Is it a sale day?

So there's extra people on the app anyway.

Um, but then it also looks
at the nature of the event.

Did they just come and view the home
screen or did they come and like

actually add something to their cart?

It uses this to create this probability
of like, here, here's the chance that

this person right now doing this thing
is moving in the direction we want.

And then it drops out the information
about the event itself and gets a

baseline estimate of the probability.

Meaning if someone did nothing at this
moment in time, what are the chances they

were moving in the direction we wanted?

This gives us.

A baseline, an actual
estimate of directionality.

This is actually something
we do as humans all the time.

When we're talking to people, we don't,
if I'm trying to convince someone of

something, I don't blindfold myself
and say my whole piece and then say,

okay, now I'm gonna look at you and
see how you, what you thought about it.

I'm watching the entire time.

And to see how you react.

And if you're looking away, if
you're looking at your watch, if

you're hemming and hawing, I'm
gonna change what I'm doing because

it's clear that it's not working.

Whereas if you're leaning forward,
if you're engaged, if you're asking

questions, I will double down on
what I'm doing and, and keep going

because it, it's, it's working.

Um, we needed a way for
agents to be able to do that.

And so that's what this
instrumentation does.

And then once they have this information
to be able to essentially read the

room, we, we adapted a, uh, an tric
method, uh, difference in differences.

Very standard method for
doing causal analysis.

Um, not, uh, normally suited for
reaching conclusions based on sparse

data about individual treatments.

So we had to adapt it, but we basically
take all of this signal that we've

created both after we send a message.

So we look at what someone did afterwards,
but we also look behind and say, what were

they already doing at the time we sent?

And then we subtract out
that baseline estimate.

So say what, what, what of all of this
information could have been explained by

just, this was business as usual for them.

And then what we're looking for
is the after information should be

bigger than the before information.

So we should see that the
user was doing some things.

We had an intervention like that,
we engaged with them in some way,

and then they did more things.

And then the agent translates
that, um, into a beta distribution.

So, so not only an estimate of how, how
impactful was this thing, but also how

confident am I that it was impactful.

Um, and then we aggregate those over time.

So in the end, the agents have
a profile for each user, every,

every action you could take.

Um, at each day of the week,
each value proposition gets a

beta distribution for that user.

That is, uh, a, a summary.

Of how they have responded to
individual treatments over time.

And the agent is picking from those
distributions and going with the

actions that have the largest picks.

So if it says, I need to have a
value proposition, I have five

value propositions, draw from each
of those and pick the largest one.

If the, if the agent has very little
information, if the user hasn't

responded much, those distributions
are all gonna be very flat and

centered around a 50% probability of
success, which means essentially the

agent's going to randomly explore.

It's just gonna move around
and try different things.

The more someone responds to something,
that that distribution will go from being

flat to become peaked and it'll move up.

And so it will get
picked a lot more often.

So the agent will move from exploring
to exploiting what it's learned, but.

If the user then changes their behavior,
say you used to message them Friday

at six o'clock, that works really
well for them to have them order food.

Um, but then they, they change their life.

Circumstances change.

They get a new job.

Their kid has, uh, a sports program
at five, so they can't eat that time.

The, the agent will recognize that
what they were trying isn't working

anymore, and that distribution
will start to flatten out again and

they'll go back into exploration.

So instead of having to have a human
say, let's explore now more, let's

exploit now more, um, the agents
are actually making that decision

dynamically, um, based on the feedback
they're getting from the user.

That's like I was, that's
a fire hose of information.

But that's, that's, that's the core
kind of machine that's driving the

agent's decision making is this, um,
individual assessment of, I did a thing,

did it move the user in a direction?

I'm then going to aggregate that
information over time into these

distributions and use those to make
decisions about what to do next.

Chris : Hmm.

I think that ties in well to,
to the next question around

experimentation and model reliability.

How, how are, what's your approach
to research in, is it led by, know,

the model performance and how the
agents are performing, or, um.

How, how do you actually approach that?

Schaun Wheeler: So we don't really have.

A model, not in the sense that most
people use the term, like we, we have

some models that underlie some of it.

Like for instance, we use a predictive
model to get those baseline and

actual probability predictions.

But in terms of the, the agents
actually making decisions, um,

there isn't a shared model.

It's not like all the agents are all,
all like referencing the same thing.

They're referencing the one user history.

Um, they will share information if
they lack a decision making criteria.

And if they've never tried value
proposition A, they will look at to

other agents and say, how does value
proposition a tend to play with users?

Okay, now, now I'll decide
whether that one should, I

should try that one with my user.

Um, but in the end, we're

user behavior is inherently unpredictable.

Um, in fact we, we call it
prediction, but it's not prediction.

It's guessing.

Uh.

Uh, we're, we're, we're
trying to make good bets here.

Um, so we assess the health of the
system the same way you would assess, um,

the performance of a, a money manager.

So you can, uh, a money manager can
put in lots of different places, and

if one place is not performing very well
and hasn't been performing well for a

little while, you might withdraw some of
your money from that investment and put

it somewhere that's performing better.

Um, that's really what
the agents are doing.

So, uh, it's not because you
can confidently predict that if

I can, if I put my money here,
I'm getting this much return.

That's, if we did that, we'd
put all of our money into

the basket, the best basket.

But in fact, what you want to do
is hedge your bets throughout.

So if the user does something
spontaneous, something you didn't,

that you didn't expect, you can
still accommodate that very quickly.

Um, and if the user changes their
behavior from something that was

established before, you don't get locked
into doing that thing over and over

again just because it used to work.

You should recognize pretty
immediately something has changed and

start hedging your bets some more.

Um, so a lot of measures of like model
reliability and, and validity and things

like that are, are not, I I won't say
they're in inapplicable in this situation,

but it's a different kind of game because
what you're really doing is, is trying to

find a good betting strategy, not so much
trying to find some sort of, um, accuracy

measure that you're satisfied with.

Chris : Well, uh, yeah.

Thanks for, um, thanks for sharing.

That's super interesting.

m moving on to more, uh, building
up and, um, you know, less, um,

technically focused, you know, lot,
lots of startups starts, but not

everybody, uh, stays in the race.

So what, what.

What some of your challenges
while building amp, and again,

it could be technical, it could
be operational, strategic What?

What has been the real
challenge for you guys?

Schaun Wheeler: Good question.

Um, well, I can start
with a technical one.

Um, they, they, one of the
biggest challenges was that

there wasn't an existing method
for doing what we wanted to do.

Chris : Hmm.

Schaun Wheeler: So AB
tests aren't adaptive.

Bandits are adaptive, but they
still working at an aggregate level.

ML can be individual, but, well, if
you're willing to invest in it a lot,

but ML isn't grounded in counterfactuals.

It, it's, it's wed to
what it's already seen.

It's not act actively
producing new options.

Like if, if we, it took us
some time in a lot of trial

and error to figure out how to.

Take what we knew was already happening
in the human brain, like how, how people

make decisions about complex environments
and to make that happen in the agent.

It wasn't something where we
could just take an existing method

off the shelf and apply that.

Um, so that was, that was a challenge,
although I think bigger that the

technical challenges were, were
more manageable in retrospect.

Um, one of the biggest challenges we
probably faced was, uh, in the very early

days, we realized we'd underestimated how
much time and attention and resources we

would need to devote to creating a user
interface for people who managed content.

Um, we, we originally assumed like
that there's already lots of systems

out there that manage messages and
we kind of assumed that yeah, people

will like build their thing there
and then we will orchestrate it.

Like our agents will take care
of deciding what to send to whom.

And we learned, we learned
pretty quickly that uh,

a system that's built for a
non-agent context cannot be

easily adapted to an agentic one.

Um, the, a lot of these messaging
systems are so wed to the concept of

campaigns and segments, very static.

I've made a decision and now I
live with it kind of decisions.

Um, and that's, that, that's
actually like the opposite of what

it means to operate genetically.

And so we had to, I.

We actually had to invest a lot of, a
lot of, a lot of resources into building

a user interface that would allow
people to create content in a way that

the agents would be able to act on it.

Well, it's not just a matter of
saying, oh yeah, I just have my

CRM team do what they always do.

I throw it to the agent, and the agent
makes, magically, makes it make sense.

Um, the, it's that old garbage in, garbage
out, uh, saw that you have to, you have

to have, uh, the right kinds of inputs.

Chris : Okay.

Um, cool.

And, um, speaking to you in the
past, I'm completely remote.

For those that don't know, um, you've
got companies like Amazon and De Jassy

push pushing everyone back to the office.

Uh, I think some startups,
uh, managing secure some real

hot talent because of that.

But, uh, also a lot of companies
who were completely remote have now

twisted and a, a hybrid approach.

At the very minimum, I think,
um, you, Sean, and an a, a

are sticking by your guns.

And as I say, it's something
you're really proud of.

you know, how, how, how do
you manage a, a culture?

How do you manage culture
at, at a global scale?

Schaun Wheeler: Um, so yeah,
we are all over the place.

I, I, I'd say at most we have four to
five employees in one geographic location.

Um, and that's only true
in two different locations.

Um, in, in most cases we have like,
one employee is, uh, in Paris.

So, uh, like that, we, and

that was important to us from the start.

I mean, part of it, we actually
founded, uh, amp right at

the start of the pandemic.

So, uh, we, we had to be remote, uh,
in the beginning, but also, um, my

co-founders, um, at the time we were
all spread out all over the place.

We were in, we were in North
Carolina, Paris, and Singapore.

Um, and so, and none of us
had a great desire to move.

So the, the fact that we wanted,
we really wanted to work together.

We didn't wanna work physically together
because we were happy with where we were.

Um, I think that colored a lot
of how we, we structured, uh.

This, but also we, we knew that the
decision to stay in one place carries a

real cost in terms of getting the team
you want, because it's not just finding

people with the right skills and the right
fit for the team and everything like that.

It's finding people who are
willing to live in the place

you want to them to live.

And being more flexible on that allowed
us to be actually a lot more stringent,

uh, on the other criteria of what
we were looking for in team members.

Um, I, I don't think our team, I mean,
this is the best team I've ever worked

with in my career, and I don't think it
would be, um, as good a team if we had

had to, uh, force people to accommodate
certain geographic preferences.

Um, but it does take a, uh, it, it's,
it's, um, it takes some adaptation.

Uh, I, we, we say this often
internally, and I do think it's true.

Um.

That

managing a fully remote team
and accommodating that team, um,

requires an immense amount of trust.

Like we don't do, it's very pretty
rare actually for any of us to do, like

regular, like scheduled one-on-ones
with our, our, our team members.

Our, our doors are open all the time,
but we aren't sitting there saying,

okay, give me an update on this project.

If someone takes on a project, we
trust that they have that project

and that if there's a problem with
that, that's going to change the, the,

the delivery date or anything like
that, they're gonna notify that us.

Early and in fact, we'll reach
out to other teammates for help.

Um, it, it's, it's, uh, it, it's
requires a shift for people who join us.

Um, 'cause they're, they're used to having
a lot more oversight, um, of not having

as much ownership over what they do.

Um, one guy that we, we just hired
recently, uh, I just had a conversation

with him and, and he, uh, he just
brought it up outta the blue.

We were just talking.

He says, can I just say I have
never worked with an engineering

team that was so willing and
ready to have strong opinions.

About how things, uh, how
the product should be built.

Engineers always have strong opinions
about, no, this is bad coding

practice or something like that.

But they, they were like, no, the
business shouldn't be going in

this direction because of this part
of the system that you don't know

about 'cause you didn't build it.

And they, they, he says, normally
engineers are like, okay, tell

me what your requirements are.

I, what, what are the constraints?

When do we need this by?

I'll do my best to work within that.

But he said, this felt
like real partnering.

Uh, it took some time.

It, it still takes time whenever
someone joins for us to, to help

them get used to that and recognize
that we don't, we don't, uh,

punish anyone for bad decisions.

Because that the, the only way you make
good decisions by making bad decisions

and then, and then learning from ' em.

So we are, we are, uh, we are
very, very committed to the concept

of blameless retrospectives.

Like we, we've all done stupid things.

You can't not do stupid things
when you're building a startup

'cause you're moving so fast.

And so, uh, when a stupid thing
happens, the question isn't, why

did you do that stupid thing?

It's why did we as a company
have things set up that that

stupid thing was possible?

And that it, it, it shifts
the conversation a lot.

So yeah, there's, there's lots
of aspects of, of like figuring

out how to deal asynchronously
with tasks and stuff like that.

But I think at the core it's really
a matter of, um, of, of allowing

that level of trust and ownership.

At an individual level, if we bring
someone on and we, we feel, if we feel

good enough about someone to bring
them onto the team, we have to feel

good enough about them to let them
make their own decisions, even if

it's not the decision we would make.

And we will do our best to accommodate
that, and we will adjust, uh,

iteratively as we go forward.

It's not an easy, it hasn't been
easy, but, um, I actually would've

have a hard time going back to
a different way of doing things.

I, I, I've, I'm very proud
of, of the way we've, we've

built ourselves to operate now.

Chris : Yeah, I, I'm the same.

I worked in a recruitment's,
very traditional in the sense

that everyone's in the office.

And since I've founded my business working
remotely, it's, you know, high degree.

It sounds like you're in a good place,
by the way, with a high degree of

accountability and, and ownership
and people, people really struggle

to go back once, once you make

Schaun Wheeler: Yeah.

Chris : Um, would you say, I know
you touched on you, you chose to go

remote at the start of the pandemic.

Would you say it's evolved in terms
of your approach to, you know, you've

always been remote first, but has
it evolved since, the pandemic.

Schaun Wheeler: In a number of ways.

I mean, we always have to.

Um, although I don't, I
don't know that it's so much.

I guess the only part that's
really evolved from in pandemic

to post pandemic is, um, I.

We, we regularly will bring team
members together into the same

geographic area for like a week.

Um, we usually do, uh, at least
one whole company trip a year.

Um, and it's a, it's a working trip.

Like we do some things where, yeah, we'll
be tourist for a couple days and, and,

and do some stuff like that, but for the
most part, we find a really nice place

to be, um, nice as in the geographic
location, but also like in a nice, uh,

a, a nice hotel that where we're not just
like in a windowless conference room.

Um, and we will pick things
to hack on for that week.

Uh, we, and there, there are certain,
there are certain problems that

really don't get solved nearly
as quickly or as easily remotely.

Um, like there, there, there're
things that you can have 50 calls

about it and still not solve it.

And then you get together for half an hour
face to face and suddenly it's solved.

And so, um, we've built that in
not only for a whole team, but also

we'll, uh, uh, I just came back a
few weeks ago from Dubai where we

brought all of our engineers together.

Um, we've, we'll sometimes bring the
go to market team together, like we,

we do, uh, that on a regular basis.

Um, and that obviously wasn't,
uh, an option during the pandemic.

Um, other than that, um, I'd say most
of the changes we've made to how we

work remotely are less a function of
we're no longer in a pandemic and more

a function of our team is growing.

So there are certain things that
work really well when you only have

to coordinate three people, but
when you have to coordinate 30, it's

a total, it's different ballgame.

And so, uh, that, that kind of, uh,
growth accommodation, I think is, is

kind of the larger driver of the, of
changes in the way we, we do remote.

Chris : Makes sense.

Sounds like a, a really sensible and,
uh, grown up way to, to, to do things.

So, uh, yeah.

Commend you for it.

Um, okay.

And, and moving on to just the future
and a, and a bit of fun in terms of,

uh, nobody is like, how long is a
piece of string in AI at the moment?

I guess no one can accurately predicts
anything, uh, beyond six months.

But in terms of you being in the know
around agents and what you're developing,

um, where do you see agents heading and
where would you like to see them heading?

Schaun Wheeler: So right now I
think agents are, uh, in terms of

how people view them right now and
what the assumptions they make about

them, they're, uh, overwhelmingly
assumed to be LLM powered.

Um, I mean, Mo mostly if you
talk to people and you mention

agents, they automatically assume
you're talking about an LLM.

Um, and obviously that sticks in my craw
a little bit because we don't do that.

But, um, I do think I.

That will change.

Um, there's some really important
and interesting work going on,

uh, regarding world models.

Um, neuros symbolic reasoning.

Um, I guess what I'm really saying is
I, I, I think agents are increasingly

going to have to move past this
procedural paradigm of that.

That's, that's a good basis for
making all kinds of decisions.

It's actually a good basis for
only making a very small subset of

decisions, and I think that we're
going to have to move beyond that.

But obviously I'm, I'm biased.

Chris : Okay.

Very interesting.

And, um, yeah, what, what tools do you
use Sean to, to keep you productive?

I know you kind of touched on, um, your
coding partner with your, your LLM.

Um,

Schaun Wheeler: I'm actually gonna give
you a disappointing answer on this one.

I, I actually, I, I use, uh,
just very basic bare bones tools.

I, I tend to use, uh, to, uh,
waffle between chat, GBT and Gemini.

Um, I, I don't use, uh, any
of the specialized ones.

Um, may partially, that's just
maybe my, my, my personality.

But, um, I also try to be very
intentional about which tools I bring on.

Um.

Every time we, we offload part of
our decision making to a tool, um,

we lose something as, as humans like,
and that, that, that's, that's not

a new thing that's been happening
ever since we've been a species.

Um, but, uh, it's, it's, uh, you,
you tend to get better at thinking

in certain ways and worse at thinking
in the ways that you offload.

That's, that's the point of
offloading, um, is you don't

have to think that way anymore.

And so I try to be very
intentional about which tools I

bring on and in what situations.

I even use a basic LLM.

Um, I do use them all the time.

Uh, that I, I mean, I think
it's an invaluable tool.

Um, but yeah, I, I haven't spread
out into, uh, a lot of the, the more

specialized tools that might be.

Me with my startup founder hat
on in that I realize that a huge

amount of these tools have no moat.

They're like, there, there's, they, even
the ones that have large market share

right now, there it, the, the cost of,
of switching to one of the alternatives

is so low that, um, like I, I, uh,

I'm waiting to see who
some, which winners emerge.

So maybe it's just, I don't wanna,
I don't wanna learn a tool and then

find out that it wasn't the winner
and that I have to switch anyway.

So, uh, I, I tend to be a little
more conservative on, on, on the

choice of of of LLM based tools.

Chris : Nice.

And, uh, yeah, speaking of, uh, you know,
doing your own thinking, um, you know,

what, what would be one book, podcast
or even ML paper that's really, know,

challenged the way you think or you would
recommend for, to someone, um, you know,

what's the first one you'd recommend?

Schaun Wheeler: That's a
hard question to pick one.

Um, I, I don't know if this would
really be my first one, but it's

the first one coming to mind.

Um, there's a, there, there's a book.

It's not well known.

It's actually written by a philosopher
of science named Peter Stein.

It's called The Book of Evidence.

Um, it's, it's a, it's a pretty, like,
there's some formal logic in there.

Like, it, it, it's not a,
it's not, it's a dense book.

Um, but it really changed the
way I thought about evidence.

Um, so much of our conversation about,
um, evidence or proof or, or, uh,

a, a justifiable basis for decision
making is we to, um, thinking about.

Particular methods or tools,
um, or experimentation setups.

Um, and that book really shifted
my thinking to looking at the idea

of competing hypotheses as the core
thing to define, which is actually

something that is pre methodological.

Um, and you can have this really,
really robust method, but if you

have not, um, defined your landscape
of hypotheses correctly, um, and

robustly, um, the method isn't really
proving what you think it's proving.

Um, and I think that opened me up a
lot to, um, considering ways of, um,

approaching evidence that could look at,
at, uh, I mean, essentially what we're

doing is looking at an n of one, uh, when,
when our agents are making decisions that

I, I was taught in statistics classes,
that's a, that's a terrible idea.

It's nonsensical.

Um, but I, I think there is a
basis for doing it if you do the,

the hypothesis work beforehand.

Um, so yeah, it, it, it's, it's kind
of a funny book to recommend it.

Like it's not, it's not a particularly
enjoyable read, but it really did

change my thinking quite a bit.

Chris : Nice.

I'll, uh, yeah, made a note.

I'll have a look into that one.

Um, and in, in terms of, you know,
Gentech infrastructure aside, and,

and final question by the way.

Um, you know, what excites you about AI
in terms of breakthroughs, you know, could

be robotics or what, what, what excite and
change looking forward to the most within

Schaun Wheeler: Um,

I, I, I, I try to follow a
good, uh, diversity of people

on social media in terms of ai.

So I, I, I follow a lot of
the starry-eyed tech bros.

And I also follow a lot of
the, the AI is going to destroy

civilization, uh, skeptics.

And, um, and I, I think, I think
there's value in all of those.

Uh.

Perspectives.

I mean, we always, whenever there's a
new technology that changes how people

think, you can tell how influential a
technology is by how many people say it's

going to pretend to the end of the world.

Um, like they did that with television,
they did it with airplanes, they

with bicycles, Socrates did it
with writing like that, that

that's, that's a common thing.

Um, and I don't wanna use that
to dismiss the skepticism.

I, I, I, I share some of it.

Um, but I, I think what a lot of the,
the skeptics miss in focusing on, oh

look, they're, these companies, they're
taking a whole bunch of data often

without the permission of the people who
generated that data in the first place.

And they are, um, creating these
machines in order to make a lot of money.

And that, that, I mean,
that, that, that's all true.

Um.

I think they're underestimating the
second order impacts of these kinds of

systems, especially on the way they can
influence underprivileged communities,

people with disabilities, like there's
usually not a lot of money in helping

people who did not have equal access to
education to level the playing field.

That's usually like an own
NGO does that you don't get a

corporation that invests in that.

But that leveling of the playing
field is actually, uh, um, one of

the main, according to a, a New York
Times article I read a while ago, was

one of the main motivations behind
Microsoft's investment in open ai.

Uh, their CTO had come from a background
like that and, and saw that as.

Um, this, like, if this can work, this
is taking someone who maybe doesn't

speak the way people expect them to
speak, um, and maybe doesn't have the

same, uh, kind of procedural skills that
someone who had gone to four years of, of

some college at, um, would, would have.

And this can help even that out.

A lot of, um, technology, I think
in helping people who have physical

disabilities is going to come out
of this same kind of technology.

It's the same, um, methods,
the same processes.

It's just a different application.

I'm, I'm actually very excited to see
the way this moves into those spaces,

um, and that, that touches on robotics,
that touches on lms everything.

But I, I, I think there's, there's tons
of potential there that's not going to

get funded directly, but it, but can get
funded, can get supported indirectly by

further developing these technologies.

More episodes

Chapters

What is The NeuralPod?