The NeuralPod AI Podcast
The NeuralPod is all about deep diving into technical machine learning topics and showing its real-world impact, bridging the gap between AI-first companies and adoption.
We chat with subject matter experts across reinforcement learning, recommender systems, deep learning and generative AI. As well as chatting with business leaders, VCs, operations and strategy experts.
Who are NeuralRec.ai? NeuralRec is a recruitment staffing agency. We build niche ML teams and represent some of the globe's best ML talent.
Chris : Hi Sean Wheeler, uh,
chief Scientist of of amp.
Welcome to the show
today.
Schaun Wheeler: Glad to be here.
Chris : So, um, yeah, let,
let's just jump into this.
I know.
Um, do you wanna just start by,
uh, talking us through your career
history and an introduction to you
and, and how you got to be the,
uh, the chief scientist at, at a.
Schaun Wheeler: Uh, sure.
Um, so let's see where to start.
Uh, I'm a, I'm a cognitive
anthropologist by training.
Um, so that's different from general
anthropology in that it looks
at how people think and organize
knowledge and not just how they live.
Um, and it's different from psychology
in that it's not, it's not based in a lab
trying to understand individual brains.
It's trying to understand
shared ways of thinking.
So, um, my, my earliest experiences,
uh, in research actually we're
doing ethnography in a, a central
Asian country named Kyrgyzstan.
Um, that's a, that's a folk
instrument, uh, from Stan back there.
But, um, my, my research was, was on a.
Uh, social movements.
Um, people who, uh, joined religious
movements who, uh, interacted with and
interviewed people who had, uh, engaged
in a, a, uh, a nonviolent, uh, overthrow
of their, their corrupt president.
Um, things like that.
I basically studied how people
came to make up their minds about
a subject strongly enough that they
were willing to sacrifice for it.
Um, so that got me a job, uh, with the
US Department of the Army of all places.
Um, at the time the war in Afghanistan
was, uh, was, um, the largest conflict
that the, the military was involved in.
And, uh, they brought in me and several
other people to try to help them treat
the conflict less like a military
problem and more like a people problem.
Um, but the US Army is very, very
good at many things like logistics
and maneuvers and, and taking on huge
tasks on very short notice, but they
aren't all that great with people.
So, um, our group was responsible.
For kind of making sense of all of the
unstructured data coming out of the war.
Um, we ended up actually this, that's,
that's where I, I hacked myself into
a, a data scientist that I, I taught
myself, uh, how to code, started
teaching myself machine learning.
We actually, our team was one
of the very first to use machine
learning for intelligence
analysis in the US government.
Chris : Oh, that's cool.
And do you think, um, anthropology, that
connection with how people think it is
played a part in you going into, uh, agent
infrastructure for kind of marketing.
Schaun Wheeler: I always actually say
that anthropology is, uh, is the, um,
it has more to do with me being a good
data scientist than anything else.
Uh, the, there's a
data science in general, but
a agentic, uh, approaches.
To, to decisions, uh, in particular.
Um, those are people problems.
They, like, they, they're, they're,
you're trying, it, it, it's a matter
of human computer interaction.
Um, and if you go into, actually, just
like the Army went in saying, can we
treat people like they're bombs and tanks?
No, you can't, you can't go into a human
computer interaction treating a person as
if they are a computer, as if they're just
another interface, um, with your program.
So that, yeah, I, I, I find
that, uh, for a while I wondered
whether my, my background in
anthropology actually was worth it.
I mean, it was a lot of years to get that
PhD, but, um, uh, I, I find that I use it
more often than I use a lot of my methods,
like my technical methods training.
Chris : Yeah.
It's one, it's one of the pieces
of feedback we get from, from our
customer is, you know, a good data
scientist should be able to link
it to the, the business problem.
And it sounds like that's
potentially where you started.
So I'm, I'm sure
Schaun Wheeler: Yeah.
Chris : uh, very worthwhile.
Um, okay, I'm, I'm moving on.
Uh, you know, your background in
civil service, education, ad tech,
um, I think that's a unique blend
of seeing machine learning, uh,
problems across different industries.
you know, I know problems can be
very specific to certain industries,
but, um, are there any kind of
common themes to machine learning
that you see across, across those
industries and how it's impacted them?
Schaun Wheeler: Yeah.
Um, I think the, maybe the biggest
theme I've seen is that, uh, data
science is the great unkept promise.
Like it, it was supposed to be
the, the sexiest job of the 21st
century and people were supposed to
gather all this data together and
somehow get tons of value from it.
Um, and there are certainly many
companies that have gotten a lot
of value, uh, from their data.
Um, but that's been very.
Unevenly distributed.
Um, if you look at the companies who pay
attention to this Gartner VentureBeat
who do surveys on this, they estimate
that, like, what was the numbers?
I saw something between 70 and 90%
of ML projects never actually get
to the point where they yield value.
Um, and that tracks with
my experience like that.
There, there's tons and tons of hope
put in, in, uh, machine learning
and data science, but it very
often doesn't come to fruition,
Chris : Yeah.
Schaun Wheeler: most companies.
Chris : Yeah.
And do, do you, do you think that's still
the case with like a, a gen project?
Because I, I read something similar
a, in a Harvard business view a while
back around 80% of projects failed,
but it was a, a 3-year-old study.
You know, fast forward
Schaun Wheeler: Mm-hmm.
Chris : you know, what's happened last
year is, um, know, there's been a bit
of a, quote unquote gold rush with,
with AI and a lot of hype, you know, a
lot of companies investing into agents.
So you think that around, know, lessons
have been learned or are still around 8%.
Uh.
Schaun Wheeler: I, I, I think it's
probably more true with, with ENT systems
than, than, uh, traditional ML systems.
I mean, an ENT system is even less
deterministic than an ML system.
Like it.
You're not just doing, introducing
a random seed in there to get
your, your, uh, your predictions.
Um, you have, you can literally put in
the exact same inputs over and over again
and get entirely different responses.
Um, this is especially true if you're
talking about, uh, uh, LLMs, um,
but any kind of EENT system, um.
Is more in line with the, the
uncertainty that you get from
interacting with a human.
A human can interact different ways
depending on lots of different contexts.
Um, and I, I think that's, I think,
uh, I mean from what I've seen so far,
uh, I mean I, I know everyone is, is
rushing for the, the gold in there.
But, uh, there's mo most of
the examples I've seen have
been small proof of concept.
Um, very fragile, reliant on
things working out just right.
Um, and as far as putting those
things into production, I've
seen very few examples so far
that have, have really worked.
I think we're way off there.
Chris : Yeah.
Interesting.
And, and I think in your, your experience,
just to zoom in on that a little bit,
what do you think the major pitfalls of
developing agents are at, at the moment?
Where, where do you see people,
um, going wrong or where do you
believe people are going wrong?
Schaun Wheeler: I actually think it is the
same problem that, uh, I, I've seen with
more traditional ml, um, I, I've, I've
never seen a, a ML project fail because
you couldn't get the model to converge
or because you couldn't get your feature
set engineered or anything like that.
It's because you couldn't get
the right interface with the
human part of the organization.
Whether that is as simple as just
not getting, being able to get
buy-in from the people who need
buy-in to not really understanding
the human workflows that this system
is supposed to inject itself into.
So, um, there were many times, um,
throughout my career that I, or I, data
scientists that I was working with,
managing mentoring, um, got all the way
to where they had a beautiful technical
product, but it couldn't get used because
they realized they had, they had built it.
To solve a problem that
didn't actually exist.
They had misunderstood the human problem.
And I think that that is what
I'm seeing a lot of cases with,
with, um, agen systems as well.
You're, you have people who are, um,
making some pretty, uh, quick and
simplistic assumptions about what
the underlying problem is, and then
trusting that somehow the, the, the,
the data and the algorithms will
make sense of that problem for them.
But data doesn't, data
doesn't solve your problems.
Data is a, that is a tool.
It's not a, it's not a person.
Um, and I, I think that's true of ag
systems as well as traditional ml.
Chris : Makes sense.
Well, hopefully you can talk
us through, uh, amp and some of
the, uh, successful e jet systems
that that'll work here shortly.
Um, but yeah, ju just to keep it on
piece with the introduction, I think
one thing we like to discuss, um, it,
it, it is just some of the lessons
you've learned as a data scientist.
I think it's a, you know, it's a hit and
miss market for, for junior people with,
with some of the AI systems coming out.
And, um, you know, some,
some lessons you've learned.
And then with that in mind, how you
would, um, appreciate, there's no
syllable to this, but how, how you would
position yourself in the current market.
Schaun Wheeler: I would position myself.
That's that one.
I don't know I have an answer to, but I
think, I think if I've learned anything
about this over my career, it's that the,
the data science market, the tech industry
in general, but especially data science,
is very much a gamble in terms of, of
where you end up in, in your career,
um, where you're, you're faced with.
Trying to get hired by people who do not
know actually what they want you to do.
Um, they, they, everyone wants the, a
sprinkling of that data science, magic.
Um, but they, unless you are a data
scientist or have worked with them
extensively at a technical level, most
people don't have a very vague idea
of what's actually going to happen.
Um, and that makes for a very
difficult hiring scenario.
So, um, the way I, I handled
it in my career was, uh, I
embraced that, that ignorance.
So, uh, finding a, a job that isn't
doing what you want to do, but because
they don't know what they want you to
do, it actually gives you freedom to.
To take, to steer the ship a little
bit where you can, you have more free
time maybe than you would if someone
actually knew what they wanted you to do.
They aren't because they, they can't
say, are you doing X, Y, and Z?
If they don't know X, Y, and Z are, but
also you have more room to go through
and suggest and say, I know you want to
do this very simple way of doing it, but
I have this fancy tool and maybe it's a
fancy tool that you just want to learn.
Um, but there's, there's value in that
for the business, but there's value in
that for you in terms of building up
your skillset that you can then leverage
to get into a job that's closer fit.
So, um, I went through several jobs
that were not a good fit for me,
um, that I took for a variety of
reasons, but a lot of the benefit I
got from them was, was from using that
as sort of an on the job training.
And I think that's a very valid way to,
to navigate a situation when there isn't
consensus on what, uh, best practices,
uh, look like for the most part.
Chris : No Pa Yeah, that's a, a,
a really interesting approach.
And it, terms of engineering at
the moment, you know, you've got
coding assistant like Claude or you
Schaun Wheeler: Yeah.
Chris : insert assistant.
would you encourage people to really
focus on, on engineering first approaches
or being really strong engineers?
Um, or, you know, I, is that
time you think we'll all be re
uh, software engineers might
be replacing five years, for
Schaun Wheeler: No, no, no.
They, they're not gonna be replaced.
Um, and anyone who has used, uh, a, anyone
outside of, of like that, that rare breed
of, like LinkedIn influencer who has used
an LLM to code, um, knows that they're,
uh, they're pretty good at boilerplate.
Like if it's something like you,
basically I, I, I saw the graph of
like stack overflow traffic has been
going down, down, down, down, down.
Um, because it's easier to, to find, uh,
the information you need through an LLM
than, than by searching through it there.
But if you need to do anything at all, um,
off the beaten path, or if you need to do
anything that requires a large multi-part
code base, um, the LMS aren't there yet.
And I'm not sure that, I mean, to
get a context window that big and
to get the right attention mechanism
in place to handle a context window
that big, um, that's, that's not a,
I don't view that as a problem that's
going to be solved really soon.
Chris : I.
Schaun Wheeler: Uh, I, in, in
the end, LLMs are like, in this
way, are like another person.
It's like pair coding and no.
I, I don't, I don't think anyone would
ever ask, like, do you think we're
reaching a place where you're gonna
pair code and you're just gonna offload
the entire job to your pair coder?
Like, it's always this
conversation between you and,
and you do some things right?
And then the pair coder like catches that
thing and, uh, something you did wrong
and says, Hey, you should change this.
You do the same with them.
That's a, that kind of back and
forth iterative relationship.
Like I, I use LLMs every time I code.
Um, but it's, it's always a very
collaborative relationship and a lot of
me correcting on scope, on pointing out
obvious things that, that were, are wrong.
Um, if nothing else, um, the more
you offload your coding to an
LLM, the more less maintainable
your code base is going to be.
Um, they're incredibly redundant.
They'll create a.
Three versions of the same function
and name them all different because it
can't remember that it created it back
there and that it does the same thing.
Those are real problems if you're talking
about putting a system in production.
Um, because ultimately the LLM is
not going to maintain your code base
if something breaks, you need to be
able to go in and make sense of it.
Um, so yeah, I, I, I'm a skeptic
on, on the LLMs are going
to replace engineers front.
Chris : I think, I think
that's a, a sensible approach.
And I'm sure some, uh, software engineers
listening will, will breathe a sigh
of relief after, after hearing that.
Um, okay.
Let, let's move on to, to amp,
which we'll break into two parts.
Um, some about the product that
you're building, how you're developing
the agent infrastructure, and it
say, uh, from our talks in the
past, really, um, unique approach.
And then more around how you've, you've
built amped and, um, you know, some of the
remote culture that, that you've built.
And, um, and we'll move on to a bit
of a quick fire round at the end,
but yeah, just introduce people to
who am is and, um, and intro and
introduction to Agen infrastructure.
'cause that may be the first time
they've, they've heard that term.
Schaun Wheeler: Sure.
So, um, what AMP is, depends
on who you're talking to.
Um, if you're in product, then AMP is
an adaptive experimentation engine.
It does feature validation.
Um, if you're in lifecycle, it
optimizes message, copy, timing,
frequency for every user individually.
If you're in engineering,
it's orchestration.
Um, if you're in data science, then
it's a, a deployment layer, uh,
for your experiments and a source
of truth for those experiments.
Um, as, as an architect of the system,
I tend to think of it as, I guess you
could call it iterative alignment.
So every user on any app has things
they want and don't want, things
they like and and don't like.
Um, and we assign a dedicated agenda
learner to each individual user on an app.
And that agent's sole purpose in
life is to adjust the holistic
user experience over time
for that one user to better line
up that user's desi, to better
line up with what that user wants.
So it cuts across the
entire user experience.
It's not just messages, it's not just
app screens, it's the entire interaction
landscape that this agent is managing.
And the idea is not to get the
user to change their behavior,
to fit what the business wants.
It's to get the businesses, um.
Serving of content and
interactions to change dynamically
to fit what the user wants.
That way the user gets more of what they
want, and that gets into a virtuous cycle
where the business gets more of what they
want because their user base is happy.
I,
Chris : Hmm.
we, we talked about successful
ag agent, um, project.
So what, what kind of companies or
industries are you seeing getting
the most, um, benefit from this
kind of ag agentic infrastructure?
Schaun Wheeler: um, so the,
this works really well,
um, in any situation.
Uh, so we, we primarily work,
work with consumer apps.
Um, it works really well in any
situation where, uh, the things the
user does that really add value.
Um, to you as a, for you as a business
are things that could happen any day.
Um, so e-commerce apps, that works
extremely well with, with e-commerce.
This is a case where, um, yes,
some users like don't have the
money to spend at certain times.
You aren't in market for certain things,
but plausibly any user, if you get
the right thing in front of them at
the right time, could say, yeah, okay,
I'll, I'll put out some money for that.
Uh, and so in that case, the
agents are very good at taking
users who are already active and
incrementing up their, their activity.
Taking users that aren't
active and giving them.
Undivided attention that brings them back,
um, in cross-selling, things like that.
But we also, um, we work really well
in, uh, uh, gaming, uh, streaming.
Um, we also do well on, uh, in, uh, lots
of different subscription situations.
Um, finance management, uh, actually
streaming in some ways is, is both a,
a transactional system in that you want
people to watch or listen, but also a
subscription system in that you want
them to, to, um, pay for a membership.
That's, that's a lot of ways that
you, that you, uh, make money there.
So, uh, in all of those situations
we're, we're basically dealing with,
um, users sitting around and they
could do lots of different things
with their time at any given moment.
And what you're trying to do is
find the right moment to remind
them that this particular app is one
of the things they find value in.
It's one of the things
that they like doing.
Chris : Okay.
No, that's, that sounds, uh,
super interesting and we, we
kind of, uh, touched on LLMs at
the, the start of the show here.
Um, you know, um, doesn't rely on LLMs.
you wanna just talk us through kind
Schaun Wheeler: Sure,
Chris : to some of the traditional
AI methods and, you know, the
unique approach that you're taking?
Schaun Wheeler: yeah.
Uh, we don't rely on LLMs, uh, really
at all, but there's the option.
Uh, as a, as a last stage, um, we
actually find that very few customers
take advantage of that option, um,
usually because of concerns around quality
control, brand voice, um, consistency.
Like you don't want the LLM
suddenly becoming racist or, or
off 90% off or something like that.
Um, and so most, uh, like enterprise
companies are really concerned
about seeing what could go out
to a user and pre-approving it.
Um, which that, so, um, in the cases
where they do use LLMs, it's usually, uh,
to basically augment their human efforts
to pre-populate a content inventory.
Um, so messages or content
that could go on an app screen
that's already been approved.
Um, but we, I mean we, we started this
before chat, GPT came on the scene.
Uh, and when it did come on the scene,
I mean, it obviously caught a lot of
people's attention and we, we considered
for a little bit like, should we be
using this kind of technology instead?
And we, um, we pretty quickly decided
that it wasn't, that wasn't a viable
option for what we wanted to do.
Um, LLMs,
uh, not to get too wonky here,
but LLMs replicate what in humans
is called procedural memory.
It, it, it's sequences.
So do a then B, then C,
um, shuffle deck of cards.
Like you do it in a certain,
you coordinate your muscles
in a certain order, uh, riding
a bike, riding a sentence.
Um, and that works really well, uh,
in certain kinds of situations where
you can't, where knowing the right
behavioral pattern is the key to success.
So, um, think about
like a game like chess.
Chess is a success in chess is
largely a matter of good tactics.
Like you, you have your pieces here.
Which piece do you move?
Okay, you move this piece.
I have three options.
What's the best place to move?
Uh, that's all very
procedural, very tactical.
Um, it's what, uh, there's a psychologist
named, uh, uh, Hogarth, who, uh.
Who called this a kind
learning environment.
Like there, there's definite rules,
there's stable rules, you can
know what they are ahead of time.
Um, LLMs work really well in
those kinds of situations.
So if you go in and say, Hey, I
want to write this boilerplate
piece of code, um, that, that is
like a pretty common use case.
Like they, it, it can say, okay, what's
the right behavior, the right tokens to
put together to, to satisfy that use case.
Um, but we found that with
customer engagement, um, you
don't have clear rules like that.
It's what, it's what I've got
a wicked learning environment.
So, um, there aren't rules.
If there are rules, you
can't know them anyway.
Even if you can know
them, they keep changing.
And so in that case, what humans do
isn't to memorize behavior patterns.
'cause the behavior isn't going to
be consistently successful 'cause
your environment keeps changing.
I.
Instead, you remember environmental cues.
You learn features of the environment and
say, when I recognize this kind of thing,
at least I know what I'm dealing with.
And then I will, I will figure out
my behavior on the fly that uses
a different kind of memory what,
what's called semantic memory.
It's categorization and humans.
Take these categories of here's what's
important in our environment, and we
use, uh, what's called associative
learning to basically add positive and
negative weights onto those categories,
um, over time from our experience.
So, uh, you learn that, uh,
you have, okay, you wanna go
to someplace to work at a cafe.
And so you have several different
cafes and you know that this cafe has
a great wifi, but their coffee stinks.
And this cafe that doesn't have great
wifi, this one doesn't have, uh,
outlets, but they have great muffins.
Like you're, you're taking all these
experiences and putting them all
together to kind of get the lay of
the conceptual land, and then you
make a decision based on how you
feel about your different options.
And that's what user engagement is.
You have all of these different
options for engaging users.
You have.
Just basic things like what time
of day and what day of week, how
often should you contact them?
You have things like, I have,
I'm an e-commerce company.
I sell clothing, and I could, I
could talk to 'em about shirts
or shoes or dresses or pants.
What do I talk to them about?
You have, uh, value propositions,
like why should they engage?
You might have different incentives.
You can offer different tones of voice.
There's like countless things
that, that are ways you can
categorize the experience for them.
And you're not going to be able to learn,
oh, the user looked at this and then did
this, and therefore this is obviously
the way I should interact with them.
That that kind of behavioral stability
doesn't exist in human, in humans, period.
Um, but you can say over time, I have
found that this user o, even though I
sell 50 different kinds of clothing, this
user only uses me for shirts and pants.
Apparently they get their shoes and hats
and everything like that somewhere else.
So that tells me something about
how I can interact with them.
Either are you in the market
for the things you normally get?
Let me try to do that, or I'm gonna
have to make a cross sell attempt, which
is different from approaching someone
who's already in the market for it.
So you have to create different
content for those situations.
And then you could also say, well, this
person doesn't so much care about price
point, so incentives aren't really gonna
make much of a difference for them.
They're willing to pay for quality,
but they want to be assured that it's
quality, so they want social proof.
So that means I need to put together
communication that emphasizes
ratings, um, reviews people have
given, um, popularity of the item.
Is this something that a lot of people
on the, on the system are buying?
Stuff like that, that's going
to be more likely to succeed.
And so when our agents
are assigned to a user.
The business populates the categories,
and then the agent is going through and
trying content that indexes to each of
these categories, and it creates a waiting
system for each category that tells it.
Here's how good of a bet it is.
And it uses that to then
drive its next decisions.
So with when we realized that's
really what we were trying to
do was we, we weren't trying to
optimize behavior within a session.
We're trying to get behavior
better over days, weeks, months.
Um, an LLM doesn't do that.
An I'M for, it's called
catastrophic forgetting.
Like they don't, they don't remember,
they can't keep that much in mind.
We needed a, an agent that had
a brain that could keep that
history in mind, but not the whole
raw history, but a consolidated,
here's my lessons learned history.
Chris : Hmm.
Schaun Wheeler: and that that
just wasn't something that the
L LMS aren't built to do, that
they're the wrong mechanism for it.
Um, and other systems, like traditional
ML systems aren't built for that either.
Bandits aren't built for that either.
Like that.
That was, that was one of the biggest
challenges when we realized, wow, we
can't actually use nice established,
boring technology to do this.
We have to figure out a
different way to do it.
Chris : Yeah, it, it sounds like
you've really tried to create
something that's, uh, brings us most
value to the user, uh, as possible
a a very, uh, unpredictable user.
So I guess, um, you know, kind of spoke
about before the, the session, you know,
how you created your a l base system
architecture, you know, the, the million
dollar question is how, how have you,
you know, built that around the, the user
and, you know, we, we don't expect you
to give the, the secret sauce away here,
but, um, you know, what, what, what's,
um, what's been your process there?
Schaun Wheeler: Uh, so we're actually
pretty open about, we, we, we are, we
don't, um, I don't so much believe in
secret sauce, so, uh, we, we actually
try to be pretty open about this.
Um, the core idea behind the system
is that instead of trying to predict
what works on average for a large
group of people over some period
of time, like that's what, that's
what AP testing or bandits do.
Um, we wanna figure out what
works for each individual person
at a very specific moment.
So that's not a, that's not
common in machine learning.
The standard assumption is if
it worked for a group of people,
it'll probably work for you.
Um, but real people don't work that way.
So the first step we take.
Is to look at a stream of behavior.
In this case, it's the, the app, the app
event stream, um, is what we, we, we use.
Um, but we don't just define like,
okay, here's my conversion event and
we're gonna try to optimize that.
We actually use the entire event stream.
So every button, click, every screen,
view, every add to cart, the entire thing.
If you're, if your a, if your engineers
instrumented it in order for your
app to work, the agent can use it.
Um, and so it goes through and we predict
for each of those events, the probability
that the user doing that event is going
to enter some kind of desired end state.
That's your goal, like
your conversion event.
The agent makes that probability judgment
based on several different factors.
One, it looks at the user's history.
Does this user come every day or have
they not been around for 90 days?
Um, it looks at app dynamics.
Is this a particularly busy
time of day or busy day of week?
Is it a sale day?
So there's extra people on the app anyway.
Um, but then it also looks
at the nature of the event.
Did they just come and view the home
screen or did they come and like
actually add something to their cart?
It uses this to create this probability
of like, here, here's the chance that
this person right now doing this thing
is moving in the direction we want.
And then it drops out the information
about the event itself and gets a
baseline estimate of the probability.
Meaning if someone did nothing at this
moment in time, what are the chances they
were moving in the direction we wanted?
This gives us.
A baseline, an actual
estimate of directionality.
This is actually something
we do as humans all the time.
When we're talking to people, we don't,
if I'm trying to convince someone of
something, I don't blindfold myself
and say my whole piece and then say,
okay, now I'm gonna look at you and
see how you, what you thought about it.
I'm watching the entire time.
And to see how you react.
And if you're looking away, if
you're looking at your watch, if
you're hemming and hawing, I'm
gonna change what I'm doing because
it's clear that it's not working.
Whereas if you're leaning forward,
if you're engaged, if you're asking
questions, I will double down on
what I'm doing and, and keep going
because it, it's, it's working.
Um, we needed a way for
agents to be able to do that.
And so that's what this
instrumentation does.
And then once they have this information
to be able to essentially read the
room, we, we adapted a, uh, an tric
method, uh, difference in differences.
Very standard method for
doing causal analysis.
Um, not, uh, normally suited for
reaching conclusions based on sparse
data about individual treatments.
So we had to adapt it, but we basically
take all of this signal that we've
created both after we send a message.
So we look at what someone did afterwards,
but we also look behind and say, what were
they already doing at the time we sent?
And then we subtract out
that baseline estimate.
So say what, what, what of all of this
information could have been explained by
just, this was business as usual for them.
And then what we're looking for
is the after information should be
bigger than the before information.
So we should see that the
user was doing some things.
We had an intervention like that,
we engaged with them in some way,
and then they did more things.
And then the agent translates
that, um, into a beta distribution.
So, so not only an estimate of how, how
impactful was this thing, but also how
confident am I that it was impactful.
Um, and then we aggregate those over time.
So in the end, the agents have
a profile for each user, every,
every action you could take.
Um, at each day of the week,
each value proposition gets a
beta distribution for that user.
That is, uh, a, a summary.
Of how they have responded to
individual treatments over time.
And the agent is picking from those
distributions and going with the
actions that have the largest picks.
So if it says, I need to have a
value proposition, I have five
value propositions, draw from each
of those and pick the largest one.
If the, if the agent has very little
information, if the user hasn't
responded much, those distributions
are all gonna be very flat and
centered around a 50% probability of
success, which means essentially the
agent's going to randomly explore.
It's just gonna move around
and try different things.
The more someone responds to something,
that that distribution will go from being
flat to become peaked and it'll move up.
And so it will get
picked a lot more often.
So the agent will move from exploring
to exploiting what it's learned, but.
If the user then changes their behavior,
say you used to message them Friday
at six o'clock, that works really
well for them to have them order food.
Um, but then they, they change their life.
Circumstances change.
They get a new job.
Their kid has, uh, a sports program
at five, so they can't eat that time.
The, the agent will recognize that
what they were trying isn't working
anymore, and that distribution
will start to flatten out again and
they'll go back into exploration.
So instead of having to have a human
say, let's explore now more, let's
exploit now more, um, the agents
are actually making that decision
dynamically, um, based on the feedback
they're getting from the user.
That's like I was, that's
a fire hose of information.
But that's, that's, that's the core
kind of machine that's driving the
agent's decision making is this, um,
individual assessment of, I did a thing,
did it move the user in a direction?
I'm then going to aggregate that
information over time into these
distributions and use those to make
decisions about what to do next.
Chris : Hmm.
I think that ties in well to,
to the next question around
experimentation and model reliability.
How, how are, what's your approach
to research in, is it led by, know,
the model performance and how the
agents are performing, or, um.
How, how do you actually approach that?
Schaun Wheeler: So we don't really have.
A model, not in the sense that most
people use the term, like we, we have
some models that underlie some of it.
Like for instance, we use a predictive
model to get those baseline and
actual probability predictions.
But in terms of the, the agents
actually making decisions, um,
there isn't a shared model.
It's not like all the agents are all,
all like referencing the same thing.
They're referencing the one user history.
Um, they will share information if
they lack a decision making criteria.
And if they've never tried value
proposition A, they will look at to
other agents and say, how does value
proposition a tend to play with users?
Okay, now, now I'll decide
whether that one should, I
should try that one with my user.
Um, but in the end, we're
user behavior is inherently unpredictable.
Um, in fact we, we call it
prediction, but it's not prediction.
It's guessing.
Uh.
Uh, we're, we're, we're
trying to make good bets here.
Um, so we assess the health of the
system the same way you would assess, um,
the performance of a, a money manager.
So you can, uh, a money manager can
put in lots of different places, and
if one place is not performing very well
and hasn't been performing well for a
little while, you might withdraw some of
your money from that investment and put
it somewhere that's performing better.
Um, that's really what
the agents are doing.
So, uh, it's not because you
can confidently predict that if
I can, if I put my money here,
I'm getting this much return.
That's, if we did that, we'd
put all of our money into
the basket, the best basket.
But in fact, what you want to do
is hedge your bets throughout.
So if the user does something
spontaneous, something you didn't,
that you didn't expect, you can
still accommodate that very quickly.
Um, and if the user changes their
behavior from something that was
established before, you don't get locked
into doing that thing over and over
again just because it used to work.
You should recognize pretty
immediately something has changed and
start hedging your bets some more.
Um, so a lot of measures of like model
reliability and, and validity and things
like that are, are not, I I won't say
they're in inapplicable in this situation,
but it's a different kind of game because
what you're really doing is, is trying to
find a good betting strategy, not so much
trying to find some sort of, um, accuracy
measure that you're satisfied with.
Chris : Well, uh, yeah.
Thanks for, um, thanks for sharing.
That's super interesting.
m moving on to more, uh, building
up and, um, you know, less, um,
technically focused, you know, lot,
lots of startups starts, but not
everybody, uh, stays in the race.
So what, what.
What some of your challenges
while building amp, and again,
it could be technical, it could
be operational, strategic What?
What has been the real
challenge for you guys?
Schaun Wheeler: Good question.
Um, well, I can start
with a technical one.
Um, they, they, one of the
biggest challenges was that
there wasn't an existing method
for doing what we wanted to do.
Chris : Hmm.
Schaun Wheeler: So AB
tests aren't adaptive.
Bandits are adaptive, but they
still working at an aggregate level.
ML can be individual, but, well, if
you're willing to invest in it a lot,
but ML isn't grounded in counterfactuals.
It, it's, it's wed to
what it's already seen.
It's not act actively
producing new options.
Like if, if we, it took us
some time in a lot of trial
and error to figure out how to.
Take what we knew was already happening
in the human brain, like how, how people
make decisions about complex environments
and to make that happen in the agent.
It wasn't something where we
could just take an existing method
off the shelf and apply that.
Um, so that was, that was a challenge,
although I think bigger that the
technical challenges were, were
more manageable in retrospect.
Um, one of the biggest challenges we
probably faced was, uh, in the very early
days, we realized we'd underestimated how
much time and attention and resources we
would need to devote to creating a user
interface for people who managed content.
Um, we, we originally assumed like
that there's already lots of systems
out there that manage messages and
we kind of assumed that yeah, people
will like build their thing there
and then we will orchestrate it.
Like our agents will take care
of deciding what to send to whom.
And we learned, we learned
pretty quickly that uh,
a system that's built for a
non-agent context cannot be
easily adapted to an agentic one.
Um, the, a lot of these messaging
systems are so wed to the concept of
campaigns and segments, very static.
I've made a decision and now I
live with it kind of decisions.
Um, and that's, that, that's
actually like the opposite of what
it means to operate genetically.
And so we had to, I.
We actually had to invest a lot of, a
lot of, a lot of resources into building
a user interface that would allow
people to create content in a way that
the agents would be able to act on it.
Well, it's not just a matter of
saying, oh yeah, I just have my
CRM team do what they always do.
I throw it to the agent, and the agent
makes, magically, makes it make sense.
Um, the, it's that old garbage in, garbage
out, uh, saw that you have to, you have
to have, uh, the right kinds of inputs.
Chris : Okay.
Um, cool.
And, um, speaking to you in the
past, I'm completely remote.
For those that don't know, um, you've
got companies like Amazon and De Jassy
push pushing everyone back to the office.
Uh, I think some startups,
uh, managing secure some real
hot talent because of that.
But, uh, also a lot of companies
who were completely remote have now
twisted and a, a hybrid approach.
At the very minimum, I think,
um, you, Sean, and an a, a
are sticking by your guns.
And as I say, it's something
you're really proud of.
you know, how, how, how do
you manage a, a culture?
How do you manage culture
at, at a global scale?
Schaun Wheeler: Um, so yeah,
we are all over the place.
I, I, I'd say at most we have four to
five employees in one geographic location.
Um, and that's only true
in two different locations.
Um, in, in most cases we have like,
one employee is, uh, in Paris.
So, uh, like that, we, and
that was important to us from the start.
I mean, part of it, we actually
founded, uh, amp right at
the start of the pandemic.
So, uh, we, we had to be remote, uh,
in the beginning, but also, um, my
co-founders, um, at the time we were
all spread out all over the place.
We were in, we were in North
Carolina, Paris, and Singapore.
Um, and so, and none of us
had a great desire to move.
So the, the fact that we wanted,
we really wanted to work together.
We didn't wanna work physically together
because we were happy with where we were.
Um, I think that colored a lot
of how we, we structured, uh.
This, but also we, we knew that the
decision to stay in one place carries a
real cost in terms of getting the team
you want, because it's not just finding
people with the right skills and the right
fit for the team and everything like that.
It's finding people who are
willing to live in the place
you want to them to live.
And being more flexible on that allowed
us to be actually a lot more stringent,
uh, on the other criteria of what
we were looking for in team members.
Um, I, I don't think our team, I mean,
this is the best team I've ever worked
with in my career, and I don't think it
would be, um, as good a team if we had
had to, uh, force people to accommodate
certain geographic preferences.
Um, but it does take a, uh, it, it's,
it's, um, it takes some adaptation.
Uh, I, we, we say this often
internally, and I do think it's true.
Um.
That
managing a fully remote team
and accommodating that team, um,
requires an immense amount of trust.
Like we don't do, it's very pretty
rare actually for any of us to do, like
regular, like scheduled one-on-ones
with our, our, our team members.
Our, our doors are open all the time,
but we aren't sitting there saying,
okay, give me an update on this project.
If someone takes on a project, we
trust that they have that project
and that if there's a problem with
that, that's going to change the, the,
the delivery date or anything like
that, they're gonna notify that us.
Early and in fact, we'll reach
out to other teammates for help.
Um, it, it's, it's, uh, it, it's
requires a shift for people who join us.
Um, 'cause they're, they're used to having
a lot more oversight, um, of not having
as much ownership over what they do.
Um, one guy that we, we just hired
recently, uh, I just had a conversation
with him and, and he, uh, he just
brought it up outta the blue.
We were just talking.
He says, can I just say I have
never worked with an engineering
team that was so willing and
ready to have strong opinions.
About how things, uh, how
the product should be built.
Engineers always have strong opinions
about, no, this is bad coding
practice or something like that.
But they, they were like, no, the
business shouldn't be going in
this direction because of this part
of the system that you don't know
about 'cause you didn't build it.
And they, they, he says, normally
engineers are like, okay, tell
me what your requirements are.
I, what, what are the constraints?
When do we need this by?
I'll do my best to work within that.
But he said, this felt
like real partnering.
Uh, it took some time.
It, it still takes time whenever
someone joins for us to, to help
them get used to that and recognize
that we don't, we don't, uh,
punish anyone for bad decisions.
Because that the, the only way you make
good decisions by making bad decisions
and then, and then learning from ' em.
So we are, we are, uh, we are
very, very committed to the concept
of blameless retrospectives.
Like we, we've all done stupid things.
You can't not do stupid things
when you're building a startup
'cause you're moving so fast.
And so, uh, when a stupid thing
happens, the question isn't, why
did you do that stupid thing?
It's why did we as a company
have things set up that that
stupid thing was possible?
And that it, it, it shifts
the conversation a lot.
So yeah, there's, there's lots
of aspects of, of like figuring
out how to deal asynchronously
with tasks and stuff like that.
But I think at the core it's really
a matter of, um, of, of allowing
that level of trust and ownership.
At an individual level, if we bring
someone on and we, we feel, if we feel
good enough about someone to bring
them onto the team, we have to feel
good enough about them to let them
make their own decisions, even if
it's not the decision we would make.
And we will do our best to accommodate
that, and we will adjust, uh,
iteratively as we go forward.
It's not an easy, it hasn't been
easy, but, um, I actually would've
have a hard time going back to
a different way of doing things.
I, I, I've, I'm very proud
of, of the way we've, we've
built ourselves to operate now.
Chris : Yeah, I, I'm the same.
I worked in a recruitment's,
very traditional in the sense
that everyone's in the office.
And since I've founded my business working
remotely, it's, you know, high degree.
It sounds like you're in a good place,
by the way, with a high degree of
accountability and, and ownership
and people, people really struggle
to go back once, once you make
Schaun Wheeler: Yeah.
Chris : Um, would you say, I know
you touched on you, you chose to go
remote at the start of the pandemic.
Would you say it's evolved in terms
of your approach to, you know, you've
always been remote first, but has
it evolved since, the pandemic.
Schaun Wheeler: In a number of ways.
I mean, we always have to.
Um, although I don't, I
don't know that it's so much.
I guess the only part that's
really evolved from in pandemic
to post pandemic is, um, I.
We, we regularly will bring team
members together into the same
geographic area for like a week.
Um, we usually do, uh, at least
one whole company trip a year.
Um, and it's a, it's a working trip.
Like we do some things where, yeah, we'll
be tourist for a couple days and, and,
and do some stuff like that, but for the
most part, we find a really nice place
to be, um, nice as in the geographic
location, but also like in a nice, uh,
a, a nice hotel that where we're not just
like in a windowless conference room.
Um, and we will pick things
to hack on for that week.
Uh, we, and there, there are certain,
there are certain problems that
really don't get solved nearly
as quickly or as easily remotely.
Um, like there, there, there're
things that you can have 50 calls
about it and still not solve it.
And then you get together for half an hour
face to face and suddenly it's solved.
And so, um, we've built that in
not only for a whole team, but also
we'll, uh, uh, I just came back a
few weeks ago from Dubai where we
brought all of our engineers together.
Um, we've, we'll sometimes bring the
go to market team together, like we,
we do, uh, that on a regular basis.
Um, and that obviously wasn't,
uh, an option during the pandemic.
Um, other than that, um, I'd say most
of the changes we've made to how we
work remotely are less a function of
we're no longer in a pandemic and more
a function of our team is growing.
So there are certain things that
work really well when you only have
to coordinate three people, but
when you have to coordinate 30, it's
a total, it's different ballgame.
And so, uh, that, that kind of, uh,
growth accommodation, I think is, is
kind of the larger driver of the, of
changes in the way we, we do remote.
Chris : Makes sense.
Sounds like a, a really sensible and,
uh, grown up way to, to, to do things.
So, uh, yeah.
Commend you for it.
Um, okay.
And, and moving on to just the future
and a, and a bit of fun in terms of,
uh, nobody is like, how long is a
piece of string in AI at the moment?
I guess no one can accurately predicts
anything, uh, beyond six months.
But in terms of you being in the know
around agents and what you're developing,
um, where do you see agents heading and
where would you like to see them heading?
Schaun Wheeler: So right now I
think agents are, uh, in terms of
how people view them right now and
what the assumptions they make about
them, they're, uh, overwhelmingly
assumed to be LLM powered.
Um, I mean, Mo mostly if you
talk to people and you mention
agents, they automatically assume
you're talking about an LLM.
Um, and obviously that sticks in my craw
a little bit because we don't do that.
But, um, I do think I.
That will change.
Um, there's some really important
and interesting work going on,
uh, regarding world models.
Um, neuros symbolic reasoning.
Um, I guess what I'm really saying is
I, I, I think agents are increasingly
going to have to move past this
procedural paradigm of that.
That's, that's a good basis for
making all kinds of decisions.
It's actually a good basis for
only making a very small subset of
decisions, and I think that we're
going to have to move beyond that.
But obviously I'm, I'm biased.
Chris : Okay.
Very interesting.
And, um, yeah, what, what tools do you
use Sean to, to keep you productive?
I know you kind of touched on, um, your
coding partner with your, your LLM.
Um,
Schaun Wheeler: I'm actually gonna give
you a disappointing answer on this one.
I, I actually, I, I use, uh,
just very basic bare bones tools.
I, I tend to use, uh, to, uh,
waffle between chat, GBT and Gemini.
Um, I, I don't use, uh, any
of the specialized ones.
Um, may partially, that's just
maybe my, my, my personality.
But, um, I also try to be very
intentional about which tools I bring on.
Um.
Every time we, we offload part of
our decision making to a tool, um,
we lose something as, as humans like,
and that, that, that's, that's not
a new thing that's been happening
ever since we've been a species.
Um, but, uh, it's, it's, uh, you,
you tend to get better at thinking
in certain ways and worse at thinking
in the ways that you offload.
That's, that's the point of
offloading, um, is you don't
have to think that way anymore.
And so I try to be very
intentional about which tools I
bring on and in what situations.
I even use a basic LLM.
Um, I do use them all the time.
Uh, that I, I mean, I think
it's an invaluable tool.
Um, but yeah, I, I haven't spread
out into, uh, a lot of the, the more
specialized tools that might be.
Me with my startup founder hat
on in that I realize that a huge
amount of these tools have no moat.
They're like, there, there's, they, even
the ones that have large market share
right now, there it, the, the cost of,
of switching to one of the alternatives
is so low that, um, like I, I, uh,
I'm waiting to see who
some, which winners emerge.
So maybe it's just, I don't wanna,
I don't wanna learn a tool and then
find out that it wasn't the winner
and that I have to switch anyway.
So, uh, I, I tend to be a little
more conservative on, on, on the
choice of of of LLM based tools.
Chris : Nice.
And, uh, yeah, speaking of, uh, you know,
doing your own thinking, um, you know,
what, what would be one book, podcast
or even ML paper that's really, know,
challenged the way you think or you would
recommend for, to someone, um, you know,
what's the first one you'd recommend?
Schaun Wheeler: That's a
hard question to pick one.
Um, I, I don't know if this would
really be my first one, but it's
the first one coming to mind.
Um, there's a, there, there's a book.
It's not well known.
It's actually written by a philosopher
of science named Peter Stein.
It's called The Book of Evidence.
Um, it's, it's a, it's a pretty, like,
there's some formal logic in there.
Like, it, it, it's not a,
it's not, it's a dense book.
Um, but it really changed the
way I thought about evidence.
Um, so much of our conversation about,
um, evidence or proof or, or, uh,
a, a justifiable basis for decision
making is we to, um, thinking about.
Particular methods or tools,
um, or experimentation setups.
Um, and that book really shifted
my thinking to looking at the idea
of competing hypotheses as the core
thing to define, which is actually
something that is pre methodological.
Um, and you can have this really,
really robust method, but if you
have not, um, defined your landscape
of hypotheses correctly, um, and
robustly, um, the method isn't really
proving what you think it's proving.
Um, and I think that opened me up a
lot to, um, considering ways of, um,
approaching evidence that could look at,
at, uh, I mean, essentially what we're
doing is looking at an n of one, uh, when,
when our agents are making decisions that
I, I was taught in statistics classes,
that's a, that's a terrible idea.
It's nonsensical.
Um, but I, I think there is a
basis for doing it if you do the,
the hypothesis work beforehand.
Um, so yeah, it, it, it's, it's kind
of a funny book to recommend it.
Like it's not, it's not a particularly
enjoyable read, but it really did
change my thinking quite a bit.
Chris : Nice.
I'll, uh, yeah, made a note.
I'll have a look into that one.
Um, and in, in terms of, you know,
Gentech infrastructure aside, and,
and final question by the way.
Um, you know, what excites you about AI
in terms of breakthroughs, you know, could
be robotics or what, what, what excite and
change looking forward to the most within
Schaun Wheeler: Um,
I, I, I, I try to follow a
good, uh, diversity of people
on social media in terms of ai.
So I, I, I follow a lot of
the starry-eyed tech bros.
And I also follow a lot of
the, the AI is going to destroy
civilization, uh, skeptics.
And, um, and I, I think, I think
there's value in all of those.
Uh.
Perspectives.
I mean, we always, whenever there's a
new technology that changes how people
think, you can tell how influential a
technology is by how many people say it's
going to pretend to the end of the world.
Um, like they did that with television,
they did it with airplanes, they
with bicycles, Socrates did it
with writing like that, that
that's, that's a common thing.
Um, and I don't wanna use that
to dismiss the skepticism.
I, I, I, I share some of it.
Um, but I, I think what a lot of the,
the skeptics miss in focusing on, oh
look, they're, these companies, they're
taking a whole bunch of data often
without the permission of the people who
generated that data in the first place.
And they are, um, creating these
machines in order to make a lot of money.
And that, that, I mean,
that, that, that's all true.
Um.
I think they're underestimating the
second order impacts of these kinds of
systems, especially on the way they can
influence underprivileged communities,
people with disabilities, like there's
usually not a lot of money in helping
people who did not have equal access to
education to level the playing field.
That's usually like an own
NGO does that you don't get a
corporation that invests in that.
But that leveling of the playing
field is actually, uh, um, one of
the main, according to a, a New York
Times article I read a while ago, was
one of the main motivations behind
Microsoft's investment in open ai.
Uh, their CTO had come from a background
like that and, and saw that as.
Um, this, like, if this can work, this
is taking someone who maybe doesn't
speak the way people expect them to
speak, um, and maybe doesn't have the
same, uh, kind of procedural skills that
someone who had gone to four years of, of
some college at, um, would, would have.
And this can help even that out.
A lot of, um, technology, I think
in helping people who have physical
disabilities is going to come out
of this same kind of technology.
It's the same, um, methods,
the same processes.
It's just a different application.
I'm, I'm actually very excited to see
the way this moves into those spaces,
um, and that, that touches on robotics,
that touches on lms everything.
But I, I, I think there's, there's tons
of potential there that's not going to
get funded directly, but it, but can get
funded, can get supported indirectly by
further developing these technologies.