How busy professionals stay on top of the React ecosystem. We give you a 1 hour recap of the latest news and nuance in React's development and ecosystem, upcoming conferences, and open source releases. New episodes the first week of every month, with live recordings on the last Wednesday of every month in the Reactiflux stage.
Hosted Mark Erikson (Redux maintainer), Carl Vitullo (startup veteran), and Mo Khazali (head of mobile at Theodo). See something for us? Post to #tech-reads-and-news
Hello, everyone.
Thank you for joining us for this
month in React, which is not going
to be particularly React heavy.
Mark and I have been talking for a
couple of weeks now about doing a,
like, bonus episode of sorts to talk
about AI and how we're using it.
So we are just, instead of a bonus
episode, we're, we're just gonna
do that for this month, for April.
Yeah, and apologies for last month.
We had a recording problem,
and we, it was completely
unsalvageable, just nothing to save.
Big bummer.
But yeah, so I am Carl.
I am joined this month by Mark
Erickson and Swizec Teller.
yeah, we're gonna talk about AI
because it's been a huge part of
each of our workflows for the last,
like, ranging from, like, three
to six months to a year or more.
Let's do some intros first.
I guess Mark and I are reasonably
well known, but I'm Carl.
I am a staff level software engineer
and engineering manager and community
lead here at Reactiflux, where I
do events like this and build code
to keep the community operating.
I'm Mark Erickson.
My day job is ReplayIO, where we've built
a time traveling debugger for both humans
and agents with ReplayMCP now available.
Please check out our blog.
I just put up a blog post on how Replay
found a bug faster than Dan Abermov did.
I am still the Redex maintainer.
Honestly, I haven't done much
Redux stuff in the last few months
because all my brain space has
been taken up with day job work.
And also, I'm going around to a
whole bunch of conferences this year.
I'm Swiz.
I work at Plasmidsaurus, we are a
DNA sequencing as a service company.
We do a lot of React for really fancy
data visualizations for stuff like,
"Hey, wh- how do you visualize a few
million data points in the browser and
make it work smooth?" Stuff like that.
And these days, I'm kind of more
of a manager than an IC really.
And I've been thinking a lot about what
kind of engineers get hired these days.
We've been hiring a lot and using
more and more AI to write the code.
Yep.
Believe that.
Cool.
Yeah.
So we were just chatting a
little bit about what shape
this conversation's gonna take.
Just to level set a little
bit for everyone listening.
We're gonna start off kind of at the
point of, like, what convinced us that
AI was a tool worth taking seriously and,
you know, getting AI pilled as it were.
Go from there into how we're using it
now, what problems we're using it to
solve, with what tools, as well as kind
of, like, what aren't we using, what
don't we find useful and compelling?
And go from there to, like, landscape
of, like, what tools are available,
what's out there, where do we think
it's gonna go, and then kinda close out
with what do we think the impacts are
gonna be on the industry more broadly.
Mark, you wanna start us off
talking about what convinced
you that AI was worth using?
Sure.
A year ago, I was dead set that I would
never, ever allow AI to write code for me.
It was a fate worse than death.
It was destroying my career.
I refused to do it.
And in fact, I actually wrote a 15,000
word blog post over the weekend that
I haven't published yet that will give
the long form version of this story.
The short form is, over the summer last
year, I cautiously started using AI to
explain an existing code base to me.
You know, just give me some architecture
docs, walk me through the data flow.
And then there was a three-day period
in late August that blew my mind.
On a Tuesday, I asked it to write some
redux unit tests for me because my
brain was too tired to write actual
code, and it did, and I was stunned.
On Wednesday, there was a node compression
library that I've been trying to replace,
but the alternative didn't have all the
features we needed, and it's Rust-based.
And I tried asking the AI to
write the feature for me in
the Rust library, and it did.
It actually didn't quite work right,
and the maintainer had to turn down
the PR, but this was the first time
I saw an agent actually just crank
along and spit out a bunch of code
and happily make a bunch of updates.
And I thought I had a, a
good understanding of what
that process looked like.
And then when I saw it in person
for the first time, my jaw dropped.
And then on a Thursday, I needed to
write some AST-based linting code.
I know what ASTs are.
I've used Babbel.
I understand the concepts, but
it's kind of complicated and
fiddly, and we had a custom setup.
I was like, "Could this do it for me?
" And it did.
And it did it much faster
than I could as a person.
And my worldview got destroyed.
Yep.
Sounds familiar.
I mean, that, that sounds
like an amazing experience.
For me, the first was way back in
the, uh, Stone Age where you had
to talk to ChatGPT and then copy
paste the output to try to run it.
And it was like, I think it was the
holidays, and I was writing a book feeling
kind of discouraged, and I was like, "I
wonder how many words I'm writing per
day." And it's like, I can write Python.
It's not that interesting to parse a
markdown file, go through, get history and
see how many words you added every day.
Um, so I was like, "Maybe ChatGPT
can just write this for me.
" And I asked it, and the code,
the code didn't work, but it ran.
And I thought that was really cool.
So I talked to it a little bit
more, and we ended up with a really
nice ... It was an extremely ugly
code that I would never write myself.
I think it ended up still taking
two or three hours, but it was a lot
more interesting than me doing it.
Um, and I ended up with a nationalization
of how my, how my book is doing, and then
I wrote a few more scripts like that.
And then I started using it at work to
write my database migrations, because-
Please tell me, was it
still just the ChatGPT UI?
It was.
Uh, this was, this,
this was before Copilot.
I was like, "Hey, ChatGPT, I have this
SQL query, please write migration."
And it wrote the migrations or I ended
up, uh, later on just copy pasting table
definitions from, like, DBR or DataGrape
or whatever, go look at the current tables
and be like, "I have these tables. Please
write migration for ... " I think we
were using Connects at the time. "Please
write migration to add these columns
and then copy paste back and it worked.
"And I was doing that for
a while and it was amazing.
I never wrote a migration
again in my life.
Yeah.
I started doing a lot more
migrations because it just became
so much easier to actually do them.
There's such a pain,
there's so mechanical.
That was an early one for me too that
was like ... I guess I actually had been,
like, scoping my work to avoid migrations
because I hate doing them so much.
I had a similar experience of just
like, " Oh my God, this works for that.
I can do so many more of them now
because I don't have to do it.
"Yeah, it's a little funny because
actually database migrations are
a big part of why my career looks
the way it does, because I did,
like, one in, you know, my first
year as a software engineer.
I went," Wow, I hate this.
You can't guarantee anything.
There's just, you just gotta try it.
"And so then I went more front
end because you don't have to do
database migrations on the front end.
Yeah.
I had a similar type
of experience as Mark.
I guess I, I got on it much
later than you did, Swizz.
Sometime around, like, August, July,
August, September of last year, you
know, I, I had done kind of like
ChatGPT prototypes or whatever, like
Claude, having it write some HTML
in line and just, like, prove out a
concept or write a first draft of a
script, like, prove out does this work?
Give me off the blank page and give me
something to grow from and diagnose.
And then I'd been hearing people talk
about Claude code, so I finally gave
that a shot in, like, I think around
July, August for the first time.
And just watching it, you know, like,
Mark, like you said, of just watching
it churn through and do things and
also having it be, like, native
on the command line and watching
it just run exactly the same Bash
commands that I would to diagnose
something and, like, read the files.
It was like, "Oh, this is doing exactly
the same process I would to resolve
this bug, but it's doing it five times
faster than I can." That was my big,
like, light switch moment of, like, "Oh,
I need to take this really seriously."
For me, it's kind of like I
really hate watching it work.
I've been using AI for a while
doing, like," Oh, can you
write this function for me?
Or can you, like, do little small things?
"But my workflow really changed when
Cursor launched Slack background agents.
So I started in Slack, like, when you
get ... Partially, I'm a PM, so I get
a lot of requests that are like, " That
is definitely not a priority right now.
We're not gonna work on that.
"But I can now go at Cursor, do the
thing, and I just get a PR with the
implemented small thing that I never
would've taken the time to do myself.
And I, I find that amazing
because watching it work, I think
is, for me, is too distracting.
Yeah.
So I like it when it's fully
somewhere in the cloud.
I don't have to worry.
I just review the code when it's ready.
My workflow is very hands-on
and human in the loop.
Like, I'm sitting there watching that
thing like a hawk and having conversations
with it, which I realize is not how
most people are using AI at this point.
I think a lot of people are very much
on the, in the cloud, multi-agents, how
many of these th- things can be run in
parallel, which I now have opinions on.
Part of it is ... Well, a lot,
a lot of it's the point that I'm
making the draft blog post, which
is understanding is still critical.
And I'm a very, I'm a very firm
believer in, you know, the fundamentals
and understanding and building
a mental model of the system.
And I personally, for me,
I want my brain engaged.
I want to be thinking through the
problem, and then I'm using the
agent to amplify my own abilities.
Now, don't get me wrong, there's
definitely been a few moments where
I was, like, you know, out and about.
It's like, you know, it actually would be
kind of nice if I could just pull up my
phone and tell an agent, go do this thing.
Like, I, I, don't get me wrong.
I get the appeal.
Yeah.
But in terms of day-to-day
development work and how I approach
programming, I want my brain
active, engaged, and thinking, not
just handing it off to an agent.
Yeah.
I hear that.
I think I use it in both ways.
There are a lot of things that, like
you said, so it's just, like, it's not
quite high priority enough to justify.
And that's where I'll use, like,
more of a background agent.
Like, you know, here's 15 one line
descriptions and, like, just from the one
sentence, it's clear enough what's needed.
Like, you know, adjust the size of this.
Add an element that controls this.
Like, those are mechanical enough
that it's really just, like, getting
it right, making sure it compiles.
I don't currently have a functioning
workflow for that exactly.
What I'm doing right now in the last
four or five weeks just, like, hasn't
really been of that type of development.
So I am still, day-to-day, most
of what I've been doing is very
much, like, the mech suit variant.
You know, people talk about,
like, automated versus mech
suit, like robot or mech suit.
And yeah, so I'm driving it
pretty closely, but I'm not really
reviewing its output very strictly.
I'm, you know, I tell it to use
subagents, and then every so often
I'll, you know, tell it to write me
a report, like, how is this working?
What is it doing?
Ask it some targeted questions to
make sure that the mental model I
have matches what's actually there.
But that's a little different, I
think, than what you described, Mark.
Like, I'm not actually, like, I
don't read almost any of the code.
I try and think of it more from,
like, my engineering manager,
tech lead, product manager hat.
I do really try and sit in engineering
manager/product manager role and
say, like, "Okay, how did those
people talk to me as an engineer?
Like, they were not technical.
They didn't understand anything of
what was being built, and yet it
was their job to make sure it was
functioning as intended." So that's
very much how I try to think about
my work now is from that perspective.
It's like I, instead of reading a line
of code to ensure that patterns are
followed as I intended, like, set up
a lint rule and then, you know, maybe
it's only 80% is good, it's not gonna
catch all the subtleties, but, like,
my experience of working on a team
is, that's kinda how it works anyway.
Like, if it's not captured in the
lint rule, eventually it will break.
Like, eventually, someone will
have been onboarded and not ha- be
deeply steeped in the history of the
project or just be tired that day
and they forget about it or whatever.
So, like, that was very much my
perspective as, uh, you know, when
I've been a tech lead is, like, if bad
code goes out, you know, the blameless
postmortem, it was a process failure.
This should not have been
permitted by the automated checks.
And so that's kind of how I'm thinking
about it is just remove the abstraction
a little bit and guarantee the
outputs and then work on automated
tracking of automated evaluation of
the code quality to make sure that's
at a level that I need it to be.
Yeah, same.
I think that's my perspective as well.
I've ... Going up into tech lead was
really kind of almost broke me as an
engineer because I was like, "Oh my God,
all of this terrible, awful code, but
it works and it's fine and everything
is okay and the team just handles
it. And if anything is wrong later,
we just fix it. It's totally fine."
And I kind of treat AI the same way.
I, I do a lot of driving from, like,
a product perspective and for critical
features or, like, super gnarly
business requirements, I go into the
ID and I drive it kind of like what
Mark was describing, reviewing every
line of code, clicking yes, et cetera.
But they do really well with follow-ups.
Cursor gives you the wrong
thing back in a Slack thread.
You just say, "Ed curs- ad cursor,
go fix it. " A lot of the times,
those fixes are actually, "Oh, yeah.
Now that I'm holding a working MVP,
it doesn't actually feel right.
I totally didn't even think
of several features that it
needs before it's useful."
I also really like the code review flow.
I don't know exactly what we did,
but we hooked it up so that you can
do a PR review, like a code review
for your cursor agent session.
And in GitHub, you just go at cursor, fix
this, or you give it information basically
the same as you would with a team
member, and then you just get follow-up,
commits, and it fixes all the things.
That's very much been
my experience as well.
One of my biggest struggles as a,
as an engineer is, like, I remember
hearing about this sort of a stereotype
of, like, give it, give this task
to a junior engineer because they
don't know it's impossible, you know?
The sense of as you get more
experienced, you get to see more of
the complexity, and the complexity
makes it harder to take action because
you're now evaluating trade-offs
instead of just kind of ignoring them
because you don't know they're there.
And I would get so stalled into
analysis paralysis and, like,
what's the correct way to do this?
How do you do this best?
What's gonna have the best maintenance
trade-offs and the lowest, you
know, the tech debt and whatever?
And that's just such a
difficult way to actually solve,
especially a novel problem.
Like, if you understand
a problem well, great.
You can specify everything in advance.
But, like, if you're working at the
edge of your knowledge, at the e- edge
of your understanding, then, like,
you're gonna go the wrong way at first.
And it's so painful to work on something
for three weeks and then go like, "Oh,
shit. This is just completely the wrong
architecture and I need to start over."
But it's so much less painful
now with AI because, like,
great, what a wonderful learning.
Let me take that, let me have it write
a three-page document about what we
learned, what the new problems are, go
back and forth, interact with it about,
like, designing a new data model and
architectural, you know, process flow.
I actually just did this
in the last, like, week.
I've been benchmarking local LLMs because
I want to better understand which ones
work for what tasks so I can not have
all of my AI usage be based on some
mystery frontier model in the cloud.
And I just, like, you know, I started
with just, like, "Hey, give me a
script that will run this model. Start
up a, you know, LLM server and then
run prompts against it. " And then,
like, that grows, "Oh, I need, I
wanna be able to kill it and restart.
Oh, I wanna be able to, you know,
version the prompts as I change them.
Oh, I need to have it, you know, run
a code evaluator when I'm giving it,
like, a code generation challenge."
Eventually, it became un-
unmanageable because of the tech debt.
It's, you know, it grew and expanded and
I learned more about what was needed.
And so I just started over and I
said, "Great. This is a Python ball
of mess and it sucks and I hate it."
So let's talk about the architectural
flow of it and now let's add some lint
rules and let's rewrite it in effect.
So it's actually like a high quality
code base with, like, resumability
and scheduling and whatever.
And now it's working way better
and it's, like, maintainable.
It's been really powerful.
I freely admit that the very
hands-on style that I've got
is intentionally learning.
I know it's possible to go faster.
I know that it's possible
to delegate a lot more.
I am treating it much more as an extension
of my IDE and keyboard than I am as a
junior developer delegating the work.
That's fine.
I am good with that.
That is what works best
for me and my brain.
You might start doing that less
and less as you use it more.
I started with doing that a lot more,
and then eventually I realized, wait
a minute, I'm just clicking yes on
everything and then giving it a follow-up
prompt to change the things I didn't
actually like, but click yes on anyway.
I certainly didn't think a year ago
I would be using any of this at all.
I was, I was convinced that was a red
line that I would, I would never, ever
cross on pain of death, and here I am.
But also, like, I mean, I've, I've found a
workflow that I will be hopefully blogging
about in the next day, next couple
days that does actually work for me.
And the hands-on aspect of it
is what I get good results out
of, and it's what fits my brain.
Well, okay, let's take
that as jumping off point.
Like, what tools
Let's do, like, super concise,
just, like, list what tools and
what models you're using day-to-day.
So my, my own setup, my intro was the
kilocode VS code extension with, you
know, it was probably either Sunat 35
or Sonat4, whatever was available around
September-ish last fall because I didn't
want to go anything command line at first.
I wanted to stay in a
graphical environment.
I tried ClaudeCode for about a day.
Tried the VSCode extension, which didn't
work at all, and then I tried the command
line tool, and I, I did not like it.
I tried the OpenCode command
line tool, also did not like it.
How do y'all deal with copy paste
in a command line environment?
So what I have now, I'm using OpenCode.
My personal laptop is Windows.
My work laptop is Windows,
but I work in WSL.
So I actually serve OpenCode from
within the WSL environment, and then I
use a third party web UI for OpenCode.
OpenCode has a very nice
server client distinction.
The text client is just one
of the possible clients.
So I found one called Code Nomad,
which is very good, works great.
So I actually serve Code Nomad
plus OpenCode added the WSL side,
and then I just hit local host
whatever port in my browser on the
Windows side, which also avoids any
cross-platform file shenanigans as well.
So I've got my chat sessions
and my tabs open in the browser.
That's now my development environment.
And then I still have VS Code open
for looking at some of the diffs and
editing that my ... I much prefer a Git
graphical client called Fork, but it
doesn't work well in Linux, so I've just
stuck with the built-in VS Code stuff.
I actually don't like it, but I've been
too lazy to go find a better alternative.
Model-wise, I've basically
been on whatever the latest
and greatest anthropic is.
We have corporate keys,
da, da, da, da, da, da.
Also, I have somewhat intentionally
tried to avoid model hopping.
I don't want to be running a bunch of
evals every other week and saying, "Oh,
this one provides me a 3% increase in
such and such a benchmark. Clearly, I
need to switch my entire workflow to a
different model." and granted, OpenCode
does let you just pick and choose what
model you want, but I'm trying to get
something that's good and consistent
and that I know works, not chasing
the hypothetical maximum performance.
Okay.
So you're, you're on OpenCode
and using latest Claude Anthropic
models just through API usage?
O- Opus 46, yeah.
That's the other thing.
Uh, it's, it's API keys, not the various,
like, you know, $20, $100, $200 max plans.
So I, I read about people getting the
resets and it's like, I haven't do with
that because I'm also not paying for it.
Right.
Cool.
Okay.
It's like knowledge that my experience
may not be universally shared.
Okay.
Yeah.
Yeah, Swizz, what's
your workflow look like?
I'm curious, I'm very curious
about yours because I think you're
by far the most advanced agent
user between the three of us.
Which is really funny because I
was surprised to learn earlier
this week that 97% of my code
is AI because I really talked.
I was like, "What?"
But I actually have an
extremely simple setup.
I use Cursor, I keep it updated
to whatever the latest version.
So when it pops up with, "Please
update," I click the button.
I think the latest Opus model or
whatever it is, it's like something 4.6.
I think it's an Anthropic model.
We have a Team subscription, so
CompanyPays gives us an infinite budget,
but I think I'm actually using, like
It's actually funny.
I don't know ... We were just
talking about this today.
I don't know how this happened, but on
the Cursor Leaderboard, I have the second
most AI usage of all of the engineers, and
I have the least amount of dollars used.
Fascinating.
Interesting.
I have literally spent $25 this month.
So anyway, I use Cursor, I use
@cursor on Slack a lot, and I
use @cursor on GitHub a lot.
We also use Linear, and I've
started more and more delegating
my linear tickets to cursor.
So look at the linear ticket, be like,
"Eh, I didn't write this well, add a
little bit more context so that a dumb bot
can know where to go fix things or what
to change, and then just click delegate
to cursor, and then I review the PRs."
That's my workflow.
I try to keep it really simple and easy,
because like Mark said, I'm not looking
to super maximize my productivity.
I'm more looking for, you know, as a
manager, I'm supposed to stay on SiteQuest
anyway, so this is a really good way
to code between during, uh, between
and during meetings when I'm supposed
to be paying just like half attention.
Yeah, that's really interesting.
I, I, I like that you two are like
kind of two ends of the spectrum here.
So as you use it as like very much
in a managerial capacity, like, you
know, the same as you would ping a
colleague, you ping cursor instead
and just say, "Hey, do look at this.
"
Yeah, pretty much.
I, I've been doing that a lot.
And it's all within collaboration
platforms, I guess is like
a big distinction here.
It's all in space- A lot of it is.
Yeah.
Okay.
We're now experimenting with building
like a feedback loop so that we would
have a bot that Sentry sends us errors and
we're thinking of having a bot that looks
at those errors, figures out how to fix
them, and then just issues a PR when we
get, so that we could have automatic PRs-
Yeah.
for, at least for some er- no, like, you
know, probably not for super critical,
crazy, important things, but there's a
lot of errors that happen where it's not
that important, but it's nice to fix.
Yeah, definitely.
Right, the long tail of Century issues.
But when I'm like actually hands-on
coding, I use it a lot to write my tests
because I know what I want to test and
I can describe the situation to set
up, but I hate doing the grunt work of
setting up your database in just the
right way to have 50 models, et cetera,
yeah.
Yeah.
Okay.
Interesting.
I, I kind of split y'all's experience
a little bit, or I'm working
towards ... I feel like I'm closer to
where Mark is right now and I'm trying
to move towards where you are, Swiz.
Yeah, I almost exclusively am
using Claude code with Opus.
I've played around a bit with ... Like
I, I used Copilot, you know, GitHub
Copilot, like once to, like, scaffold a
prototype that I was experimenting with.
I don't like the experience
of it being in a web browser.
Like, something about it ... I don't know.
I also don't like code spaces or,
like, remote development across SSH.
So, like, this may be just my own,
like, biases and preferences, but I
really like just ... It's right here.
It's on my machine.
It's an environment that I have set up.
I know exactly what's
available to it and what's not.
Just, like, I guess I've prioritized,
at least in that where I'm trying to
use it like a Mac suit if speeding up
my own pr- individual productivity,
I just want it to be predictable
and understandable for myself.
To me, that's very much just been Claude
code with whatever defaults pretty much.
I've played around a little bit
here and there, but yeah, I'm trying
to work towards ... One of my many
projects that I have in flight is
a personal, like, orchestrator.
So I had started out kind of my foray
into, like, earnest AI focused, using it
to the extent that I am now, as opposed
to more limited reading every line of
code, going in and editing it myself.
That's only really been since,
like, February when I started
playing this AI agent game.
And it's been really interesting thinking
about ... Because putting an agent in
a simulated world and, like, having it
autonomously play a game and do so in an
interesting way, you know, effectively
and interestingly, really just, like,
got a bunch of wheels turning in my head.
And so, like, ClaudeCode works really well
for me for, like, pushing the envelope,
like, expanding what a project is, what
it can do, and I have really enjoyed
using background agents more, like, very
much like what you described to us of
just, "Hey, oh, fix this. This is broken.
Here's this bug." I don't currently
have anything like that operational,
but I did get my little personal
orchestrator working for a minute.
It was just wor- working on the GitHub
API and I just said, like, "Here's a
repo, like, go, you know, read the issues,
triage them, prioritize them, take a task,
open a PR, and wait for me to review it.
" And it worked, but I didn't give it
any guardrails, so it ended up, like,
redoing the same GitHub issue multiple
times, or, you know, it would re-review
the same PR over and over again with,
like, a page and a half of a comment.
And so I was like, "All right,
okay, this is proof of concept, it
works, but, like, clearly I need
some additional things set up."
And I guess one, another thing I
wanna say, like, kind of what you
said was about, like, using Sentry
Data as an input for what to work on.
I think that's the frontier.
Like, that's what is
currently being explored.
Like, how do you do that effectively?
Yeah.
You need the feedback loops.
So I was reading a lot of how to
build an agent papers the other day,
and the main inputs are basically
rag, memory, and feedback loops,
and from there, it can do a lot.
So Cursor, for example, Cursor Cloud
agents through Slack, they actually,
when they're developing the feature,
they will fire up a browser and go
test it, see if they can actually do
the thing, and if they can't, they
will then c- continue iterating.
And in the end, the PR
doesn't just have code.
It has screenshots and videos of
the working feature, which is what
I require from all of my engineers,
and it makes reviews so much easier.
And there was another ... Oh, yeah.
So I think the longest I've managed
to do, to have it spin when I asked
it to build an entire feature.
So, like, I would write a maybe 200
word, two or 300 word prompt in Slack
to add cursor, and it took ... I think
it spent 45 minutes to come back with
the working feature, and that was
amazing, because I was in a meeting
for that whole time, and then I just
looked at the PR and gave it feedback.
Yeah.
That's incredible.
So my day job, we're coming at it from a
similar but also sort of opposite angle.
We, we have built a time driver to
bugger, and originally the premise was
that by making a DVR style recording,
you as a person can go in and do all
the investigating in the lines of code
and the print statements and everything
else so that eventually you as a
person figure out why this was broken.
So we shipped an MCP a couple months
ago, and I've already seen some very
real examples of agents being able
to go in and solve bugs that they
wouldn't have been able to otherwise.
I actually put up a post
about this a week ago.
Dan Abermov had filed an actual React
bug saying that the used deferred value
hook sometimes fails in production,
it's stuck, like a render behind.
And he had a repro and he said, uh,
"I've had my agent try to look at it,
but I can't find the answer." A month
later, he comes back and files like
a four-line fix deep in the guts of
React Scheduler to actually fix it.
And he posted on BlueSky later, and
apparently what he had to do was
rebuild the React Library with a bunch
of console logging added so that his
agent could look at the prod build
and eventually trace what was going
on and figure out how to fix it.
So I'm like, "That would be a great
marketing post comparison." So I took
his example, I made replay recordings
of the working dev build and the failing
prod build, handed them to an agent,
and I said, "Here's a bug report.
Here's the two replay recordings.
The issue is somewhere in React.
Can you find it?" took 10 minutes.
So I'm like, so then I'm like, okay,
well now let's make it like, you know,
something resembling a proper experiment.
So I took the same two recordings
and I spun up four simultaneous
sessions with differing instructions.
The first one was just a basic, here's
a bug report, go investigate, actually
less context, the proof of concept.
The second one, I gave it like a eight
step investigative process to follow.
The third one had a few paragraphs
just naming some concepts and
Reacts internals, like not even file
structure, but just things like,
you know, schedulers, fibers, lanes.
And then the last one also
explicitly listed some of the
replay MCP tools we have available.
All four of the sessions found the
same bug and suggested the correct fix.
They took respectively 28,
17, eight, and seven minutes.
So on the one hand, you know, ob-
obligatory sales pitch, replay recording,
finding bugs, it's awesome, it's great.
We're building some cool stuff with it.
But it also showed me a lot
about the value of the prompts
and the context that you give.
There was another example I was working
on where someone had made a, an example
NextJS app with some example bugs in it
and was asking an agent to find them.
And in one of them, there's a, there's
a double loading screen that's caused
by mixing and matching suspense
and Tanstack query loading state.
The AI always suggests the use suspense
query hook, but apparently if you do that,
it leads to a hydration mismatch error.
And the real answer is to do
some server pre-fetching instead.
So like you have to think
bigger, think architecturally.
So I did the same thing.
I tried, you know, making some
re- replay recordings and feeding
in the, the varying instructions.
My agents mostly got the same base
level used suspense query error.
And then I tried one more session
where I fed in the NextJS and
TanStack query skills files.
And now the agent actually said,
"Well, useSuspense query is the
initial fix, but you really ought
to do server pre-fetching instead."
So again, like that, that taught
me a lot about the proper context.
That also says a lot as we're trying
to build our own CI debugging agent
that looks at test recordings, so
clearly we need to give it, you
know, good prompts, user-based
context, all that kind of stuff.
So, I mean, I think even something
like that can maybe help explain
some of the differences and
results that different people get.
Yeah, no, the impact of prompting and
available context is, like, so massive.
As we're talking workflow, something I do
very regularly is, you know, start with
a prompt, start with, like, if I'm gonna
add a new feature or redesign something
or what, you know, a large task, not a
bug fix, you know, a sprints project,
not a ticket, then I'll start off with,
like, a 200-word, like, prompt of, like,
"Here's what we're trying to build.
Here's how I think we will need to do it.
Here's a couple of parts that I
think are gonna be important." You
know, you know, have it explore,
do the implementation, review, you
know, and then I manually confirm.
And then I'll tell it, like ... And
I guess I'll do this at a couple of
points throughout as it's exploring
and, like, evaluating things.
I'll ask it, like, "How
was the documentation?
Like, did you find everything you
needed?" And that has been, like, totally
transformative for, I think for token
usage as well, because, like, just by
asking that, it'll say, like, "Oh yeah,
it was all ... Here's the three files I
referenced and I got everything I needed."
Or, like, "Oh, yeah, this file was
out of date. You know, I did the wrong
thing at first and then I had to go
back and rework it. " So I'd recommend
that we update this" and it makes those
recommendations just based off its
own experience of reading the code.
And so, like, just by doing that fairly
regularly as the code base evolves,
I manage to keep, you know, the
documentation pretty well up to date.
Are you asking it to write its
findings back into documentation
or are you updating the docs?
I'll read it with a skeptical
eye of, you know, like, does this
sound like it is a real problem or
not, a real area for improvement?
And then I'll just say, like, "Yeah,
hey, can you update those docs?"
Often, I'll come at it with a pretty
clear intention of, like, we just
revamped how scheduling works.
Let's analyze the architectural overview
document that we have and see if
it's still accurate and then tell it
...
You know, so, like, I will do some
documentation specific projects with that
in mind, but for the most part, yeah,
just, like, ask it how the experience
of the familiarizing itself with the
code base is, and then give it free
reign to improve that for the most part.
Yeah.
I think one thing I would be worried
with that we have some people in the
team experimenting with that, and they
all agree that after a few iterations,
like, a few weeks later or a few months
later, you end up with very spaghettified
documentation that's essentially just AI
slob that humans definitely don't wanna
read, but even the AI barely wants to
look at it and read it because it just
keeps getting low, lower and lower signal.
So I think ... I don't have a solution
for that, but it's a thing I've heard.
That's fair.
I haven't really evaluated that recently.
I did come at it with trying to
say, like, "Here's five documents at
these levels of abstraction in the
code base." So I do try to give it a
little bit of overall stru- structural
guidance, but that's a good thing to,
that's a good thing to look out for.
I haven't really evaluated it.
That's also how I get my own
mental model of the code.
So I guess, like, I don't necessarily
write it, but I will read it, and if
it's nonsense spaghetti, then it's like,
"No, we gotta do this again." Like-
That also ties into the
longer term memory question.
So, you know, the problem is
ev- every session is fresh.
It knows nothing, you know, every,
literally everything is being injected,
you know, agents, MD, you know,
whatever rules, files, et cetera.
So how does an agent even know that
you have these nice architectural docs
that you've been keeping up to date?
Or how does it know that, you know,
last week, a week earlier, whatever,
you made these particular decisions,
and that's why we ended up here.
Or it has no idea what
any of your code base is.
Let's go read 30 files to re-
form a brand new, fresh set of
memories that compress the context.
And some of this can be,
can be dealt with partially.
There's ... I, I, I highly recommend
tools that will do AST-based scanning
of the code base and preferring those
for what, for loading chunks of code
rather than just, like, blindly whatever
built-in read file tools are there.
But that is actually where I'm
running into the limits of my
own personal workflow right now.
I have, you know, dozens of feature
and research docs that I've generated.
I have daily progress docs.
I have sub-task handoff docs.
There's a lot of very valuable information
in there, and the AI has absolutely
no clue that any of that exists.
So I'm n- I'm now f- starting to feel the
need that I need to have some kind of,
you know, review sweep process, some kind
of tool to, you know, index the markdown
file, something to form that longer term
memory structure, and I keep bookmarking
hundreds of tools and saying I'm gonna go
investigate them and haven't done so yet.
If only you had a tool that's really
good at summarizing large pieces of text.
I, no, I, I actually
have done some of that.
Um, I just haven't actually settled
on one and tried installing it yet.
One thing I found also really helps
for context and stuff like that is
actually structuring your code well.
So having a vertically oriented
architecture, small balls of mud that
are self-contained rather than one large
horizontally sliced ball of mud makes
it easier for humans, and it also really
works for the AI when you can say,
"Go add a feature to this directory."
And it's the, like, basically use your
directory structure as an index for
where to find different kinds of code.
That sounds suspiciously like software
engineering practices, and I was
told those don't apply anymore.
Yeah, right.
Like, the main thing that I've been trying
to keep in mind here of using it, and,
and this was, this kind of talk goes back
to your recent blog post of, you know,
where you talk about how much code A-
AI is writing for you, and to just talk
to it like, these models were trained
on things like GitHub discussions and PR
reviews and code comments and whatever.
So, like, it's not some new,
esoteric, crazy, unpredictable thing.
Like, the more you talk to it, just like
it's a competent engineer, the more it
will behave like a competent engineer.
And so it has all of these assumptions
about, you know, social norms within
the context of, you know, text documents
describing code and documentation.
And so the, the more you understand the
social norms that it are deeply baked
into its training, the better it does.
So, like, yeah, like, I saw these,
you know, memory startups where it's
like, "Oh, we will automatically
generate documentation for your code
base and make sure it's up to date."
And it's like, I'm already doing that with
a Read Me document tagged with a GitHash
of how, when it was last updated for free.
Like, if you have a couple of folders,
a couple of directories full of code
that is, like, pretty well isolated with
a Read Me document in there, like, I
don't need to tell the AI to look for
the Read Me because it knows to do that.
Of course, it knows to look for a Read Me.
So there's definitely a lot of
difficulty with keeping the signal
high, but, like, that's not new.
I don't know.
Like, that just sounds so much
to me, like, working on a team
of eight engineers and, like, not
everyone has the same shared context.
Not everyone reads every
email or every Slack message.
And so, like, as far as the challenge
of, the, the challenge of keeping
a team up to date on best practices
and recent decision making, versus
keeping an AI at the same level, like,
that, those feel really similar to me.
It's just engineering best
practices and communication norms.
With, with the major difference
that AI will actually go read the
documents you tell, you ask it to read.
Right, right.
Like, one of my, like, you know,
war stories was I was a contractor
somewhere and they kept having, I kept
having to do a rework because, you
know, QA was overwhelmed and, like,
you know, they finally get around to
reviewing PRs and, oh, it's broken.
Oh, gotta go back.
Oh, there's all these
conflicts now because
And so, like, despite being
a contractor, I sat down.
It's like, no, we need to
fix our review process.
We need to fix our merges and whatever.
Spent, like, two weeks doing that,
get everyone on the same page, man.
Next day, the, you know, tech lead
employee at the company, the guy
who should have been doing that
process is like, "Nah, this is too
much. Like, I'm just gonna force
push." And he took down production.
So it's so it's like, uh, when
people talk about AI doing stuff,
it's like, "I don't know, man. Have
you ever worked with engineers?"
Uh, so this, this does tie into a lot of
the conversations that I had last week.
I, I was at both the AI engineer
and React Miami conferences.
And the general theme of the discussions
there between the talks and the people
and then, you know, all the different
chatter online is we built ... And
I also ranted about this in my
15,000 Word unpublished blog post.
We built a bunch of software development
practices for people, the agile manifesto,
linear issues, PR reviews that require
multiple people to stamp them or, you
know, as a way to share knowledge,
standups, like these are all people-
based processes, and it made sense when
humans are the limiting factor and you
have to be very intentional about making
sure you're writing the correct code.
Like, not just does the code work, but
are we even spending the time and effort
on the right thing in the first place?
And we now essentially have a generate
infinite amounts of code button, and now
we're finding out that we've moved the
bottleneck further down the chain into all
the verification and QA sides of things.
And so this shows up in, oops, GitHub
is overwhelmed because we've literally
10Xed the number of PRs that are being
pushed or, wait, we've got a bunch of
senior engineers, but they're literally
spending all their time reviewing PRs
generated by the junior devs with AI, much
less what happens when you have a dark
factory of agents cranking out code twenty
four seven and no one ever looks at it.
Like, we haven't figured out what the
resulting processes ought to be, and
one of the points I make in my blog
post is that I think right now we're
all operating under the assumption that
there's no limit for how fast we can
go, and in reality, we're gonna figure
out the actual limit is maybe, like, 3X
of what it used to be, but we need to,
like, accept that and plan for it instead
of thinking that if we shove everything
in one end of the pipe simultaneously,
it all comes out the other end.
I have opinions.
Um-
Yes.
I think I wanted to say first is I really
like, Carl, you mentioned that you start
sessions with a really long 200 word
prompt that gives all the context, doesn't
just tell the agent what to do, but also
why it's doing it in, like, other context.
Same.
This turns out works really
well with humans as well.
They give you much better work
if you tell them why they're
doing it or what the goal is.
And sometimes if you tell, if you
tell them what the goal is, they
might even come back with, "Your
solution is dumb. We should do this
other thing instead," which, wow, can
you imagine using engineers' brains
for thinking, not just writing code?
But I think the other thing is, we
are generating a lot of code, and we
think the bottleneck is code review.
I think there's kind of two bottlenecks.
One is that when you're moving really,
really fast, you have to actually
be more careful about working on the
right things because if you're just
digging yourself into a hole faster,
we just had an example recently where
we broke Agile, we were like, "Oh yeah,
we know what we, what we need to do.
Let's just have Claude crank out 6,000
lines of code in one gigantic PR."
First of all, we couldn't review that,
so we had to break it up into a bunch of
sub-tasks and then turn that into sub-PRs.
Turns out that took another extra day of
work and everyone was distracted while
doing this, so we weren't focusing on what
we were actually supposed to be doing.
So I think distraction is still bad.
Multitasking is still bad.
We're not as good at
multitasking as we think we are.
And the other thing is we, when we
split all of that up, we found that
Cloud made a tactical mistake very
early on, chose the wrong architecture
for how to build something with React.
And now you have a lot of code that needs
a lot of rework to actually make work,
and you're wasting reviewing time, you're
wa- you're wasting a lot of time that
could have been solved if we started with
a small task to set up a small thing and
talked about it, or maybe even talked
about the architecture before we wrote
six, 7,000 lines of code, and then just
wrote the correct code the first time.
Yeah, 100%.
We're hitting communication
bottlenecks much faster now.
Yeah, like Mark said.
Communication and thinking and
planning was always the job.
Yeah.
Yep.
Except now you're running,
like, 10 times faster.
So if you take the wrong step or as
your first step, you're just running
in the opposite direction of where
you wanna go, but really, really fast.
Right.
Yeah.
Right.
It's very easy to get in
extremely the wrong place in
exactly the same amount of time.
That touches on something I've
said before in this podcast.
I think that the future of, like,
review is gonna be less code review.
The architectural side of things
is still gonna be important.
Like, it's not just, does it work?
Does it do what you intended?
You can't fully treat it as a black box.
The box has to be translucent, you
know, at least a little bit to avoid
major architectural problems that are
gonna cause a production outage or make
it impossible to scale or whatever.
Like, it's not all going
to be, does this function.
But I do think, does it function is
going to be a much greater part of this?
My sense is that it will rely a
lot less on reviewing the code so
much as getting a plausible mental
model of the architecture, and then
it, you know, and then statically
verifying that it does what's intended.
So, like, my, my theory here is that
we're gonna start seeing a lot more
end-to-end testing and user acceptance.
A thing I did recently that I
haven't fully executed on yet,
but ... Well, I, I did a lot of it.
But so a, a project that I have, I
fully rewrote the CICD so that instead
of PR checks, review, approve, merge,
ship, instead of, you know, that, like,
trunk-based development, where every PR
gets shipped individually, I move to more
of a, like, Git Flow type architecture
where it's like, no, here's the version.
This is going to go out.
Uh, here's our release candidate.
And then that single release candidate
gets pushed to a staging environment,
and then I do the user acceptance testing
of does this function as I need it to.
And then once I have that, once I've
done that manual testing, great.
I know this code is
working as I meant it to.
So now I'm going to have it write a bunch
of end-to-end tests to verify that it will
continue to do that when I make changes.
And I think that's gonna be
kind of the new cycle of, like-
I want that, but on every PR.
Yeah, right.
And I guess to me, like, that's where
manual testing is now the bottleneck.
Like, you cannot ... At the end of
the day, this needs to function,
this needs to solve a problem.
Any code you write needs to solve
a problem that real humans have.
Or I guess, you know, sure,
other agents or whatever.
I've experimented with
automated end-to-end testing.
It doesn't quite work yet.
I think the models are, or at least the
models I was trying were too slow, but
the idea is that instead of telling that
Instead of writing a test as what the
features that should be there, it's
more about, can the user do a thing?
And then because you have a computer use
model, it can look at the browser and
it's automatically testing both your UX.
Like, if the model can't figure out
how to do what your user is supposed
to be, drunk users won't figure it out
either, or tired users or whatever.
So you just point it at the browser
and you'd say, "Go, like, go place an
order." And at the end of the day, as
long as it can place the order it wants
to place, you can keep iterating on the
implementation, you can keep iterating
on the design, you can keep iterating
on the UX, you just get tests that are
based on can the agent place an order.
Yeah.
Oh, that's really interesting.
That's, that's a really good variant on
what I was just kind of talking about
because, like, sure, static end-to-end
tests that are verifying, like,
this button with this test ID exists
and the flow completes as expected.
Like, sure, okay, it
functions, but is it usable?
That's an interesting usability
automated usability testing because
that kind of goes back to what
we were talk- talking about with,
like, documentation and code notes.
Like, just make it intuitive.
Can an agent intuit how
to achieve a general task?
Oh, that's really interesting.
If you've ever used VCR in Rails,
that's kind of where I got the idea.
With VCR, you do full integration testing
with calling remote APIs, but you don't
wanna call them every time, so you record
the responses and you keep them locally.
So you then write tests against
those local responses, but if the
API ever changes or anything, you
just regenerate them and you find all
of the bugs in your code that don't
fit the new, newly released API.
And you could do something similar
with this where you have browser-based
acceptance testing, and it records
what it's doing so that you don't have
to spend as many tokens every time.
And you just d- when you change
the UI, you delete it and you
see if you can do the flow again.
Yeah.
Totally smart.
Interesting.
Doesn't quite work yet, but
that would be really cool.
We've kind of been going into this
direction, but let me, like, sort of,
like, you know, refocus a little bit.
Like, so what do we think the
current landscape of tools,
of AI tools looks like?
We've kind of been discussing that
in, in, we've been dancing around it.
And I guess where do we
think it's going next?
Going or ought to go?
Either, both.
Both.
My general inclination as the person
who found one tool set and is stuck
with, and is sticking with it and is not
actively trying to change what he's doing.
Certainly what I'm seeing online is
turning towards automate all the things,
which I think is a very understandable
impulse for a software developer.
If you can do things once, you can
do it in a loop, you can run it in
parallel, you can distribute it, you
can have an agent do it, et cetera.
And I think it kind of ties into the
go as fast as possible at all costs
for the purpose of going as fast
as possible, direction that I see.
I mean, like this bit about, you know,
the dark factories, the idea that a team
never even looks at the code, that it's
truly all agents all the time, and all you
say is, "Do something that accomplishes
some goal, and you never even look at it,
" is sort of an ultimate outcome of that.
I don't think it's a good one,
but I see how it's the idea
taken to a logical extreme.
I, I also have been pretty ... I
found a tool and I'm sticking with it.
I've been noticing ClaudeCode
pulling in lots of ideas
that I've just been thinking.
Like, as it evolves, it's like,
this is a productized version of
how I've already been using it.
You know, like they added a /btw command
that lets you ask a question and it
will ... You cannot use tools and it
does not go into the context history.
So it's just like even do a little aside.
Like if you wanna ask, like, "What's
this thing doing?" I had already
been doing that by way of, like,
I'd ask it a question and then
I'd go back and just, you know,
restore the conversation to that.
And similarly, two months ago, I had
been saying, like, I would actually spin
up a Claude code instance that was on
Haiku because I just need to summarize
this and it's easier and I wanna manage
my usage a little bit more precisely.
And, like, now I don't have to do that
because I just say, use a subagent and
ClaudeCode is smart enough to say, "Oh,
I'm just summarizing. Let me use Haiku."
So I've just been watching, like,
kind of what the agent itself is doing
versus what I need to tell it to do, or
proactively manage, converge a little bit.
So that's, that's had me wondering,
what outside of the agent, what is
not going to live within the agent.
And I think where I've landed on
that is, like, everyone's talking
about memory the last six weeks
or so, and I think that's wrong.
I think that's just gonna be
conventions and documentation.
It's just knowledge.
I think what's going to persistently
live outside of an agent harness, as it
were, is, like, incoming information,
like a data stream, like GitHub issues,
text, you know, the Slack chat, a
Century Alert, a production alarm.
If you have a data stream that is
categorizing those and prioritizing
them, and then giving that to an agent,
to an autonomous agent that is just
working, like, those are different things.
I think you cannot have a single
generic agent that is able to connect
to every possible data source.
So that to me feels like a
boundary that's gonna be pretty
strong for the coming months.
I don't know.
I think for me, I feel like going
super deep into what, at least on
the internet, a lot of people are
talking about various MD files and
orchestrations and all the crazy stuff.
I feel like that's a little bit
like ID skins and WIM shortcuts.
It's like, yeah, sure, it's cool, but
I, I think you're over complicating it.
The models are gonna keep getting
stronger and better, so I would
just think they're gonna be able
to do bigger and bigger tasks.
I think feedback loops, like what you
said, what is outside of the model?
Building really strong
feedback loops is important.
I think memory documentation inside the
code or making the code itself easier
to research or giving it MCP tools to,
like, notion or linear or whatever,
you keep your organizational memory.
Having access to organizational
memory will help agents as well.
Yeah.
I don't think we're ever gonna come to
full automation of anything, really.
I think there's a strong gelman
amnesia when it comes to th- these
things where agents will fully
automate every job except the ones
I'm strict, super familiar with.
That is just way too complicated
and there's too many deals.
Everyone feels that way about
their job because newsflash,
they're all too complicated.
I wanna explain that reference briefly
for people who aren't familiar.
Gelman and Amnesia, that refers to
the idea of, like, you're reading a
newspaper and you see a story about your
professional industry or something you're
very knowledgeable about, and you see
everything that it got wildly wrong, and
you're like, "Who even wrote this? What
are they doing? These incompetents."
And then you read the next article, and
you're not familiar with it, and you
just take it super credulously, and it's
like, "Oh, yes, look at these," you know.
It's a, like, I think that's a pretty
good way of thinking about how people
are talking about AI right now, yeah.
And I think, from my perspective,
go hard as long as you can.
I think the bubble is going to
collapse in the next one to two years.
Tokens are gonna become super expensive,
and we're not gonna use as many of these
tools long-term as we are right now, but
we're gonna have a lot of infrastructure,
like, just like we did after the dot-com.
That's part of why I've been
really interested in local
models, because I agree with that.
Like, they're being heavily subsidized
right now, and I think in, I don't know,
about two years, I don't know, we'll see.
It's been very ... People are talking
about it, it feels like the wheels
are falling off right now with, like,
ClaudeCode is just clamped down usage on
the 20 d- $20 a month plan aggressively.
They grew from nine billion ARR to
30 billion ARR in a quarter, so I'm
imagine they have some scaling issues.
Right?
Yeah, clearly.
One of the points I made in that draft
blog post that I'm hoping to publish this
week, the cats out of the bag, like, even
if OpenAI and Anthropic were to utterly
and completely collapse and go out of
business today, the technology exists.
And whether it's, you know, boutique
hosting of models or local LLMs, the
technology is not going to go away.
And even if it only, even if it
never gets better at all, the t- the
capabilities stay exactly the way they
are today, it's still curly enough
to upend large parts of our society.
So then the question becomes, w-
what are you doing with it and
how do you, how do you respond?
Well, that's a good ... We should wrap up
soon, but that is an excellent jumping off
point for our last little subject here.
Uh, what do we think the impacts of LLMs
on engineering as an industry, software
engineering as an industry, and maybe more
broadly, societally, if we wanna go there?
I would say it's raising
the bar for everything.
Like, what counts as a minimum
viable product is gonna get
more fancy and more complicated.
What counts as a junior entry-level
engineer, or what the expectations are for
a junior entry-level engineer are gonna
go up, and I think we're gonna need a lot
more, um, product sense, and a lot less
being just really good at writing code.
Yeah, I'm inclined to agree.
I am honestly worried about the
impacts in society as a whole.
We've seen lots of debate and studies
about, you know, have, have cell phones
effectively destroyed the minds of
youth and the ability to focus and
social bul- like electronic bullying
and, like, very ... Like, why did teens
get very depressed starting in 2012?
Like, that sort of thing where
a technology possibly had
large-scale societal effects.
And I'm not saying that LLMs are,
you know, inherently destructive or
inherently bad, but a lot of times
the technology gets invented and
then it changes society in ways we
were very unable to predict early on.
In this case, I think we can predict
a number of ways, and there's probably
a whole lot of other stuff, too.
So I genuinely worry about college
students being able to get away
without having to develop critical
thinking skills even less than they
were able to get away with it before.
I have concerns about that sort of thing.
And, you know, I'm towards the tail
end of my career, I've done my thing,
I learned, I gained my experience.
I don't know how junior devs are
going to gain some of that experience.
Maybe it is they need
different experience.
I don't know what the job
pathways look like at that point.
So, like, I mean, don't get me
wrong, I'm excited to be able to
use AI to crank out some of the
things that I've had in my head.
I've found some of the same joy in being
able to have a multi-hour flow session
and build stuff just with AI editing
the files instead of me and my fingers.
But I can also look around and say,
"Yeah, there's a bunch of unexpected
consequences floating around as well."
I mean, in terms of experience,
it's not like I know how to code
without a compiler, and I would
definitely not enjoy coding without
a, without garbage collection.
Yeah, I think that's closer
to my theory on this.
I think it will, you know, the societal
impacts are gonna be something a lot.
We'll see.
I think the societal impacts will come
mostly not from software engineering, so
I think I don't understand those quite
well enough to opine on it too much.
I do care a lot about, like,
sociology and economics and whatever.
So I do have thoughts and opinions, but
I th- maybe it's a, maybe it's my Gelman
new amnesia of like, I know those well
enough that I'm like, "Nope, that's way
too complicated. I'm not no opinion." But
I think we take for granted a little bit
how much we're standing on the shoulders
of giants already and like, this is a big
giant to climb the shoulders of for sure.
Like, not trying to discount that.
People talk about how it's gonna
change our brains and like, you
know, destroy humanity and whatever.
And like, I don't know, like, we
kind of did a lot of that already.
Like, can you navigate
in your city without GPS?
There's a lot of stuff that we've
already lost that used to be
like, canonically like, you know,
you used to have a navigator.
If you wanted to do a road trip,
like, you would have spent a week
planning a route and then, like,
God help you if you miss a turn.
Same thing for, like, finance
and, like, the spreadsheet.
I don't know, this is gonna be, this is a
broad, more broadly applicable thing than
any of those past technological leaps.
Another thing that I learned recently,
going on the societal level again,
there was a ... In 2023, there was an,
like, international, like, literacy
survey done, levels and whatever.
And I think level one was can
read, but struggles with, like,
finding knowledge in text or, like,
following multi-step instructions.
And 28% of Americans m- met
that were, you know, were tested
at that level of literacy.
And, like, it said anything below level
three is considered partially illiterate.
And the stats were half, half-
54% of Americans are less than
sixth grade reading level.
They will not do well with LLMs.
Right.
But, like, you know, we
did that before LLMs.
That was 2023.
Like, okay, sure.
Like, there's gonna be weird shit
going on since then now, but, like,
God, we already ... Like, as far as
destroying society, like, we've done
a pretty good job of that without AI.
And for context, that's
54% of adult Americans.
It doesn't, like, count
children and stuff.
Yeah.
I don't know.
It's just, like, anytime I hear
people talk about, like, the downfall
of, like, civilization or whatever,
it's like, "I don't know. Have
you looked at civilization lately?
It's not doing so great." So, I
don't know, it is gonna be a lot.
It's gonna be challenging and difficult.
I guess going back to engineering,
software engineering, you remember
the '90s and 2000s, 2010s era,
like, stereotypes of software
engineers, like, it's, like, super
autistic, like, zero social skills.
They are a human computer.
And now we don't have that anymore.
We, you know, democratize software
engineering and now everyone does it.
We've always tried to get
something basically like an LLM.
Like, that's what people wanted out
of software engineers this whole time.
So, I think it's gonna be ... I
like the analogy of, like, compilers
and garbage collection and stuff,
because it's just a new tool.
It's a new level of abstraction,
and now it's much further
removed from the hardware.
But, yeah, I don't know.
People talk about, like, the d- it's
gonna destroy, like, the upskilling, it's
gonna destroy junior engineers, and it's,
there's not gonna be any pathway in.
And, like, I don't know, maybe
that's true, but at the end of
the day, I'm gonna end on, like, a
positive thought or, like, advice.
Like, this is all communication.
It's just more and more communication.
If you wanna lock in your software
engineering career, get good at
communicating, get good at deeply
understanding and articulating
a problem, not the code, not the
patterns necessarily, but, like,
understand what an architecture
is good at or bad at or what the
performance profile of this or that are.
And, like, now, you don't even need
to understand it at the same level.
Like, I've been really excited about
AI is because I have such a broad and
shallow expertise, and now I can use it.
Like, it used to be that I had all
this, like, vague notional expertise
in a wide range of stuff, and it's
like, "Damn, I wish I could use that,
but I would need a team of 15 people
in order to be productive with this.
And now I don't.
So, like, I don't know,
that's exciting for me.
And I think that's just
generally exciting.
So if you're just good at communicating
and you learn a domain and you get good at
articulating problems within that domain,
then, like, I think you're gonna be fine.
I think superpower right now is if you're
a domain expert who can kind of code.
There is certainly still value
in, you know, greater engineering
expertise and understanding how
to ship stuff to production.
At some scale, that
becomes a domain expertise.
True.
Yeah.
Extremely true.
So yeah, I don't know.
Uh, I'm excited.
I
can build more stuff.
I can build so much ... I
have built so much more.
I've explored so many more things
and gone, "Oh, this is harder than
I thought. I'm gonna set it down."
Instead of just never doing it and,
like, fantasizing about what it
could have been for years, you know?
Yeah.
Same.
All right.
Cool.
Thank you all.
Thank you both for joining me.
Thank you everyone in the
audience for listening.
Appreciate it a lot.
Hopefully this was informative.
Cool.
All right.
Well, this has been this
month in React, quote unquote.
I think we may be exploring different
formats in the future because, I don't
know, who's following individual software
library development as closely anymore.
But yeah, thanks so much for listening.
If this is a show that you get value
from, please send it to a coworker,
send it to a friend, leave us a review.
And yeah, I'll see you next month.
All right.
Take care.