This Month in React

Transcript

★ Support this podcast ★
Reply on Bluesky

(00:00) - This Month in React April
(00:26) - Introductions
(02:36) - What convinced you AI code tools were worth using?
(04:24) - Using early ChatGPT for DB migrations
(06:59) - Watching AI use a command-line
(08:16) - Background chat agents
(08:58) - Staying very hands-on while using AI tools
(11:00) - Driving AI closely without reading its code
(17:50) - Mark's workflow; OpenCode with CodeNomad UI, plus IDE+git UI. Opus 4.6 on API
(20:39) - Swizec's workflow, latest Cursor on Opus
(23:58) - Carl's workflow, mostly ClaudeCode but looking at custom orchestrators
(25:23) - Exploring fully autonomous agents
(28:08) - Mark's AI debugging work in React core
(31:42) - Value of providing more context
(33:58) - AI-owned documentation
(37:09) - Using good engineering practices still matters?
(40:47) - How do you know the right code to make?
(42:49) - Good communication still matters
(45:13) - What will "review" look like in the future?
(46:17) - Automating functionality tests with deployment practices
(51:49) - What behaviors belong to the agent, and what fundamentally can't be part of the agent?
(56:17) - Impacts of LLMs on software engineering?
(01:03:22) - A superpower right now is a domain expert who can kind of code

Creators and Guests

Host

Mark Erikson

An engineer maintaining Redux and Redux Toolkit, working at Replay.io to make smarter AI chat bots and debuggers using time travel.

Producer

Carl Vitullo

Solopreneur just vibing, posts are probably bullshit. Community lead at Reactiflux, the largest chat community of React professionals.

Guest

Swizec Teller

Engineer and manager at a DNA sequencing as a service company, does a lot of React for really fancy data visualizations

What is This Month in React?

How busy professionals stay on top of the React ecosystem. We give you a 1 hour recap of the latest news and nuance in React's development and ecosystem, upcoming conferences, and open source releases. New episodes the first week of every month, with live recordings on the last Wednesday of every month in the Reactiflux stage.

Hosted Mark Erikson (Redux maintainer), Carl Vitullo (startup veteran), and Mo Khazali (head of mobile at Theodo). See something for us? Post to #tech-reads-and-news

Hello, everyone.

Thank you for joining us for this
month in React, which is not going

to be particularly React heavy.

Mark and I have been talking for a
couple of weeks now about doing a,

like, bonus episode of sorts to talk
about AI and how we're using it.

So we are just, instead of a bonus
episode, we're, we're just gonna

do that for this month, for April.

Yeah, and apologies for last month.

We had a recording problem,
and we, it was completely

unsalvageable, just nothing to save.

Big bummer.

But yeah, so I am Carl.

I am joined this month by Mark
Erickson and Swizec Teller.

yeah, we're gonna talk about AI
because it's been a huge part of

each of our workflows for the last,
like, ranging from, like, three

to six months to a year or more.

Let's do some intros first.

I guess Mark and I are reasonably
well known, but I'm Carl.

I am a staff level software engineer
and engineering manager and community

lead here at Reactiflux, where I
do events like this and build code

to keep the community operating.

I'm Mark Erickson.

My day job is ReplayIO, where we've built
a time traveling debugger for both humans

and agents with ReplayMCP now available.

Please check out our blog.

I just put up a blog post on how Replay
found a bug faster than Dan Abermov did.

I am still the Redex maintainer.

Honestly, I haven't done much
Redux stuff in the last few months

because all my brain space has
been taken up with day job work.

And also, I'm going around to a
whole bunch of conferences this year.

I'm Swiz.

I work at Plasmidsaurus, we are a
DNA sequencing as a service company.

We do a lot of React for really fancy
data visualizations for stuff like,

"Hey, wh- how do you visualize a few
million data points in the browser and

make it work smooth?" Stuff like that.

And these days, I'm kind of more
of a manager than an IC really.

And I've been thinking a lot about what
kind of engineers get hired these days.

We've been hiring a lot and using
more and more AI to write the code.

Yep.

Believe that.

Cool.

Yeah.

So we were just chatting a
little bit about what shape

this conversation's gonna take.

Just to level set a little
bit for everyone listening.

We're gonna start off kind of at the
point of, like, what convinced us that

AI was a tool worth taking seriously and,
you know, getting AI pilled as it were.

Go from there into how we're using it
now, what problems we're using it to

solve, with what tools, as well as kind
of, like, what aren't we using, what

don't we find useful and compelling?

And go from there to, like, landscape
of, like, what tools are available,

what's out there, where do we think
it's gonna go, and then kinda close out

with what do we think the impacts are
gonna be on the industry more broadly.

Mark, you wanna start us off
talking about what convinced

you that AI was worth using?

Sure.

A year ago, I was dead set that I would
never, ever allow AI to write code for me.

It was a fate worse than death.

It was destroying my career.

I refused to do it.

And in fact, I actually wrote a 15,000
word blog post over the weekend that

I haven't published yet that will give
the long form version of this story.

The short form is, over the summer last
year, I cautiously started using AI to

explain an existing code base to me.

You know, just give me some architecture
docs, walk me through the data flow.

And then there was a three-day period
in late August that blew my mind.

On a Tuesday, I asked it to write some
redux unit tests for me because my

brain was too tired to write actual
code, and it did, and I was stunned.

On Wednesday, there was a node compression
library that I've been trying to replace,

but the alternative didn't have all the
features we needed, and it's Rust-based.

And I tried asking the AI to
write the feature for me in

the Rust library, and it did.

It actually didn't quite work right,
and the maintainer had to turn down

the PR, but this was the first time
I saw an agent actually just crank

along and spit out a bunch of code
and happily make a bunch of updates.

And I thought I had a, a
good understanding of what

that process looked like.

And then when I saw it in person
for the first time, my jaw dropped.

And then on a Thursday, I needed to
write some AST-based linting code.

I know what ASTs are.

I've used Babbel.

I understand the concepts, but
it's kind of complicated and

fiddly, and we had a custom setup.

I was like, "Could this do it for me?

" And it did.

And it did it much faster
than I could as a person.

And my worldview got destroyed.

Yep.

Sounds familiar.

I mean, that, that sounds
like an amazing experience.

For me, the first was way back in
the, uh, Stone Age where you had

to talk to ChatGPT and then copy
paste the output to try to run it.

And it was like, I think it was the
holidays, and I was writing a book feeling

kind of discouraged, and I was like, "I
wonder how many words I'm writing per

day." And it's like, I can write Python.

It's not that interesting to parse a
markdown file, go through, get history and

see how many words you added every day.

Um, so I was like, "Maybe ChatGPT
can just write this for me.

" And I asked it, and the code,
the code didn't work, but it ran.

And I thought that was really cool.

So I talked to it a little bit
more, and we ended up with a really

nice ... It was an extremely ugly
code that I would never write myself.

I think it ended up still taking
two or three hours, but it was a lot

more interesting than me doing it.

Um, and I ended up with a nationalization
of how my, how my book is doing, and then

I wrote a few more scripts like that.

And then I started using it at work to
write my database migrations, because-

Please tell me, was it
still just the ChatGPT UI?

It was.

Uh, this was, this,
this was before Copilot.

I was like, "Hey, ChatGPT, I have this
SQL query, please write migration."

And it wrote the migrations or I ended
up, uh, later on just copy pasting table

definitions from, like, DBR or DataGrape
or whatever, go look at the current tables

and be like, "I have these tables. Please
write migration for ... " I think we

were using Connects at the time. "Please
write migration to add these columns

and then copy paste back and it worked.

"And I was doing that for
a while and it was amazing.

I never wrote a migration
again in my life.

Yeah.

I started doing a lot more
migrations because it just became

so much easier to actually do them.

There's such a pain,
there's so mechanical.

That was an early one for me too that
was like ... I guess I actually had been,

like, scoping my work to avoid migrations
because I hate doing them so much.

I had a similar experience of just
like, " Oh my God, this works for that.

I can do so many more of them now
because I don't have to do it.

"Yeah, it's a little funny because
actually database migrations are

a big part of why my career looks
the way it does, because I did,

like, one in, you know, my first
year as a software engineer.

I went," Wow, I hate this.

You can't guarantee anything.

There's just, you just gotta try it.

"And so then I went more front
end because you don't have to do

database migrations on the front end.

Yeah.

I had a similar type
of experience as Mark.

I guess I, I got on it much
later than you did, Swizz.

Sometime around, like, August, July,
August, September of last year, you

know, I, I had done kind of like
ChatGPT prototypes or whatever, like

Claude, having it write some HTML
in line and just, like, prove out a

concept or write a first draft of a
script, like, prove out does this work?

Give me off the blank page and give me
something to grow from and diagnose.

And then I'd been hearing people talk
about Claude code, so I finally gave

that a shot in, like, I think around
July, August for the first time.

And just watching it, you know, like,
Mark, like you said, of just watching

it churn through and do things and
also having it be, like, native

on the command line and watching
it just run exactly the same Bash

commands that I would to diagnose
something and, like, read the files.

It was like, "Oh, this is doing exactly
the same process I would to resolve

this bug, but it's doing it five times
faster than I can." That was my big,

like, light switch moment of, like, "Oh,
I need to take this really seriously."

For me, it's kind of like I
really hate watching it work.

I've been using AI for a while
doing, like," Oh, can you

write this function for me?

Or can you, like, do little small things?

"But my workflow really changed when
Cursor launched Slack background agents.

So I started in Slack, like, when you
get ... Partially, I'm a PM, so I get

a lot of requests that are like, " That
is definitely not a priority right now.

We're not gonna work on that.

"But I can now go at Cursor, do the
thing, and I just get a PR with the

implemented small thing that I never
would've taken the time to do myself.

And I, I find that amazing
because watching it work, I think

is, for me, is too distracting.

Yeah.

So I like it when it's fully
somewhere in the cloud.

I don't have to worry.

I just review the code when it's ready.

My workflow is very hands-on
and human in the loop.

Like, I'm sitting there watching that
thing like a hawk and having conversations

with it, which I realize is not how
most people are using AI at this point.

I think a lot of people are very much
on the, in the cloud, multi-agents, how

many of these th- things can be run in
parallel, which I now have opinions on.

Part of it is ... Well, a lot,
a lot of it's the point that I'm

making the draft blog post, which
is understanding is still critical.

And I'm a very, I'm a very firm
believer in, you know, the fundamentals

and understanding and building
a mental model of the system.

And I personally, for me,
I want my brain engaged.

I want to be thinking through the
problem, and then I'm using the

agent to amplify my own abilities.

Now, don't get me wrong, there's
definitely been a few moments where

I was, like, you know, out and about.

It's like, you know, it actually would be
kind of nice if I could just pull up my

phone and tell an agent, go do this thing.

Like, I, I, don't get me wrong.

I get the appeal.

Yeah.

But in terms of day-to-day
development work and how I approach

programming, I want my brain
active, engaged, and thinking, not

just handing it off to an agent.

Yeah.

I hear that.

I think I use it in both ways.

There are a lot of things that, like
you said, so it's just, like, it's not

quite high priority enough to justify.

And that's where I'll use, like,
more of a background agent.

Like, you know, here's 15 one line
descriptions and, like, just from the one

sentence, it's clear enough what's needed.

Like, you know, adjust the size of this.

Add an element that controls this.

Like, those are mechanical enough
that it's really just, like, getting

it right, making sure it compiles.

I don't currently have a functioning
workflow for that exactly.

What I'm doing right now in the last
four or five weeks just, like, hasn't

really been of that type of development.

So I am still, day-to-day, most
of what I've been doing is very

much, like, the mech suit variant.

You know, people talk about,
like, automated versus mech

suit, like robot or mech suit.

And yeah, so I'm driving it
pretty closely, but I'm not really

reviewing its output very strictly.

I'm, you know, I tell it to use
subagents, and then every so often

I'll, you know, tell it to write me
a report, like, how is this working?

What is it doing?

Ask it some targeted questions to
make sure that the mental model I

have matches what's actually there.

But that's a little different, I
think, than what you described, Mark.

Like, I'm not actually, like, I
don't read almost any of the code.

I try and think of it more from,
like, my engineering manager,

tech lead, product manager hat.

I do really try and sit in engineering
manager/product manager role and

say, like, "Okay, how did those
people talk to me as an engineer?

Like, they were not technical.

They didn't understand anything of
what was being built, and yet it

was their job to make sure it was
functioning as intended." So that's

very much how I try to think about
my work now is from that perspective.

It's like I, instead of reading a line
of code to ensure that patterns are

followed as I intended, like, set up
a lint rule and then, you know, maybe

it's only 80% is good, it's not gonna
catch all the subtleties, but, like,

my experience of working on a team
is, that's kinda how it works anyway.

Like, if it's not captured in the
lint rule, eventually it will break.

Like, eventually, someone will
have been onboarded and not ha- be

deeply steeped in the history of the
project or just be tired that day

and they forget about it or whatever.

So, like, that was very much my
perspective as, uh, you know, when

I've been a tech lead is, like, if bad
code goes out, you know, the blameless

postmortem, it was a process failure.

This should not have been
permitted by the automated checks.

And so that's kind of how I'm thinking
about it is just remove the abstraction

a little bit and guarantee the
outputs and then work on automated

tracking of automated evaluation of
the code quality to make sure that's

at a level that I need it to be.

Yeah, same.

I think that's my perspective as well.

I've ... Going up into tech lead was
really kind of almost broke me as an

engineer because I was like, "Oh my God,
all of this terrible, awful code, but

it works and it's fine and everything
is okay and the team just handles

it. And if anything is wrong later,
we just fix it. It's totally fine."

And I kind of treat AI the same way.

I, I do a lot of driving from, like,
a product perspective and for critical

features or, like, super gnarly
business requirements, I go into the

ID and I drive it kind of like what
Mark was describing, reviewing every

line of code, clicking yes, et cetera.

But they do really well with follow-ups.

Cursor gives you the wrong
thing back in a Slack thread.

You just say, "Ed curs- ad cursor,
go fix it. " A lot of the times,

those fixes are actually, "Oh, yeah.

Now that I'm holding a working MVP,
it doesn't actually feel right.

I totally didn't even think
of several features that it

needs before it's useful."

I also really like the code review flow.

I don't know exactly what we did,
but we hooked it up so that you can

do a PR review, like a code review
for your cursor agent session.

And in GitHub, you just go at cursor, fix
this, or you give it information basically

the same as you would with a team
member, and then you just get follow-up,

commits, and it fixes all the things.

That's very much been
my experience as well.

One of my biggest struggles as a,
as an engineer is, like, I remember

hearing about this sort of a stereotype
of, like, give it, give this task

to a junior engineer because they
don't know it's impossible, you know?

The sense of as you get more
experienced, you get to see more of

the complexity, and the complexity
makes it harder to take action because

you're now evaluating trade-offs
instead of just kind of ignoring them

because you don't know they're there.

And I would get so stalled into
analysis paralysis and, like,

what's the correct way to do this?

How do you do this best?

What's gonna have the best maintenance
trade-offs and the lowest, you

know, the tech debt and whatever?

And that's just such a
difficult way to actually solve,

especially a novel problem.

Like, if you understand
a problem well, great.

You can specify everything in advance.

But, like, if you're working at the
edge of your knowledge, at the e- edge

of your understanding, then, like,
you're gonna go the wrong way at first.

And it's so painful to work on something
for three weeks and then go like, "Oh,

shit. This is just completely the wrong
architecture and I need to start over."

But it's so much less painful
now with AI because, like,

great, what a wonderful learning.

Let me take that, let me have it write
a three-page document about what we

learned, what the new problems are, go
back and forth, interact with it about,

like, designing a new data model and
architectural, you know, process flow.

I actually just did this
in the last, like, week.

I've been benchmarking local LLMs because
I want to better understand which ones

work for what tasks so I can not have
all of my AI usage be based on some

mystery frontier model in the cloud.

And I just, like, you know, I started
with just, like, "Hey, give me a

script that will run this model. Start
up a, you know, LLM server and then

run prompts against it. " And then,
like, that grows, "Oh, I need, I

wanna be able to kill it and restart.

Oh, I wanna be able to, you know,
version the prompts as I change them.

Oh, I need to have it, you know, run
a code evaluator when I'm giving it,

like, a code generation challenge."

Eventually, it became un-
unmanageable because of the tech debt.

It's, you know, it grew and expanded and
I learned more about what was needed.

And so I just started over and I
said, "Great. This is a Python ball

of mess and it sucks and I hate it."
So let's talk about the architectural

flow of it and now let's add some lint
rules and let's rewrite it in effect.

So it's actually like a high quality
code base with, like, resumability

and scheduling and whatever.

And now it's working way better
and it's, like, maintainable.

It's been really powerful.

I freely admit that the very
hands-on style that I've got

is intentionally learning.

I know it's possible to go faster.

I know that it's possible
to delegate a lot more.

I am treating it much more as an extension
of my IDE and keyboard than I am as a

junior developer delegating the work.

That's fine.

I am good with that.

That is what works best
for me and my brain.

You might start doing that less
and less as you use it more.

I started with doing that a lot more,
and then eventually I realized, wait

a minute, I'm just clicking yes on
everything and then giving it a follow-up

prompt to change the things I didn't
actually like, but click yes on anyway.

I certainly didn't think a year ago
I would be using any of this at all.

I was, I was convinced that was a red
line that I would, I would never, ever

cross on pain of death, and here I am.

But also, like, I mean, I've, I've found a
workflow that I will be hopefully blogging

about in the next day, next couple
days that does actually work for me.

And the hands-on aspect of it
is what I get good results out

of, and it's what fits my brain.

Well, okay, let's take
that as jumping off point.

Like, what tools

Let's do, like, super concise,
just, like, list what tools and

what models you're using day-to-day.

So my, my own setup, my intro was the
kilocode VS code extension with, you

know, it was probably either Sunat 35
or Sonat4, whatever was available around

September-ish last fall because I didn't
want to go anything command line at first.

I wanted to stay in a
graphical environment.

I tried ClaudeCode for about a day.

Tried the VSCode extension, which didn't
work at all, and then I tried the command

line tool, and I, I did not like it.

I tried the OpenCode command
line tool, also did not like it.

How do y'all deal with copy paste
in a command line environment?

So what I have now, I'm using OpenCode.

My personal laptop is Windows.

My work laptop is Windows,
but I work in WSL.

So I actually serve OpenCode from
within the WSL environment, and then I

use a third party web UI for OpenCode.

OpenCode has a very nice
server client distinction.

The text client is just one
of the possible clients.

So I found one called Code Nomad,
which is very good, works great.

So I actually serve Code Nomad
plus OpenCode added the WSL side,

and then I just hit local host
whatever port in my browser on the

Windows side, which also avoids any
cross-platform file shenanigans as well.

So I've got my chat sessions
and my tabs open in the browser.

That's now my development environment.

And then I still have VS Code open
for looking at some of the diffs and

editing that my ... I much prefer a Git
graphical client called Fork, but it

doesn't work well in Linux, so I've just
stuck with the built-in VS Code stuff.

I actually don't like it, but I've been
too lazy to go find a better alternative.

Model-wise, I've basically
been on whatever the latest

and greatest anthropic is.

We have corporate keys,
da, da, da, da, da, da.

Also, I have somewhat intentionally
tried to avoid model hopping.

I don't want to be running a bunch of
evals every other week and saying, "Oh,

this one provides me a 3% increase in
such and such a benchmark. Clearly, I

need to switch my entire workflow to a
different model." and granted, OpenCode

does let you just pick and choose what
model you want, but I'm trying to get

something that's good and consistent
and that I know works, not chasing

the hypothetical maximum performance.

Okay.

So you're, you're on OpenCode
and using latest Claude Anthropic

models just through API usage?

O- Opus 46, yeah.

That's the other thing.

Uh, it's, it's API keys, not the various,
like, you know, $20, $100, $200 max plans.

So I, I read about people getting the
resets and it's like, I haven't do with

that because I'm also not paying for it.

Right.

Cool.

Okay.

It's like knowledge that my experience
may not be universally shared.

Okay.

Yeah.

Yeah, Swizz, what's
your workflow look like?

I'm curious, I'm very curious
about yours because I think you're

by far the most advanced agent
user between the three of us.

Which is really funny because I
was surprised to learn earlier

this week that 97% of my code
is AI because I really talked.

I was like, "What?"

But I actually have an
extremely simple setup.

I use Cursor, I keep it updated
to whatever the latest version.

So when it pops up with, "Please
update," I click the button.

I think the latest Opus model or
whatever it is, it's like something 4.6.

I think it's an Anthropic model.

We have a Team subscription, so
CompanyPays gives us an infinite budget,

but I think I'm actually using, like

It's actually funny.

I don't know ... We were just
talking about this today.

I don't know how this happened, but on
the Cursor Leaderboard, I have the second

most AI usage of all of the engineers, and
I have the least amount of dollars used.

Fascinating.

Interesting.

I have literally spent $25 this month.

So anyway, I use Cursor, I use
@cursor on Slack a lot, and I

use @cursor on GitHub a lot.

We also use Linear, and I've
started more and more delegating

my linear tickets to cursor.

So look at the linear ticket, be like,
"Eh, I didn't write this well, add a

little bit more context so that a dumb bot
can know where to go fix things or what

to change, and then just click delegate
to cursor, and then I review the PRs."

That's my workflow.

I try to keep it really simple and easy,
because like Mark said, I'm not looking

to super maximize my productivity.

I'm more looking for, you know, as a
manager, I'm supposed to stay on SiteQuest

anyway, so this is a really good way
to code between during, uh, between

and during meetings when I'm supposed
to be paying just like half attention.

Yeah, that's really interesting.

I, I, I like that you two are like
kind of two ends of the spectrum here.

So as you use it as like very much
in a managerial capacity, like, you

know, the same as you would ping a
colleague, you ping cursor instead

and just say, "Hey, do look at this.

"
Yeah, pretty much.

I, I've been doing that a lot.

And it's all within collaboration
platforms, I guess is like

a big distinction here.

It's all in space- A lot of it is.

Yeah.

Okay.

We're now experimenting with building
like a feedback loop so that we would

have a bot that Sentry sends us errors and
we're thinking of having a bot that looks

at those errors, figures out how to fix
them, and then just issues a PR when we

get, so that we could have automatic PRs-

Yeah.

for, at least for some er- no, like, you
know, probably not for super critical,

crazy, important things, but there's a
lot of errors that happen where it's not

that important, but it's nice to fix.

Yeah, definitely.

Right, the long tail of Century issues.

But when I'm like actually hands-on
coding, I use it a lot to write my tests

because I know what I want to test and
I can describe the situation to set

up, but I hate doing the grunt work of
setting up your database in just the

right way to have 50 models, et cetera,

yeah.

Yeah.

Okay.

Interesting.

I, I kind of split y'all's experience
a little bit, or I'm working

towards ... I feel like I'm closer to
where Mark is right now and I'm trying

to move towards where you are, Swiz.

Yeah, I almost exclusively am
using Claude code with Opus.

I've played around a bit with ... Like
I, I used Copilot, you know, GitHub

Copilot, like once to, like, scaffold a
prototype that I was experimenting with.

I don't like the experience
of it being in a web browser.

Like, something about it ... I don't know.

I also don't like code spaces or,
like, remote development across SSH.

So, like, this may be just my own,
like, biases and preferences, but I

really like just ... It's right here.

It's on my machine.

It's an environment that I have set up.

I know exactly what's
available to it and what's not.

Just, like, I guess I've prioritized,
at least in that where I'm trying to

use it like a Mac suit if speeding up
my own pr- individual productivity,

I just want it to be predictable
and understandable for myself.

To me, that's very much just been Claude
code with whatever defaults pretty much.

I've played around a little bit
here and there, but yeah, I'm trying

to work towards ... One of my many
projects that I have in flight is

a personal, like, orchestrator.

So I had started out kind of my foray
into, like, earnest AI focused, using it

to the extent that I am now, as opposed
to more limited reading every line of

code, going in and editing it myself.

That's only really been since,
like, February when I started

playing this AI agent game.

And it's been really interesting thinking
about ... Because putting an agent in

a simulated world and, like, having it
autonomously play a game and do so in an

interesting way, you know, effectively
and interestingly, really just, like,

got a bunch of wheels turning in my head.

And so, like, ClaudeCode works really well
for me for, like, pushing the envelope,

like, expanding what a project is, what
it can do, and I have really enjoyed

using background agents more, like, very
much like what you described to us of

just, "Hey, oh, fix this. This is broken.
Here's this bug." I don't currently

have anything like that operational,
but I did get my little personal

orchestrator working for a minute.

It was just wor- working on the GitHub
API and I just said, like, "Here's a

repo, like, go, you know, read the issues,
triage them, prioritize them, take a task,

open a PR, and wait for me to review it.

" And it worked, but I didn't give it
any guardrails, so it ended up, like,

redoing the same GitHub issue multiple
times, or, you know, it would re-review

the same PR over and over again with,
like, a page and a half of a comment.

And so I was like, "All right,
okay, this is proof of concept, it

works, but, like, clearly I need
some additional things set up."

And I guess one, another thing I
wanna say, like, kind of what you

said was about, like, using Sentry
Data as an input for what to work on.

I think that's the frontier.

Like, that's what is
currently being explored.

Like, how do you do that effectively?

Yeah.

You need the feedback loops.

So I was reading a lot of how to
build an agent papers the other day,

and the main inputs are basically
rag, memory, and feedback loops,

and from there, it can do a lot.

So Cursor, for example, Cursor Cloud
agents through Slack, they actually,

when they're developing the feature,
they will fire up a browser and go

test it, see if they can actually do
the thing, and if they can't, they

will then c- continue iterating.

And in the end, the PR
doesn't just have code.

It has screenshots and videos of
the working feature, which is what

I require from all of my engineers,
and it makes reviews so much easier.

And there was another ... Oh, yeah.

So I think the longest I've managed
to do, to have it spin when I asked

it to build an entire feature.

So, like, I would write a maybe 200
word, two or 300 word prompt in Slack

to add cursor, and it took ... I think
it spent 45 minutes to come back with

the working feature, and that was
amazing, because I was in a meeting

for that whole time, and then I just
looked at the PR and gave it feedback.

Yeah.

That's incredible.

So my day job, we're coming at it from a
similar but also sort of opposite angle.

We, we have built a time driver to
bugger, and originally the premise was

that by making a DVR style recording,
you as a person can go in and do all

the investigating in the lines of code
and the print statements and everything

else so that eventually you as a
person figure out why this was broken.

So we shipped an MCP a couple months
ago, and I've already seen some very

real examples of agents being able
to go in and solve bugs that they

wouldn't have been able to otherwise.

I actually put up a post
about this a week ago.

Dan Abermov had filed an actual React
bug saying that the used deferred value

hook sometimes fails in production,
it's stuck, like a render behind.

And he had a repro and he said, uh,
"I've had my agent try to look at it,

but I can't find the answer." A month
later, he comes back and files like

a four-line fix deep in the guts of
React Scheduler to actually fix it.

And he posted on BlueSky later, and
apparently what he had to do was

rebuild the React Library with a bunch
of console logging added so that his

agent could look at the prod build
and eventually trace what was going

on and figure out how to fix it.

So I'm like, "That would be a great
marketing post comparison." So I took

his example, I made replay recordings
of the working dev build and the failing

prod build, handed them to an agent,
and I said, "Here's a bug report.

Here's the two replay recordings.

The issue is somewhere in React.

Can you find it?" took 10 minutes.

So I'm like, so then I'm like, okay,
well now let's make it like, you know,

something resembling a proper experiment.

So I took the same two recordings
and I spun up four simultaneous

sessions with differing instructions.

The first one was just a basic, here's
a bug report, go investigate, actually

less context, the proof of concept.

The second one, I gave it like a eight
step investigative process to follow.

The third one had a few paragraphs
just naming some concepts and

Reacts internals, like not even file
structure, but just things like,

you know, schedulers, fibers, lanes.

And then the last one also
explicitly listed some of the

replay MCP tools we have available.

All four of the sessions found the
same bug and suggested the correct fix.

They took respectively 28,
17, eight, and seven minutes.

So on the one hand, you know, ob-
obligatory sales pitch, replay recording,

finding bugs, it's awesome, it's great.

We're building some cool stuff with it.

But it also showed me a lot
about the value of the prompts

and the context that you give.

There was another example I was working
on where someone had made a, an example

NextJS app with some example bugs in it
and was asking an agent to find them.

And in one of them, there's a, there's
a double loading screen that's caused

by mixing and matching suspense
and Tanstack query loading state.

The AI always suggests the use suspense
query hook, but apparently if you do that,

it leads to a hydration mismatch error.

And the real answer is to do
some server pre-fetching instead.

So like you have to think
bigger, think architecturally.

So I did the same thing.

I tried, you know, making some
re- replay recordings and feeding

in the, the varying instructions.

My agents mostly got the same base
level used suspense query error.

And then I tried one more session
where I fed in the NextJS and

TanStack query skills files.

And now the agent actually said,
"Well, useSuspense query is the

initial fix, but you really ought
to do server pre-fetching instead."

So again, like that, that taught
me a lot about the proper context.

That also says a lot as we're trying
to build our own CI debugging agent

that looks at test recordings, so
clearly we need to give it, you

know, good prompts, user-based
context, all that kind of stuff.

So, I mean, I think even something
like that can maybe help explain

some of the differences and
results that different people get.

Yeah, no, the impact of prompting and
available context is, like, so massive.

As we're talking workflow, something I do
very regularly is, you know, start with

a prompt, start with, like, if I'm gonna
add a new feature or redesign something

or what, you know, a large task, not a
bug fix, you know, a sprints project,

not a ticket, then I'll start off with,
like, a 200-word, like, prompt of, like,

"Here's what we're trying to build.

Here's how I think we will need to do it.

Here's a couple of parts that I
think are gonna be important." You

know, you know, have it explore,
do the implementation, review, you

know, and then I manually confirm.

And then I'll tell it, like ... And
I guess I'll do this at a couple of

points throughout as it's exploring
and, like, evaluating things.

I'll ask it, like, "How
was the documentation?

Like, did you find everything you
needed?" And that has been, like, totally

transformative for, I think for token
usage as well, because, like, just by

asking that, it'll say, like, "Oh yeah,
it was all ... Here's the three files I

referenced and I got everything I needed."

Or, like, "Oh, yeah, this file was
out of date. You know, I did the wrong

thing at first and then I had to go
back and rework it. " So I'd recommend

that we update this" and it makes those
recommendations just based off its

own experience of reading the code.

And so, like, just by doing that fairly
regularly as the code base evolves,

I manage to keep, you know, the
documentation pretty well up to date.

Are you asking it to write its
findings back into documentation

or are you updating the docs?

I'll read it with a skeptical
eye of, you know, like, does this

sound like it is a real problem or
not, a real area for improvement?

And then I'll just say, like, "Yeah,
hey, can you update those docs?"

Often, I'll come at it with a pretty
clear intention of, like, we just

revamped how scheduling works.

Let's analyze the architectural overview
document that we have and see if

it's still accurate and then tell it

...
You know, so, like, I will do some
documentation specific projects with that

in mind, but for the most part, yeah,
just, like, ask it how the experience

of the familiarizing itself with the
code base is, and then give it free

reign to improve that for the most part.

Yeah.

I think one thing I would be worried
with that we have some people in the

team experimenting with that, and they
all agree that after a few iterations,

like, a few weeks later or a few months
later, you end up with very spaghettified

documentation that's essentially just AI
slob that humans definitely don't wanna

read, but even the AI barely wants to
look at it and read it because it just

keeps getting low, lower and lower signal.

So I think ... I don't have a solution
for that, but it's a thing I've heard.

That's fair.

I haven't really evaluated that recently.

I did come at it with trying to
say, like, "Here's five documents at

these levels of abstraction in the
code base." So I do try to give it a

little bit of overall stru- structural
guidance, but that's a good thing to,

that's a good thing to look out for.

I haven't really evaluated it.

That's also how I get my own
mental model of the code.

So I guess, like, I don't necessarily
write it, but I will read it, and if

it's nonsense spaghetti, then it's like,
"No, we gotta do this again." Like-

That also ties into the
longer term memory question.

So, you know, the problem is
ev- every session is fresh.

It knows nothing, you know, every,
literally everything is being injected,

you know, agents, MD, you know,
whatever rules, files, et cetera.

So how does an agent even know that
you have these nice architectural docs

that you've been keeping up to date?

Or how does it know that, you know,
last week, a week earlier, whatever,

you made these particular decisions,
and that's why we ended up here.

Or it has no idea what
any of your code base is.

Let's go read 30 files to re-
form a brand new, fresh set of

memories that compress the context.

And some of this can be,
can be dealt with partially.

There's ... I, I, I highly recommend
tools that will do AST-based scanning

of the code base and preferring those
for what, for loading chunks of code

rather than just, like, blindly whatever
built-in read file tools are there.

But that is actually where I'm
running into the limits of my

own personal workflow right now.

I have, you know, dozens of feature
and research docs that I've generated.

I have daily progress docs.

I have sub-task handoff docs.

There's a lot of very valuable information
in there, and the AI has absolutely

no clue that any of that exists.

So I'm n- I'm now f- starting to feel the
need that I need to have some kind of,

you know, review sweep process, some kind
of tool to, you know, index the markdown

file, something to form that longer term
memory structure, and I keep bookmarking

hundreds of tools and saying I'm gonna go
investigate them and haven't done so yet.

If only you had a tool that's really
good at summarizing large pieces of text.

I, no, I, I actually
have done some of that.

Um, I just haven't actually settled
on one and tried installing it yet.

One thing I found also really helps
for context and stuff like that is

actually structuring your code well.

So having a vertically oriented
architecture, small balls of mud that

are self-contained rather than one large
horizontally sliced ball of mud makes

it easier for humans, and it also really
works for the AI when you can say,

"Go add a feature to this directory."

And it's the, like, basically use your
directory structure as an index for

where to find different kinds of code.

That sounds suspiciously like software
engineering practices, and I was

told those don't apply anymore.

Yeah, right.

Like, the main thing that I've been trying
to keep in mind here of using it, and,

and this was, this kind of talk goes back
to your recent blog post of, you know,

where you talk about how much code A-
AI is writing for you, and to just talk

to it like, these models were trained
on things like GitHub discussions and PR

reviews and code comments and whatever.

So, like, it's not some new,
esoteric, crazy, unpredictable thing.

Like, the more you talk to it, just like
it's a competent engineer, the more it

will behave like a competent engineer.

And so it has all of these assumptions
about, you know, social norms within

the context of, you know, text documents
describing code and documentation.

And so the, the more you understand the
social norms that it are deeply baked

into its training, the better it does.

So, like, yeah, like, I saw these,
you know, memory startups where it's

like, "Oh, we will automatically
generate documentation for your code

base and make sure it's up to date."

And it's like, I'm already doing that with
a Read Me document tagged with a GitHash

of how, when it was last updated for free.

Like, if you have a couple of folders,
a couple of directories full of code

that is, like, pretty well isolated with
a Read Me document in there, like, I

don't need to tell the AI to look for
the Read Me because it knows to do that.

Of course, it knows to look for a Read Me.

So there's definitely a lot of
difficulty with keeping the signal

high, but, like, that's not new.

I don't know.

Like, that just sounds so much
to me, like, working on a team

of eight engineers and, like, not
everyone has the same shared context.

Not everyone reads every
email or every Slack message.

And so, like, as far as the challenge
of, the, the challenge of keeping

a team up to date on best practices
and recent decision making, versus

keeping an AI at the same level, like,
that, those feel really similar to me.

It's just engineering best
practices and communication norms.

With, with the major difference
that AI will actually go read the

documents you tell, you ask it to read.

Right, right.

Like, one of my, like, you know,
war stories was I was a contractor

somewhere and they kept having, I kept
having to do a rework because, you

know, QA was overwhelmed and, like,
you know, they finally get around to

reviewing PRs and, oh, it's broken.

Oh, gotta go back.

Oh, there's all these
conflicts now because

And so, like, despite being
a contractor, I sat down.

It's like, no, we need to
fix our review process.

We need to fix our merges and whatever.

Spent, like, two weeks doing that,
get everyone on the same page, man.

Next day, the, you know, tech lead
employee at the company, the guy

who should have been doing that
process is like, "Nah, this is too

much. Like, I'm just gonna force
push." And he took down production.

So it's so it's like, uh, when
people talk about AI doing stuff,

it's like, "I don't know, man. Have
you ever worked with engineers?"

Uh, so this, this does tie into a lot of
the conversations that I had last week.

I, I was at both the AI engineer
and React Miami conferences.

And the general theme of the discussions
there between the talks and the people

and then, you know, all the different
chatter online is we built ... And

I also ranted about this in my
15,000 Word unpublished blog post.

We built a bunch of software development
practices for people, the agile manifesto,

linear issues, PR reviews that require
multiple people to stamp them or, you

know, as a way to share knowledge,
standups, like these are all people-

based processes, and it made sense when
humans are the limiting factor and you

have to be very intentional about making
sure you're writing the correct code.

Like, not just does the code work, but
are we even spending the time and effort

on the right thing in the first place?

And we now essentially have a generate
infinite amounts of code button, and now

we're finding out that we've moved the
bottleneck further down the chain into all

the verification and QA sides of things.

And so this shows up in, oops, GitHub
is overwhelmed because we've literally

10Xed the number of PRs that are being
pushed or, wait, we've got a bunch of

senior engineers, but they're literally
spending all their time reviewing PRs

generated by the junior devs with AI, much
less what happens when you have a dark

factory of agents cranking out code twenty
four seven and no one ever looks at it.

Like, we haven't figured out what the
resulting processes ought to be, and

one of the points I make in my blog
post is that I think right now we're

all operating under the assumption that
there's no limit for how fast we can

go, and in reality, we're gonna figure
out the actual limit is maybe, like, 3X

of what it used to be, but we need to,
like, accept that and plan for it instead

of thinking that if we shove everything
in one end of the pipe simultaneously,

it all comes out the other end.

I have opinions.

Um-

Yes.

I think I wanted to say first is I really
like, Carl, you mentioned that you start

sessions with a really long 200 word
prompt that gives all the context, doesn't

just tell the agent what to do, but also
why it's doing it in, like, other context.

Same.

This turns out works really
well with humans as well.

They give you much better work
if you tell them why they're

doing it or what the goal is.

And sometimes if you tell, if you
tell them what the goal is, they

might even come back with, "Your
solution is dumb. We should do this

other thing instead," which, wow, can
you imagine using engineers' brains

for thinking, not just writing code?

But I think the other thing is, we
are generating a lot of code, and we

think the bottleneck is code review.

I think there's kind of two bottlenecks.

One is that when you're moving really,
really fast, you have to actually

be more careful about working on the
right things because if you're just

digging yourself into a hole faster,
we just had an example recently where

we broke Agile, we were like, "Oh yeah,
we know what we, what we need to do.

Let's just have Claude crank out 6,000
lines of code in one gigantic PR."

First of all, we couldn't review that,
so we had to break it up into a bunch of

sub-tasks and then turn that into sub-PRs.

Turns out that took another extra day of
work and everyone was distracted while

doing this, so we weren't focusing on what
we were actually supposed to be doing.

So I think distraction is still bad.

Multitasking is still bad.

We're not as good at
multitasking as we think we are.

And the other thing is we, when we
split all of that up, we found that

Cloud made a tactical mistake very
early on, chose the wrong architecture

for how to build something with React.

And now you have a lot of code that needs
a lot of rework to actually make work,

and you're wasting reviewing time, you're
wa- you're wasting a lot of time that

could have been solved if we started with
a small task to set up a small thing and

talked about it, or maybe even talked
about the architecture before we wrote

six, 7,000 lines of code, and then just
wrote the correct code the first time.

Yeah, 100%.

We're hitting communication
bottlenecks much faster now.

Yeah, like Mark said.

Communication and thinking and
planning was always the job.

Yeah.

Yep.

Except now you're running,
like, 10 times faster.

So if you take the wrong step or as
your first step, you're just running

in the opposite direction of where
you wanna go, but really, really fast.

Right.

Yeah.

Right.

It's very easy to get in
extremely the wrong place in

exactly the same amount of time.

That touches on something I've
said before in this podcast.

I think that the future of, like,
review is gonna be less code review.

The architectural side of things
is still gonna be important.

Like, it's not just, does it work?

Does it do what you intended?

You can't fully treat it as a black box.

The box has to be translucent, you
know, at least a little bit to avoid

major architectural problems that are
gonna cause a production outage or make

it impossible to scale or whatever.

Like, it's not all going
to be, does this function.

But I do think, does it function is
going to be a much greater part of this?

My sense is that it will rely a
lot less on reviewing the code so

much as getting a plausible mental
model of the architecture, and then

it, you know, and then statically
verifying that it does what's intended.

So, like, my, my theory here is that
we're gonna start seeing a lot more

end-to-end testing and user acceptance.

A thing I did recently that I
haven't fully executed on yet,

but ... Well, I, I did a lot of it.

But so a, a project that I have, I
fully rewrote the CICD so that instead

of PR checks, review, approve, merge,
ship, instead of, you know, that, like,

trunk-based development, where every PR
gets shipped individually, I move to more

of a, like, Git Flow type architecture
where it's like, no, here's the version.

This is going to go out.

Uh, here's our release candidate.

And then that single release candidate
gets pushed to a staging environment,

and then I do the user acceptance testing
of does this function as I need it to.

And then once I have that, once I've
done that manual testing, great.

I know this code is
working as I meant it to.

So now I'm going to have it write a bunch
of end-to-end tests to verify that it will

continue to do that when I make changes.

And I think that's gonna be
kind of the new cycle of, like-

I want that, but on every PR.

Yeah, right.

And I guess to me, like, that's where
manual testing is now the bottleneck.

Like, you cannot ... At the end of
the day, this needs to function,

this needs to solve a problem.

Any code you write needs to solve
a problem that real humans have.

Or I guess, you know, sure,
other agents or whatever.

I've experimented with
automated end-to-end testing.

It doesn't quite work yet.

I think the models are, or at least the
models I was trying were too slow, but

the idea is that instead of telling that

Instead of writing a test as what the
features that should be there, it's

more about, can the user do a thing?

And then because you have a computer use
model, it can look at the browser and

it's automatically testing both your UX.

Like, if the model can't figure out
how to do what your user is supposed

to be, drunk users won't figure it out
either, or tired users or whatever.

So you just point it at the browser
and you'd say, "Go, like, go place an

order." And at the end of the day, as
long as it can place the order it wants

to place, you can keep iterating on the
implementation, you can keep iterating

on the design, you can keep iterating
on the UX, you just get tests that are

based on can the agent place an order.

Yeah.

Oh, that's really interesting.

That's, that's a really good variant on
what I was just kind of talking about

because, like, sure, static end-to-end
tests that are verifying, like,

this button with this test ID exists
and the flow completes as expected.

Like, sure, okay, it
functions, but is it usable?

That's an interesting usability
automated usability testing because

that kind of goes back to what
we were talk- talking about with,

like, documentation and code notes.

Like, just make it intuitive.

Can an agent intuit how
to achieve a general task?

Oh, that's really interesting.

If you've ever used VCR in Rails,
that's kind of where I got the idea.

With VCR, you do full integration testing
with calling remote APIs, but you don't

wanna call them every time, so you record
the responses and you keep them locally.

So you then write tests against
those local responses, but if the

API ever changes or anything, you
just regenerate them and you find all

of the bugs in your code that don't
fit the new, newly released API.

And you could do something similar
with this where you have browser-based

acceptance testing, and it records
what it's doing so that you don't have

to spend as many tokens every time.

And you just d- when you change
the UI, you delete it and you

see if you can do the flow again.

Yeah.

Totally smart.

Interesting.

Doesn't quite work yet, but
that would be really cool.

We've kind of been going into this
direction, but let me, like, sort of,

like, you know, refocus a little bit.

Like, so what do we think the
current landscape of tools,

of AI tools looks like?

We've kind of been discussing that
in, in, we've been dancing around it.

And I guess where do we
think it's going next?

Going or ought to go?

Either, both.

Both.

My general inclination as the person
who found one tool set and is stuck

with, and is sticking with it and is not
actively trying to change what he's doing.

Certainly what I'm seeing online is
turning towards automate all the things,

which I think is a very understandable
impulse for a software developer.

If you can do things once, you can
do it in a loop, you can run it in

parallel, you can distribute it, you
can have an agent do it, et cetera.

And I think it kind of ties into the
go as fast as possible at all costs

for the purpose of going as fast
as possible, direction that I see.

I mean, like this bit about, you know,
the dark factories, the idea that a team

never even looks at the code, that it's
truly all agents all the time, and all you

say is, "Do something that accomplishes
some goal, and you never even look at it,

" is sort of an ultimate outcome of that.

I don't think it's a good one,
but I see how it's the idea

taken to a logical extreme.

I, I also have been pretty ... I
found a tool and I'm sticking with it.

I've been noticing ClaudeCode
pulling in lots of ideas

that I've just been thinking.

Like, as it evolves, it's like,
this is a productized version of

how I've already been using it.

You know, like they added a /btw command
that lets you ask a question and it

will ... You cannot use tools and it
does not go into the context history.

So it's just like even do a little aside.

Like if you wanna ask, like, "What's
this thing doing?" I had already

been doing that by way of, like,
I'd ask it a question and then

I'd go back and just, you know,
restore the conversation to that.

And similarly, two months ago, I had
been saying, like, I would actually spin

up a Claude code instance that was on
Haiku because I just need to summarize

this and it's easier and I wanna manage
my usage a little bit more precisely.

And, like, now I don't have to do that
because I just say, use a subagent and

ClaudeCode is smart enough to say, "Oh,
I'm just summarizing. Let me use Haiku."

So I've just been watching, like,
kind of what the agent itself is doing

versus what I need to tell it to do, or
proactively manage, converge a little bit.

So that's, that's had me wondering,
what outside of the agent, what is

not going to live within the agent.

And I think where I've landed on
that is, like, everyone's talking

about memory the last six weeks
or so, and I think that's wrong.

I think that's just gonna be
conventions and documentation.

It's just knowledge.

I think what's going to persistently
live outside of an agent harness, as it

were, is, like, incoming information,
like a data stream, like GitHub issues,

text, you know, the Slack chat, a
Century Alert, a production alarm.

If you have a data stream that is
categorizing those and prioritizing

them, and then giving that to an agent,
to an autonomous agent that is just

working, like, those are different things.

I think you cannot have a single
generic agent that is able to connect

to every possible data source.

So that to me feels like a
boundary that's gonna be pretty

strong for the coming months.

I don't know.

I think for me, I feel like going
super deep into what, at least on

the internet, a lot of people are
talking about various MD files and

orchestrations and all the crazy stuff.

I feel like that's a little bit
like ID skins and WIM shortcuts.

It's like, yeah, sure, it's cool, but
I, I think you're over complicating it.

The models are gonna keep getting
stronger and better, so I would

just think they're gonna be able
to do bigger and bigger tasks.

I think feedback loops, like what you
said, what is outside of the model?

Building really strong
feedback loops is important.

I think memory documentation inside the
code or making the code itself easier

to research or giving it MCP tools to,
like, notion or linear or whatever,

you keep your organizational memory.

Having access to organizational
memory will help agents as well.

Yeah.

I don't think we're ever gonna come to
full automation of anything, really.

I think there's a strong gelman
amnesia when it comes to th- these

things where agents will fully
automate every job except the ones

I'm strict, super familiar with.

That is just way too complicated
and there's too many deals.

Everyone feels that way about
their job because newsflash,

they're all too complicated.

I wanna explain that reference briefly
for people who aren't familiar.

Gelman and Amnesia, that refers to
the idea of, like, you're reading a

newspaper and you see a story about your
professional industry or something you're

very knowledgeable about, and you see
everything that it got wildly wrong, and

you're like, "Who even wrote this? What
are they doing? These incompetents."

And then you read the next article, and
you're not familiar with it, and you

just take it super credulously, and it's
like, "Oh, yes, look at these," you know.

It's a, like, I think that's a pretty
good way of thinking about how people

are talking about AI right now, yeah.

And I think, from my perspective,
go hard as long as you can.

I think the bubble is going to
collapse in the next one to two years.

Tokens are gonna become super expensive,
and we're not gonna use as many of these

tools long-term as we are right now, but
we're gonna have a lot of infrastructure,

like, just like we did after the dot-com.

That's part of why I've been
really interested in local

models, because I agree with that.

Like, they're being heavily subsidized
right now, and I think in, I don't know,

about two years, I don't know, we'll see.

It's been very ... People are talking
about it, it feels like the wheels

are falling off right now with, like,
ClaudeCode is just clamped down usage on

the 20 d- $20 a month plan aggressively.

They grew from nine billion ARR to
30 billion ARR in a quarter, so I'm

imagine they have some scaling issues.

Right?

Yeah, clearly.

One of the points I made in that draft
blog post that I'm hoping to publish this

week, the cats out of the bag, like, even
if OpenAI and Anthropic were to utterly

and completely collapse and go out of
business today, the technology exists.

And whether it's, you know, boutique
hosting of models or local LLMs, the

technology is not going to go away.

And even if it only, even if it
never gets better at all, the t- the

capabilities stay exactly the way they
are today, it's still curly enough

to upend large parts of our society.

So then the question becomes, w-
what are you doing with it and

how do you, how do you respond?

Well, that's a good ... We should wrap up
soon, but that is an excellent jumping off

point for our last little subject here.

Uh, what do we think the impacts of LLMs
on engineering as an industry, software

engineering as an industry, and maybe more
broadly, societally, if we wanna go there?

I would say it's raising
the bar for everything.

Like, what counts as a minimum
viable product is gonna get

more fancy and more complicated.

What counts as a junior entry-level
engineer, or what the expectations are for

a junior entry-level engineer are gonna
go up, and I think we're gonna need a lot

more, um, product sense, and a lot less
being just really good at writing code.

Yeah, I'm inclined to agree.

I am honestly worried about the
impacts in society as a whole.

We've seen lots of debate and studies
about, you know, have, have cell phones

effectively destroyed the minds of
youth and the ability to focus and

social bul- like electronic bullying
and, like, very ... Like, why did teens

get very depressed starting in 2012?

Like, that sort of thing where
a technology possibly had

large-scale societal effects.

And I'm not saying that LLMs are,
you know, inherently destructive or

inherently bad, but a lot of times
the technology gets invented and

then it changes society in ways we
were very unable to predict early on.

In this case, I think we can predict
a number of ways, and there's probably

a whole lot of other stuff, too.

So I genuinely worry about college
students being able to get away

without having to develop critical
thinking skills even less than they

were able to get away with it before.

I have concerns about that sort of thing.

And, you know, I'm towards the tail
end of my career, I've done my thing,

I learned, I gained my experience.

I don't know how junior devs are
going to gain some of that experience.

Maybe it is they need
different experience.

I don't know what the job
pathways look like at that point.

So, like, I mean, don't get me
wrong, I'm excited to be able to

use AI to crank out some of the
things that I've had in my head.

I've found some of the same joy in being
able to have a multi-hour flow session

and build stuff just with AI editing
the files instead of me and my fingers.

But I can also look around and say,
"Yeah, there's a bunch of unexpected

consequences floating around as well."

I mean, in terms of experience,
it's not like I know how to code

without a compiler, and I would
definitely not enjoy coding without

a, without garbage collection.

Yeah, I think that's closer
to my theory on this.

I think it will, you know, the societal
impacts are gonna be something a lot.

We'll see.

I think the societal impacts will come
mostly not from software engineering, so

I think I don't understand those quite
well enough to opine on it too much.

I do care a lot about, like,
sociology and economics and whatever.

So I do have thoughts and opinions, but
I th- maybe it's a, maybe it's my Gelman

new amnesia of like, I know those well
enough that I'm like, "Nope, that's way

too complicated. I'm not no opinion." But
I think we take for granted a little bit

how much we're standing on the shoulders
of giants already and like, this is a big

giant to climb the shoulders of for sure.

Like, not trying to discount that.

People talk about how it's gonna
change our brains and like, you

know, destroy humanity and whatever.

And like, I don't know, like, we
kind of did a lot of that already.

Like, can you navigate
in your city without GPS?

There's a lot of stuff that we've
already lost that used to be

like, canonically like, you know,
you used to have a navigator.

If you wanted to do a road trip,
like, you would have spent a week

planning a route and then, like,
God help you if you miss a turn.

Same thing for, like, finance
and, like, the spreadsheet.

I don't know, this is gonna be, this is a
broad, more broadly applicable thing than

any of those past technological leaps.

Another thing that I learned recently,
going on the societal level again,

there was a ... In 2023, there was an,
like, international, like, literacy

survey done, levels and whatever.

And I think level one was can
read, but struggles with, like,

finding knowledge in text or, like,
following multi-step instructions.

And 28% of Americans m- met
that were, you know, were tested

at that level of literacy.

And, like, it said anything below level
three is considered partially illiterate.

And the stats were half, half-

54% of Americans are less than
sixth grade reading level.

They will not do well with LLMs.

Right.

But, like, you know, we
did that before LLMs.

That was 2023.

Like, okay, sure.

Like, there's gonna be weird shit
going on since then now, but, like,

God, we already ... Like, as far as
destroying society, like, we've done

a pretty good job of that without AI.

And for context, that's
54% of adult Americans.

It doesn't, like, count
children and stuff.

Yeah.

I don't know.

It's just, like, anytime I hear
people talk about, like, the downfall

of, like, civilization or whatever,
it's like, "I don't know. Have

you looked at civilization lately?
It's not doing so great." So, I

don't know, it is gonna be a lot.

It's gonna be challenging and difficult.

I guess going back to engineering,
software engineering, you remember

the '90s and 2000s, 2010s era,
like, stereotypes of software

engineers, like, it's, like, super
autistic, like, zero social skills.

They are a human computer.

And now we don't have that anymore.

We, you know, democratize software
engineering and now everyone does it.

We've always tried to get
something basically like an LLM.

Like, that's what people wanted out
of software engineers this whole time.

So, I think it's gonna be ... I
like the analogy of, like, compilers

and garbage collection and stuff,
because it's just a new tool.

It's a new level of abstraction,
and now it's much further

removed from the hardware.

But, yeah, I don't know.

People talk about, like, the d- it's
gonna destroy, like, the upskilling, it's

gonna destroy junior engineers, and it's,
there's not gonna be any pathway in.

And, like, I don't know, maybe
that's true, but at the end of

the day, I'm gonna end on, like, a
positive thought or, like, advice.

Like, this is all communication.

It's just more and more communication.

If you wanna lock in your software
engineering career, get good at

communicating, get good at deeply
understanding and articulating

a problem, not the code, not the
patterns necessarily, but, like,

understand what an architecture
is good at or bad at or what the

performance profile of this or that are.

And, like, now, you don't even need
to understand it at the same level.

Like, I've been really excited about
AI is because I have such a broad and

shallow expertise, and now I can use it.

Like, it used to be that I had all
this, like, vague notional expertise

in a wide range of stuff, and it's
like, "Damn, I wish I could use that,

but I would need a team of 15 people
in order to be productive with this.

And now I don't.

So, like, I don't know,
that's exciting for me.

And I think that's just
generally exciting.

So if you're just good at communicating
and you learn a domain and you get good at

articulating problems within that domain,
then, like, I think you're gonna be fine.

I think superpower right now is if you're
a domain expert who can kind of code.

There is certainly still value
in, you know, greater engineering

expertise and understanding how
to ship stuff to production.

At some scale, that
becomes a domain expertise.

True.

Yeah.

Extremely true.

So yeah, I don't know.

Uh, I'm excited.

can build more stuff.

I can build so much ... I
have built so much more.

I've explored so many more things
and gone, "Oh, this is harder than

I thought. I'm gonna set it down."
Instead of just never doing it and,

like, fantasizing about what it
could have been for years, you know?

Yeah.

Same.

All right.

Cool.

Thank you all.

Thank you both for joining me.

Thank you everyone in the
audience for listening.

Appreciate it a lot.

Hopefully this was informative.

Cool.

All right.

Well, this has been this
month in React, quote unquote.

I think we may be exploring different
formats in the future because, I don't

know, who's following individual software
library development as closely anymore.

But yeah, thanks so much for listening.

If this is a show that you get value
from, please send it to a coworker,

send it to a friend, leave us a review.

And yeah, I'll see you next month.

All right.

Take care.

More episodes

Chapters

Creators and Guests

What is This Month in React?