Limitless: An AI Podcast

AI Loops have taken over our timeline as a more autonomous way of using AI models, alongside prompting, agents, and harnesses. 

Today, we compare practical use cases, note how AI runtimes have expanded to hours or days, and talk about costs, enterprise limits, and the human role in higher-level work.

------
🌌 LIMITLESS HQ ⬇️

NEWSLETTER:    https://limitlessft.substack.com/
FOLLOW ON X:   https://x.com/LimitlessFT
SPOTIFY:             https://open.spotify.com/show/5oV29YUL8AzzwXkxEXlRMQ
APPLE:                 https://podcasts.apple.com/us/podcast/limitless-podcast/id1813210890
RSS FEED:           https://limitlessft.substack.com/

------
TIMESTAMPS

0:00 AI Autonomy Ladder
1:49 From Prompts to Agents
4:59 Understanding AI Loops
10:35 Why Autonomy Is Rising
15:46 Human Taste Still Matters
20:38 The Cost of Intelligence
25:25 Recursive Self-Improvement
27:32 Four Rungs Explained
29:41 Closing

------
RESOURCES

Josh: https://x.com/JoshKale

Ejaaz: https://x.com/cryptopunk7213

------
Not financial or tax advice. See our investment disclosures here:
https://www.bankless.com/disclosures⁠

Creators and Guests

Host
Ejaaz Ahamadeen
Host
Josh Kale

What is Limitless: An AI Podcast?

Exploring the frontiers of Technology and AI

Ejaaz:
99% of people are using AI models the same way that they use Google.

Ejaaz:
But recently, a new way of prompting your AI has emerged that doesn't just replace

Ejaaz:
the way that you work, it promotes you to the CEO of your very own AI company.

Ejaaz:
It's called Loops and it's part of a growing development in agent autonomy where

Ejaaz:
AI agents basically spin up and autonomously complete tasks or goals that you

Ejaaz:
set for it, often working throughout the night.

Ejaaz:
In 2019, the longest that an AI agent could work autonomously for was for two seconds.

Ejaaz:
Fast forward to today, and they can work autonomously for 12 hours,

Ejaaz:
and that's doubling every couple of months.

Ejaaz:
Andre Carpathy calls this phenomenon the autonomy slider, where you can go take

Ejaaz:
a dial that slides from humans that approve everything to humans that periodically check in.

Ejaaz:
And it's part of this growing trend of agents consuming and taking up more of

Ejaaz:
human capital and labor. And the question that remains going forwards is, what will humans do?

Ejaaz:
And will they be entirely replaced

Ejaaz:
by AI? Or will they be the ultimate orchestrator of their destiny?

Josh:
Yeah, I think the goal for this episode is really just to inform people on what's

Josh:
possible current day with these agents, with these LLMs, with writing these

Josh:
loops, as well as where you can possibly find yourself within that stack,

Josh:
because it gets pretty complicated.

Josh:
When we're getting into loops, not everyone needs to use loops,

Josh:
but everyone should be using LLMs probably slightly different than how you're using them today.

Josh:
So maybe we could start with a little history lesson in terms of the four levels

Josh:
in which we have been engaging with llm starting with the first level which is just prompting,

Josh:
generally like most people are probably still doing this started in 2022 2023

Josh:
around the release of chat gpt

Josh:
the way that you would engage with these llms is you would just submit a question

Josh:
or you submit a prompt and you get some language back now if you are still doing

Josh:
this that's okay because i find a.

Josh:
Three years ago, four years ago. It has since advanced pretty,

Josh:
pretty meaningfully since then.

Josh:
The second step of this is agents. And we're going to spend some time on agents.

Josh:
Everyone's kind of heard of an agent. Maybe not everyone knows what an agent is.

Josh:
An agent is something that could think for a little bit longer.

Josh:
It could run a bit longer than just a standard prompt.

Josh:
It can go off and do things. It could call tools for you.

Josh:
It's a much more capable version of the text box. Then, like we talked about

Josh:
all the time on the show recently in the last few weeks, there's the harness

Josh:
feature in which you put an LLM into a container and that gives it a memory feature.

Josh:
That gives it complete tool use. That's something like an open claw that we've

Josh:
talked about a lot that some people do use and that's level three.

Josh:
And now level four, which is the new thing that has come this week,

Josh:
that's really been highlighted by some of the top leaders at these AI labs is loops.

Josh:
And a loop is essentially a version of an agent that has an orchestration layer

Josh:
and kind of builds upon itself.

Josh:
So it allows you to kind of continue to scope yourself out. If you can imagine

Josh:
you're kind of you're dealing directly with an employee at level one and then

Josh:
you're kind of directing that person to go off and do their own in level two,

Josh:
At level three with the harness, you're kind of directing a series of people to help you.

Josh:
And then level four, you're just the top level CEO who's directing your C-suite

Josh:
to go and manage all the employees below you. So there's an entire stack to this.

Josh:
It's very cool. It just how do you use your AI currently? Where would you say

Josh:
that you fit in this stack?

Ejaaz:
Yeah, so looking at this diagram that we have on the screen here,

Ejaaz:
I'm somewhere between number two and number three. I'm somewhere between using

Ejaaz:
agents and trying to figure out the whole harness thing.

Ejaaz:
Now, what am I doing when it comes to like spinning up agents?

Ejaaz:
If you look at either my Claude or my ChatGPT desktop apps right now,

Ejaaz:
I've renamed a bunch of my conversations to a particular focus or subject and then agent after it.

Ejaaz:
And so I can go to it and this agent basically has all the context of what I

Ejaaz:
wanted to do, whether it's like research a particular topic,

Ejaaz:
create some kind of an outline for something, research a particular investment angle.

Ejaaz:
It already knows and has the embedded context for what it needs to do.

Ejaaz:
And there's usually like one to maybe three tasks that it needs to autonomously execute on its own.

Ejaaz:
And so it runs in kind of like a sequence. But if any of that sequence kind

Ejaaz:
of breaks, let's say it kind of tries to retrieve data from some particular

Ejaaz:
website and it is unable to do so, it breaks.

Ejaaz:
And it comes to me and it says, hey, Ejaz, is there some other thing that you

Ejaaz:
want to look at or retrieve from, blah, blah, blah?

Ejaaz:
It's not fully autonomous. Now, number three, the Harness side of things is

Ejaaz:
what I'm trying to kind of like mold my understanding around.

Ejaaz:
What I've noticed is when you type in a prompt and you get a response,

Ejaaz:
you can kind of tell that it's AI-y. Like usually when we kind of create artifacts,

Ejaaz:
it comes in a particular font or it speaks in a particular type of language.

Ejaaz:
The Harness helps kind of like take your prompt and kind of mold it into something

Ejaaz:
that is more human-like, but also more nuanced with what you are trying to do.

Ejaaz:
Like it effectively gets

Ejaaz:
closer towards that ultimate goal. Like we were talking before recording this

Ejaaz:
episode about human taste and how AI doesn't really get human taste.

Ejaaz:
The harness helps you get towards that ultimate kind of taste profile for the

Ejaaz:
particular output that you're trying to generate.

Ejaaz:
I haven't tried working with loops just yet, but my understanding of this,

Ejaaz:
and correct me if I'm wrong, is you have an AI. You can prompt it and you can get some kind of output.

Ejaaz:
A loop specifically is an AI agent that doesn't break, if it comes across an

Ejaaz:
obstacle that it doesn't understand, its instinct isn't to come to the human

Ejaaz:
and say, hey, like, I can't figure this out, guide me.

Ejaaz:
It completely reiterates the prompt over and over again until it gets past that

Ejaaz:
obstacle, working towards like one objective. So a few examples I've seen for

Ejaaz:
this is if you are coding, right?

Ejaaz:
And let's say there's multiple workflows of a code base that you want to work

Ejaaz:
on, and it comes across a hiccup where it can't retrieve data from one of those

Ejaaz:
particular flows, it is able to kind of like circumnavigate around it,

Ejaaz:
maybe spin up its own separate flow and try to figure out the problem.

Ejaaz:
And often this results in an agent working for multiple hours at a time,

Ejaaz:
often overnight. I think Carpathy spoke about his auto research agent working

Ejaaz:
overnight whilst he slept.

Ejaaz:
And we're seeing different variations of this start to arise.

Ejaaz:
Where are you, Josh, in the stack?

Josh:
Yeah, loops are like the closed source a system where you kind of define an

Josh:
outcome and it will continue to work towards that outcome without any external inputs.

Josh:
It's very cool. It's very automated. I don't think it's for everyone.

Josh:
It's certainly not for me because I haven't really had a use case for loops per se.

Josh:
I would say I'm sitting at each one of those first three phases given whatever

Josh:
tasks I'm trying to do. And I think it's important to understand that a lot

Josh:
of people might not even need to go past number one unless you're actually doing productive work.

Josh:
A lot of the agents, a lot of the harnesses are for kind of automating more.

Josh:
More systems from your life if you're just trying to use this as google if you're

Josh:
just trying to use this as a writing assistant or someone to chat with the prompting

Josh:
is really strong and i find a lot of times

Josh:
this is my outlook or this is my outlet for like google search results so instead

Josh:
of searching for google i'll get a little more in-depth results i'll ask my llm

Josh:
for agents i use them quite a bit when i'm doing a little bit more productive

Josh:
work for example we track the analytics on limitless and we want a place in

Josh:
which we can have all those analytics dumped to a dashboard,

Josh:
that is an agent that I run.

Josh:
So it goes into my browser. It detects all of the views that we've had from

Josh:
the week for YouTube, from Spotify, from RSS feed, where you should all be subscribed

Josh:
to and rate us five stars.

Josh:
And it compiles it into a singular spreadsheet in which we could then publish

Josh:
online and we could share with prospective sponsors and things like that.

Josh:
And then for harnesses I've used, because I mean, that's mostly OpenClaw.

Josh:
I've used OpenClaw. I really enjoyed the process. I find myself using it a bit less and less.

Josh:
And I think in the loops feature, at least it's probably most productive right

Josh:
now for people who are writing code, who are writing verifiable solutions.

Josh:
One of the difficult things that as I was looking into loops and figuring out

Josh:
how I can structure them into my life, one of the problems that I run into is

Josh:
I'm not really sure I have a verifiable,

Josh:
set of outputs that I wanted to optimize for, for a lot of the work that I'm

Josh:
doing, because a lot of it is subjective. A lot of it is kind of creative work.

Josh:
It requires a human in the loop for a lot more of it.

Josh:
So I would say I am number one, two, and three on the list. Haven't quite made my way to four.

Josh:
But yeah, for the people who are, those are the people like Boris Churny from

Josh:
Anthropic. And we know Andre and Peter Steinberg from OpenAI.

Josh:
They are all on four. They are using it to,

Josh:
create these like unbelievable, agentic systems and continue to remove themselves out of the loop.

Ejaaz:
You know what I've realized? With loops in particular and just AI agents in

Ejaaz:
general, they're trying to improve our understanding or rather their understanding

Ejaaz:
of the English language.

Ejaaz:
So one of my favorite Carpathy quotes back in the day was English is the new

Ejaaz:
programming language. I think you said this like two, two and a half years ago.

Ejaaz:
And I've just realized that like us creating AI agents is basically like,

Ejaaz:
it's the same model. It hasn't necessarily got smarter.

Ejaaz:
It's just like using that model to kind of like keep ramming its head and its

Ejaaz:
brain against a particular problem until it understands what the human actually means.

Ejaaz:
And so like in this new world, like I know you just used the example of like,

Ejaaz:
you know, loops can be used for coding specifically,

Ejaaz:
coding that Boris Churny and Carpathy is doing is English.

Ejaaz:
Like they're speaking to the LLM, they are writing in English to the LLM.

Ejaaz:
And yeah, maybe they're copy and pasting some versions of code,

Ejaaz:
but that code is primarily generated by an AI.

Ejaaz:
I think like something crazy, like 80% plus of code generated at Anthropic,

Ejaaz:
both for research and for just general consumer adoption is generated by Claude itself.

Ejaaz:
And so that's one thing. The other thing is the model just not getting smarter

Ejaaz:
is a really interesting thing. Like typically in my head, I would think,

Ejaaz:
okay, you need a better model to be able to unlock some of these new features

Ejaaz:
like AI agents, autonomous loops, et cetera.

Ejaaz:
But really you could just take the same model, wrap a harness around it and

Ejaaz:
try to get it to understand what particular goal it's getting at and just run

Ejaaz:
that iteration over and over and over again until you get a better output.

Ejaaz:
And I guess this is the same concept as inference or reinforcement learning

Ejaaz:
where like we've found this trend of post-training of these AI models,

Ejaaz:
these AI models just getting smarter, not because they've got bigger GPUs or more expensive GPUs.

Ejaaz:
It's because you've just taken the same model and you've just run it through

Ejaaz:
a different reasoning framework over and over again until it can do a thing.

Ejaaz:
And this is the practical embellishment of it. I personally haven't found like

Ejaaz:
an obvious use case for loops either.

Ejaaz:
So either you and I are boxing ourselves into a particular realm and maybe someone

Ejaaz:
listening to this is using this for like their software engineering thing or

Ejaaz:
their marketing thing. But yeah, I guess that's where I sit right now.

Josh:
Well, I think it's probably a skill issue on both our parts.

Josh:
Like there is certainly a use case for us in which we can use a loop in which

Josh:
we can define this outcome, send an agent off to go do it, and it will iterate

Josh:
on itself until it comes to a conclusion.

Josh:
I think it's just so novel and so new. It's difficult to kind of understand

Josh:
why. And we have this really great chart on screen that you're showing now,

Josh:
which is the why now section of this.

Josh:
And it's because the duration of a task that these agents can run is so much

Josh:
longer than it used to be.

Josh:
I mean, in 2019 we have here, it was two seconds. This was well before ChatGPT.

Josh:
But even early last year, in 2025, the duration that an agent could run on one single task was

Josh:
less than an hour in length so there's only so many tokens it could generate

Josh:
there's only so much reasoning it can do and there's only so much iteration

Josh:
you could get over that hour time period let alone the amount of costs that

Josh:
these tokens are going to be,

Josh:
costing you if you're using like the api or anything like that now fast forward

Josh:
to today i mean the best models in the world they're getting days worth of runtime

Josh:
so they can really think deeply and continue to iterate on themselves over and over i see examples of um.

Josh:
Backslash goal on x all the time of people who have a problem whether it be

Josh:
an optimization problem where they have a bug that they need to fix and they'll

Josh:
put this backslash goal on it for

Josh:
however long it needs to and it'll think for three four even five days i've

Josh:
seen in order to optimize for the specific parameter and this is possible because

Josh:
these models now can think for days long,

Josh:
you have to assume months is coming what does it look like Like when an agent can think for months.

Josh:
I mean, it's a really interesting paradigm shift that I'm not sure where people

Josh:
are going to find value in the open-ended way that it exists today,

Josh:
right? It's like, okay, here's this agent.

Josh:
You can tell to do whatever you want. You can create a loop.

Josh:
You can create an infrastructure system for it to operate in.

Josh:
It's pretty much open-ended and it's on you. And I think the answer to that

Josh:
is that not even the AI companies really understand the best use cases for it quite yet.

Josh:
I would imagine it's still this really difficult thing of how do you unlock

Josh:
value from essentially an open-ended agent that can go and run for an infinite

Josh:
amount of time? I don't know.

Ejaaz:
I also question like what a human's purpose would be at that point.

Ejaaz:
Like if you automate enough of the thinking and the curiosity behind like solving

Ejaaz:
particular problems, What do humans end up doing at that point,

Ejaaz:
especially if they don't do the work themselves?

Ejaaz:
They don't understand it, right? You need an AI to kind of like understand what

Ejaaz:
on earth is going on in the first place.

Ejaaz:
And eventually like an AI will then start setting goals, like more ambitious

Ejaaz:
goals than a human can in terms of like what to like kind of solve or go after.

Ejaaz:
There were some very low-level examples that I saw in response to Pete Steyer's tweet about loops.

Ejaaz:
And there's some kind of concrete examples that I want to run through very quickly here.

Ejaaz:
So one of them is using it for code, right? So a classic loop could basically

Ejaaz:
look like, okay, can you please pull live errors for my particular app?

Ejaaz:
Can you inspect and figure out where the bug might particularly be?

Ejaaz:
Can you create then a fix for this particular bug in my code?

Ejaaz:
And then can you deploy it? Then can you check the health of that deployment

Ejaaz:
and make sure that nothing else is broken?

Ejaaz:
And then record what failed and feed that into a database so that in the future,

Ejaaz:
we can detect errors like this or prevent it when we code and build some of

Ejaaz:
These future app features.

Ejaaz:
Now, that is kind of like a very small and specific enough use case that can

Ejaaz:
be generalized across basically any app or software engineering project that

Ejaaz:
you might be working on if you're listening to this.

Ejaaz:
And I wonder how many hours worth of engineering time that this replaces.

Ejaaz:
Because I know that there are entire teams having worked at companies.

Ejaaz:
Been a product manager in the past, entire teams of software engineers that

Ejaaz:
spend their entire days working on something like that. So that's one thing.

Ejaaz:
And then for content, which is very applicable for product managers,

Ejaaz:
or even like the work that you and I do, Josh, an agent could read a PRD.

Ejaaz:
So which is a product requirement doc, which is usually kind of like created

Ejaaz:
for a strategic goal that you want to kind of like build at your company,

Ejaaz:
like a product or a feature, it then writes whatever that next asset could be.

Ejaaz:
So it could be like a design profile or a mockup of what that feature might

Ejaaz:
look like, score it against like some kind of criteria that the company has

Ejaaz:
across like, you know, it must follow our vision, A, B, and C.

Ejaaz:
It must also look a particular way. This is our design profile,

Ejaaz:
our brand kind of profile and our aesthetic.

Ejaaz:
And then it kind of like updates its progress depending on like what other teams

Ejaaz:
have shipped. So maybe it's dependent on a particular feature.

Ejaaz:
And so it updates itself autonomously like that. Now, this all sounds very vague

Ejaaz:
intentionally because it's meant to apply to your particular business or your particular project.

Ejaaz:
But make no mistake, this is what

Ejaaz:
a lot of humans are paid upwards of six figures to do on a daily basis.

Ejaaz:
It's that nuance. And we're starting to see basically AI models and AI agents

Ejaaz:
enter into that human taste profile. So when I think about where we end up eventually,

Ejaaz:
There's a common argument that's made that it's like, oh, humans will always have the taste.

Ejaaz:
They'll always be able to kind of direct where the AI should go because we are

Ejaaz:
this all being kind of like smart kind of entity.

Ejaaz:
But I see increasingly AI stepping into that boundary and becoming the tastemaker

Ejaaz:
for all of the work that we end up doing.

Josh:
I still believe that to be true, that humans in the loop are critically important

Josh:
to applying human taste. I saw this great chart. I have no idea where it is.

Josh:
Somewhere in the depths of X. But basically, it was showing that in the App

Josh:
Store, the iOS App Store, where everyone downloads their apps,

Josh:
the amount of apps that have gone into production that have been published recently has gone vertical.

Josh:
I think it's doubled or tripled over the last six months. Everybody's publishing apps at the App Store.

Josh:
The amount of five-star reviews and the amount of downloads has actually either

Josh:
stayed flat or gone down.

Josh:
It has not matched the amount of new apps that are going to the app store.

Josh:
Why is this? It's because a lot of the apps don't have enough care applied to

Josh:
them. They're just not great applications. And when I think about,

Josh:
how I use my phone on a regular device or on a regular day or how I use my laptop

Josh:
and the applications that I actually spend time on, there's a very fixed set

Josh:
of them. And I'm a little stubborn when it comes to downloading new ones because

Josh:
a lot of the new ones just are not great.

Josh:
And I think a lot of that comes from this, this lack of care that is presented

Josh:
from AI outputs, where if you're optimizing for a specific parameter that you

Josh:
can measure, it's going to do it great, but it doesn't understand the subtle nuances of how humans

Josh:
engage and how they really love to use these products like one of the products

Josh:
that i use totally unrelated totally not sponsored but this app called copilot

Josh:
money it's like a budgeting application and it's so thoughtfully curated and designed and.

Josh:
And it really deeply understands all the complexities that are related to humans

Josh:
when it comes to budgeting it understands a lot of the

Josh:
the design characteristics same with an app called flighty i'm sure a lot of

Josh:
people have heard flighty it's like a flight tracking application there's a

Josh:
thousand ways to track a flight but flighty really cares about design they really care about how,

Josh:
it's implemented with the human and they've created this amazing output and

Josh:
i don't see that changing one thing that i did want to note is that,

Josh:
i think when a lot of people see this they imagine a world in which they are

Josh:
getting replaced everyone's like ai is replacing me look how much i could do

Josh:
now it has these loops and i think the reality is it gives you a lot more agency

Josh:
to do the things you want to do,

Josh:
where maybe you're not doing the day-to-day where,

Josh:
you would normally prompt an agent to do this but you're doing a lot of the

Josh:
higher level tasks you can imagine yourself not having to do

Josh:
the day-to-day like for example if you're just managing your household you no

Josh:
longer have to take out the trash you don't have to run errands you could just

Josh:
focus on how to make your household the best household it is because you have

Josh:
that higher level ability

Josh:
and in that chart that we showed in the artifact earlier on it shows a decreasing

Josh:
sized human it's the amount of input that a human is needed to get the output you want,

Josh:
but it's still ultimately on the human being in order to to push and navigate

Josh:
towards the outputs that you want because ultimately these tools are just for us so when i think of.

Josh:
Ai becoming increasingly good and when it comes to running the show even i've

Josh:
leaned on it we both have i think a lot more recently but all that's done is

Josh:
actually given us more leverage to do more with the show than have it replace

Josh:
us and even in the case that.

Josh:
We could clone ourselves. We could create a video version of ourselves that

Josh:
has a perfect voice that sounds just like us. I don't think people actually want that.

Josh:
There's that lacking human nature that still isn't understood.

Josh:
And I find that it's more empowering when I hear that these loops exist that

Josh:
can run for days on end and create amazing outputs versus not where it's kind

Josh:
of extracted from us. I don't really think that's true.

Ejaaz:
Yeah, it's like that stat of, well, it's that thesis that everyone held about

Ejaaz:
a year ago, which is like with the increase of AI adoption, people will have

Ejaaz:
more free time to have fun and leisure.

Ejaaz:
And in fact, the opposite has shown that like people just work way more and work harder.

Ejaaz:
And the output of that work is measured across like pretty much every single

Ejaaz:
company and profession and role.

Ejaaz:
I do generally agree with that. I don't think humans are going to get wiped out anytime soon.

Ejaaz:
But one thing that is kind of nagging my brain is if we extrapolate this intelligence out enough,

Ejaaz:
there is no reason why AI won't be able to take over or replace other parts

Ejaaz:
of the cognitive process that a human can do,

Ejaaz:
particularly if it's one AI model trained on the entire corpus of knowledge

Ejaaz:
that a bunch of humans have been guiding it.

Ejaaz:
So when I think about Anthropic, when I think about OpenAI, I think about all the

Ejaaz:
millions of people that use their product every single day and the data that

Ejaaz:
they ingest every single day that gets recorded on one singular database that

Ejaaz:
can then be reused to train a better model that is more hyper-optimized towards humans.

Ejaaz:
You could argue that as a single human, you don't get to meet and read the thoughts

Ejaaz:
of every other human that is out there.

Ejaaz:
You have your very own individual process. And I think that an AM model that

Ejaaz:
can get access to the world's brain and thoughts could probably create something

Ejaaz:
kind of close to knowing what that human taste profile would be.

Ejaaz:
The other major question that I'm wondering is, how much is all of this going to cost?

Ejaaz:
One, like, stat that has stuck in my head over the recent few weeks is that

Ejaaz:
Philanthropic particularly, they service, or like,

Ejaaz:
the Fortune 10, the top 10 companies in the world, nine of them use Clawed,

Ejaaz:
and their budget's increased by 500%, or is projected to increase by 500% by the end of this year.

Ejaaz:
And they're doing this willingly because the ROI, the value that they're getting

Ejaaz:
out of that is pretty massive.

Ejaaz:
Alternatively, there are companies like Uber that have slashed their budgets

Ejaaz:
massively because their entire year's budget was spent in a couple of months.

Ejaaz:
So I'm wondering, in this world of agent loops where you've got AIs working

Ejaaz:
overnight for you, the bills are going to increase pretty massively.

Ejaaz:
And I'm wondering, unless these AI models don't get cheaper,

Ejaaz:
and there's an infrastructure bottleneck there where these GPUs cost a lot of

Ejaaz:
money, we can't scale power and infrastructure anytime soon.

Ejaaz:
We need so much more energy than we already have currently on Earth to be able to power these things.

Ejaaz:
The cost of these things are just going to go up a lot more massively,

Ejaaz:
which means that either this is only going to be a power or a tool reserved

Ejaaz:
for the rich, or something's going to break here and maybe open source models

Ejaaz:
get adopted more aggressively.

Josh:
Yeah, I imagine there's probably use cases for all of the above.

Josh:
It's like open source models will continue to improve they'll be able to do

Josh:
a lot of the more trivial tasks that don't require frontier intelligence so

Josh:
therefore the cost of those types of loops will go down because not everyone

Josh:
needs to have the most cutting edge,

Josh:
software stack engineering like they're just kind of having it help them through

Josh:
their day-to-day maybe it's replying to emails maybe it's whatever miscellaneous things it may be

Josh:
there's a high probability that these open source models as they continue to

Josh:
improve will be able to bite off a meaningful chunk of that then the other half

Josh:
is using these frontier models that is a requirement in order to get the absolute

Josh:
best results for whatever very challenging work they're doing.

Josh:
And that is going to cost a lot of money for sure.

Josh:
And I don't see that changing, but I think the output of the dollars in will continue to go up.

Josh:
It's because as you get more knowledge per token, as you get more output per

Josh:
prompt, it very clearly, I mean, the economics seem to make sense.

Josh:
And I think that's kind of right now.

Josh:
Enterprise spend on these models they're trying to figure out well how much

Josh:
value can we actually get back from every dollar spent and right now it's a

Josh:
little bit unsure you mentioned uber we have uber here that we're showing on screen

Josh:
where uber just recently put a cap on the amount of tokens that

Josh:
their employees are allowed to use at fifteen hundred dollars per engineer per tool per month and

Josh:
we'll see how that works because a lot of other companies that we know they're

Josh:
kind giving their engineers unlimited budget in fact they're kind of ranking

Josh:
the engineers based on how many tokens they're using per month and.

Josh:
We'll see where that goes. I suspect the companies that are spending more on

Josh:
tokens will continue to see a higher upside for now, at least.

Josh:
But like you mentioned, the underlying problem with all of this is we're going

Josh:
to continue to have more prompts. I mean, these loops consume a tremendous amount

Josh:
of tokens, whether they're frontier tokens or open source tokens.

Josh:
It doesn't matter. We're going to need orders of magnitude more than we have.

Josh:
And we don't have the computability. It really does always come down to that

Josh:
energy problem, that infrastructure problem.

Josh:
We don't have the infra built out to support this so therefore the costs likely

Josh:
continue to stay high maybe it's not because you're paying the provider for tokens

Josh:
perhaps it's just renting the gpu time from a cluster that is doing much more

Josh:
valuable work so i think that might ultimately be

Josh:
that crux is the actual availability of the compute to do these things and that's

Josh:
why these edge compute devices like having your,

Josh:
mac studio on your desktop that can run locally it's probably a pretty valuable thing to have.

Ejaaz:
So I'm sure a lot of you are wondering, you know, how does this apply to me?

Ejaaz:
You know, I have none of my friends have mentioned this loop feature.

Ejaaz:
I don't really know many people who are using it.

Ejaaz:
As we mentioned earlier, like this isn't probably going to be used by the bulk

Ejaaz:
or majority of people yet until some of those use cases actually arise.

Ejaaz:
I think it's mainly going to happen in the workplace. It's going to happen with

Ejaaz:
like some of these enterprise companies that are trying to automate certain

Ejaaz:
departments or functions of their particular a company like marketing,

Ejaaz:
like software engineering.

Ejaaz:
And I think it'll start with lower level tasks because these agents still aren't

Ejaaz:
smart enough to understand nuance completely.

Ejaaz:
And also, you don't just want to let an agent run loose overnight whilst you're

Ejaaz:
sleeping and then take down your entire company. And one place where it's working

Ejaaz:
tirelessly to accelerate the development of that

Ejaaz:
And we have Boris Cherny over here basically explaining how he's basically ditched

Ejaaz:
his integrated development environment.

Ejaaz:
He has ditched all of his normal tools that he had spent decades basically honing

Ejaaz:
his software engineering skill on to now completely focus on building up these

Ejaaz:
agent loops. And what is he focused on?

Ejaaz:
Well, he works primarily on cloud code, but the other folks at Anthropic and

Ejaaz:
OpenAI have started this thing called

Ejaaz:
Recursive self-improvement or RSI, which is basically the goal of getting your

Ejaaz:
AI model to build the next version of itself.

Ejaaz:
And this is a test that Anthropic and the folks at OpenAI do for any new model that they release.

Ejaaz:
They set it a goal or task to basically rebuild itself in a more improved fashion.

Ejaaz:
Now, one thing that the AI has gotten really good at is building out that next function.

Ejaaz:
But one thing it's not very good at is figuring out what research problems they

Ejaaz:
should fix, what research problems it should focus on to try and,

Ejaaz:
you know, overcome and make it ultimately, you know, a better model than its competitors.

Ejaaz:
Now, RSI is something, it's kind of like the golden egg that each AI lab is going after.

Ejaaz:
And this is the primary use of agent loops right now.

Ejaaz:
And you can see why it might be obvious. If you have an AI model that can basically

Ejaaz:
build the next best version of itself, eventually you're going to get to AGI,

Ejaaz:
whatever the hell that looks like, and then you can apply it to pretty much any sector.

Ejaaz:
Now, the problem and the worry that kind of immediately pops into my head and

Ejaaz:
a lot of these researchers head is, if it eventually does get that smart, right?

Ejaaz:
Could escape human control completely and run off on its own and do its own

Ejaaz:
thing. Because at that point, why would it need a human to kind of like guide it or shepherd it?

Ejaaz:
Instead, it can just kind of like do its own thing. So this is like the primary

Ejaaz:
use case that I'm seeing for agent loops being worked on right now.

Ejaaz:
I would love to see a like more broader application across like kind of like

Ejaaz:
consumer professions, like in finance, like in science and stuff like that,

Ejaaz:
which I do believe it'll spill over eventually. But unless you're seeing anything

Ejaaz:
else, Josh, I think like that is primarily it on agent loops and agent autonomy.

Josh:
It's on you to figure out the best use cases for it. Like there's no real company

Josh:
defining it. They're just giving you the tools. And I mean, for better or worse,

Josh:
it's very open-ended. So it's on you to figure out how best to use these.

Josh:
I think if this sounds a little overwhelming, maybe we could outline a few examples

Josh:
of each one of these kind of four rungs in the ladder here.

Josh:
The first one being prompting this everyone has done before.

Josh:
I'm sure it's like rewrite this

Josh:
email to sound more confident or explain what my doctor meant by this.

Josh:
But then you've probably also used the partial agentic usage as well of these

Josh:
models, which is like planet my three day vacation to Lisbon that I'm going on next week.

Josh:
And it will actually go off and use tools and it will think complex thoughts

Josh:
and ideas and kind of surface you a full itinerary for your trip.

Josh:
And then there's the third one, which is the harness. This is a little more

Josh:
complicated. This is for people who are building more project based stuff.

Josh:
So for example, if you want to build you a website for your dog walking business

Josh:
and you kind of describe it and you go back and forth on a spec and then it

Josh:
goes off and implements that.

Josh:
And the fourth is loops, which doesn't have to necessarily be overwhelming.

Josh:
It can be simple as let's say you are.

Josh:
Interested in the news, you could say every morning before I wake up,

Josh:
scan these 10 sources plus market data and give me this bulleted brief.

Josh:
Or let's say you have a to-do list. It'll go off and think overnight and solve

Josh:
all those problems overnight, iteratively until it comes to a solution that

Josh:
it hopefully arrives at in the mornings. There's a lot of use cases.

Josh:
I think a lot of it requires creativity.

Josh:
And that is the prompt we will leave you with today, which is share with us,

Josh:
please, how you are using these models best.

Josh:
Because so much of the question isn't are these models smart?

Josh:
It's how can I extract that intelligence from them in the most effective way

Josh:
for my life? So I would be so curious to hear which rung of the ladder you find

Josh:
yourself on one through four.

Josh:
And then what the most interesting use cases you found,

Josh:
among those rungs of the ladder are you using loops currently what are you using them for,

Josh:
are you with agents are you still using it as a google extension if you're still

Josh:
using it as a google extension i would encourage a little more creativity really

Josh:
try to ask harder questions and figure out how it could be implemented in your

Josh:
life but i think that's pretty much it on the loop um,

Josh:
you're not going anywhere but your job might shift a little bit in terms of

Josh:
scope as these tools get more powerful and that should be the hope that should

Josh:
be the goal because it'll allow you to do so much more that you want to accomplish, I believe.

Josh:
And yeah, I think that's where we'll leave you with today.

Ejaaz:
Thank you folks so much for listening. Similar to Josh's prompts,

Ejaaz:
I'm actually kind of curious, for one singular task that you've used your AI

Ejaaz:
for, what is the most number of tokens that you've burnt?

Ejaaz:
Be honest, it can be for any use case, doesn't matter, let us know.

Ejaaz:
And also, what is the longest that you've had an AI work on a particular task

Ejaaz:
for? Is it a couple of minutes? Is it hours? Is it potentially overnight?

Ejaaz:
Let us know. I'm curious. And what was the associated bill with that?

Ejaaz:
And yeah, we'll see you on the next episode. Wherever you listen to us,

Ejaaz:
if you haven't subscribed, if you haven't rated us, if you're not leaving us

Ejaaz:
comments, what are you doing?

Ejaaz:
We respond to pretty much any and every one of them. We listen to your feedback.

Ejaaz:
It feeds into some of the work and content that we put out.

Ejaaz:
We are almost hitting 60,000 of you folks. And you guys are reading our newsletter,

Ejaaz:
which is like hit out to about 100,000 plus people. every single week. We post twice a week.

Ejaaz:
But yeah, wherever you are, please subscribe to us, leave us a comment,

Ejaaz:
and we'll see you on the next one.

Josh:
See you guys next time.

Ejaaz:
Peace.