1
00:00:00,020 --> 00:00:03,990
Ejaaz:
99% of people are using AI models the same way that they use Google.

2
00:00:03,990 --> 00:00:08,140
Ejaaz:
But recently, a new way of prompting your AI has emerged that doesn't just replace

3
00:00:08,140 --> 00:00:12,530
Ejaaz:
the way that you work, it promotes you to the CEO of your very own AI company.

4
00:00:12,730 --> 00:00:16,520
Ejaaz:
It's called Loops and it's part of a growing development in agent autonomy where

5
00:00:16,520 --> 00:00:21,160
Ejaaz:
AI agents basically spin up and autonomously complete tasks or goals that you

6
00:00:21,160 --> 00:00:23,080
Ejaaz:
set for it, often working throughout the night.

7
00:00:23,200 --> 00:00:28,610
Ejaaz:
In 2019, the longest that an AI agent could work autonomously for was for two seconds.

8
00:00:28,800 --> 00:00:32,150
Ejaaz:
Fast forward to today, and they can work autonomously for 12 hours,

9
00:00:32,150 --> 00:00:34,200
Ejaaz:
and that's doubling every couple of months.

10
00:00:34,200 --> 00:00:38,290
Ejaaz:
Andre Carpathy calls this phenomenon the autonomy slider, where you can go take

11
00:00:38,290 --> 00:00:43,010
Ejaaz:
a dial that slides from humans that approve everything to humans that periodically check in.

12
00:00:43,010 --> 00:00:47,300
Ejaaz:
And it's part of this growing trend of agents consuming and taking up more of

13
00:00:47,300 --> 00:00:52,480
Ejaaz:
human capital and labor. And the question that remains going forwards is, what will humans do?

14
00:00:52,690 --> 00:00:54,510
Ejaaz:
And will they be entirely replaced

15
00:00:54,510 --> 00:00:58,000
Ejaaz:
by AI? Or will they be the ultimate orchestrator of their destiny?

16
00:00:58,300 --> 00:01:02,030
Josh:
Yeah, I think the goal for this episode is really just to inform people on what's

17
00:01:02,030 --> 00:01:06,030
Josh:
possible current day with these agents, with these LLMs, with writing these

18
00:01:06,030 --> 00:01:10,360
Josh:
loops, as well as where you can possibly find yourself within that stack,

19
00:01:10,360 --> 00:01:11,960
Josh:
because it gets pretty complicated.

20
00:01:11,960 --> 00:01:14,600
Josh:
When we're getting into loops, not everyone needs to use loops,

21
00:01:14,600 --> 00:01:19,040
Josh:
but everyone should be using LLMs probably slightly different than how you're using them today.

22
00:01:19,080 --> 00:01:23,810
Josh:
So maybe we could start with a little history lesson in terms of the four levels

23
00:01:23,810 --> 00:01:28,990
Josh:
in which we have been engaging with llm starting with the first level which is just prompting,

24
00:01:29,480 --> 00:01:33,800
Josh:
generally like most people are probably still doing this started in 2022 2023

25
00:01:34,070 --> 00:01:35,480
Josh:
around the release of chat gpt

26
00:01:35,840 --> 00:01:39,160
Josh:
the way that you would engage with these llms is you would just submit a question

27
00:01:39,160 --> 00:01:43,020
Josh:
or you submit a prompt and you get some language back now if you are still doing

28
00:01:43,020 --> 00:01:45,750
Josh:
this that's okay because i find a.

29
00:01:50,260 --> 00:01:53,640
Josh:
Three years ago, four years ago. It has since advanced pretty,

30
00:01:54,070 --> 00:01:55,170
Josh:
pretty meaningfully since then.

31
00:01:55,450 --> 00:01:58,520
Josh:
The second step of this is agents. And we're going to spend some time on agents.

32
00:01:59,110 --> 00:02:01,970
Josh:
Everyone's kind of heard of an agent. Maybe not everyone knows what an agent is.

33
00:02:02,190 --> 00:02:05,170
Josh:
An agent is something that could think for a little bit longer.

34
00:02:05,170 --> 00:02:08,180
Josh:
It could run a bit longer than just a standard prompt.

35
00:02:08,460 --> 00:02:10,980
Josh:
It can go off and do things. It could call tools for you.

36
00:02:11,630 --> 00:02:15,380
Josh:
It's a much more capable version of the text box. Then, like we talked about

37
00:02:15,380 --> 00:02:17,840
Josh:
all the time on the show recently in the last few weeks, there's the harness

38
00:02:18,050 --> 00:02:23,170
Josh:
feature in which you put an LLM into a container and that gives it a memory feature.

39
00:02:23,170 --> 00:02:26,450
Josh:
That gives it complete tool use. That's something like an open claw that we've

40
00:02:26,450 --> 00:02:29,160
Josh:
talked about a lot that some people do use and that's level three.

41
00:02:29,160 --> 00:02:32,140
Josh:
And now level four, which is the new thing that has come this week,

42
00:02:32,410 --> 00:02:36,200
Josh:
that's really been highlighted by some of the top leaders at these AI labs is loops.

43
00:02:36,460 --> 00:02:41,770
Josh:
And a loop is essentially a version of an agent that has an orchestration layer

44
00:02:41,770 --> 00:02:43,570
Josh:
and kind of builds upon itself.

45
00:02:43,570 --> 00:02:47,390
Josh:
So it allows you to kind of continue to scope yourself out. If you can imagine

46
00:02:47,390 --> 00:02:50,950
Josh:
you're kind of you're dealing directly with an employee at level one and then

47
00:02:50,950 --> 00:02:54,280
Josh:
you're kind of directing that person to go off and do their own in level two,

48
00:02:54,770 --> 00:02:58,770
Josh:
At level three with the harness, you're kind of directing a series of people to help you.

49
00:02:59,020 --> 00:03:02,920
Josh:
And then level four, you're just the top level CEO who's directing your C-suite

50
00:03:02,920 --> 00:03:06,280
Josh:
to go and manage all the employees below you. So there's an entire stack to this.

51
00:03:06,510 --> 00:03:10,420
Josh:
It's very cool. It just how do you use your AI currently? Where would you say

52
00:03:10,420 --> 00:03:11,750
Josh:
that you fit in this stack?

53
00:03:12,000 --> 00:03:16,020
Ejaaz:
Yeah, so looking at this diagram that we have on the screen here,

54
00:03:16,020 --> 00:03:19,850
Ejaaz:
I'm somewhere between number two and number three. I'm somewhere between using

55
00:03:19,850 --> 00:03:22,870
Ejaaz:
agents and trying to figure out the whole harness thing.

56
00:03:23,210 --> 00:03:26,820
Ejaaz:
Now, what am I doing when it comes to like spinning up agents?

57
00:03:27,070 --> 00:03:31,250
Ejaaz:
If you look at either my Claude or my ChatGPT desktop apps right now,

58
00:03:31,910 --> 00:03:37,830
Ejaaz:
I've renamed a bunch of my conversations to a particular focus or subject and then agent after it.

59
00:03:38,170 --> 00:03:42,480
Ejaaz:
And so I can go to it and this agent basically has all the context of what I

60
00:03:42,480 --> 00:03:44,950
Ejaaz:
wanted to do, whether it's like research a particular topic,

61
00:03:45,230 --> 00:03:48,870
Ejaaz:
create some kind of an outline for something, research a particular investment angle.

62
00:03:49,010 --> 00:03:52,290
Ejaaz:
It already knows and has the embedded context for what it needs to do.

63
00:03:52,530 --> 00:03:58,030
Ejaaz:
And there's usually like one to maybe three tasks that it needs to autonomously execute on its own.

64
00:03:58,310 --> 00:04:02,230
Ejaaz:
And so it runs in kind of like a sequence. But if any of that sequence kind

65
00:04:02,230 --> 00:04:06,210
Ejaaz:
of breaks, let's say it kind of tries to retrieve data from some particular

66
00:04:06,210 --> 00:04:09,130
Ejaaz:
website and it is unable to do so, it breaks.

67
00:04:09,130 --> 00:04:13,080
Ejaaz:
And it comes to me and it says, hey, Ejaz, is there some other thing that you

68
00:04:13,080 --> 00:04:15,030
Ejaaz:
want to look at or retrieve from, blah, blah, blah?

69
00:04:15,430 --> 00:04:19,930
Ejaaz:
It's not fully autonomous. Now, number three, the Harness side of things is

70
00:04:19,930 --> 00:04:22,540
Ejaaz:
what I'm trying to kind of like mold my understanding around.

71
00:04:22,860 --> 00:04:26,430
Ejaaz:
What I've noticed is when you type in a prompt and you get a response,

72
00:04:26,750 --> 00:04:30,770
Ejaaz:
you can kind of tell that it's AI-y. Like usually when we kind of create artifacts,

73
00:04:30,770 --> 00:04:34,490
Ejaaz:
it comes in a particular font or it speaks in a particular type of language.

74
00:04:34,780 --> 00:04:38,640
Ejaaz:
The Harness helps kind of like take your prompt and kind of mold it into something

75
00:04:38,640 --> 00:04:43,580
Ejaaz:
that is more human-like, but also more nuanced with what you are trying to do.

76
00:04:43,580 --> 00:04:44,730
Ejaaz:
Like it effectively gets

77
00:04:45,090 --> 00:04:49,010
Ejaaz:
closer towards that ultimate goal. Like we were talking before recording this

78
00:04:49,010 --> 00:04:53,130
Ejaaz:
episode about human taste and how AI doesn't really get human taste.

79
00:04:53,130 --> 00:04:57,690
Ejaaz:
The harness helps you get towards that ultimate kind of taste profile for the

80
00:04:57,690 --> 00:04:59,200
Ejaaz:
particular output that you're trying to generate.

81
00:04:59,970 --> 00:05:04,200
Ejaaz:
I haven't tried working with loops just yet, but my understanding of this,

82
00:05:04,200 --> 00:05:10,110
Ejaaz:
and correct me if I'm wrong, is you have an AI. You can prompt it and you can get some kind of output.

83
00:05:10,480 --> 00:05:15,740
Ejaaz:
A loop specifically is an AI agent that doesn't break, if it comes across an

84
00:05:15,740 --> 00:05:19,980
Ejaaz:
obstacle that it doesn't understand, its instinct isn't to come to the human

85
00:05:19,980 --> 00:05:22,930
Ejaaz:
and say, hey, like, I can't figure this out, guide me.

86
00:05:23,260 --> 00:05:28,070
Ejaaz:
It completely reiterates the prompt over and over again until it gets past that

87
00:05:28,070 --> 00:05:32,100
Ejaaz:
obstacle, working towards like one objective. So a few examples I've seen for

88
00:05:32,100 --> 00:05:34,730
Ejaaz:
this is if you are coding, right?

89
00:05:34,980 --> 00:05:38,860
Ejaaz:
And let's say there's multiple workflows of a code base that you want to work

90
00:05:38,860 --> 00:05:42,730
Ejaaz:
on, and it comes across a hiccup where it can't retrieve data from one of those

91
00:05:42,730 --> 00:05:45,860
Ejaaz:
particular flows, it is able to kind of like circumnavigate around it,

92
00:05:46,070 --> 00:05:49,000
Ejaaz:
maybe spin up its own separate flow and try to figure out the problem.

93
00:05:49,240 --> 00:05:53,040
Ejaaz:
And often this results in an agent working for multiple hours at a time,

94
00:05:53,040 --> 00:05:57,590
Ejaaz:
often overnight. I think Carpathy spoke about his auto research agent working

95
00:05:57,590 --> 00:05:58,730
Ejaaz:
overnight whilst he slept.

96
00:05:59,020 --> 00:06:02,280
Ejaaz:
And we're seeing different variations of this start to arise.

97
00:06:02,840 --> 00:06:04,430
Ejaaz:
Where are you, Josh, in the stack?

98
00:06:04,980 --> 00:06:09,340
Josh:
Yeah, loops are like the closed source a system where you kind of define an

99
00:06:09,340 --> 00:06:13,590
Josh:
outcome and it will continue to work towards that outcome without any external inputs.

100
00:06:13,810 --> 00:06:16,090
Josh:
It's very cool. It's very automated. I don't think it's for everyone.

101
00:06:16,090 --> 00:06:21,430
Josh:
It's certainly not for me because I haven't really had a use case for loops per se.

102
00:06:21,430 --> 00:06:26,290
Josh:
I would say I'm sitting at each one of those first three phases given whatever

103
00:06:26,290 --> 00:06:28,980
Josh:
tasks I'm trying to do. And I think it's important to understand that a lot

104
00:06:28,980 --> 00:06:33,140
Josh:
of people might not even need to go past number one unless you're actually doing productive work.

105
00:06:33,310 --> 00:06:37,780
Josh:
A lot of the agents, a lot of the harnesses are for kind of automating more.

106
00:06:38,640 --> 00:06:41,920
Josh:
More systems from your life if you're just trying to use this as google if you're

107
00:06:41,920 --> 00:06:46,570
Josh:
just trying to use this as a writing assistant or someone to chat with the prompting

108
00:06:46,570 --> 00:06:48,500
Josh:
is really strong and i find a lot of times

109
00:06:48,900 --> 00:06:53,150
Josh:
this is my outlook or this is my outlet for like google search results so instead

110
00:06:53,150 --> 00:06:56,700
Josh:
of searching for google i'll get a little more in-depth results i'll ask my llm

111
00:06:56,930 --> 00:07:00,090
Josh:
for agents i use them quite a bit when i'm doing a little bit more productive

112
00:07:00,090 --> 00:07:04,050
Josh:
work for example we track the analytics on limitless and we want a place in

113
00:07:04,050 --> 00:07:06,470
Josh:
which we can have all those analytics dumped to a dashboard,

114
00:07:06,800 --> 00:07:08,100
Josh:
that is an agent that I run.

115
00:07:08,390 --> 00:07:12,850
Josh:
So it goes into my browser. It detects all of the views that we've had from

116
00:07:12,850 --> 00:07:16,660
Josh:
the week for YouTube, from Spotify, from RSS feed, where you should all be subscribed

117
00:07:16,660 --> 00:07:18,010
Josh:
to and rate us five stars.

118
00:07:18,230 --> 00:07:21,810
Josh:
And it compiles it into a singular spreadsheet in which we could then publish

119
00:07:21,810 --> 00:07:25,190
Josh:
online and we could share with prospective sponsors and things like that.

120
00:07:25,410 --> 00:07:28,410
Josh:
And then for harnesses I've used, because I mean, that's mostly OpenClaw.

121
00:07:28,590 --> 00:07:34,040
Josh:
I've used OpenClaw. I really enjoyed the process. I find myself using it a bit less and less.

122
00:07:34,460 --> 00:07:39,950
Josh:
And I think in the loops feature, at least it's probably most productive right

123
00:07:39,950 --> 00:07:43,000
Josh:
now for people who are writing code, who are writing verifiable solutions.

124
00:07:43,260 --> 00:07:46,180
Josh:
One of the difficult things that as I was looking into loops and figuring out

125
00:07:46,180 --> 00:07:49,460
Josh:
how I can structure them into my life, one of the problems that I run into is

126
00:07:49,460 --> 00:07:51,430
Josh:
I'm not really sure I have a verifiable,

127
00:07:51,910 --> 00:07:55,500
Josh:
set of outputs that I wanted to optimize for, for a lot of the work that I'm

128
00:07:55,500 --> 00:07:59,240
Josh:
doing, because a lot of it is subjective. A lot of it is kind of creative work.

129
00:07:59,440 --> 00:08:01,540
Josh:
It requires a human in the loop for a lot more of it.

130
00:08:01,920 --> 00:08:06,400
Josh:
So I would say I am number one, two, and three on the list. Haven't quite made my way to four.

131
00:08:07,300 --> 00:08:11,010
Josh:
But yeah, for the people who are, those are the people like Boris Churny from

132
00:08:11,010 --> 00:08:16,020
Josh:
Anthropic. And we know Andre and Peter Steinberg from OpenAI.

133
00:08:16,020 --> 00:08:18,640
Josh:
They are all on four. They are using it to,

134
00:08:19,390 --> 00:08:23,250
Josh:
create these like unbelievable, agentic systems and continue to remove themselves out of the loop.

135
00:08:23,580 --> 00:08:28,150
Ejaaz:
You know what I've realized? With loops in particular and just AI agents in

136
00:08:28,150 --> 00:08:35,050
Ejaaz:
general, they're trying to improve our understanding or rather their understanding

137
00:08:35,050 --> 00:08:36,220
Ejaaz:
of the English language.

138
00:08:36,220 --> 00:08:40,710
Ejaaz:
So one of my favorite Carpathy quotes back in the day was English is the new

139
00:08:41,110 --> 00:08:44,020
Ejaaz:
programming language. I think you said this like two, two and a half years ago.

140
00:08:44,780 --> 00:08:49,180
Ejaaz:
And I've just realized that like us creating AI agents is basically like,

141
00:08:49,180 --> 00:08:52,170
Ejaaz:
it's the same model. It hasn't necessarily got smarter.

142
00:08:52,380 --> 00:08:56,260
Ejaaz:
It's just like using that model to kind of like keep ramming its head and its

143
00:08:56,260 --> 00:09:00,670
Ejaaz:
brain against a particular problem until it understands what the human actually means.

144
00:09:00,900 --> 00:09:04,790
Ejaaz:
And so like in this new world, like I know you just used the example of like,

145
00:09:04,790 --> 00:09:07,720
Ejaaz:
you know, loops can be used for coding specifically,

146
00:09:08,360 --> 00:09:12,100
Ejaaz:
coding that Boris Churny and Carpathy is doing is English.

147
00:09:12,100 --> 00:09:16,070
Ejaaz:
Like they're speaking to the LLM, they are writing in English to the LLM.

148
00:09:16,070 --> 00:09:18,580
Ejaaz:
And yeah, maybe they're copy and pasting some versions of code,

149
00:09:18,580 --> 00:09:21,450
Ejaaz:
but that code is primarily generated by an AI.

150
00:09:21,450 --> 00:09:25,860
Ejaaz:
I think like something crazy, like 80% plus of code generated at Anthropic,

151
00:09:25,860 --> 00:09:31,430
Ejaaz:
both for research and for just general consumer adoption is generated by Claude itself.

152
00:09:31,430 --> 00:09:36,010
Ejaaz:
And so that's one thing. The other thing is the model just not getting smarter

153
00:09:36,240 --> 00:09:39,480
Ejaaz:
is a really interesting thing. Like typically in my head, I would think,

154
00:09:39,480 --> 00:09:43,000
Ejaaz:
okay, you need a better model to be able to unlock some of these new features

155
00:09:43,000 --> 00:09:45,650
Ejaaz:
like AI agents, autonomous loops, et cetera.

156
00:09:45,890 --> 00:09:49,530
Ejaaz:
But really you could just take the same model, wrap a harness around it and

157
00:09:49,530 --> 00:09:53,980
Ejaaz:
try to get it to understand what particular goal it's getting at and just run

158
00:09:53,980 --> 00:09:57,870
Ejaaz:
that iteration over and over and over again until you get a better output.

159
00:09:58,610 --> 00:10:03,680
Ejaaz:
And I guess this is the same concept as inference or reinforcement learning

160
00:10:03,680 --> 00:10:07,650
Ejaaz:
where like we've found this trend of post-training of these AI models,

161
00:10:07,650 --> 00:10:12,110
Ejaaz:
these AI models just getting smarter, not because they've got bigger GPUs or more expensive GPUs.

162
00:10:12,110 --> 00:10:15,130
Ejaaz:
It's because you've just taken the same model and you've just run it through

163
00:10:15,130 --> 00:10:18,260
Ejaaz:
a different reasoning framework over and over again until it can do a thing.

164
00:10:18,540 --> 00:10:23,190
Ejaaz:
And this is the practical embellishment of it. I personally haven't found like

165
00:10:23,190 --> 00:10:25,180
Ejaaz:
an obvious use case for loops either.

166
00:10:25,370 --> 00:10:29,910
Ejaaz:
So either you and I are boxing ourselves into a particular realm and maybe someone

167
00:10:29,910 --> 00:10:32,550
Ejaaz:
listening to this is using this for like their software engineering thing or

168
00:10:32,550 --> 00:10:35,530
Ejaaz:
their marketing thing. But yeah, I guess that's where I sit right now.

169
00:10:35,790 --> 00:10:37,540
Josh:
Well, I think it's probably a skill issue on both our parts.

170
00:10:37,540 --> 00:10:40,860
Josh:
Like there is certainly a use case for us in which we can use a loop in which

171
00:10:40,860 --> 00:10:44,210
Josh:
we can define this outcome, send an agent off to go do it, and it will iterate

172
00:10:44,210 --> 00:10:46,140
Josh:
on itself until it comes to a conclusion.

173
00:10:46,430 --> 00:10:50,170
Josh:
I think it's just so novel and so new. It's difficult to kind of understand

174
00:10:50,460 --> 00:10:53,070
Josh:
why. And we have this really great chart on screen that you're showing now,

175
00:10:53,070 --> 00:10:55,260
Josh:
which is the why now section of this.

176
00:10:55,550 --> 00:10:59,850
Josh:
And it's because the duration of a task that these agents can run is so much

177
00:10:59,850 --> 00:11:01,460
Josh:
longer than it used to be.

178
00:11:01,670 --> 00:11:06,220
Josh:
I mean, in 2019 we have here, it was two seconds. This was well before ChatGPT.

179
00:11:06,470 --> 00:11:13,050
Josh:
But even early last year, in 2025, the duration that an agent could run on one single task was

180
00:11:13,440 --> 00:11:17,020
Josh:
less than an hour in length so there's only so many tokens it could generate

181
00:11:17,020 --> 00:11:19,990
Josh:
there's only so much reasoning it can do and there's only so much iteration

182
00:11:19,990 --> 00:11:23,680
Josh:
you could get over that hour time period let alone the amount of costs that

183
00:11:23,680 --> 00:11:24,640
Josh:
these tokens are going to be,

184
00:11:25,100 --> 00:11:28,930
Josh:
costing you if you're using like the api or anything like that now fast forward

185
00:11:28,930 --> 00:11:33,310
Josh:
to today i mean the best models in the world they're getting days worth of runtime

186
00:11:33,540 --> 00:11:38,070
Josh:
so they can really think deeply and continue to iterate on themselves over and over i see examples of um.

187
00:11:38,680 --> 00:11:43,970
Josh:
Backslash goal on x all the time of people who have a problem whether it be

188
00:11:43,970 --> 00:11:47,130
Josh:
an optimization problem where they have a bug that they need to fix and they'll

189
00:11:47,130 --> 00:11:49,200
Josh:
put this backslash goal on it for

190
00:11:49,590 --> 00:11:53,830
Josh:
however long it needs to and it'll think for three four even five days i've

191
00:11:53,830 --> 00:11:57,960
Josh:
seen in order to optimize for the specific parameter and this is possible because

192
00:11:57,960 --> 00:11:59,730
Josh:
these models now can think for days long,

193
00:12:00,290 --> 00:12:04,940
Josh:
you have to assume months is coming what does it look like Like when an agent can think for months.

194
00:12:04,940 --> 00:12:10,770
Josh:
I mean, it's a really interesting paradigm shift that I'm not sure where people

195
00:12:10,770 --> 00:12:15,090
Josh:
are going to find value in the open-ended way that it exists today,

196
00:12:15,090 --> 00:12:17,650
Josh:
right? It's like, okay, here's this agent.

197
00:12:17,880 --> 00:12:19,960
Josh:
You can tell to do whatever you want. You can create a loop.

198
00:12:19,960 --> 00:12:22,040
Josh:
You can create an infrastructure system for it to operate in.

199
00:12:23,000 --> 00:12:25,760
Josh:
It's pretty much open-ended and it's on you. And I think the answer to that

200
00:12:25,760 --> 00:12:30,600
Josh:
is that not even the AI companies really understand the best use cases for it quite yet.

201
00:12:30,890 --> 00:12:33,700
Josh:
I would imagine it's still this really difficult thing of how do you unlock

202
00:12:33,700 --> 00:12:37,350
Josh:
value from essentially an open-ended agent that can go and run for an infinite

203
00:12:37,350 --> 00:12:38,700
Josh:
amount of time? I don't know.

204
00:12:39,640 --> 00:12:45,180
Ejaaz:
I also question like what a human's purpose would be at that point.

205
00:12:45,310 --> 00:12:51,210
Ejaaz:
Like if you automate enough of the thinking and the curiosity behind like solving

206
00:12:51,210 --> 00:12:54,280
Ejaaz:
particular problems, What do humans end up doing at that point,

207
00:12:54,280 --> 00:12:56,330
Ejaaz:
especially if they don't do the work themselves?

208
00:12:56,330 --> 00:12:59,570
Ejaaz:
They don't understand it, right? You need an AI to kind of like understand what

209
00:12:59,700 --> 00:13:01,300
Ejaaz:
on earth is going on in the first place.

210
00:13:01,540 --> 00:13:05,150
Ejaaz:
And eventually like an AI will then start setting goals, like more ambitious

211
00:13:05,150 --> 00:13:09,250
Ejaaz:
goals than a human can in terms of like what to like kind of solve or go after.

212
00:13:09,980 --> 00:13:15,950
Ejaaz:
There were some very low-level examples that I saw in response to Pete Steyer's tweet about loops.

213
00:13:16,260 --> 00:13:19,650
Ejaaz:
And there's some kind of concrete examples that I want to run through very quickly here.

214
00:13:19,930 --> 00:13:25,530
Ejaaz:
So one of them is using it for code, right? So a classic loop could basically

215
00:13:25,530 --> 00:13:30,360
Ejaaz:
look like, okay, can you please pull live errors for my particular app?

216
00:13:30,630 --> 00:13:34,260
Ejaaz:
Can you inspect and figure out where the bug might particularly be?

217
00:13:34,500 --> 00:13:38,760
Ejaaz:
Can you create then a fix for this particular bug in my code?

218
00:13:39,010 --> 00:13:41,870
Ejaaz:
And then can you deploy it? Then can you check the health of that deployment

219
00:13:41,870 --> 00:13:43,550
Ejaaz:
and make sure that nothing else is broken?

220
00:13:43,810 --> 00:13:47,390
Ejaaz:
And then record what failed and feed that into a database so that in the future,

221
00:13:47,390 --> 00:13:51,380
Ejaaz:
we can detect errors like this or prevent it when we code and build some of

222
00:13:52,480 --> 00:13:53,670
Ejaaz:
These future app features.

223
00:13:53,930 --> 00:13:58,490
Ejaaz:
Now, that is kind of like a very small and specific enough use case that can

224
00:13:58,490 --> 00:14:02,800
Ejaaz:
be generalized across basically any app or software engineering project that

225
00:14:02,800 --> 00:14:04,690
Ejaaz:
you might be working on if you're listening to this.

226
00:14:05,380 --> 00:14:11,220
Ejaaz:
And I wonder how many hours worth of engineering time that this replaces.

227
00:14:11,220 --> 00:14:13,990
Ejaaz:
Because I know that there are entire teams having worked at companies.

228
00:14:14,420 --> 00:14:18,650
Ejaaz:
Been a product manager in the past, entire teams of software engineers that

229
00:14:18,650 --> 00:14:21,840
Ejaaz:
spend their entire days working on something like that. So that's one thing.

230
00:14:22,180 --> 00:14:25,720
Ejaaz:
And then for content, which is very applicable for product managers,

231
00:14:25,720 --> 00:14:29,700
Ejaaz:
or even like the work that you and I do, Josh, an agent could read a PRD.

232
00:14:29,700 --> 00:14:32,950
Ejaaz:
So which is a product requirement doc, which is usually kind of like created

233
00:14:33,120 --> 00:14:36,320
Ejaaz:
for a strategic goal that you want to kind of like build at your company,

234
00:14:36,320 --> 00:14:40,610
Ejaaz:
like a product or a feature, it then writes whatever that next asset could be.

235
00:14:40,610 --> 00:14:43,470
Ejaaz:
So it could be like a design profile or a mockup of what that feature might

236
00:14:43,470 --> 00:14:47,600
Ejaaz:
look like, score it against like some kind of criteria that the company has

237
00:14:47,600 --> 00:14:50,960
Ejaaz:
across like, you know, it must follow our vision, A, B, and C.

238
00:14:51,130 --> 00:14:53,980
Ejaaz:
It must also look a particular way. This is our design profile,

239
00:14:53,980 --> 00:14:56,200
Ejaaz:
our brand kind of profile and our aesthetic.

240
00:14:56,440 --> 00:15:00,510
Ejaaz:
And then it kind of like updates its progress depending on like what other teams

241
00:15:00,510 --> 00:15:03,260
Ejaaz:
have shipped. So maybe it's dependent on a particular feature.

242
00:15:03,470 --> 00:15:07,810
Ejaaz:
And so it updates itself autonomously like that. Now, this all sounds very vague

243
00:15:07,810 --> 00:15:12,370
Ejaaz:
intentionally because it's meant to apply to your particular business or your particular project.

244
00:15:12,370 --> 00:15:14,260
Ejaaz:
But make no mistake, this is what

245
00:15:14,260 --> 00:15:18,770
Ejaaz:
a lot of humans are paid upwards of six figures to do on a daily basis.

246
00:15:18,770 --> 00:15:22,960
Ejaaz:
It's that nuance. And we're starting to see basically AI models and AI agents

247
00:15:23,230 --> 00:15:28,580
Ejaaz:
enter into that human taste profile. So when I think about where we end up eventually,

248
00:15:29,860 --> 00:15:33,390
Ejaaz:
There's a common argument that's made that it's like, oh, humans will always have the taste.

249
00:15:33,650 --> 00:15:36,760
Ejaaz:
They'll always be able to kind of direct where the AI should go because we are

250
00:15:36,760 --> 00:15:39,430
Ejaaz:
this all being kind of like smart kind of entity.

251
00:15:39,780 --> 00:15:44,800
Ejaaz:
But I see increasingly AI stepping into that boundary and becoming the tastemaker

252
00:15:45,000 --> 00:15:46,320
Ejaaz:
for all of the work that we end up doing.

253
00:15:47,070 --> 00:15:51,810
Josh:
I still believe that to be true, that humans in the loop are critically important

254
00:15:51,810 --> 00:15:55,520
Josh:
to applying human taste. I saw this great chart. I have no idea where it is.

255
00:15:55,810 --> 00:15:59,650
Josh:
Somewhere in the depths of X. But basically, it was showing that in the App

256
00:15:59,650 --> 00:16:01,810
Josh:
Store, the iOS App Store, where everyone downloads their apps,

257
00:16:02,140 --> 00:16:06,090
Josh:
the amount of apps that have gone into production that have been published recently has gone vertical.

258
00:16:06,330 --> 00:16:10,140
Josh:
I think it's doubled or tripled over the last six months. Everybody's publishing apps at the App Store.

259
00:16:10,590 --> 00:16:15,580
Josh:
The amount of five-star reviews and the amount of downloads has actually either

260
00:16:15,580 --> 00:16:17,440
Josh:
stayed flat or gone down.

261
00:16:17,760 --> 00:16:20,820
Josh:
It has not matched the amount of new apps that are going to the app store.

262
00:16:21,000 --> 00:16:26,520
Josh:
Why is this? It's because a lot of the apps don't have enough care applied to

263
00:16:26,520 --> 00:16:28,910
Josh:
them. They're just not great applications. And when I think about,

264
00:16:29,670 --> 00:16:33,670
Josh:
how I use my phone on a regular device or on a regular day or how I use my laptop

265
00:16:33,670 --> 00:16:37,100
Josh:
and the applications that I actually spend time on, there's a very fixed set

266
00:16:37,100 --> 00:16:41,130
Josh:
of them. And I'm a little stubborn when it comes to downloading new ones because

267
00:16:41,130 --> 00:16:43,010
Josh:
a lot of the new ones just are not great.

268
00:16:43,350 --> 00:16:47,330
Josh:
And I think a lot of that comes from this, this lack of care that is presented

269
00:16:47,330 --> 00:16:51,360
Josh:
from AI outputs, where if you're optimizing for a specific parameter that you

270
00:16:51,360 --> 00:16:56,700
Josh:
can measure, it's going to do it great, but it doesn't understand the subtle nuances of how humans

271
00:16:57,130 --> 00:17:00,800
Josh:
engage and how they really love to use these products like one of the products

272
00:17:00,800 --> 00:17:04,340
Josh:
that i use totally unrelated totally not sponsored but this app called copilot

273
00:17:04,340 --> 00:17:08,750
Josh:
money it's like a budgeting application and it's so thoughtfully curated and designed and.

274
00:17:09,510 --> 00:17:13,920
Josh:
And it really deeply understands all the complexities that are related to humans

275
00:17:13,920 --> 00:17:16,420
Josh:
when it comes to budgeting it understands a lot of the

276
00:17:16,770 --> 00:17:19,910
Josh:
the design characteristics same with an app called flighty i'm sure a lot of

277
00:17:19,910 --> 00:17:22,930
Josh:
people have heard flighty it's like a flight tracking application there's a

278
00:17:22,930 --> 00:17:27,350
Josh:
thousand ways to track a flight but flighty really cares about design they really care about how,

279
00:17:28,070 --> 00:17:31,480
Josh:
it's implemented with the human and they've created this amazing output and

280
00:17:31,480 --> 00:17:35,010
Josh:
i don't see that changing one thing that i did want to note is that,

281
00:17:35,640 --> 00:17:39,550
Josh:
i think when a lot of people see this they imagine a world in which they are

282
00:17:39,550 --> 00:17:42,690
Josh:
getting replaced everyone's like ai is replacing me look how much i could do

283
00:17:42,690 --> 00:17:47,080
Josh:
now it has these loops and i think the reality is it gives you a lot more agency

284
00:17:47,080 --> 00:17:48,410
Josh:
to do the things you want to do,

285
00:17:48,880 --> 00:17:52,050
Josh:
where maybe you're not doing the day-to-day where,

286
00:17:52,680 --> 00:17:55,580
Josh:
you would normally prompt an agent to do this but you're doing a lot of the

287
00:17:55,580 --> 00:17:58,060
Josh:
higher level tasks you can imagine yourself not having to do

288
00:17:58,380 --> 00:18:02,160
Josh:
the day-to-day like for example if you're just managing your household you no

289
00:18:02,160 --> 00:18:05,490
Josh:
longer have to take out the trash you don't have to run errands you could just

290
00:18:05,490 --> 00:18:08,520
Josh:
focus on how to make your household the best household it is because you have

291
00:18:08,520 --> 00:18:09,530
Josh:
that higher level ability

292
00:18:09,800 --> 00:18:13,740
Josh:
and in that chart that we showed in the artifact earlier on it shows a decreasing

293
00:18:13,740 --> 00:18:16,820
Josh:
sized human it's the amount of input that a human is needed to get the output you want,

294
00:18:17,280 --> 00:18:21,410
Josh:
but it's still ultimately on the human being in order to to push and navigate

295
00:18:21,410 --> 00:18:25,710
Josh:
towards the outputs that you want because ultimately these tools are just for us so when i think of.

296
00:18:26,500 --> 00:18:30,120
Josh:
Ai becoming increasingly good and when it comes to running the show even i've

297
00:18:30,120 --> 00:18:34,210
Josh:
leaned on it we both have i think a lot more recently but all that's done is

298
00:18:34,210 --> 00:18:38,360
Josh:
actually given us more leverage to do more with the show than have it replace

299
00:18:38,360 --> 00:18:39,710
Josh:
us and even in the case that.

300
00:18:41,060 --> 00:18:43,810
Josh:
We could clone ourselves. We could create a video version of ourselves that

301
00:18:44,010 --> 00:18:48,000
Josh:
has a perfect voice that sounds just like us. I don't think people actually want that.

302
00:18:48,600 --> 00:18:52,110
Josh:
There's that lacking human nature that still isn't understood.

303
00:18:52,400 --> 00:18:55,910
Josh:
And I find that it's more empowering when I hear that these loops exist that

304
00:18:55,910 --> 00:19:01,080
Josh:
can run for days on end and create amazing outputs versus not where it's kind

305
00:19:01,080 --> 00:19:03,370
Josh:
of extracted from us. I don't really think that's true.

306
00:19:03,830 --> 00:19:08,830
Ejaaz:
Yeah, it's like that stat of, well, it's that thesis that everyone held about

307
00:19:08,830 --> 00:19:13,310
Ejaaz:
a year ago, which is like with the increase of AI adoption, people will have

308
00:19:13,310 --> 00:19:15,930
Ejaaz:
more free time to have fun and leisure.

309
00:19:16,300 --> 00:19:20,900
Ejaaz:
And in fact, the opposite has shown that like people just work way more and work harder.

310
00:19:21,170 --> 00:19:24,500
Ejaaz:
And the output of that work is measured across like pretty much every single

311
00:19:24,500 --> 00:19:26,210
Ejaaz:
company and profession and role.

312
00:19:28,200 --> 00:19:32,590
Ejaaz:
I do generally agree with that. I don't think humans are going to get wiped out anytime soon.

313
00:19:33,520 --> 00:19:41,270
Ejaaz:
But one thing that is kind of nagging my brain is if we extrapolate this intelligence out enough,

314
00:19:42,080 --> 00:19:48,760
Ejaaz:
there is no reason why AI won't be able to take over or replace other parts

315
00:19:48,760 --> 00:19:51,530
Ejaaz:
of the cognitive process that a human can do,

316
00:19:52,410 --> 00:19:57,700
Ejaaz:
particularly if it's one AI model trained on the entire corpus of knowledge

317
00:19:57,700 --> 00:19:59,440
Ejaaz:
that a bunch of humans have been guiding it.

318
00:19:59,640 --> 00:20:03,980
Ejaaz:
So when I think about Anthropic, when I think about OpenAI, I think about all the

319
00:20:05,300 --> 00:20:08,720
Ejaaz:
millions of people that use their product every single day and the data that

320
00:20:08,720 --> 00:20:12,930
Ejaaz:
they ingest every single day that gets recorded on one singular database that

321
00:20:12,930 --> 00:20:17,940
Ejaaz:
can then be reused to train a better model that is more hyper-optimized towards humans.

322
00:20:18,150 --> 00:20:22,720
Ejaaz:
You could argue that as a single human, you don't get to meet and read the thoughts

323
00:20:22,720 --> 00:20:24,370
Ejaaz:
of every other human that is out there.

324
00:20:25,860 --> 00:20:29,550
Ejaaz:
You have your very own individual process. And I think that an AM model that

325
00:20:29,550 --> 00:20:34,720
Ejaaz:
can get access to the world's brain and thoughts could probably create something

326
00:20:34,720 --> 00:20:38,550
Ejaaz:
kind of close to knowing what that human taste profile would be.

327
00:20:39,240 --> 00:20:44,440
Ejaaz:
The other major question that I'm wondering is, how much is all of this going to cost?

328
00:20:46,210 --> 00:20:49,890
Ejaaz:
One, like, stat that has stuck in my head over the recent few weeks is that

329
00:20:50,190 --> 00:20:53,010
Ejaaz:
Philanthropic particularly, they service, or like,

330
00:20:53,890 --> 00:20:58,290
Ejaaz:
the Fortune 10, the top 10 companies in the world, nine of them use Clawed,

331
00:20:58,560 --> 00:21:04,380
Ejaaz:
and their budget's increased by 500%, or is projected to increase by 500% by the end of this year.

332
00:21:04,600 --> 00:21:07,400
Ejaaz:
And they're doing this willingly because the ROI, the value that they're getting

333
00:21:07,400 --> 00:21:09,300
Ejaaz:
out of that is pretty massive.

334
00:21:09,970 --> 00:21:13,490
Ejaaz:
Alternatively, there are companies like Uber that have slashed their budgets

335
00:21:13,490 --> 00:21:17,210
Ejaaz:
massively because their entire year's budget was spent in a couple of months.

336
00:21:17,440 --> 00:21:22,140
Ejaaz:
So I'm wondering, in this world of agent loops where you've got AIs working

337
00:21:22,140 --> 00:21:25,610
Ejaaz:
overnight for you, the bills are going to increase pretty massively.

338
00:21:25,610 --> 00:21:29,810
Ejaaz:
And I'm wondering, unless these AI models don't get cheaper,

339
00:21:29,810 --> 00:21:33,320
Ejaaz:
and there's an infrastructure bottleneck there where these GPUs cost a lot of

340
00:21:33,320 --> 00:21:36,660
Ejaaz:
money, we can't scale power and infrastructure anytime soon.

341
00:21:36,660 --> 00:21:42,310
Ejaaz:
We need so much more energy than we already have currently on Earth to be able to power these things.

342
00:21:42,640 --> 00:21:46,070
Ejaaz:
The cost of these things are just going to go up a lot more massively,

343
00:21:46,320 --> 00:21:51,140
Ejaaz:
which means that either this is only going to be a power or a tool reserved

344
00:21:51,140 --> 00:21:54,890
Ejaaz:
for the rich, or something's going to break here and maybe open source models

345
00:21:54,890 --> 00:21:56,510
Ejaaz:
get adopted more aggressively.

346
00:21:56,790 --> 00:22:00,700
Josh:
Yeah, I imagine there's probably use cases for all of the above.

347
00:22:01,220 --> 00:22:04,260
Josh:
It's like open source models will continue to improve they'll be able to do

348
00:22:04,260 --> 00:22:08,020
Josh:
a lot of the more trivial tasks that don't require frontier intelligence so

349
00:22:08,020 --> 00:22:11,050
Josh:
therefore the cost of those types of loops will go down because not everyone

350
00:22:11,050 --> 00:22:13,070
Josh:
needs to have the most cutting edge,

351
00:22:13,800 --> 00:22:16,930
Josh:
software stack engineering like they're just kind of having it help them through

352
00:22:16,930 --> 00:22:20,630
Josh:
their day-to-day maybe it's replying to emails maybe it's whatever miscellaneous things it may be

353
00:22:21,010 --> 00:22:23,920
Josh:
there's a high probability that these open source models as they continue to

354
00:22:23,920 --> 00:22:27,390
Josh:
improve will be able to bite off a meaningful chunk of that then the other half

355
00:22:27,390 --> 00:22:30,840
Josh:
is using these frontier models that is a requirement in order to get the absolute

356
00:22:30,840 --> 00:22:34,260
Josh:
best results for whatever very challenging work they're doing.

357
00:22:34,560 --> 00:22:38,740
Josh:
And that is going to cost a lot of money for sure.

358
00:22:39,120 --> 00:22:46,330
Josh:
And I don't see that changing, but I think the output of the dollars in will continue to go up.

359
00:22:46,330 --> 00:22:50,920
Josh:
It's because as you get more knowledge per token, as you get more output per

360
00:22:51,150 --> 00:22:55,030
Josh:
prompt, it very clearly, I mean, the economics seem to make sense.

361
00:22:55,200 --> 00:22:56,980
Josh:
And I think that's kind of right now.

362
00:22:58,960 --> 00:23:02,930
Josh:
Enterprise spend on these models they're trying to figure out well how much

363
00:23:02,930 --> 00:23:06,760
Josh:
value can we actually get back from every dollar spent and right now it's a

364
00:23:06,760 --> 00:23:10,210
Josh:
little bit unsure you mentioned uber we have uber here that we're showing on screen

365
00:23:10,480 --> 00:23:14,680
Josh:
where uber just recently put a cap on the amount of tokens that

366
00:23:15,070 --> 00:23:20,900
Josh:
their employees are allowed to use at fifteen hundred dollars per engineer per tool per month and

367
00:23:21,300 --> 00:23:24,940
Josh:
we'll see how that works because a lot of other companies that we know they're

368
00:23:24,940 --> 00:23:29,170
Josh:
kind giving their engineers unlimited budget in fact they're kind of ranking

369
00:23:29,170 --> 00:23:33,710
Josh:
the engineers based on how many tokens they're using per month and.

370
00:23:34,620 --> 00:23:38,410
Josh:
We'll see where that goes. I suspect the companies that are spending more on

371
00:23:38,410 --> 00:23:41,900
Josh:
tokens will continue to see a higher upside for now, at least.

372
00:23:42,180 --> 00:23:46,140
Josh:
But like you mentioned, the underlying problem with all of this is we're going

373
00:23:46,140 --> 00:23:48,880
Josh:
to continue to have more prompts. I mean, these loops consume a tremendous amount

374
00:23:48,880 --> 00:23:51,180
Josh:
of tokens, whether they're frontier tokens or open source tokens.

375
00:23:51,410 --> 00:23:54,940
Josh:
It doesn't matter. We're going to need orders of magnitude more than we have.

376
00:23:54,940 --> 00:23:59,460
Josh:
And we don't have the computability. It really does always come down to that

377
00:23:59,460 --> 00:24:01,200
Josh:
energy problem, that infrastructure problem.

378
00:24:01,200 --> 00:24:06,030
Josh:
We don't have the infra built out to support this so therefore the costs likely

379
00:24:06,030 --> 00:24:10,870
Josh:
continue to stay high maybe it's not because you're paying the provider for tokens

380
00:24:11,170 --> 00:24:15,810
Josh:
perhaps it's just renting the gpu time from a cluster that is doing much more

381
00:24:15,810 --> 00:24:18,190
Josh:
valuable work so i think that might ultimately be

382
00:24:18,560 --> 00:24:23,060
Josh:
that crux is the actual availability of the compute to do these things and that's

383
00:24:23,060 --> 00:24:25,520
Josh:
why these edge compute devices like having your,

384
00:24:26,240 --> 00:24:30,110
Josh:
mac studio on your desktop that can run locally it's probably a pretty valuable thing to have.

385
00:24:30,290 --> 00:24:33,630
Ejaaz:
So I'm sure a lot of you are wondering, you know, how does this apply to me?

386
00:24:34,490 --> 00:24:37,920
Ejaaz:
You know, I have none of my friends have mentioned this loop feature.

387
00:24:37,920 --> 00:24:39,710
Ejaaz:
I don't really know many people who are using it.

388
00:24:40,490 --> 00:24:43,850
Ejaaz:
As we mentioned earlier, like this isn't probably going to be used by the bulk

389
00:24:43,850 --> 00:24:47,570
Ejaaz:
or majority of people yet until some of those use cases actually arise.

390
00:24:47,570 --> 00:24:51,840
Ejaaz:
I think it's mainly going to happen in the workplace. It's going to happen with

391
00:24:51,840 --> 00:24:54,540
Ejaaz:
like some of these enterprise companies that are trying to automate certain

392
00:24:54,540 --> 00:24:57,610
Ejaaz:
departments or functions of their particular a company like marketing,

393
00:24:57,840 --> 00:24:58,910
Ejaaz:
like software engineering.

394
00:24:58,910 --> 00:25:02,740
Ejaaz:
And I think it'll start with lower level tasks because these agents still aren't

395
00:25:02,740 --> 00:25:05,060
Ejaaz:
smart enough to understand nuance completely.

396
00:25:05,310 --> 00:25:09,180
Ejaaz:
And also, you don't just want to let an agent run loose overnight whilst you're

397
00:25:09,180 --> 00:25:12,450
Ejaaz:
sleeping and then take down your entire company. And one place where it's working

398
00:25:12,640 --> 00:25:15,390
Ejaaz:
tirelessly to accelerate the development of that

399
00:25:18,190 --> 00:25:23,380
Ejaaz:
And we have Boris Cherny over here basically explaining how he's basically ditched

400
00:25:23,380 --> 00:25:25,470
Ejaaz:
his integrated development environment.

401
00:25:25,730 --> 00:25:30,440
Ejaaz:
He has ditched all of his normal tools that he had spent decades basically honing

402
00:25:30,440 --> 00:25:34,790
Ejaaz:
his software engineering skill on to now completely focus on building up these

403
00:25:34,790 --> 00:25:36,530
Ejaaz:
agent loops. And what is he focused on?

404
00:25:36,760 --> 00:25:40,520
Ejaaz:
Well, he works primarily on cloud code, but the other folks at Anthropic and

405
00:25:40,520 --> 00:25:43,020
Ejaaz:
OpenAI have started this thing called

406
00:25:43,660 --> 00:25:48,870
Ejaaz:
Recursive self-improvement or RSI, which is basically the goal of getting your

407
00:25:48,870 --> 00:25:52,380
Ejaaz:
AI model to build the next version of itself.

408
00:25:52,590 --> 00:25:57,390
Ejaaz:
And this is a test that Anthropic and the folks at OpenAI do for any new model that they release.

409
00:25:57,660 --> 00:26:02,560
Ejaaz:
They set it a goal or task to basically rebuild itself in a more improved fashion.

410
00:26:02,730 --> 00:26:07,640
Ejaaz:
Now, one thing that the AI has gotten really good at is building out that next function.

411
00:26:07,640 --> 00:26:12,780
Ejaaz:
But one thing it's not very good at is figuring out what research problems they

412
00:26:12,780 --> 00:26:16,400
Ejaaz:
should fix, what research problems it should focus on to try and,

413
00:26:16,860 --> 00:26:21,120
Ejaaz:
you know, overcome and make it ultimately, you know, a better model than its competitors.

414
00:26:21,400 --> 00:26:27,010
Ejaaz:
Now, RSI is something, it's kind of like the golden egg that each AI lab is going after.

415
00:26:27,280 --> 00:26:30,270
Ejaaz:
And this is the primary use of agent loops right now.

416
00:26:30,450 --> 00:26:34,440
Ejaaz:
And you can see why it might be obvious. If you have an AI model that can basically

417
00:26:34,440 --> 00:26:38,210
Ejaaz:
build the next best version of itself, eventually you're going to get to AGI,

418
00:26:38,210 --> 00:26:41,950
Ejaaz:
whatever the hell that looks like, and then you can apply it to pretty much any sector.

419
00:26:42,180 --> 00:26:45,790
Ejaaz:
Now, the problem and the worry that kind of immediately pops into my head and

420
00:26:45,790 --> 00:26:50,410
Ejaaz:
a lot of these researchers head is, if it eventually does get that smart, right?

421
00:26:51,190 --> 00:26:55,850
Ejaaz:
Could escape human control completely and run off on its own and do its own

422
00:26:55,850 --> 00:27:00,220
Ejaaz:
thing. Because at that point, why would it need a human to kind of like guide it or shepherd it?

423
00:27:00,380 --> 00:27:03,930
Ejaaz:
Instead, it can just kind of like do its own thing. So this is like the primary

424
00:27:03,930 --> 00:27:06,770
Ejaaz:
use case that I'm seeing for agent loops being worked on right now.

425
00:27:06,770 --> 00:27:09,720
Ejaaz:
I would love to see a like more broader application across like kind of like

426
00:27:10,110 --> 00:27:12,960
Ejaaz:
consumer professions, like in finance, like in science and stuff like that,

427
00:27:12,960 --> 00:27:16,870
Ejaaz:
which I do believe it'll spill over eventually. But unless you're seeing anything

428
00:27:16,870 --> 00:27:21,610
Ejaaz:
else, Josh, I think like that is primarily it on agent loops and agent autonomy.

429
00:27:21,720 --> 00:27:25,790
Josh:
It's on you to figure out the best use cases for it. Like there's no real company

430
00:27:25,790 --> 00:27:29,690
Josh:
defining it. They're just giving you the tools. And I mean, for better or worse,

431
00:27:29,690 --> 00:27:32,890
Josh:
it's very open-ended. So it's on you to figure out how best to use these.

432
00:27:33,130 --> 00:27:36,940
Josh:
I think if this sounds a little overwhelming, maybe we could outline a few examples

433
00:27:36,940 --> 00:27:39,420
Josh:
of each one of these kind of four rungs in the ladder here.

434
00:27:39,740 --> 00:27:42,760
Josh:
The first one being prompting this everyone has done before.

435
00:27:42,760 --> 00:27:44,650
Josh:
I'm sure it's like rewrite this

436
00:27:44,970 --> 00:27:49,140
Josh:
email to sound more confident or explain what my doctor meant by this.

437
00:27:49,400 --> 00:27:53,760
Josh:
But then you've probably also used the partial agentic usage as well of these

438
00:27:53,760 --> 00:27:58,250
Josh:
models, which is like planet my three day vacation to Lisbon that I'm going on next week.

439
00:27:58,500 --> 00:28:02,010
Josh:
And it will actually go off and use tools and it will think complex thoughts

440
00:28:02,010 --> 00:28:05,220
Josh:
and ideas and kind of surface you a full itinerary for your trip.

441
00:28:05,530 --> 00:28:08,580
Josh:
And then there's the third one, which is the harness. This is a little more

442
00:28:08,580 --> 00:28:11,430
Josh:
complicated. This is for people who are building more project based stuff.

443
00:28:11,430 --> 00:28:15,800
Josh:
So for example, if you want to build you a website for your dog walking business

444
00:28:15,800 --> 00:28:19,220
Josh:
and you kind of describe it and you go back and forth on a spec and then it

445
00:28:19,220 --> 00:28:20,410
Josh:
goes off and implements that.

446
00:28:20,630 --> 00:28:24,060
Josh:
And the fourth is loops, which doesn't have to necessarily be overwhelming.

447
00:28:24,060 --> 00:28:26,810
Josh:
It can be simple as let's say you are.

448
00:28:27,660 --> 00:28:30,420
Josh:
Interested in the news, you could say every morning before I wake up,

449
00:28:30,600 --> 00:28:35,550
Josh:
scan these 10 sources plus market data and give me this bulleted brief.

450
00:28:35,550 --> 00:28:38,910
Josh:
Or let's say you have a to-do list. It'll go off and think overnight and solve

451
00:28:38,910 --> 00:28:42,370
Josh:
all those problems overnight, iteratively until it comes to a solution that

452
00:28:42,370 --> 00:28:45,810
Josh:
it hopefully arrives at in the mornings. There's a lot of use cases.

453
00:28:45,810 --> 00:28:47,740
Josh:
I think a lot of it requires creativity.

454
00:28:47,980 --> 00:28:51,970
Josh:
And that is the prompt we will leave you with today, which is share with us,

455
00:28:51,970 --> 00:28:54,220
Josh:
please, how you are using these models best.

456
00:28:54,220 --> 00:28:57,340
Josh:
Because so much of the question isn't are these models smart?

457
00:28:57,340 --> 00:29:00,540
Josh:
It's how can I extract that intelligence from them in the most effective way

458
00:29:00,540 --> 00:29:04,330
Josh:
for my life? So I would be so curious to hear which rung of the ladder you find

459
00:29:04,330 --> 00:29:05,380
Josh:
yourself on one through four.

460
00:29:05,600 --> 00:29:07,990
Josh:
And then what the most interesting use cases you found,

461
00:29:08,690 --> 00:29:12,650
Josh:
among those rungs of the ladder are you using loops currently what are you using them for,

462
00:29:13,130 --> 00:29:16,090
Josh:
are you with agents are you still using it as a google extension if you're still

463
00:29:16,090 --> 00:29:19,540
Josh:
using it as a google extension i would encourage a little more creativity really

464
00:29:19,540 --> 00:29:22,580
Josh:
try to ask harder questions and figure out how it could be implemented in your

465
00:29:22,580 --> 00:29:26,230
Josh:
life but i think that's pretty much it on the loop um,

466
00:29:26,990 --> 00:29:31,310
Josh:
you're not going anywhere but your job might shift a little bit in terms of

467
00:29:31,310 --> 00:29:34,710
Josh:
scope as these tools get more powerful and that should be the hope that should

468
00:29:34,710 --> 00:29:39,540
Josh:
be the goal because it'll allow you to do so much more that you want to accomplish, I believe.

469
00:29:39,540 --> 00:29:41,770
Josh:
And yeah, I think that's where we'll leave you with today.

470
00:29:42,040 --> 00:29:46,580
Ejaaz:
Thank you folks so much for listening. Similar to Josh's prompts,

471
00:29:46,580 --> 00:29:51,130
Ejaaz:
I'm actually kind of curious, for one singular task that you've used your AI

472
00:29:51,130 --> 00:29:53,690
Ejaaz:
for, what is the most number of tokens that you've burnt?

473
00:29:53,810 --> 00:29:56,860
Ejaaz:
Be honest, it can be for any use case, doesn't matter, let us know.

474
00:29:57,160 --> 00:30:01,390
Ejaaz:
And also, what is the longest that you've had an AI work on a particular task

475
00:30:01,390 --> 00:30:05,880
Ejaaz:
for? Is it a couple of minutes? Is it hours? Is it potentially overnight?

476
00:30:05,880 --> 00:30:08,750
Ejaaz:
Let us know. I'm curious. And what was the associated bill with that?

477
00:30:09,030 --> 00:30:12,320
Ejaaz:
And yeah, we'll see you on the next episode. Wherever you listen to us,

478
00:30:12,520 --> 00:30:16,000
Ejaaz:
if you haven't subscribed, if you haven't rated us, if you're not leaving us

479
00:30:16,000 --> 00:30:17,180
Ejaaz:
comments, what are you doing?

480
00:30:17,410 --> 00:30:20,910
Ejaaz:
We respond to pretty much any and every one of them. We listen to your feedback.

481
00:30:20,910 --> 00:30:23,540
Ejaaz:
It feeds into some of the work and content that we put out.

482
00:30:23,770 --> 00:30:28,360
Ejaaz:
We are almost hitting 60,000 of you folks. And you guys are reading our newsletter,

483
00:30:28,360 --> 00:30:33,390
Ejaaz:
which is like hit out to about 100,000 plus people. every single week. We post twice a week.

484
00:30:34,030 --> 00:30:37,840
Ejaaz:
But yeah, wherever you are, please subscribe to us, leave us a comment,

485
00:30:37,840 --> 00:30:38,900
Ejaaz:
and we'll see you on the next one.

486
00:30:39,290 --> 00:30:40,080
Josh:
See you guys next time.

487
00:30:40,080 --> 00:30:40,450
Ejaaz:
Peace.