1
00:00:08,139 --> 00:00:14,383
Hi everyone, welcome to the Monkey Patching Podcast, where we go bananas about all things
lobsters, vibes and more.

2
00:00:14,383 --> 00:00:16,245
My name is Murilo, it's been a while.

3
00:00:16,245 --> 00:00:18,236
I'm joined as always by my friend Bart.

4
00:00:18,236 --> 00:00:20,657
Hey Bart, how are you?

5
00:00:20,657 --> 00:00:21,637
Long time.

6
00:00:21,968 --> 00:00:22,749
Yeah, long time.

7
00:00:22,749 --> 00:00:23,730
I'm good.

8
00:00:23,730 --> 00:00:25,971
It's bit...

9
00:00:26,512 --> 00:00:28,453
It's a holiday for the kids.

10
00:00:29,153 --> 00:00:30,935
It's a bit juggling crowd today.

11
00:00:30,935 --> 00:00:34,437
But I did just come back from a short ski in Lévoche.

12
00:00:35,158 --> 00:00:36,358
So that was nice.

13
00:00:36,358 --> 00:00:37,919
Just last weekend.

14
00:00:39,380 --> 00:00:41,181
It was actually snowing while we were there.

15
00:00:41,181 --> 00:00:42,183
It was definitely good enough.

16
00:00:42,183 --> 00:00:43,036
It's not that high.

17
00:00:43,036 --> 00:00:45,329
It's like 1100 meters top.

18
00:00:45,329 --> 00:00:49,509
So you need to be a bit lucky to have good snow, but it was actually a good weekend.

19
00:00:50,248 --> 00:00:51,428
So you were lucky.

20
00:00:52,389 --> 00:00:53,069
That's nice.

21
00:00:53,069 --> 00:00:57,752
uh I went to South Africa for almost two weeks.

22
00:00:57,752 --> 00:01:03,074
So we went to Cape Town and then we went to Johannesburg for an Indian wedding.

23
00:01:03,074 --> 00:01:09,157
So whatever you picture, you you think of like Indian wedding in South Africa, that's
exactly how it is.

24
00:01:09,157 --> 00:01:14,781
You know, the music, the outfits, the animals, you know, whatever you think, you're right.

25
00:01:15,632 --> 00:01:23,273
like dancing to Indian traditional music with a, in the background, like a zebra walking
by.

26
00:01:23,273 --> 00:01:39,433
exactly that's exactly how it was that's exactly how it was i could say much more about
that but yeah i know it was that's the thing i feel like i had to um i had to get my

27
00:01:39,433 --> 00:01:49,341
biases in check as well i feel like there are a few moments that i was like if someone
records me right now canceled immediately you know it looks it looks bad

28
00:01:49,341 --> 00:01:52,764
But it was fine in the context in time with the people it was fine, right?

29
00:01:52,764 --> 00:01:55,291
But anyways.

30
00:01:55,291 --> 00:02:01,707
actually from the guy that had this wedding that there was quite a big Indian community in
South Africa.

31
00:02:01,707 --> 00:02:02,633
I didn't know that.

32
00:02:02,633 --> 00:02:03,003
yes.

33
00:02:03,003 --> 00:02:04,715
And it was also, I mean, was fun.

34
00:02:04,715 --> 00:02:06,216
Cape Town is really beautiful.

35
00:02:06,216 --> 00:02:11,359
But it was also a lot of, I don't want to say educational, but there was a lot of
interesting insights, right?

36
00:02:11,359 --> 00:02:14,551
Because the history of South Africa is very recent, right?

37
00:02:14,551 --> 00:02:15,772
With apartheid and all these things.

38
00:02:15,772 --> 00:02:23,846
So a lot of the people we talked to, can actually have that very close, you know, like how
it was to live in this, like the parents of the groom, right?

39
00:02:23,846 --> 00:02:25,247
For the wedding that I was there.

40
00:02:25,247 --> 00:02:26,520
So yeah, so was very interesting.

41
00:02:26,520 --> 00:02:27,509
It very nice trip.

42
00:02:27,509 --> 00:02:28,262
But...

43
00:02:28,262 --> 00:02:29,643
Happy to be back.

44
00:02:29,643 --> 00:02:38,229
happy to we had some listeners being people ping me is like, we won monkey patch and I was
like, okay, enough is enough.

45
00:02:38,229 --> 00:02:40,720
So yeah, exactly.

46
00:02:40,720 --> 00:02:41,730
Exactly.

47
00:02:41,730 --> 00:02:42,931
So wait.

48
00:02:43,852 --> 00:02:44,592
Yeah, exactly.

49
00:02:44,592 --> 00:02:44,947
Yeah.

50
00:02:44,947 --> 00:02:45,733
So we know more.

51
00:02:45,733 --> 00:02:46,473
We're back.

52
00:02:46,473 --> 00:02:48,113
So this is it.

53
00:02:48,154 --> 00:02:56,358
And you also have news on regards to your, your new, your new endeavor, right?

54
00:02:56,358 --> 00:02:57,776
Like you're looking for help.

55
00:02:57,776 --> 00:02:58,408
hiring.

56
00:02:58,408 --> 00:02:59,491
We're hiring.

57
00:02:59,491 --> 00:03:00,068
Yeah, yeah.

58
00:03:00,068 --> 00:03:02,050
about and who are you looking for?

59
00:03:02,053 --> 00:03:03,895
Well, you can actually find the vacancy.

60
00:03:03,895 --> 00:03:04,683
Make some advertisement.

61
00:03:04,683 --> 00:03:06,137
Do we need like a k-ching?

62
00:03:06,137 --> 00:03:08,139
Like a sound effect or something?

63
00:03:08,139 --> 00:03:11,611
um We should have these things, huh?

64
00:03:11,611 --> 00:03:13,403
When we make advertisements.

65
00:03:13,403 --> 00:03:15,624
But maybe self-advertisement is okay.

66
00:03:16,881 --> 00:03:20,204
But, so you can find the vacancy on Cambrio.

67
00:03:20,204 --> 00:03:26,041
With a C-C-A-M-B-R-Y-O.com Cambrio is our...

68
00:03:26,041 --> 00:03:29,316
quote unquote official venture studio.

69
00:03:29,316 --> 00:03:31,178
The quotes are something to explain another time.

70
00:03:31,178 --> 00:03:35,942
But our first venture that we're now building is is top of mind, we're calling it.

71
00:03:37,585 --> 00:03:38,838
Top of mind is

72
00:03:38,838 --> 00:03:43,978
It's a tool to remember, let's say, the small things about your network.

73
00:03:43,978 --> 00:03:53,078
Like when you go to a networking event or when you have a lunch or when you have whatever,
like you typically remember the faces, but all the details, like you very quickly forget,

74
00:03:53,078 --> 00:03:54,158
right?

75
00:03:54,897 --> 00:04:03,998
Like you had a lunch with someone and let's say it's a potential client of yours and he
says something like, it gives a small tidbit of information.

76
00:04:03,998 --> 00:04:07,558
Like two weeks from now I'm having a small surgery on my knee.

77
00:04:07,646 --> 00:04:11,648
like not super relevant for the commercial thing that you're doing with this client.

78
00:04:11,648 --> 00:04:14,821
But like if you see them in three weeks, it would be nice to remember this, right?

79
00:04:14,821 --> 00:04:16,289
Like all these things.

80
00:04:16,289 --> 00:04:24,210
And we're making it very, very, very simple to record in whatever way any inputs like
notes, recording of a meeting.

81
00:04:24,210 --> 00:04:31,452
If you're afterwards in the car saying something to the app, everything gets recorded very
easily, whether it's text, video, images.

82
00:04:31,452 --> 00:04:39,358
what our solution does is basically from all that raw input, we have a layer that
structured all that into a structured knowledge base basically.

83
00:04:39,358 --> 00:04:41,199
So we make that translation automatically.

84
00:04:41,199 --> 00:04:45,522
And then next time you meet this person, we surface it for you.

85
00:04:45,522 --> 00:04:48,684
So you get all this information at the moment that you need it.

86
00:04:49,445 --> 00:04:52,027
And that's what we're doing with the top of mind.

87
00:04:52,207 --> 00:04:59,502
And we have a vacancy open for an AI native developer, which is ridiculously hard to find
actually.

88
00:04:59,706 --> 00:05:03,428
Yeah, so we talked a bit about before we hit record, right?

89
00:05:03,428 --> 00:05:06,619
So this is the job position for people following the video, right?

90
00:05:06,619 --> 00:05:07,669
This is it.

91
00:05:07,669 --> 00:05:10,251
Cambrio.com, then you can apply here.

92
00:05:10,251 --> 00:05:12,602
But you're looking for a very specific software developer, right?

93
00:05:12,602 --> 00:05:14,494
Like you call it AI software engineer.

94
00:05:14,494 --> 00:05:16,051
How would you describe that person?

95
00:05:16,051 --> 00:05:21,655
and it's actually because we talk about this a lot, like on the podcast for the last, I
don't know, the last year.

96
00:05:21,655 --> 00:05:33,685
I think like AI native development and like basically means like just using codecs or
cloud codes to build whatever you're building and not look into an, not look in an ID,

97
00:05:33,685 --> 00:05:37,967
don't do any traditional development, just orchestrate your coding agents.

98
00:05:38,491 --> 00:05:40,651
To me that is AI native development today.

99
00:05:40,651 --> 00:05:50,831
And I think it's also something that's still relatively recent in a sense that like Opus
4.5 in what is it November, like it really enabled this, like the performance went up a

100
00:05:50,831 --> 00:05:51,671
lot.

101
00:05:52,191 --> 00:05:54,831
And I'm looking for someone that basically doesn't write code anymore.

102
00:05:54,831 --> 00:05:58,631
I'm looking for someone that is very good at orchestrating coding agents.

103
00:06:00,251 --> 00:06:00,302
And...

104
00:06:00,302 --> 00:06:05,366
question, the question I had before was, I do use cloud code today.

105
00:06:05,426 --> 00:06:14,481
I'm not to the point that I don't look at the code anymore, but for most of the cases, the
default, like let's say 90%, I still manually review, right?

106
00:06:14,481 --> 00:06:17,332
So in cloud code you have like auto-approved changes or you manually review.

107
00:06:17,332 --> 00:06:18,823
So then it says, I'm gonna do this, that's okay.

108
00:06:18,823 --> 00:06:19,928
And I just, yeah, okay.

109
00:06:19,928 --> 00:06:20,374
yeah, okay.

110
00:06:20,374 --> 00:06:22,585
Would I still qualify for this position?

111
00:06:23,471 --> 00:06:25,128
I think I could work with you.

112
00:06:25,128 --> 00:06:29,702
Maybe if you did a podcast or something, not an app, maybe something.

113
00:06:30,260 --> 00:06:44,240
No, but I think it's fine to, let's say, what you're mentioning is like in cloud code, you
have this verification step, Like you need to say accept or do not accept for every step

114
00:06:44,240 --> 00:06:45,540
that you do in edit.

115
00:06:46,880 --> 00:06:49,120
I personally don't think it's very efficient.

116
00:06:49,120 --> 00:06:54,420
I don't have anything necessarily against it either, because I think it also helps to
build up trust.

117
00:06:54,420 --> 00:06:59,528
Like if you're making this full switch to AI native development, like this is a good
intermediate step to...

118
00:06:59,528 --> 00:07:00,718
built at rest.

119
00:07:00,840 --> 00:07:01,160
Yeah.

120
00:07:01,160 --> 00:07:02,330
Also not for me.

121
00:07:02,330 --> 00:07:02,902
I agree.

122
00:07:02,902 --> 00:07:10,988
There's also building a bit the mental model of the code and like, okay, these are the
things that are happening here or sometimes, especially in the beginning, cause you have a

123
00:07:10,988 --> 00:07:11,920
lot of rules, right?

124
00:07:11,920 --> 00:07:20,754
Like you want to say Claude don't make functions that are really big or whatever, or don't
use this type of pattern and use this, know, don't use a lot of nested for loops.

125
00:07:20,754 --> 00:07:25,929
And I think in the beginning, when you actually, as you look at the code and you say,
actually don't do this.

126
00:07:25,929 --> 00:07:28,351
do that and also add a note for you to never do it again.

127
00:07:28,351 --> 00:07:34,875
And I think I noticed that in the beginning of the project that happens a bit more, but
then as time goes on, it's like, it's less necessary.

128
00:07:34,875 --> 00:07:36,396
I can just say, okay, yes, yes, yes, yes.

129
00:07:36,396 --> 00:07:40,399
And then I get to a point that like, once he gives me the plan and just approve.

130
00:07:40,399 --> 00:07:47,436
I also think there's a bit of, so there's there is getting more confidence in the model,
but I think there's also a bit of a learning thing as well.

131
00:07:47,436 --> 00:07:54,150
Cause sometimes I do request some things and it's less and less, but still sometimes I
think the prompts are too broad.

132
00:07:54,298 --> 00:07:55,809
And then I noticed that this is producing stuff.

133
00:07:55,809 --> 00:07:56,970
like, actually, that's not what I want.

134
00:07:56,970 --> 00:07:57,160
Right.

135
00:07:57,160 --> 00:07:59,602
So I think there's for me, it's been it's been helpful.

136
00:07:59,602 --> 00:08:04,546
And I still sometimes I do auto proof, but it is still an exception today.

137
00:08:04,827 --> 00:08:05,727
Right.

138
00:08:05,762 --> 00:08:11,946
I think for me, at least, and I think for a lot of people, it's like there was like typing
code and the tab completion models.

139
00:08:12,307 --> 00:08:14,446
And then you went to like, I can just do everything.

140
00:08:14,446 --> 00:08:15,727
I'm just going to auto prove everything.

141
00:08:15,727 --> 00:08:18,789
And then you just get this huge pile of code.

142
00:08:18,789 --> 00:08:23,733
And then you can like you you start to notice that it's like there's a lot of code
duplication.

143
00:08:23,845 --> 00:08:28,613
like things are not really organized anymore and like you kind of have to find a bit of a
in between, right?

144
00:08:28,613 --> 00:08:32,877
Like it's not like it doesn't just work if you just say build me an app, right?

145
00:08:32,877 --> 00:08:33,269
So.

146
00:08:33,269 --> 00:08:33,889
true.

147
00:08:33,889 --> 00:08:45,734
think there are actually it wasn't planned this way, but actually today I created a new
blog post on this how to get started with AI native development specifically on cloud code

148
00:08:45,734 --> 00:08:46,045
though.

149
00:08:46,045 --> 00:08:57,050
And the reason is because like over the last two weeks there were three moments where I
had to onboard someone on a project and to switch from traditional development to AI

150
00:08:57,050 --> 00:08:58,201
native development.

151
00:08:58,201 --> 00:08:58,598
really?

152
00:08:58,598 --> 00:08:59,705
And it wasn't Cambrio?

153
00:08:59,705 --> 00:09:00,184
with...

154
00:09:00,184 --> 00:09:01,267
No, it was not for Cambrio.

155
00:09:01,267 --> 00:09:05,412
I'm supporting some students.

156
00:09:05,894 --> 00:09:06,940
wow, cool.

157
00:09:06,940 --> 00:09:11,064
and they still see very little in their curriculum on this.

158
00:09:11,064 --> 00:09:19,612
So we switched them to Claude Goat and what you notice is that people are researching
themselves, but then you very quickly find...

159
00:09:20,104 --> 00:09:24,824
sort of old information, like four, five, six months old information on this.

160
00:09:24,824 --> 00:09:34,564
And it says, ah, yeah, when you go to plan mode, you need to find this model or this and
this is Claude MD or all these, you need to have all this full spec file for it before you

161
00:09:34,564 --> 00:09:35,264
can start.

162
00:09:35,264 --> 00:09:40,944
And at that moment, that was relevant, but like so much changed since November, December,
right?

163
00:09:40,944 --> 00:09:46,644
Like, and I think it's, this is just like, I think a lot of us just works out of the box,
it works now very intuitively.

164
00:09:46,644 --> 00:09:50,098
And these are just some pointers that are probably outdated again in two months.

165
00:09:50,098 --> 00:09:50,884
Yeah, it's true.

166
00:09:50,884 --> 00:09:52,170
This go by very fast.

167
00:09:52,170 --> 00:09:55,678
if you want to get into this, like AI Native Code, just give it a try.

168
00:09:55,982 --> 00:10:05,582
And maybe just one thing, like how is it to, because I do have the question like juniors,
like people that are less experienced, is it different for them to pick these things up?

169
00:10:05,582 --> 00:10:16,244
Or when you coach them, you realize that actually it's just, like, do you need to be quote
unquote senior to make use, to be efficient with AI native coding?

170
00:10:16,244 --> 00:10:18,956
Or do you think they just need to be taught?

171
00:10:19,606 --> 00:10:21,639
I think they pick it up very quickly, to be honest.

172
00:10:21,639 --> 00:10:24,492
These are also, let's say, computer science students, right?

173
00:10:24,492 --> 00:10:27,884
Like it's not that the underlying concepts are new to them.

174
00:10:28,205 --> 00:10:35,853
What they of course do not have, but what you also do not have like with traditional
programming, like they have very little architecture experience.

175
00:10:36,110 --> 00:10:36,749
Hmm.

176
00:10:36,749 --> 00:10:47,038
And to me, I'm not sure how I feel about it yet, or how to tackle it, instead of with
traditional coding, you become one person in a team.

177
00:10:47,038 --> 00:10:50,280
And the team has an idea on, you're going to do this, you're going to build this.

178
00:10:50,280 --> 00:10:52,032
And this is the solution that we're building together.

179
00:10:52,032 --> 00:10:58,756
But with AI-driven coding, you're suddenly, you become the team lead of five different
agents.

180
00:10:59,257 --> 00:11:03,360
And that implies that you need to have a bit broader view on

181
00:11:03,360 --> 00:11:05,200
How does the architecture look like?

182
00:11:05,240 --> 00:11:07,960
yeah, it a difficult problem.

183
00:11:07,960 --> 00:11:14,008
And maybe I'm also overthinking it because actually when picking it up and just starting
seems to be very easy for them.

184
00:11:14,261 --> 00:11:15,211
Okay, well, that's good.

185
00:11:15,211 --> 00:11:15,682
That's good.

186
00:11:15,682 --> 00:11:17,304
It's a true data point, right?

187
00:11:17,304 --> 00:11:19,965
And I think that's also what we're talking before.

188
00:11:20,026 --> 00:11:22,302
You mentioned it's hard to find these profiles.

189
00:11:22,302 --> 00:11:27,392
And it's also interesting to hear because we see a lot of this on blog posts and there's
two types of people.

190
00:11:27,392 --> 00:11:37,217
I mean, today I think is a bit less extreme, let's say, but there's the people that are
100x developers, that they are true believers that you should never do this and they're

191
00:11:37,217 --> 00:11:38,387
really evangelists.

192
00:11:38,387 --> 00:11:40,018
And then there's the people that...

193
00:11:40,508 --> 00:11:45,618
and this group has been decreasing, the people that say like, is just hype.

194
00:11:45,618 --> 00:11:49,765
But apparently the signals we get, they're not yet the reality.

195
00:11:51,157 --> 00:12:01,174
Well, I'm seeing a lot of people now for the vacancy that we have opened and why that's
also interesting is because you hear like they're currently working somewhere else and you

196
00:12:01,174 --> 00:12:04,610
hear what they're doing there and I think the reality is that for

197
00:12:04,610 --> 00:12:11,820
A lot of maybe tech startups aside, a lot of established companies, they are very far from
adopting an AI native workflow.

198
00:12:11,820 --> 00:12:17,227
Like they maybe have maybe give their developers a license to co-pilot or something,
right?

199
00:12:17,227 --> 00:12:19,680
Like it's a bit underwhelming.

200
00:12:19,680 --> 00:12:22,873
So I think there is still quite a way to go.

201
00:12:22,873 --> 00:12:25,444
Interesting, You know, very cool.

202
00:12:25,444 --> 00:12:27,944
think maybe so for people that I want to...

203
00:12:27,944 --> 00:12:28,824
This wasn't planned, right?

204
00:12:28,824 --> 00:12:35,052
But maybe for people that want to get started, you would say this is the go-to for today,
tomorrow.

205
00:12:35,052 --> 00:12:39,487
is maybe bit overstating it, but I think it's a very short document, right?

206
00:12:39,487 --> 00:12:49,117
You only lose, I don't know, five minutes while reading it, and I think it's a good
starting point to really go all out AI native development with Cloud Code.

207
00:12:49,674 --> 00:12:53,607
Sounds good and good luck also finding uh the position.

208
00:12:53,607 --> 00:12:58,493
Again, it's cambro.com and then you can just look for jobs and just apply there.

209
00:12:58,493 --> 00:12:59,034
all right.

210
00:12:59,034 --> 00:13:01,317
So what do we have for today?

211
00:13:01,317 --> 00:13:08,331
VentureBeat argues open AIs, open claw move is a pivot from chatbots to agents that take
actions across apps and systems.

212
00:13:08,331 --> 00:13:12,043
The tension is that open claws fast and lose openness, help it go viral.

213
00:13:12,043 --> 00:13:16,505
So what happens when enterprise guardrails and safety expectations move in?

214
00:13:18,095 --> 00:13:20,547
A few things to unpack there.

215
00:13:20,716 --> 00:13:21,731
Yes, please do.

216
00:13:21,731 --> 00:13:23,032
So first things OpenClaw.

217
00:13:23,032 --> 00:13:24,414
Previously Moldbot.

218
00:13:24,414 --> 00:13:27,057
Even more ago, further going back, Claudebot.

219
00:13:27,057 --> 00:13:27,918
Now OpenClaw.

220
00:13:27,918 --> 00:13:30,499
The title says here OpenAI's acquisition of OpenClaw.

221
00:13:30,499 --> 00:13:31,901
That actually did not happen.

222
00:13:31,901 --> 00:13:37,407
Peter Steinberger, is the author of OpenClaw, got basically hired by OpenAI.

223
00:13:37,957 --> 00:13:41,799
OpenAI has also dedicated resources to further develop OpenClaw.

224
00:13:41,799 --> 00:13:53,286
For the people that don't know, OpenClaw is basically an agentic loop that you deploy on
your local-ish system, like your own system, it doesn't have to be local, and that can

225
00:13:53,286 --> 00:13:56,167
call a lot of different models to do a lot of different things.

226
00:13:56,167 --> 00:14:00,560
And because it is local, it also has access to do a lot of things with...

227
00:14:00,560 --> 00:14:04,475
which creates a lot of opportunities, but it also creates some security issues.

228
00:14:04,475 --> 00:14:06,478
But it is extremely powerful.

229
00:14:06,478 --> 00:14:09,680
Also in a sense that it can create its own skills.

230
00:14:09,680 --> 00:14:17,286
So if you say, for example, I find it important that you know how to generate PDFs for
markdowns, it's gonna research on how to do that and create a skill for that.

231
00:14:17,286 --> 00:14:18,648
And from the future it can do that.

232
00:14:18,648 --> 00:14:25,114
If you say, want you from now on to manage our GitHub issues in this and this and this
way.

233
00:14:25,255 --> 00:14:28,938
Please do that, notify me when this goes wrong, give us an update on this goes.

234
00:14:28,938 --> 00:14:29,898
It then does that.

235
00:14:29,898 --> 00:14:32,880
We, for example, use this for top of mind for issue management.

236
00:14:32,880 --> 00:14:36,364
uh So it is extremely strong.

237
00:14:36,364 --> 00:14:47,193
And now with the pseudo acquisition of OpenClaw, VentureBeats is basically arguing that
it's a bit to the end of the traditional chat era.

238
00:14:47,274 --> 00:14:57,178
as in chat GPT and that we will much more have something that we suppose just still
probably chat with, but can take way more actions because chat GPT these days is still

239
00:14:57,178 --> 00:15:04,542
very much a, I request something and it gives me some text back, but it's not gonna be
able to send an email for me, right?

240
00:15:04,542 --> 00:15:11,835
Or remind me tomorrow at five o'clock that I need to go to the doctor or like a lot of
actions it still cannot take, right?

241
00:15:12,370 --> 00:15:24,667
Do you agree with this opinion or do you think it's maybe before you answer, I think I
also saw somewhere that open claw was already sponsored by OpenAI, but now it's still

242
00:15:24,667 --> 00:15:25,858
going to be something separate.

243
00:15:25,858 --> 00:15:27,305
I think still open source.

244
00:15:27,305 --> 00:15:28,300
I'm not sure.

245
00:15:28,300 --> 00:15:31,382
But I remember reading that it was still going to be.

246
00:15:31,382 --> 00:15:32,262
Sorry.

247
00:15:33,443 --> 00:15:34,110
Yeah.

248
00:15:34,110 --> 00:15:34,300
yeah.

249
00:15:34,300 --> 00:15:35,191
It will be open source.

250
00:15:35,191 --> 00:15:40,074
But in the end, it's very cool what Peter Steinberg did.

251
00:15:40,074 --> 00:15:50,379
It's very impressive, but it's more of a, let's say, a creative effort than that is a
super hard engineering challenge.

252
00:15:50,379 --> 00:15:56,483
Even if OpenClaw doesn't get like, if OpenAI doesn't exactly implement OpenClaw, will
implement an alternative.

253
00:15:56,483 --> 00:16:01,188
Because the call base is not super complex, it's just a very creative way of looking at
this, what Peter Steinberg did.

254
00:16:01,188 --> 00:16:14,102
So I definitely do think that we will see where we now have our, let's say, very, where
chat GPT is kind of your Google, Wikipedia combination on steroids.

255
00:16:14,242 --> 00:16:19,427
I think what we will very much go towards in the future is like your personal assistant on
your phone.

256
00:16:19,657 --> 00:16:21,298
Yeah.

257
00:16:21,298 --> 00:16:30,857
Also, this has been, I mean, it also said on the preview that you read, Like, OpenClob
became very popular because it's very loose, very flexible, right?

258
00:16:30,857 --> 00:16:36,435
Like, the setup, you can connect with WhatsApp, with Telegram, with this, with that.

259
00:16:36,435 --> 00:16:38,383
There's a whole bunch of stuff off the box already.

260
00:16:38,383 --> 00:16:40,145
In some ways I also feel like it's...

261
00:16:40,145 --> 00:16:47,875
Again, taking a lot of steps back and maybe this is a hard transition, but I remember when
we talked about generative AI years ago, I was thinking that like the models that were

262
00:16:47,875 --> 00:16:50,740
going to win are the models that are specialized for one task.

263
00:16:50,781 --> 00:16:53,072
And JGPT came out, it was a bit the opposite, right?

264
00:16:53,072 --> 00:16:56,554
Like it was because everyone could use, everyone would just go there, be impressed.

265
00:16:56,554 --> 00:16:58,745
And that's what kind of build momentum.

266
00:16:58,925 --> 00:17:08,440
And it feels a bit the same with OpenClaw, like in the sense that because it's so
flexible, like for example, if I, if I create an OpenClaw instance, right, I probably

267
00:17:08,440 --> 00:17:10,271
don't need WhatsApp, Telegram.

268
00:17:10,395 --> 00:17:12,638
of iMessage, whatever, right?

269
00:17:12,638 --> 00:17:18,944
I probably just want to feed one or two, but the fact that it's very, very flexible is
probably what made it so popular, right?

270
00:17:21,508 --> 00:17:26,825
Do you think that this is the way for like AI products to be trying to be as generic as
possible?

271
00:17:26,825 --> 00:17:35,135
Well, the underlying infrastructure is generic, but because you have this evolutionary
skills aspect, you can ask it to define skills.

272
00:17:35,457 --> 00:17:38,870
It very quickly becomes like niche focused on you.

273
00:17:39,052 --> 00:17:41,587
True, it's very adaptable.

274
00:17:41,644 --> 00:17:42,827
It's extremely adaptable.

275
00:17:42,827 --> 00:17:49,823
even though the underlying LLM is very generic, like the actual application of something
like OpenCloft for you as a person is super niche for you.

276
00:17:49,823 --> 00:17:52,276
I think that is the extreme strength of this.

277
00:17:52,276 --> 00:17:53,117
Yeah.

278
00:17:53,117 --> 00:17:55,748
And also I saw a lot of the security concerns, right?

279
00:17:55,748 --> 00:17:59,398
Because again, there's a lot of stuff, right?

280
00:17:59,398 --> 00:18:00,761
It has a lot of accesses.

281
00:18:00,761 --> 00:18:07,056
And there were a lot of people on Reddit and blog posts, like calling out the security
concerns, right?

282
00:18:07,056 --> 00:18:14,610
And I think even if you go to OpenClaw, they have announced me here that OpenClaw partners
with VirusTotal for skill security, right?

283
00:18:15,591 --> 00:18:16,922
Is this something that worries you at all?

284
00:18:16,922 --> 00:18:18,384
Like, do you think about these things or?

285
00:18:18,384 --> 00:18:23,128
think you need to be worried if you have people using this that completely ignore it.

286
00:18:23,128 --> 00:18:28,194
One of the attack factors is that there is a skills hub and people start downloading
skills from other people.

287
00:18:28,194 --> 00:18:33,639
That is very easy because I can basically inject stuff in those prompts that does
adversarial stuff.

288
00:18:33,639 --> 00:18:34,819
You can also be dumb.

289
00:18:34,819 --> 00:18:41,394
make your open cloud gateway open to the internet, like to the whole world and so
everybody can log in like that's just dumb but it's happened a lot.

290
00:18:41,394 --> 00:18:49,718
You can give it access to stuff it shouldn't have access to that creates security like
it's and like all of these are just like you need to understand what you're using and the

291
00:18:49,718 --> 00:18:55,161
powers it has and if you don't understand it then you probably shouldn't use it right if
from a security point of view.

292
00:18:55,681 --> 00:18:58,682
So I think like I to nuance that a bit right.

293
00:18:59,158 --> 00:19:08,082
I mean, in some ways, like open cloud, it's so easy to set up that people that don't
really know what they're doing can actually kind of do it.

294
00:19:08,082 --> 00:19:09,873
But that's what I think the danger is, right?

295
00:19:09,873 --> 00:19:13,225
Like it's, you don't really know what you're doing, but you get something out of it.

296
00:19:13,225 --> 00:19:14,645
So it's like, it's fine.

297
00:19:14,645 --> 00:19:16,506
But when it really isn't.

298
00:19:16,506 --> 00:19:18,329
em

299
00:19:18,329 --> 00:19:26,068
So for example, claw bots that we're using, they're set up in an isolated environment with
just the access that it needs for the skills that it has.

300
00:19:26,068 --> 00:19:29,919
like the security issues are very limited.

301
00:19:29,919 --> 00:19:33,339
And how many open cloud bots do you have?

302
00:19:34,059 --> 00:19:34,619
Two.

303
00:19:34,619 --> 00:19:38,023
Do you have any, this is all for top of mind or do you have for personal use as well?

304
00:19:38,023 --> 00:19:39,889
One personal one top of mind.

305
00:19:40,479 --> 00:19:42,099
One person over the top of mine.

306
00:19:42,179 --> 00:19:43,079
cool.

307
00:19:43,179 --> 00:19:45,319
Maybe this is a side question.

308
00:19:46,219 --> 00:19:52,039
There's a lot of open-claw, cloud bots, molt bot alternatives.

309
00:19:52,859 --> 00:19:56,919
There's like nano something, zero claw.

310
00:19:57,879 --> 00:20:00,319
You try nano, nanobot.

311
00:20:00,319 --> 00:20:01,637
Nanobot or nanoclaw?

312
00:20:01,637 --> 00:20:02,728
think it's called.

313
00:20:03,688 --> 00:20:04,141
And...

314
00:20:04,141 --> 00:20:13,451
A bit less flexible, the good thing is that it runs on Apple containers, so it's a bit
more isolated.

315
00:20:13,451 --> 00:20:19,637
What's also very good is that it runs on Entropix SDK, which means you can actually use
your Entropix subscription.

316
00:20:20,198 --> 00:20:25,742
The OpenClaw you can only use via the Entropix API, which is way more expensive.

317
00:20:25,742 --> 00:20:27,263
uh

318
00:20:27,263 --> 00:20:29,799
But yeah, think this open claw is what?

319
00:20:29,799 --> 00:20:31,672
Like six weeks old?

320
00:20:32,255 --> 00:20:35,409
So we'll see a lot of these things popping up in the coming year, I think.

321
00:20:35,409 --> 00:20:39,181
And you see like the the star is like 200,000 stars.

322
00:20:39,181 --> 00:20:44,487
Like it's insane that the popularity is like if you not even a hockey stick is just like a
vertical line.

323
00:20:44,487 --> 00:20:46,969
If you look at it, it's like it's really crazy.

324
00:20:46,989 --> 00:20:47,689
So there's none.

325
00:20:47,689 --> 00:20:50,932
And also just to name a few others that I saw like Pico claw.

326
00:20:50,932 --> 00:20:52,174
This is written in Go.

327
00:20:52,174 --> 00:20:57,078
And there was another one that I heard on the change log podcast, actually zero claw that
is in rust.

328
00:20:57,078 --> 00:20:58,643
um

329
00:20:58,643 --> 00:21:01,449
how easy it is to implement this, right?

330
00:21:01,449 --> 00:21:07,051
Exactly, but that's also what I was and you're running this where actually you're in this
on Mac mini or

331
00:21:07,331 --> 00:21:18,203
No, I'm running this on, well, I do some testing on my Mac Mini, but the ones that I
actually use, I use on Google Cloud VMs.

332
00:21:18,946 --> 00:21:24,304
I was also looking, I mean, they're making this smaller, faster, like minus 5 megabytes
RAM, right?

333
00:21:24,304 --> 00:21:25,415
All these things.

334
00:21:25,916 --> 00:21:33,405
So I was also thinking like if I have Raspberry Pi and I just want to have something
running, you know, like just like a personal little system or something, you could just

335
00:21:33,405 --> 00:21:34,751
mess with it, right?

336
00:21:34,751 --> 00:21:35,511
true.

337
00:21:35,591 --> 00:21:38,991
realistically speaking, you're not going to the models locally.

338
00:21:39,151 --> 00:21:42,791
The models that you can run locally today, they're not good enough.

339
00:21:43,411 --> 00:21:50,131
I think to really have good performance, still need to go the whole Sonnet 4.6 or Opus 4.6
route.

340
00:21:50,207 --> 00:21:57,127
Yeah, so your advice for people who want to go in the open claw journey, just stick with
open claw.

341
00:21:57,499 --> 00:22:05,787
For now, yeah, would say for now stick with open claw and with respect to the security
issues just think about what you're doing and don't be dumb.

342
00:22:06,203 --> 00:22:08,336
and then don't just say, do everything, right?

343
00:22:08,336 --> 00:22:11,750
Like say, just do this, do this, do this one by one.

344
00:22:11,811 --> 00:22:13,644
Don't try to go too fast as well, right?

345
00:22:13,644 --> 00:22:16,057
I think that's the thing.

346
00:22:16,438 --> 00:22:17,064
All right.

347
00:22:17,064 --> 00:22:25,573
you're normally gonna install something from the internet and if you run the install
script and assess, I'm only gonna be installed if you give me pseudo access to everything.

348
00:22:25,573 --> 00:22:27,483
I mean, you're not gonna install it probably, right?

349
00:22:27,483 --> 00:22:30,392
I mean, just keep thinking.

350
00:22:30,392 --> 00:22:33,572
like, some people are like, yes, I read and agree to terms, know?

351
00:22:33,572 --> 00:22:34,892
Yes, yes.

352
00:22:35,372 --> 00:22:38,212
Just give me, just give me, me going.

353
00:22:42,109 --> 00:22:43,209
Yeah, yeah.

354
00:22:43,209 --> 00:22:44,649
It's a bit harsh, but I know what you're saying.

355
00:22:44,649 --> 00:22:47,949
It's like natural selection on the 21st century, right?

356
00:22:47,949 --> 00:22:53,409
It's like, it's like the next gen, like it's really like natural selection, right?

357
00:22:53,409 --> 00:22:54,660
Like you can't.

358
00:22:54,660 --> 00:22:58,543
You can feed your family, y'all, like the genes stop, you know, like that's...

359
00:22:58,543 --> 00:22:59,867
All right.

360
00:22:59,867 --> 00:23:00,808
What is next?

361
00:23:00,808 --> 00:23:11,532
Next, we have a new paper tests whether the repo level context files like agents.md
actually help coding agents finish tasks and finds they can backfire.

362
00:23:11,532 --> 00:23:20,827
The punchline is a double hit, lower success rates and more than 20 % higher inference
cost, hinting that, and I quote, more context mean more confusion.

363
00:23:20,827 --> 00:23:22,132
That's a good question.

364
00:23:22,132 --> 00:23:23,034
Here are the names.

365
00:23:23,034 --> 00:23:24,811
Are you gonna list all the names?

366
00:23:24,811 --> 00:23:29,006
No, I thought it was gonna be like MIT or something like this, but I don't know

367
00:23:29,945 --> 00:23:32,424
We don't know who the authors are, right?

368
00:23:32,621 --> 00:23:33,784
I'm not sure.

369
00:23:33,808 --> 00:23:35,312
What is this paper about?

370
00:23:35,890 --> 00:23:40,375
Wait, let me just quickly see if I can actually see where is which research institute is
coming from.

371
00:23:40,375 --> 00:23:41,461
I can't see quickly.

372
00:23:41,461 --> 00:23:52,090
So it is about, so if you are doing AI native coding or at least a hybrid version of that,
you often have files in your context.

373
00:23:52,390 --> 00:23:59,196
Agents.md is an example where you specify I want to these and these type of agents and
then your CLI can pick that up.

374
00:23:59,196 --> 00:24:01,079
the default track, quote unquote default, right?

375
00:24:01,079 --> 00:24:02,813
Like from the Linux foundation.

376
00:24:02,813 --> 00:24:05,756
And Claude implements it as Claude Omnid, that's what you mean?

377
00:24:05,756 --> 00:24:11,400
No, because there wasn't there like standards that were donated to the like a Linux
foundation, a new Linux foundation.

378
00:24:11,400 --> 00:24:14,857
I thought that agents MD was MCP was, but I forgot the name.

379
00:24:14,857 --> 00:24:15,582
Let me just.

380
00:24:15,582 --> 00:24:25,892
Okay, yeah, it's in my mind agents and we can very well be that it's done it but it just
is just like a way to define like these are the sub agents that you can call and that can

381
00:24:25,892 --> 00:24:26,722
do tasks.

382
00:24:26,722 --> 00:24:27,643
Right.

383
00:24:28,343 --> 00:24:34,486
Another example of stuff you have in your context is like Claude.md, which is sort of your
memory.

384
00:24:34,486 --> 00:24:37,307
I think it's actually in codecs is actually called memory.md.

385
00:24:37,307 --> 00:24:38,227
Not sure there.

386
00:24:38,227 --> 00:24:39,839
um

387
00:24:39,839 --> 00:24:50,318
Other things are, let's say, I have a to-do.md or I have some description on this is the
diagram of the architecture that we're using.

388
00:24:50,318 --> 00:24:52,751
And I have that in the mermaid diagram.

389
00:24:52,751 --> 00:25:01,597
You have lot of things, like extra information that we think is valuable to the alum to
help us do better coding.

390
00:25:01,959 --> 00:25:08,178
So it's basically files that you include to your project, but like they're metadata in a
way and it's really just for the AI.

391
00:25:08,833 --> 00:25:13,768
Yeah, and we think that by having this in that context, they will perform better.

392
00:25:13,768 --> 00:25:15,930
I that is a bit the hypothesis, right?

393
00:25:15,930 --> 00:25:20,473
And this paper actually looked into that.

394
00:25:20,473 --> 00:25:23,115
More specifically in the agents.md.

395
00:25:23,235 --> 00:25:25,382
Like, is it actually valuable?

396
00:25:25,382 --> 00:25:26,876
Look at this.

397
00:25:26,876 --> 00:25:27,581
Is it?

398
00:25:27,581 --> 00:25:38,860
There are probably a lot of asterisks needed here, but I think I can summarize what the
paper says is that it is typically not, because it very quickly increases the context that

399
00:25:38,860 --> 00:25:45,456
you're working with significantly, which actually does not improve performance if you have
too big a context.

400
00:25:45,456 --> 00:25:47,998
And your inference costs very quickly go up.

401
00:25:47,998 --> 00:25:50,530
your inference, why does your inference cost go up?

402
00:25:50,530 --> 00:25:58,242
you build up this chat history and every new request, you send along history again.

403
00:25:58,242 --> 00:26:01,436
Hmm, okay, I see, Do you agree with this, actually?

404
00:26:01,436 --> 00:26:04,299
I'm a bit surprised that they're saying this.

405
00:26:04,741 --> 00:26:05,753
Do you agree with this?

406
00:26:05,753 --> 00:26:10,821
What I would be interested in is, maybe it's actually in the paper, is when they tested
it.

407
00:26:11,141 --> 00:26:11,892
Hmm.

408
00:26:12,697 --> 00:26:13,548
And why am saying it?

409
00:26:13,548 --> 00:26:23,907
Because what we see is with every model iteration, we see improvements in memory retention
across larger context windows.

410
00:26:23,907 --> 00:26:29,893
We see improvements on caching, example, like we see improvements on actual usage of
subagents.

411
00:26:29,893 --> 00:26:33,173
So I would be interested to know, because like to me,

412
00:26:33,173 --> 00:26:36,609
There was a big change when Opus 4.5 was released.

413
00:26:37,512 --> 00:26:38,833
When you...

414
00:26:38,955 --> 00:26:41,050
A lot of things just work out of the box.

415
00:26:41,050 --> 00:26:42,560
Yeah, Very true.

416
00:26:42,560 --> 00:26:44,755
like, like depends a bit on what timeframe it is.

417
00:26:44,755 --> 00:26:48,723
My personal opinion, like, because I use cloud code a lot.

418
00:26:48,723 --> 00:26:52,627
I typically don't create specific sub-agents anymore.

419
00:26:53,848 --> 00:26:55,509
I do not know, no.

420
00:26:55,950 --> 00:27:04,076
I do sometimes specify that, and I did use it in the past, but it didn't like, to me it
didn't feel like it gave me anything, like it improved anything.

421
00:27:04,076 --> 00:27:13,374
Like I use it like a UI UX agent and a test suite agent and a product management agent,
but like, I don't think it necessarily, like the performance became better.

422
00:27:13,374 --> 00:27:15,225
Maybe it's not worse either, but like.

423
00:27:15,442 --> 00:27:17,112
I was a bit indifferent to it.

424
00:27:17,163 --> 00:27:22,305
So you did it kind of like to try it out quote unquote, but you didn't feel as big of a
difference and you switch back.

425
00:27:22,305 --> 00:27:26,213
It's not because the models evolved or okay.

426
00:27:26,213 --> 00:27:26,815
Interesting.

427
00:27:26,815 --> 00:27:34,360
But maybe one thing, cause I saw on your, on your blog, the one that we talked about in
the beginning, you do mention a code simplifier.

428
00:27:34,360 --> 00:27:36,073
I had to plug in though.

429
00:27:36,418 --> 00:27:37,589
Yeah, this may be good point.

430
00:27:37,589 --> 00:27:43,511
Maybe it actually overlaps a little bit with agents because a plugin is also sort of a
markdown file.

431
00:27:43,511 --> 00:27:46,872
The difference is a bit is also with skills.

432
00:27:46,872 --> 00:27:57,335
It's like they're a bit passive in a sense that your LLM knows that they're there, but
they're only gonna sort of mount the skill when it's needed, when you request it.

433
00:27:57,335 --> 00:27:59,287
So it doesn't fill up context if you don't need it.

434
00:27:59,287 --> 00:28:01,100
So can you recap that?

435
00:28:01,100 --> 00:28:03,382
it's basically saying you.

436
00:28:03,617 --> 00:28:08,089
the paper is on agents.md, which is always in your context, or should always be in your
context, right?

437
00:28:08,089 --> 00:28:09,521
Like all the instructions are there.

438
00:28:09,521 --> 00:28:18,157
But with these type of plugins or skills, like how it's done, it's like there's a very
small entry in your context, which says, know that these skills are there, and these are

439
00:28:18,157 --> 00:28:19,427
the description skills.

440
00:28:19,427 --> 00:28:23,670
But from the moment it actually uses them, it mounts the skills, and only then it comes in
the context.

441
00:28:23,670 --> 00:28:27,492
So it's not really an issue that you need to think about, like...

442
00:28:27,652 --> 00:28:30,721
that you always drag this along with every inference request.

443
00:28:30,721 --> 00:28:31,632
I see.

444
00:28:31,632 --> 00:28:32,062
I see.

445
00:28:32,062 --> 00:28:42,011
Because even the the but that's like you're talking about agents MD and maybe just to make
sure we're all on the same page, including the people listening, the agents and agents MD

446
00:28:42,011 --> 00:28:46,966
is basically like a markdown that describes more specifically what a sub agent should do.

447
00:28:46,966 --> 00:28:47,306
Right.

448
00:28:47,306 --> 00:28:55,152
So instead of like and base the idea is that by being very specific and what one agent is
going to do, it's going to perform better than just being very generic.

449
00:28:55,152 --> 00:28:56,793
That's the assumption.

450
00:28:56,994 --> 00:28:57,815
Yeah.

451
00:28:57,815 --> 00:28:58,755
And then

452
00:28:58,875 --> 00:29:06,947
This agents MD is always in your context, so it always increases, but like the code
simplify here, which is a plugin, it's not always in the context.

453
00:29:06,947 --> 00:29:09,099
It's just like listed as it's there if you want.

454
00:29:09,099 --> 00:29:15,932
And if you and the agent quote unquote double clicks, then it actually expand and then it
consumes the context.

455
00:29:15,932 --> 00:29:19,104
Claude also has sub agents, which doesn't use agents MD.

456
00:29:19,104 --> 00:29:22,525
Do you know how the mechanism is for Claude?

457
00:29:22,545 --> 00:29:25,166
Is it more like a plugin or is it more like agents MD?

458
00:29:26,001 --> 00:29:30,836
I think plugins are preferred over HSMD, but you could use HSMD.

459
00:29:30,836 --> 00:29:31,187
oh

460
00:29:31,187 --> 00:29:33,348
that Cloud also has subagents like...

461
00:29:33,348 --> 00:29:40,712
Well, before plugins, because plugins are still relatively new, actually do have in Cloud,
you have something like if you do slash agents, you can define your agent and then it

462
00:29:40,712 --> 00:29:43,909
creates a markdown per agent, if I'm not mistaken.

463
00:29:43,909 --> 00:29:45,502
kind of like the plugins.

464
00:29:45,502 --> 00:29:47,725
It's more like the plugins than the agents.md.

465
00:29:47,864 --> 00:29:51,928
Now I think it's very similar to agents.md with the difference that you have this per
agent.

466
00:29:51,928 --> 00:29:52,810
I see, see, I see.

467
00:29:52,810 --> 00:29:54,634
Okay, interesting, interesting.

468
00:29:54,634 --> 00:29:57,159
Yeah, these things evolve very quickly.

469
00:29:57,159 --> 00:29:59,218
Yeah, so, thank you.

470
00:29:59,218 --> 00:30:05,138
think about, like maybe more generically, like, does it make sense to have all this in
your context?

471
00:30:05,138 --> 00:30:16,438
Like, not necessarily agents.md, but like, I think we always, especially before, again,
November, December, when Opus 4.5, before there was a lot of discussion, it's best

472
00:30:16,438 --> 00:30:21,110
practice to have your spec file there, and you have that there, and that there, and...

473
00:30:21,254 --> 00:30:28,686
I think a lot of that doesn't really start from scratch and only if you really need
something to be remembered put it in your context and otherwise don't.

474
00:30:29,052 --> 00:30:36,280
Yeah, no, I agree with that statement, but I also think if someone is starting today, I
would say have a spec file still.

475
00:30:36,340 --> 00:30:37,701
think a bit about...

476
00:30:38,122 --> 00:30:38,863
You disagree.

477
00:30:38,863 --> 00:30:41,406
Like for people that don't know, they never used this before.

478
00:30:41,406 --> 00:30:47,468
Yeah, well, depends a but like, like, I think if you want to do a green field project, you
need to think, and that's a big mindset shift.

479
00:30:47,468 --> 00:30:48,631
You should read my blog post.

480
00:30:48,631 --> 00:30:53,432
like before in traditional development, you're going to think I'm going to build this
technical feature.

481
00:30:53,432 --> 00:30:55,593
got assigned to build this technical feature.

482
00:30:55,593 --> 00:30:56,273
Right.

483
00:30:56,273 --> 00:31:00,686
And now you need to think about like, I'm going to build this functionality, that user
wants.

484
00:31:00,686 --> 00:31:04,872
because it's your agent that's gonna build technically, but you are gonna act as the user.

485
00:31:04,872 --> 00:31:06,430
Like this is the functionality that I want you to do.

486
00:31:06,430 --> 00:31:08,328
And then there's a bit of a different way of looking at it.

487
00:31:08,328 --> 00:31:11,452
And for you, it's easy to prepare that in a spec sheet.

488
00:31:11,452 --> 00:31:15,798
I think that's fine, but to really focus on what is the functionality that you're
building.

489
00:31:15,952 --> 00:31:20,223
Yeah, I think also again, so I don't know, maybe we'll say spec sheet.

490
00:31:20,223 --> 00:31:28,003
I'm not sure if we're saying exactly the same thing, but one thing that I still have fan
off is like having like software requirements, you know, like just something like what are

491
00:31:28,003 --> 00:31:30,263
the steps that we want for this project?

492
00:31:30,263 --> 00:31:35,603
Because also it's easier to chat with your agent and say, okay, what should we tackle up
next or where are we in here?

493
00:31:35,763 --> 00:31:37,244
And it kind of...

494
00:31:37,244 --> 00:31:41,324
It's easier for me to just kind of have everything like this is what I want and to finish.

495
00:31:41,824 --> 00:31:44,064
And this is where we are right now.

496
00:31:44,064 --> 00:31:44,384
Right.

497
00:31:44,384 --> 00:31:49,004
And also for me to reflect a bit or if I have if I change my mind or actually this should
work like this should work like that.

498
00:31:49,004 --> 00:31:52,204
It is something that I still find useful.

499
00:31:52,624 --> 00:31:53,424
Right.

500
00:31:53,463 --> 00:32:00,164
Which in a lot of ways has nothing to do with agent so much but also like don't keep
things in your head like where do you want to go with this.

501
00:32:00,324 --> 00:32:00,884
Right.

502
00:32:00,884 --> 00:32:04,764
Having a bit the path of OK you want to have this app at the end.

503
00:32:04,764 --> 00:32:07,156
So these are the steps that we need to take to get there.

504
00:32:07,487 --> 00:32:10,260
But to me, is like project management, right?

505
00:32:10,260 --> 00:32:15,997
Like these become, to me, how I do this is like, these are backlog items or these are
items that are working on in this sprint.

506
00:32:15,997 --> 00:32:17,418
To me, that's not in the code base.

507
00:32:17,418 --> 00:32:21,163
To me that for what we're using, for example, it's in the hit of issues and get the
projects.

508
00:32:21,163 --> 00:32:23,902
Or it could be in Jira or it could be in Trello or whatever, right?

509
00:32:23,902 --> 00:32:24,612
Like.

510
00:32:24,612 --> 00:32:27,739
Yeah, I think for me a lot of times I'm doing things myself.

511
00:32:27,739 --> 00:32:29,120
So that's why I just...

512
00:32:30,144 --> 00:32:32,434
Yeah, because I just put on the COVID because...

513
00:32:32,434 --> 00:32:36,018
something that you necessarily need to do in the codebase.

514
00:32:36,508 --> 00:32:37,788
No, but that I agree.

515
00:32:37,788 --> 00:32:43,608
I don't think it needs to be in the code base, but I think for me having in the code base
is also easier to chat with Claude and say, okay, what do you think should be the next

516
00:32:43,608 --> 00:32:43,908
task?

517
00:32:43,908 --> 00:32:47,908
Like, know, sometimes I don't even that decision, you know, I say, what should we tackle
next?

518
00:32:47,908 --> 00:32:49,748
And then he says, ah, we should do this, this and this.

519
00:32:49,748 --> 00:32:51,048
I was like, okay, maybe this is too big.

520
00:32:51,048 --> 00:32:51,988
Maybe let's do this first.

521
00:32:51,988 --> 00:32:53,216
Or let's, know, like.

522
00:32:53,216 --> 00:33:01,678
But for that, actually ask it to check the hitup issues and to use the GH bash client, the
CLI to do that.

523
00:33:02,139 --> 00:33:04,439
But that's a fair point.

524
00:33:04,439 --> 00:33:13,322
having that knowledge of what's next, easily accessible, whether it be in the code base or
whether it be somewhere else, but like give your agent access to things that you as a

525
00:33:13,322 --> 00:33:15,348
developer would also want access to.

526
00:33:15,348 --> 00:33:17,759
Yeah, no, but then I think we're yeah, but then I fully agree.

527
00:33:17,759 --> 00:33:24,574
think we're saying I think we're pretty much saying the same thing is just different
places and like I don't also don't not opinionated where it should be as well.

528
00:33:25,075 --> 00:33:26,015
Who?

529
00:33:26,696 --> 00:33:29,477
No, not today, but how?

530
00:33:29,778 --> 00:33:31,379
Not today today.

531
00:33:31,379 --> 00:33:32,359
Cool.

532
00:33:32,380 --> 00:33:33,453
What do we have next?

533
00:33:33,453 --> 00:33:39,065
ETH Zurich, uh Swiss University, that the paper comes from.

534
00:33:39,349 --> 00:33:44,483
Interesting that they didn't include, interesting that it wasn't easy to find because it's
a...

535
00:33:44,483 --> 00:33:49,763
Well, it's actually like if you look at the footnote number one under it, like it says
Department of Computer Science.

536
00:33:49,783 --> 00:33:52,163
So we were dumb.

537
00:33:53,143 --> 00:33:54,323
Yeah, it's a nice problem.

538
00:33:54,323 --> 00:33:54,503
Yeah.

539
00:33:54,503 --> 00:33:55,283
Yeah, exactly.

540
00:33:55,283 --> 00:33:56,143
Exactly.

541
00:33:56,364 --> 00:33:56,806
cool.

542
00:33:56,806 --> 00:33:57,759
What else do we have?

543
00:33:57,759 --> 00:34:09,500
Simon Willison highlights research suggesting AI can increase work intensity instead of
easing it, especially when productivity gains mask burnout.

544
00:34:09,500 --> 00:34:11,633
The provocative angle is managerial.

545
00:34:11,633 --> 00:34:18,780
If AI boosts throughput, how do organizations prevent that extra capacity from turning
into an always-on expectation?

546
00:34:18,780 --> 00:34:19,953
I've noticed this as well.

547
00:34:19,953 --> 00:34:20,554
Same way.

548
00:34:20,554 --> 00:34:25,706
So I was working on like a very big refactor yesterday.

549
00:34:25,706 --> 00:34:29,126
And actually, I already knew that this was a problem yesterday.

550
00:34:29,126 --> 00:34:32,346
So yesterday's not really a problem, it came like a few weeks ago.

551
00:34:32,346 --> 00:34:34,726
But yesterday is a good example.

552
00:34:34,726 --> 00:34:46,397
Like yesterday I did in the afternoon, like I don't know, in four hours or something, I
did a refactor that would normally have very, very easily have taken me two weeks.

553
00:34:46,397 --> 00:34:47,291
Mm-hmm.

554
00:34:47,781 --> 00:34:51,581
And our application is in the very early stage, early stage life cycle.

555
00:34:51,581 --> 00:34:53,901
like a lot of things are changing at the same time.

556
00:34:53,901 --> 00:34:54,521
Right.

557
00:34:54,521 --> 00:35:02,661
And you are, you fire off the agent work on this, and then I'm going to like split my pain
and I'm going to say, uh, also work on this.

558
00:35:02,661 --> 00:35:04,221
Like this is a, this is separate in the call base.

559
00:35:04,221 --> 00:35:06,641
You can already build a plan for that and also execute that.

560
00:35:06,641 --> 00:35:07,681
I'm going to split the pain again.

561
00:35:07,681 --> 00:35:10,041
And I'm going to have a third one going to work on that.

562
00:35:10,041 --> 00:35:11,601
And then the fourth one, I'm going to work on that.

563
00:35:11,601 --> 00:35:13,961
then suddenly, okay, 40 is not enough.

564
00:35:13,961 --> 00:35:15,161
I'm going to open a new tab.

565
00:35:15,161 --> 00:35:16,267
I'm going to like.

566
00:35:16,267 --> 00:35:26,641
And you're juggling these, I don't know, at some point, six different, let's say, main
agents at the same time, and you need to have this context of all these things in your

567
00:35:26,641 --> 00:35:27,402
mind.

568
00:35:27,402 --> 00:35:29,703
And you're hyper-focused on this.

569
00:35:29,703 --> 00:35:32,605
And it's also a bit addictive because you move very quickly.

570
00:35:32,605 --> 00:35:38,627
But after four hours, like I say, I was very happy with what I've done, but you're
mentally tired.

571
00:35:38,788 --> 00:35:42,097
Way more than I would have just, like,

572
00:35:42,097 --> 00:35:43,773
on traditional coding for four hours.

573
00:35:43,773 --> 00:35:46,459
Yeah, but I think it's more because of context in this in your case, right?

574
00:35:46,459 --> 00:35:51,078
Like you have more context to manage and drains more your energy, right?

575
00:35:51,166 --> 00:35:58,110
It drains more your energy, also like it's because of AI coding that this is possible,
right?

576
00:35:58,450 --> 00:36:01,812
Because like you fire off an agent and it's going to take five minutes to do something.

577
00:36:01,812 --> 00:36:05,001
So that gives me five minutes to do something else with another agent.

578
00:36:05,001 --> 00:36:06,581
Yeah, yeah.

579
00:36:06,581 --> 00:36:07,741
Yeah, I see what you're saying.

580
00:36:07,741 --> 00:36:09,801
I think for me, well, two things.

581
00:36:09,801 --> 00:36:13,461
If you still auto-prove, that only happens if you auto-prove.

582
00:36:13,461 --> 00:36:18,941
I think if you still need to manually approve stuff, it's a bit less, right?

583
00:36:19,441 --> 00:36:22,881
there is a window, right?

584
00:36:22,881 --> 00:36:25,841
Like that you can still stay in front of a computer and wait.

585
00:36:25,981 --> 00:36:28,432
And I think if it's five minutes, that's too long, right?

586
00:36:28,432 --> 00:36:30,820
For me, when that happens still, like...

587
00:36:30,820 --> 00:36:38,080
It's just because I have messages, have emails, know, so like, I don't, my context
switching doesn't, like, I don't think he snowballs as much, right?

588
00:36:38,080 --> 00:36:43,380
Because I need to reply to this message on Slack, I need to do this, ah, there's this
email, then I go back, okay, now do this.

589
00:36:44,040 --> 00:36:48,300
But it is an interesting, I never thought of that actually, but.

590
00:36:48,869 --> 00:36:54,493
But it's maybe not a huge problem in a sense that...

591
00:36:54,493 --> 00:36:56,555
Because this is a bit like blowing the whistle, right?

592
00:36:56,555 --> 00:37:01,139
Like, we need to make sure that not everybody's gonna burn out.

593
00:37:01,139 --> 00:37:05,442
But I think this is mainly a problem that you can have working on a project yourself.

594
00:37:05,442 --> 00:37:08,204
Like you're a hobby project, you're gonna go all out on this.

595
00:37:08,204 --> 00:37:09,645
Or as like...

596
00:37:10,249 --> 00:37:17,418
maybe a technical founder because you have like, you yourself have this complete vision of
the product and you can work on a lot of these things at the same time.

597
00:37:17,418 --> 00:37:26,939
But if you're working on a team and like you're assigned to these and these tickets, like
it's less of an issue I think because you're gonna work more sequentially by the nature of

598
00:37:26,939 --> 00:37:30,373
the work that you do and the type of responsibility that you have.

599
00:37:30,751 --> 00:37:33,072
Yeah, maybe that's I think the responsibility part, right?

600
00:37:33,072 --> 00:37:39,835
Cause also, I mean, your scope, as a founder or whatever, like the scope is basically the
whole thing, right?

601
00:37:39,835 --> 00:37:41,965
And if you have a team, your scope is going to be less.

602
00:37:41,965 --> 00:37:46,496
And even if you could do more, you probably don't want to because someone else is going to
take ownership of that.

603
00:37:46,496 --> 00:37:49,035
And there's also be like, so it's true.

604
00:37:49,035 --> 00:37:50,095
It's true.

605
00:37:50,095 --> 00:37:57,159
But yeah, I'm also wondering like, when you say this, so let me organize my thoughts a
bit.

606
00:37:57,159 --> 00:38:02,547
Because a lot of times I feel like work your like you're paid for the value you bring.

607
00:38:02,547 --> 00:38:03,508
Right.

608
00:38:03,729 --> 00:38:10,517
And it's like if you can get two weeks worth of work in four hours.

609
00:38:10,719 --> 00:38:10,989
Yeah.

610
00:38:10,989 --> 00:38:13,633
You're drained, but then just like you can work four hours and that's it.

611
00:38:13,633 --> 00:38:14,373
Right.

612
00:38:14,775 --> 00:38:19,810
Yeah, but that's not how, like this, you want to be paid for the value that you bring, but
that's not reality, right?

613
00:38:19,810 --> 00:38:21,641
Like you're paid for the hours that you do.

614
00:38:23,263 --> 00:38:30,572
And yeah, I must say I'm only becoming more more pessimistic on the future of
employability of software engineers.

615
00:38:31,410 --> 00:38:38,404
So I think like with these new models, the performance that you get out of the box with
Cloud Code, like,

616
00:38:38,404 --> 00:38:46,216
Either we're gonna like, I am literally as efficient as today as five traditional
engineers would have been two years back, three years back.

617
00:38:46,216 --> 00:38:48,819
Like even if you like it.

618
00:38:48,871 --> 00:38:58,627
you have five windows, five agents running parallel for four hours, even if it's just for
the four hours, but like you have stuff on auto approve, like the pace of writing and

619
00:38:58,627 --> 00:39:00,311
deleting it's faster than a person as well.

620
00:39:00,311 --> 00:39:00,951
Yeah, yeah.

621
00:39:00,951 --> 00:39:02,611
And maybe to make it more concrete.

622
00:39:02,611 --> 00:39:07,642
the application that we have now, we're going to have test developers, test users going
live next week.

623
00:39:07,642 --> 00:39:10,502
We worked on this more or less a month.

624
00:39:10,982 --> 00:39:18,742
This would have easily taken a team of a few people six months, three years ago.

625
00:39:18,742 --> 00:39:18,842
Right?

626
00:39:18,842 --> 00:39:20,202
Like there's a huge difference, right?

627
00:39:20,202 --> 00:39:25,482
Like, and I know that a lot of large companies that are not there yet, but are gonna
switch at some point, right?

628
00:39:25,484 --> 00:39:28,896
I think it's inevitable and the models will only become better.

629
00:39:29,457 --> 00:39:31,118
There's no going back from this.

630
00:39:31,118 --> 00:39:37,774
And I think the reality is that you will need less people because you're not going to
write lines of code, you're going to orchestrate.

631
00:39:37,774 --> 00:39:40,144
You're going to become the team lead of your agents.

632
00:39:40,525 --> 00:39:48,461
And the only way to not have an impact on employability is if the economy moves way, way,
quicker.

633
00:39:48,461 --> 00:39:51,363
And companies start doing way, way, more.

634
00:39:51,363 --> 00:39:52,634
But I don't.

635
00:39:52,685 --> 00:39:58,228
Think the economy is gonna heat up 200 times like for in the coming two years, right?

636
00:39:58,228 --> 00:40:00,343
And the only alternative is that you need less people.

637
00:40:00,343 --> 00:40:02,618
like, think it's...

638
00:40:02,618 --> 00:40:03,408
um

639
00:40:03,408 --> 00:40:10,608
I'm wondering also if there's a lot of nice-to-haves that it was always in sight but never
got to.

640
00:40:10,608 --> 00:40:12,908
So people are just going to start doing more stuff.

641
00:40:13,048 --> 00:40:15,275
Like even more than nice-to-haves, you know?

642
00:40:16,108 --> 00:40:23,883
But I think that only holds when the economy moves quicker because like if you're gonna as
a company gonna create more output, I mean, you need to be able to offset it to the

643
00:40:23,883 --> 00:40:24,784
market.

644
00:40:25,585 --> 00:40:25,855
Right?

645
00:40:25,855 --> 00:40:31,990
And the only is possible if people have a higher purchasing power, like everything needs
to follow that.

646
00:40:32,311 --> 00:40:41,336
Because like from a I don't know what you're saying, like maybe this and this is also
valuable, but I never got the time to like that doesn't apply at this scale.

647
00:40:42,277 --> 00:40:44,693
At this scale that actually has an impact on the economy.

648
00:40:44,693 --> 00:40:46,312
Like, can the economy follow this?

649
00:40:46,312 --> 00:40:48,493
not, like, who will lose a job?

650
00:40:48,493 --> 00:40:51,387
Yeah, I'm just wondering if there are things that people are not having.

651
00:40:51,387 --> 00:40:56,033
I mean, yeah, but that's what you're You're just saying I think it's a yeah, I'm not sure.

652
00:40:56,033 --> 00:41:02,261
It's a I don't think I think I mean, let me let me organize my thoughts before.

653
00:41:02,261 --> 00:41:06,485
I think software engineering is still gonna be there, like you said, like software
engineering is changing, right?

654
00:41:06,485 --> 00:41:11,601
So what you're saying now is that people are more efficient, but they're still gonna be
software engineers, right?

655
00:41:11,601 --> 00:41:16,254
It's not like we're saying there's not gonna be anyone because managers are just gonna
talk to their agents and they're have applications.

656
00:41:16,254 --> 00:41:17,735
You still need someone.

657
00:41:18,056 --> 00:41:19,016
Yeah.

658
00:41:19,317 --> 00:41:25,303
I also think like everyone's gonna be more efficient, but maybe not everyone's gonna be as
efficient as you.

659
00:41:25,984 --> 00:41:27,625
Because I do think you're...

660
00:41:28,043 --> 00:41:31,838
You're very knowledgeable on, I mean, just how a system works and all these things.

661
00:41:31,838 --> 00:41:39,397
And you also, I mean, I think even in like, I don't know, 10 years, not everyone's going
to be five X of what we are today, I would say.

662
00:41:39,558 --> 00:41:40,082
You think?

663
00:41:40,082 --> 00:41:41,604
think everybody will be 5x.

664
00:41:41,604 --> 00:41:43,845
Five years from now, everybody will be 5x.

665
00:41:43,845 --> 00:41:53,646
And you will maybe have like very good engineers that we even like in traditional code
would call the 10x or 100x engineers and you will have them in the AI coding era as well,

666
00:41:53,646 --> 00:41:54,006
right?

667
00:41:54,006 --> 00:42:00,400
But I think if you compare like your traditional developer, your...

668
00:42:00,400 --> 00:42:06,610
one X traditional developer versus your traditional your AI developer five years from now
they will be five X.

669
00:42:07,160 --> 00:42:07,881
Hmm.

670
00:42:07,881 --> 00:42:08,921
Okay.

671
00:42:10,222 --> 00:42:10,623
Yeah.

672
00:42:10,623 --> 00:42:14,967
I mean, we can discuss, but I don't think we can conclude anything.

673
00:42:14,967 --> 00:42:19,129
We'll just put an event on our calendar in five years from now.

674
00:42:19,210 --> 00:42:20,711
And we'll just see.

675
00:42:21,071 --> 00:42:25,456
Because yeah, I do think it's like the knowledge of like how system works and all these
things.

676
00:42:25,456 --> 00:42:27,517
Not everyone's going to have as much.

677
00:42:28,378 --> 00:42:28,649
Right.

678
00:42:28,649 --> 00:42:32,141
I mean, even today, like you see a lot of people that like they...

679
00:42:32,725 --> 00:42:40,321
They know a lot how one thing works, but like if you go a bit broader, then I think for
you to really be 5X, 10X, I think you also need to have a broader thing.

680
00:42:40,321 --> 00:42:41,963
And I think that would always take time.

681
00:42:41,963 --> 00:42:44,813
We'll see in five years, Marilla, we'll see in five years.

682
00:42:46,224 --> 00:42:50,304
Have you started on an alternate career already?

683
00:42:50,304 --> 00:42:52,664
Aside from podcasting?

684
00:42:52,784 --> 00:42:53,964
Okay.

685
00:42:54,229 --> 00:42:55,721
Maybe I'll become a chicken farmer.

686
00:42:55,721 --> 00:43:01,507
There is actually a guy on LinkedIn that he was like, I forgot, it's like CEO or something
or principal engineer or something.

687
00:43:01,507 --> 00:43:08,611
And then the next item says like, he's like media senior, principal, whatever, CTO, and
then goose farmer.

688
00:43:08,611 --> 00:43:10,263
you seen that?

689
00:43:10,263 --> 00:43:13,308
It's like, it's going to be you.

690
00:43:13,308 --> 00:43:16,137
very fond of cheese, so maybe some cows.

691
00:43:16,137 --> 00:43:16,592
What?

692
00:43:16,592 --> 00:43:21,330
Yeah, maybe, Yeah, you're Dutch, so I think it's in your genes,

693
00:43:22,162 --> 00:43:23,461
Yeah, yeah, like we...

694
00:43:23,461 --> 00:43:28,460
on any given moment, we're either ice skating or we're eating cheese, Those are the two
options.

695
00:43:28,861 --> 00:43:29,953
that's what I hear.

696
00:43:29,953 --> 00:43:33,346
This is the by the way, this is the post you see.

697
00:43:33,527 --> 00:43:37,432
Principal performance architect at Microsoft and then Goose Farmer.

698
00:43:38,033 --> 00:43:39,134
That's cool.

699
00:43:40,099 --> 00:43:45,061
I'm going to see that soon on a podcast or a chicken farmer.

700
00:43:45,061 --> 00:43:46,567
Yeah, that's nice.

701
00:43:46,755 --> 00:43:47,938
All right, cool.

702
00:43:47,938 --> 00:43:50,420
What is the next item we have?

703
00:43:50,420 --> 00:43:56,212
have the software engineering Rebench is trying to solve a messy problem in agent
evaluation.

704
00:43:56,212 --> 00:44:01,874
Benchmarks go stale and models get, quote unquote, contaminated by training on the tasks.

705
00:44:01,874 --> 00:44:12,361
The leaderboard format makes you feel like a live sport, but the real question is whether
continuously refreshed tasks can keep results honest as models ship faster.

706
00:44:12,361 --> 00:44:21,696
You shared this one, but I thought it was actually quite interesting because we see this
SWB benchmark on every model release, right?

707
00:44:21,696 --> 00:44:23,898
And every model they're doing better and better.

708
00:44:24,079 --> 00:44:25,252
But the question is...

709
00:44:25,252 --> 00:44:28,007
Are the models actually using these benchmarks to train?

710
00:44:28,007 --> 00:44:28,407
Right?

711
00:44:28,407 --> 00:44:30,730
And maybe they are, maybe they're not.

712
00:44:30,741 --> 00:44:40,561
leakage over time of these benchmarks into the training data of the models and like how
realistic are these benchmark values?

713
00:44:41,841 --> 00:44:44,261
And yeah, exactly.

714
00:44:45,107 --> 00:44:46,899
Do you know more about it?

715
00:44:47,841 --> 00:44:48,862
Not in detail.

716
00:44:48,862 --> 00:44:55,666
The only thing that I do know is that the tasks that they give to the models as tests,
they are continuously evolving.

717
00:44:56,187 --> 00:45:05,852
Meaning that they're continuously mining for new tasks and that also means that these
tasks cannot have been seen yet by a model.

718
00:45:07,053 --> 00:45:12,768
That's a bit the approach to basically take away this risk of contamination of the
training data.

719
00:45:12,768 --> 00:45:19,424
Yeah, so this is why here even on the leaderboard, right, there's a sliding scale.

720
00:45:19,424 --> 00:45:24,989
So you only get the newest, right, within your thing.

721
00:45:25,150 --> 00:45:29,813
And maybe you see here, Cloud Code is the first one, which is, let's see, where is orange?

722
00:45:29,854 --> 00:45:31,935
Yeah, because Cloud Code is not really a model, right?

723
00:45:31,935 --> 00:45:34,017
But they also put it here.

724
00:45:34,017 --> 00:45:40,477
Well, it's not really a model, it says there like it's an external system and Claude code
is basically the CLI, right?

725
00:45:40,477 --> 00:45:45,037
And it can do much more than just the model because it can also execute the scripts and
stuff like that, right?

726
00:45:45,037 --> 00:45:47,537
Like it's more powerful that way.

727
00:45:48,219 --> 00:45:50,391
Also one thing that is interesting, Cloud Code is the first one.

728
00:45:50,391 --> 00:45:53,582
The resolved rate is 52%.

729
00:45:53,582 --> 00:45:55,415
So a bit half actually only.

730
00:45:57,133 --> 00:45:58,263
I would expect it to be more.

731
00:45:58,263 --> 00:45:59,434
How far do you think we'll get?

732
00:45:59,434 --> 00:46:01,315
Maybe linking to our previous discussion.

733
00:46:01,315 --> 00:46:04,881
Do think this will get to like, do you think this will ever get to like 80 %?

734
00:46:04,881 --> 00:46:07,539
Like any problem you throw at it, you'll fix?

735
00:46:07,539 --> 00:46:10,979
Well, I think 80 % should be doable.

736
00:46:11,235 --> 00:46:12,615
80 % should be doable.

737
00:46:12,615 --> 00:46:13,895
Okay, cool.

738
00:46:14,015 --> 00:46:14,844
And then the...

739
00:46:14,844 --> 00:46:17,997
of people are always doubting like, yeah, but it's not good enough for what we're doing.

740
00:46:17,997 --> 00:46:25,510
And the answer to that is always, let's just wait a bit until the next version comes along
and it's suddenly solved.

741
00:46:25,533 --> 00:46:27,324
Yeah, yeah.

742
00:46:27,324 --> 00:46:29,283
That's also why I some...

743
00:46:29,283 --> 00:46:38,948
I don't know if you've recovered, but there was one article that they were saying how the
guy was just hype and he almost owned that he was just hype, that the things didn't work,

744
00:46:38,948 --> 00:46:42,789
but he was kind of betting on the models would get better.

745
00:46:43,169 --> 00:46:48,221
Like, don't need to do work, I just need to wait and the models would get better and then
it will work.

746
00:46:48,560 --> 00:46:55,603
things about the application that we're building now, it's a bit the same, like we're
relying very heavily on an AI agent to do stuff in the application.

747
00:46:55,603 --> 00:46:57,563
And it's not perfect, right?

748
00:46:57,563 --> 00:47:03,708
Like, and the question is a bit like, are you gonna build a lot of scaffolding around it
to catch these errors?

749
00:47:03,708 --> 00:47:08,130
Or are we just gonna wait three months to four, open 4.7, right?

750
00:47:08,530 --> 00:47:10,015
Which will probably solve it.

751
00:47:10,015 --> 00:47:11,012
Yeah, true.

752
00:47:11,012 --> 00:47:15,015
That's what we've seen in the last two years continuously, right?

753
00:47:15,015 --> 00:47:22,324
Yeah, it's And it's, yeah, easier to wait instead of like, try to change the whole
application and try to catch all these things than to,

754
00:47:22,443 --> 00:47:26,627
Yeah, Is the new, the fast codecs model in here?

755
00:47:27,248 --> 00:47:29,510
Spark is called codecs Spark.

756
00:47:30,151 --> 00:47:31,692
It'll be interesting to see.

757
00:47:32,012 --> 00:47:38,320
Codecs Spark got released, 5.3 codecs Spark, model by OpenAI, which is super fast.

758
00:47:38,440 --> 00:47:45,378
Like what I was saying, like you need to wait a lot for cloud code, which is true, also in
codecs, but like this is like an almost like almost instantaneous reaction.

759
00:47:45,378 --> 00:47:47,910
Like you ask it to do something, it immediately does it.

760
00:47:48,056 --> 00:47:52,724
It's quite impressive, but I was wondering how does this perform on the benchmarks?

761
00:47:52,724 --> 00:48:00,246
Because to me that is from a user experience point of view, AI coding still a bit of a...

762
00:48:00,246 --> 00:48:03,292
It doesn't feel great because you need to wait so long.

763
00:48:03,292 --> 00:48:05,694
Yeah, maybe this is what you're saying, right?

764
00:48:05,694 --> 00:48:09,487
Like this whole Codex Spark is a way more performant, right?

765
00:48:09,487 --> 00:48:11,799
Version of 5.3 Codex, right?

766
00:48:11,799 --> 00:48:12,522
So they have some next.

767
00:48:12,522 --> 00:48:20,442
on the screen now and on the right side of the split screen, like they use Spark and it's
like, okay, here the application is already, yes, and the other one is still preparing

768
00:48:20,442 --> 00:48:22,382
what it's gonna do.

769
00:48:23,362 --> 00:48:24,922
That's impressive, right?

770
00:48:25,522 --> 00:48:26,342
That's impressive, right?

771
00:48:26,342 --> 00:48:32,202
The demo here, of course, it's probably like I picked specifically to work in the demo,
but...

772
00:48:32,202 --> 00:48:34,162
and I do think it's a tendency, right?

773
00:48:34,162 --> 00:48:42,462
We'll see more of these models trying to be faster because like you said, it is maybe also
linking back to previous discussion we had, do you think this quote unquote solves a bit

774
00:48:42,462 --> 00:48:45,910
the burnout problem because you don't switch context as much?

775
00:48:45,910 --> 00:48:46,975
Maybe it's a point.

776
00:48:46,975 --> 00:48:49,052
Definitely related to him.

777
00:48:49,052 --> 00:48:49,424
Right?

778
00:48:49,424 --> 00:48:50,035
Very cool.

779
00:48:50,035 --> 00:48:53,254
I also heard a lot of very nice things about codecs these days, actually.

780
00:48:53,254 --> 00:48:55,705
Like people even saying that they prefer codecs over...

781
00:48:55,705 --> 00:48:56,366
definitely, yeah.

782
00:48:56,366 --> 00:48:58,739
I think it's bit, comes a bit down to preferences.

783
00:48:58,739 --> 00:49:02,154
I think it can be on par with Claude Cote.

784
00:49:02,154 --> 00:49:03,735
Maybe it's also interesting how they do this.

785
00:49:03,735 --> 00:49:12,103
So they have a collaboration with Cerebras, which apparently has a very high performance
hardware to do inferencing on.

786
00:49:12,464 --> 00:49:13,556
So what...

787
00:49:13,556 --> 00:49:15,488
So this is the route they're taking.

788
00:49:15,488 --> 00:49:17,329
And Tropic also has a fast mode.

789
00:49:17,329 --> 00:49:26,327
And what they do is that they, I'm not exactly sure on implementation, but they run on the
same hardware, but they use smaller batch sizes to speed stuff up.

790
00:49:26,327 --> 00:49:27,879
So it's a bit of a different approach.

791
00:49:27,879 --> 00:49:33,243
OBDi really uses a different type of inference hardware to do these, to serve these fast
models.

792
00:49:33,790 --> 00:49:40,638
And when you say batch, is it you're batching from other people's requests or how does it
work?

793
00:49:40,795 --> 00:49:49,783
I think the batches of your own requests to Claude, to give faster feedback before waiting
for everything to finished.

794
00:49:49,991 --> 00:49:52,346
Interesting, So yeah, okay, we see that.

795
00:49:52,346 --> 00:49:55,055
sure how that exactly gets implemented.

796
00:49:55,055 --> 00:50:00,278
Yeah, and maybe we thought we're talking about it's been a while since we we chatted,
right?

797
00:50:00,278 --> 00:50:05,032
This is 5.3 codecs spark, but also 5.3 codecs was also released, right?

798
00:50:05,032 --> 00:50:12,707
So this is technically new since last time we saw and it was like 30 minutes after Opus
4.6 was announced, right?

799
00:50:12,707 --> 00:50:16,429
So this is like getting mouse.

800
00:50:16,559 --> 00:50:19,020
4.6 was the best on lot of things.

801
00:50:19,020 --> 00:50:22,759
then Codex was slightly better 30 minutes later.

802
00:50:22,759 --> 00:50:23,472
you were there.

803
00:50:23,472 --> 00:50:26,328
There was a tropical state of the art for 30 minutes.

804
00:50:27,609 --> 00:50:39,047
I do still think and actually the the rebench benchmark shows it is that Claude code to
the CLI like with the way it's set up with tool usage with plugins with everything.

805
00:50:39,047 --> 00:50:40,929
is still the best way to go these days.

806
00:50:40,929 --> 00:50:44,693
If you need to choose, you can't go wrong with Claude Cote.

807
00:50:44,743 --> 00:50:47,677
Yeah, yeah, no, also think it's the only thing that I'm...

808
00:50:47,677 --> 00:50:54,981
Because also I was talking to some colleagues and they were using cloud code and open
code, I think.

809
00:50:55,362 --> 00:50:57,804
So like the CLI interface a bit changing.

810
00:50:57,804 --> 00:51:01,819
But yeah, for according to them, wasn't, they didn't really prefer one over the other.

811
00:51:01,819 --> 00:51:05,903
They didn't feel like you need to quote unquote specialize over one CLI tool than the
other.

812
00:51:05,903 --> 00:51:15,751
yeah, it was more because you cannot use codecs or you cannot use OpenAI models or Gemini
models with Cloud Code or you can, but it's a bit funky, right?

813
00:51:16,713 --> 00:51:17,114
Yeah.

814
00:51:17,114 --> 00:51:17,475
Cool.

815
00:51:17,475 --> 00:51:19,064
Maybe one last thing I just want to share.

816
00:51:19,064 --> 00:51:20,396
thought it was pretty funny.

817
00:51:20,396 --> 00:51:28,039
There's a rentahuman.ai because I think what they were saying is like everything is
agents.

818
00:51:28,039 --> 00:51:32,842
So now if you want to hire a human to do the things like you hire a human for your agent,
right?

819
00:51:32,842 --> 00:51:34,043
You don't hire.

820
00:51:34,043 --> 00:51:35,153
You have agents for everything.

821
00:51:35,153 --> 00:51:37,144
Now you hire human just for these things.

822
00:51:37,144 --> 00:51:38,426
I thought it was pretty funny.

823
00:51:38,426 --> 00:51:43,648
I'm not sure if it's actually I mean, it looks like it's a serious business.

824
00:51:44,012 --> 00:51:50,232
Like people connect, like people are putting their rates and stuff, but I'm not sure how,
how serious this is.

825
00:51:50,232 --> 00:51:50,748
And,

826
00:51:50,748 --> 00:51:54,688
This was, I think it's also like being used by bots.

827
00:51:56,248 --> 00:52:02,448
Right, I think it's an idea that occurred on Moldbook, if I'm not mistaken.

828
00:52:03,088 --> 00:52:14,168
But I'm not 100 % sure on what I'm gonna say now, but this is an idea that agents
discussed, well bots discussed on Moldbook and that they created this platform and that

829
00:52:14,168 --> 00:52:16,268
other bots can now rent out humans.

830
00:52:16,395 --> 00:52:19,955
Yeah, there's even an MCP integration here.

831
00:52:22,240 --> 00:52:23,560
So that was pretty funny.

832
00:52:23,560 --> 00:52:25,651
So I just wanted to share that.

833
00:52:25,651 --> 00:52:33,217
it's just, it looks funny, but it's got like, could very well be that we're going to see
these things in the future, right?

834
00:52:33,658 --> 00:52:40,505
and it's not necessarily like you're like, I think a lot of people, you and me included,
will have their own bot assistance.

835
00:52:40,505 --> 00:52:47,071
And when you and me want to find a date for lunch, we were actually changing stuff on
WhatsApp earlier today.

836
00:52:47,071 --> 00:52:47,583
Like,

837
00:52:47,583 --> 00:52:55,203
I'm just gonna ask my agent to align with your agent and our agents are just gonna put it
in our agendas.

838
00:52:56,003 --> 00:52:58,623
Right, think that's not unrealistic.

839
00:52:58,623 --> 00:53:01,034
think that we're not that far off from that.

840
00:53:01,034 --> 00:53:01,805
That is true.

841
00:53:01,805 --> 00:53:02,887
And then we're going to get there.

842
00:53:02,887 --> 00:53:05,811
There's going to be someone who's like, who are you?

843
00:53:06,212 --> 00:53:10,118
your agent hired me to be your agent.

844
00:53:11,621 --> 00:53:12,964
Yeah, it's going to be.

845
00:53:12,964 --> 00:53:13,797
like having a discussion.

846
00:53:13,797 --> 00:53:15,078
That would be nice for them, right?

847
00:53:15,078 --> 00:53:16,888
Like uh a person.

848
00:53:16,888 --> 00:53:18,628
Yeah, it's been a hard week for Rello.

849
00:53:18,628 --> 00:53:19,528
yeah, it's fine.

850
00:53:19,528 --> 00:53:21,628
he's going to, yeah.

851
00:53:24,028 --> 00:53:24,968
Yeah.

852
00:53:26,010 --> 00:53:27,530
It's a podcast anniversary.

853
00:53:27,530 --> 00:53:30,230
So we wanted to give you, we organized a treat for you or whatever.

854
00:53:30,230 --> 00:53:30,510
Yeah.

855
00:53:30,510 --> 00:53:32,330
So let's see.

856
00:53:32,710 --> 00:53:39,510
I was also thinking of, yeah, doing something with the Cloudbot for the podcast as well.

857
00:53:39,510 --> 00:53:40,390
Like something like this.

858
00:53:40,390 --> 00:53:43,346
I mean, I think we automated a lot of stuff already, but.

859
00:53:44,282 --> 00:53:45,875
show notes and stuff, so maybe I'll.

860
00:53:45,875 --> 00:53:46,326
about it.

861
00:53:46,326 --> 00:53:47,566
That's a good point, yeah.

862
00:53:47,566 --> 00:53:51,618
Because we didn't release a lot of episodes since Clawbot was released.

863
00:53:51,618 --> 00:53:54,089
And like, we see how much I use it now.

864
00:53:54,350 --> 00:53:59,492
Personally, like we need to think a bit about like what can we automate with OpenClaw?

865
00:53:59,492 --> 00:54:04,185
oh Problem with OpenClaw is that it's still very costly.

866
00:54:04,185 --> 00:54:08,117
it consumes a shit ton of tokens.

867
00:54:08,123 --> 00:54:10,196
What if you use a know, minimax?

868
00:54:10,196 --> 00:54:13,496
Minimax models is that it's very cheap, but

869
00:54:13,496 --> 00:54:18,875
the OpenClaw doesn't work well if you don't use at least Opus 4.5.

870
00:54:18,875 --> 00:54:21,350
um Unfortunately, it's also recommended in the docs.

871
00:54:21,350 --> 00:54:24,675
I'm not sure if it's that explicit in the new docs, but...

872
00:54:24,675 --> 00:54:25,764
I saw somewhere.

873
00:54:25,764 --> 00:54:26,845
consensus.

874
00:54:26,845 --> 00:54:29,167
actually switched now because Sonnet got released.

875
00:54:29,167 --> 00:54:34,221
Sonnet 4.6 got released, which is on par in terms of tool usage performance.

876
00:54:34,221 --> 00:54:35,782
And it looks quite okay.

877
00:54:35,782 --> 00:54:38,406
So I think that already makes it a lot more affordable.

878
00:54:38,406 --> 00:54:43,029
Problem is that you, we were discussing earlier, like you can't use it with your entropic
subscription.

879
00:54:43,029 --> 00:54:46,747
So you need to pay for via the API and that quickly adds up.

880
00:54:46,747 --> 00:54:48,680
yeah, we need to think of something.

881
00:54:48,680 --> 00:54:49,919
need to think of something.

882
00:54:49,919 --> 00:54:52,223
maybe Olaj fans can sponsor it, right?

883
00:54:52,223 --> 00:54:53,874
Yeah.

884
00:54:53,874 --> 00:54:54,796
buy me a coffee.

885
00:54:54,796 --> 00:54:58,298
Maybe one last question you mentioned and this is my personal curiosity.

886
00:54:58,298 --> 00:55:05,923
You mentioned you code with like five different windows with five different coding
sessions in parallel.

887
00:55:05,923 --> 00:55:08,384
Do you use Git work trees or something or how do you?

888
00:55:08,384 --> 00:55:08,728
don't.

889
00:55:08,728 --> 00:55:10,007
A lot of people do,

890
00:55:10,060 --> 00:55:10,891
Yeah, how do you do it?

891
00:55:10,891 --> 00:55:17,870
You just let it run like and they because I think you also call it like he says he sees
right like if a file has been changed since I read it to kind of.

892
00:55:17,870 --> 00:55:19,153
Yeah, it does that quite well.

893
00:55:19,153 --> 00:55:21,411
um

894
00:55:21,411 --> 00:55:31,036
So HitWorktree basically becomes like this, like you open another directory in another
branch or something, like there's a bit of way to think about it.

895
00:55:31,036 --> 00:55:35,585
So you're working with two agents and they're each on their own Hit branch.

896
00:55:35,866 --> 00:55:40,410
But that also means that like they're not aware of what each other is doing, right?

897
00:55:40,475 --> 00:55:42,335
I typically don't do that.

898
00:55:42,335 --> 00:55:44,615
I think it's bit of a hassle to merge everything together.

899
00:55:44,615 --> 00:55:49,535
So when I parallelize stuff, I make sure that agents are working on different things.

900
00:55:49,535 --> 00:55:57,295
So one is working on backend, another one is working on the specific frontend aspect, one
is working on these agentic stuff.

901
00:55:57,295 --> 00:56:02,404
Like it's working on different areas of concern basically, so that they don't overlap.

902
00:56:02,404 --> 00:56:03,355
To try to minimize.

903
00:56:03,355 --> 00:56:09,432
I mean, there's always a chance, quote unquote, that they are adding the same file, but
then minimally.

904
00:56:10,174 --> 00:56:10,647
Yeah.

905
00:56:10,647 --> 00:56:12,691
that the concerns are spread enough.

906
00:56:12,691 --> 00:56:15,536
But it also makes that you need the context switches are big.

907
00:56:16,058 --> 00:56:18,581
Because you're constantly working on really a different domain.

908
00:56:18,581 --> 00:56:19,841
Yeah, yeah.

909
00:56:19,841 --> 00:56:25,761
Yeah, I think the Git work tree, thought it was interesting because then it's really like
developers, you know, they create a new branch and they're working on things and it's

910
00:56:25,761 --> 00:56:25,861
true.

911
00:56:25,861 --> 00:56:28,241
Sometimes they don't know what the other people are working on.

912
00:56:28,241 --> 00:56:29,561
So it's really just going off of a ticket.

913
00:56:29,561 --> 00:56:31,921
And then sometimes when you merge, you do have conflicts.

914
00:56:32,321 --> 00:56:33,281
interesting.

915
00:56:33,281 --> 00:56:34,212
OK, cool.

916
00:56:34,212 --> 00:56:36,212
That was it for today.

917
00:56:36,212 --> 00:56:40,143
So we had a yeah, we covered like four topics and we covered a few extras.

918
00:56:40,143 --> 00:56:44,999
think this also shows a bit we have a slightly different format, largely similar, right?

919
00:56:44,999 --> 00:56:46,241
But I think we cover a few less.

920
00:56:46,241 --> 00:56:49,213
But people can expect that's also something we discussed, right?

921
00:56:49,213 --> 00:56:51,853
In our, let's say, mini sabbatical, right?

922
00:56:51,853 --> 00:56:53,227
A few changes.

923
00:56:53,227 --> 00:56:54,890
So there are a things that we still need to plan.

924
00:56:54,890 --> 00:56:57,413
But there'll be some small changes, I think, right?

925
00:56:57,413 --> 00:56:58,304
Things we're trying out.

926
00:56:58,304 --> 00:57:01,867
You want to say anything more than that part or you want to leave it at that?

927
00:57:02,220 --> 00:57:12,705
In our dinner that we have, our last non-agent planned dinner a few weeks back, we were
discussing like to have a bit more, we're gonna maybe do less frequent news updates.

928
00:57:12,705 --> 00:57:17,706
Well, that's the working draft at least, maybe let's call it like that.

929
00:57:17,706 --> 00:57:22,306
To have less frequent news updates, meaning once a month.

930
00:57:22,306 --> 00:57:28,026
and for the other weeks that we're gonna have interviews with tech startups and or
investors.

931
00:57:28,026 --> 00:57:29,307
Yes, exactly.

932
00:57:29,307 --> 00:57:32,887
So it's just making more space for having more interviews.

933
00:57:32,887 --> 00:57:36,247
I think we had a few last year that I thought was really good.

934
00:57:36,247 --> 00:57:38,647
So just doing a few things more like that.

935
00:57:38,647 --> 00:57:40,322
keep an eye out.

936
00:57:40,322 --> 00:57:47,433
We might want to also rebrand a little bit, like for the new setup, but we don't know yet.

937
00:57:47,433 --> 00:57:52,093
It should become a bit more concrete over the coming, let's say two months, something like
that.

938
00:57:52,093 --> 00:57:53,292
It's a realistic timeline.

939
00:57:53,292 --> 00:57:54,713
think too much is realistic.

940
00:57:54,713 --> 00:58:00,342
But then again, if you're following us on Apple Podcasts, whatever, you shouldn't have to
look for us again.

941
00:58:00,426 --> 00:58:02,065
It should just be the same.

942
00:58:03,569 --> 00:58:03,899
Yeah.

943
00:58:03,899 --> 00:58:07,645
rebrand, it's just a switch of the name of the existing podcast.

944
00:58:07,645 --> 00:58:08,806
Exactly.

945
00:58:09,047 --> 00:58:10,229
All right, cool.

946
00:58:10,229 --> 00:58:11,651
Looking forward to that.

947
00:58:11,651 --> 00:58:12,353
Thanks Bart.

948
00:58:12,353 --> 00:58:14,346
I'll tell what's the name of your agent actually.

949
00:58:14,346 --> 00:58:15,167
What was the name?

950
00:58:15,167 --> 00:58:15,871
You had a name for it.

951
00:58:15,871 --> 00:58:16,649
It wasn't Jarvis.

952
00:58:16,649 --> 00:58:17,559
was like...

953
00:58:18,064 --> 00:58:21,159
I one called Barnaby and the other one is called Binky.

954
00:58:21,592 --> 00:58:25,165
But you had another one known before as Jeeves.

955
00:58:25,165 --> 00:58:27,147
No, it wasn't Jeeves even, was it?

956
00:58:27,768 --> 00:58:28,839
It was like a longer name.

957
00:58:28,839 --> 00:58:30,721
Like, I don't know, but let's say Jeeves.

958
00:58:30,721 --> 00:58:32,092
Let's say Jeeves.

959
00:58:32,273 --> 00:58:36,166
I'll ping Jeeves and then we can set that dinner.

960
00:58:36,166 --> 00:58:39,561
And then we should try.

961
00:58:39,561 --> 00:58:40,842
Yeah, we should definitely try.

962
00:58:40,842 --> 00:58:45,466
I'm also very curious about the different open clawed flavors as well.

963
00:58:45,665 --> 00:58:54,088
Maybe I was thinking of playing a bit with that, but I was like, but if it doesn't work,
I'm a bit, you know, kind of, you know, part of me just wants to get it to work.

964
00:58:54,088 --> 00:58:54,418
Right.

965
00:58:54,418 --> 00:58:57,693
So that's a, yeah.

966
00:58:58,716 --> 00:58:59,196
Okay.

967
00:58:59,196 --> 00:58:59,777
Cool.

968
00:58:59,777 --> 00:59:01,139
Maybe I'll give it a try.

969
00:59:01,139 --> 00:59:03,831
access, like just to your WhatsApp or something.

970
00:59:05,972 --> 00:59:06,874
Thank you, Meryl.

971
00:59:06,874 --> 00:59:09,038
I'll see you all next time.

972
00:59:10,123 --> 00:59:10,576
Ciao!

973
00:59:10,576 --> 00:59:11,497
Ciao.