1
00:00:03,560 --> 00:00:08,120
Ejaaz:
A bunch of AI researchers from China just released a brand new AI model called

2
00:00:08,120 --> 00:00:12,840
Ejaaz:
Kimi K2, which is not only as good as any other top model like Claude,

3
00:00:13,120 --> 00:00:17,080
Ejaaz:
but it is also 100% open source, which means it's free to take,

4
00:00:17,380 --> 00:00:20,300
Ejaaz:
customize and create into your own brand new AI model.

5
00:00:20,560 --> 00:00:24,480
Ejaaz:
This thing is amazing at coding, it beats any other model at creative writing,

6
00:00:24,620 --> 00:00:27,260
Ejaaz:
and it also has a pretty insane voice mode.

7
00:00:27,460 --> 00:00:32,700
Ejaaz:
Oh, and I should probably mention that it is one trillion parameters in size,

8
00:00:32,760 --> 00:00:35,940
Ejaaz:
which makes it one of the biggest and largest models to ever be created.

9
00:00:36,480 --> 00:00:41,940
Ejaaz:
Josh, we were winding down on a Friday night and this news broke that this team

10
00:00:41,940 --> 00:00:42,920
Ejaaz:
had released this model.

11
00:00:43,680 --> 00:00:48,700
Ejaaz:
Absolutely crazy bomb, especially with like OpenAI rumored to release their

12
00:00:48,700 --> 00:00:50,140
Ejaaz:
open source model this week.

13
00:00:50,540 --> 00:00:53,700
Ejaaz:
You've been jumping into this. What's your take?

14
00:00:54,460 --> 00:00:59,300
Josh:
Yeah. So last week we crowned Grok 4 as the new leading private model, closed source model.

15
00:00:59,300 --> 00:01:02,260
Josh:
This week we got to give the crown to Kimi K2 we got another crown

16
00:01:02,260 --> 00:01:05,360
Josh:
going for the open source team they are winning I mean this is

17
00:01:05,360 --> 00:01:08,520
Josh:
better than DeepSeek and DeepSeek R2 this is basically DeepSeek R3

18
00:01:08,520 --> 00:01:12,500
Josh:
I would imagine um and if you remember back a couple months DeepSeek really

19
00:01:12,500 --> 00:01:16,640
Josh:
flipped the world on its head because of how efficient it was and the algorithmic

20
00:01:16,640 --> 00:01:20,080
Josh:
upgrades it made and I think what we see with Kimi K2 is a lot of the same thing

21
00:01:20,080 --> 00:01:24,600
Josh:
it's it's these novel breakthroughs that come as a downstream effect of their

22
00:01:24,600 --> 00:01:25,780
Josh:
needing to be resourceful

23
00:01:26,400 --> 00:01:30,100
Josh:
China, they don't have the mega GPU clusters we have, they don't have all the

24
00:01:30,100 --> 00:01:34,340
Josh:
cutting edge hardware, but they do have the software prowess to find these efficiencies.

25
00:01:34,520 --> 00:01:37,100
Josh:
I think that's what makes this model so special. And that's what we're going

26
00:01:37,100 --> 00:01:40,820
Josh:
to get into here is specifically what they did to make this model so special.

27
00:01:41,020 --> 00:01:45,840
Ejaaz:
Yeah, I mean, look at these stats here, Josh, like 1 trillion parameters in total.

28
00:01:46,160 --> 00:01:49,980
Ejaaz:
It's 32 billion active mixture of expert models. So what this means is,

29
00:01:50,360 --> 00:01:54,420
Ejaaz:
although it's really large in size, typically these AI models can become pretty

30
00:01:54,420 --> 00:01:58,220
Ejaaz:
inefficient if it's large in size, it uses this technique called mixture of

31
00:01:58,220 --> 00:02:01,240
Ejaaz:
experts, which means that whenever someone queries a model,

32
00:02:01,440 --> 00:02:07,080
Ejaaz:
it only uses or activates a number of parameters that are relevant for the query itself.

33
00:02:07,340 --> 00:02:12,480
Ejaaz:
So it's more smarter, it's much more efficient, and it doesn't use or consume

34
00:02:12,480 --> 00:02:16,300
Ejaaz:
as much energy as you would if you wanted to run it locally at home or whatever

35
00:02:16,300 --> 00:02:18,140
Ejaaz:
that might be. It's also super cheap.

36
00:02:18,380 --> 00:02:21,780
Ejaaz:
I think I saw somewhere that this was 20% the cost of clawed,

37
00:02:21,780 --> 00:02:25,500
Ejaaz:
josh which uh we love that insane uh

38
00:02:25,500 --> 00:02:28,200
Ejaaz:
for all the nerds that kind of want to run you know

39
00:02:28,200 --> 00:02:31,040
Ejaaz:
really long tasks or you know just set and

40
00:02:31,040 --> 00:02:34,540
Ejaaz:
forget the ai to to run on like your coding log or whatever that might mean

41
00:02:34,540 --> 00:02:39,440
Ejaaz:
you can now do it at a much more affordable rate at one-fifth the cost uh than

42
00:02:39,440 --> 00:02:42,500
Ejaaz:
some of the top models that are out there and it is as good as those models

43
00:02:42,500 --> 00:02:46,220
Ejaaz:
so just insane kinds of things josh i know there's a bunch of things that you

44
00:02:46,220 --> 00:02:50,000
Ejaaz:
wanted to point out here on benchmarks um And what do you want to get into?

45
00:02:50,100 --> 00:02:54,440
Josh:
Yeah, it's really amazing. So they took 15 and a half trillion tokens and they

46
00:02:54,440 --> 00:02:56,480
Josh:
condensed those down into a one trillion parameter model.

47
00:02:56,640 --> 00:02:59,240
Josh:
And then what's amazing is when you use this model, like she said,

48
00:02:59,460 --> 00:03:01,500
Josh:
it uses a thing called mixture of experts.

49
00:03:01,780 --> 00:03:05,120
Josh:
So it has, I believe, 384 experts.

50
00:03:05,340 --> 00:03:08,900
Josh:
And each expert is good at a specific thing. So let's say in the case you want

51
00:03:08,900 --> 00:03:13,440
Josh:
to do a math problem, it will take a 32 billion parameter subset of the one

52
00:03:13,440 --> 00:03:16,980
Josh:
trillion total parameters, and it will choose eight of these different

53
00:03:17,580 --> 00:03:21,100
Josh:
Experts in a specific thing. So in the case of math, it'll find an expert that

54
00:03:21,100 --> 00:03:22,440
Josh:
has the calculator tool.

55
00:03:22,540 --> 00:03:27,720
Josh:
It'll find an expert that has a fact, like a fact checking tool or a proof tool

56
00:03:27,720 --> 00:03:28,960
Josh:
to make sure that the math is accurate.

57
00:03:29,360 --> 00:03:32,420
Josh:
It'll have just a series of tools to help itself. And that's kind of how it

58
00:03:32,420 --> 00:03:35,720
Josh:
works so efficiently is instead of using a trillion parameters at once,

59
00:03:35,840 --> 00:03:41,940
Josh:
it uses just 32 billion and it uses the eight best specialists out of the 384

60
00:03:41,940 --> 00:03:43,980
Josh:
that it has available to it. It's really impressive.

61
00:03:44,160 --> 00:03:46,440
Josh:
And what we see here is the benchmarks that we're showing on screen.

62
00:03:46,440 --> 00:03:48,380
Josh:
And the benchmarks are really good.

63
00:03:48,560 --> 00:03:52,420
Josh:
It's up there in line with just about any other top model, except with the exception

64
00:03:52,420 --> 00:03:53,500
Josh:
that this is open source.

65
00:03:53,680 --> 00:03:57,300
Josh:
And there was another breakthrough that we had, which was the actual way that

66
00:03:57,300 --> 00:03:59,040
Josh:
they handled the training of this.

67
00:03:59,160 --> 00:04:02,340
Josh:
And yeah, this is the loss curve. So what you're looking at on screen for the

68
00:04:02,340 --> 00:04:05,800
Josh:
people who are listening, it's this really pretty smooth curve that kind of

69
00:04:05,800 --> 00:04:09,520
Josh:
starts at the top and it trends down in a very predictable and smooth way.

70
00:04:09,840 --> 00:04:13,740
Josh:
And most curves don't look like this. And if they do look like this,

71
00:04:13,800 --> 00:04:17,920
Josh:
it's because the company has spent tons and tons of money on error correction

72
00:04:17,920 --> 00:04:19,400
Josh:
to make sure this curve is so smooth.

73
00:04:19,500 --> 00:04:22,580
Josh:
So basically what you're seeing is the training run of the model.

74
00:04:22,640 --> 00:04:26,280
Josh:
And a lot of times what happens is you get these very sharp spikes and it starts

75
00:04:26,280 --> 00:04:29,020
Josh:
to defer away from the normal training run.

76
00:04:29,060 --> 00:04:33,800
Josh:
And it takes a lot of compute to kind of recalibrate and push that back into the right way.

77
00:04:34,260 --> 00:04:37,360
Josh:
What they've managed to do is really make it very smooth.

78
00:04:37,540 --> 00:04:40,300
Josh:
And they've done this by increasing these efficiencies. So if you can think

79
00:04:40,300 --> 00:04:43,060
Josh:
about it, there's this analogy I was thinking of right before we hit the record button.

80
00:04:43,280 --> 00:04:46,220
Josh:
And it's if you were teaching a chef how to cook, right?

81
00:04:46,440 --> 00:04:50,680
Josh:
So we have Chef Ejaz here. I am teaching him how to cook. I am an expert chef.

82
00:04:50,860 --> 00:04:54,920
Josh:
And instead of telling him every ingredient and every step for every single

83
00:04:54,920 --> 00:04:59,300
Josh:
dish, what I tell him is like, hey, if you're making this amazing dinner recipe,

84
00:04:59,520 --> 00:05:03,940
Josh:
all you need that matters is this amount of salt applied at this time,

85
00:05:04,200 --> 00:05:08,160
Josh:
this amount of heat applied for this length of time, and the other stuff doesn't matter as much.

86
00:05:08,260 --> 00:05:11,160
Josh:
So just put in whatever you think is appropriate, but you'll get the same answer.

87
00:05:11,160 --> 00:05:16,180
Josh:
And that's what we see with this model is just an increased amount of efficiency by being

88
00:05:16,840 --> 00:05:20,080
Josh:
direct by being intentional about the data that they used to train it on,

89
00:05:20,280 --> 00:05:24,020
Josh:
the data that they used to fetch in order to give you high quality queries.

90
00:05:24,200 --> 00:05:28,400
Josh:
And it's a really novel breakthrough. They call it the MuonClip optimizer,

91
00:05:28,560 --> 00:05:32,020
Josh:
which, I mean, it's a Chinese company, maybe it means something special there,

92
00:05:32,200 --> 00:05:33,920
Josh:
but it is a new type of optimizer.

93
00:05:34,040 --> 00:05:37,260
Josh:
And what you're seeing in this curve is that it's working really well and it's

94
00:05:37,260 --> 00:05:37,860
Josh:
working really efficient.

95
00:05:37,980 --> 00:05:41,000
Josh:
And that's part of the benefit of having this open source is now we have this

96
00:05:41,000 --> 00:05:44,240
Josh:
novel breakthrough and we could take this and we could use this for even more

97
00:05:44,240 --> 00:05:48,420
Josh:
breakthroughs even more open source models and and that's part that's been really cool to see

98
00:05:48,420 --> 00:05:52,120
Ejaaz:
I i mean this is just um time

99
00:05:52,120 --> 00:05:57,720
Ejaaz:
and again from china uh so so amazing from their research team so so like just

100
00:05:57,720 --> 00:06:02,000
Ejaaz:
to kind of like um pick up your comment on deep seek at the end of last year

101
00:06:02,000 --> 00:06:07,000
Ejaaz:
we were utterly convinced that the only way to create a breakthrough model was

102
00:06:07,000 --> 00:06:10,480
Ejaaz:
to spend billions of dollars on compute clusters.

103
00:06:10,840 --> 00:06:14,660
Ejaaz:
And so therefore it was a pay-to-play game. And then DeepSeek,

104
00:06:14,960 --> 00:06:19,420
Ejaaz:
a team out of China, released their model and completely open-sourced it as well.

105
00:06:19,700 --> 00:06:24,320
Ejaaz:
And it was as good as OpenAI's Frontier model, which was the top model at the time.

106
00:06:24,600 --> 00:06:30,860
Ejaaz:
And the revelation there was, oh, you don't actually just need to chuck a bunch of compute at this.

107
00:06:31,100 --> 00:06:35,540
Ejaaz:
There are different techniques and different methods if you get creative about

108
00:06:35,540 --> 00:06:37,980
Ejaaz:
how you design your model and how you run the training cluster,

109
00:06:38,240 --> 00:06:41,980
Ejaaz:
the training one, which is basically what you need to do to make your model smart,

110
00:06:42,540 --> 00:06:47,100
Ejaaz:
you can run it in different ways that is more efficient, consumes less energy,

111
00:06:47,300 --> 00:06:51,880
Ejaaz:
and therefore less amount of money, but is as smart, if not smarter,

112
00:06:52,100 --> 00:06:55,000
Ejaaz:
than the frontier models that American AI companies are making.

113
00:06:55,180 --> 00:06:57,220
Ejaaz:
And this is just a repeat of that, Josh.

114
00:06:57,520 --> 00:07:02,740
Ejaaz:
I mean, look at this curve. For those who are looking at this episode on video.

115
00:07:03,420 --> 00:07:06,220
Ejaaz:
It is just so clean yeah it's beautiful

116
00:07:06,220 --> 00:07:09,020
Ejaaz:
the craziest part about this is when deep

117
00:07:09,020 --> 00:07:12,300
Ejaaz:
seek was released they pioneered something called uh reasoning

118
00:07:12,300 --> 00:07:15,160
Ejaaz:
or reinforcement learning uh which are two separate

119
00:07:15,160 --> 00:07:19,600
Ejaaz:
techniques that made the model super smart um with less energy and less compute

120
00:07:19,600 --> 00:07:24,580
Ejaaz:
spend um with this model they didn't even implement that technique at all so

121
00:07:24,580 --> 00:07:29,580
Ejaaz:
theoretically this model can get so much more smarter than it already is um

122
00:07:29,580 --> 00:07:33,700
Ejaaz:
and they just kind of leveraged a new method to make it as smart as it already is right now.

123
00:07:34,060 --> 00:07:39,980
Ejaaz:
So just such a fascinating kind of like progress in research from China.

124
00:07:39,980 --> 00:07:42,460
Ejaaz:
And it just keeps on coming out. It's so impressive.

125
00:07:42,680 --> 00:07:46,200
Josh:
Yeah, this is this was the exciting part to me is that we're seeing so many

126
00:07:46,200 --> 00:07:49,600
Josh:
algorithms or exponential improvements in so many different categories.

127
00:07:49,780 --> 00:07:53,680
Josh:
So this was considered a breakthrough by all means. And this wasn't even the

128
00:07:53,680 --> 00:07:55,260
Josh:
same type of breakthrough that DeepSeek had.

129
00:07:55,380 --> 00:08:00,300
Josh:
So we get this now compounding effect where we have this new training breakthrough

130
00:08:00,300 --> 00:08:03,940
Josh:
and then we have DeepSeek who has the reinforcement learning and that hasn't

131
00:08:03,940 --> 00:08:05,440
Josh:
even yet been applied to this new model.

132
00:08:05,620 --> 00:08:09,820
Josh:
So we get the exponential growth on one end, the exponential growth on the reasoning end,

133
00:08:10,060 --> 00:08:13,120
Josh:
those come together and then you get the exponential growth on the hardware

134
00:08:13,120 --> 00:08:17,040
Josh:
stack where the GPUs are getting much faster and there's all of these different

135
00:08:17,040 --> 00:08:21,940
Josh:
subsets of AI that are compounding on each other and growing and accelerating

136
00:08:21,940 --> 00:08:25,380
Josh:
quicker and quicker and what you get is this unbelievable rate of progress and

137
00:08:25,380 --> 00:08:26,160
Josh:
that's what we're seeing. So

138
00:08:26,520 --> 00:08:29,620
Josh:
reasoning isn't even here yet and we're going to see it soon because it is open

139
00:08:29,620 --> 00:08:33,060
Josh:
source so people can apply their own reasoning on top of it i'm sure the moonshot

140
00:08:33,060 --> 00:08:37,060
Josh:
team is going to be doing their own reasoning version of this model and i'm

141
00:08:37,060 --> 00:08:40,320
Josh:
sure we're going to be getting even more impressive results soon i see you have

142
00:08:40,320 --> 00:08:46,040
Josh:
a post up here um about the testing and overall performance can you please share yeah

143
00:08:46,040 --> 00:08:51,720
Ejaaz:
Yeah so um this is a tweet that summarizes really well how this model performs

144
00:08:51,720 --> 00:08:53,500
Ejaaz:
in relation to other Frontier models.

145
00:08:53,740 --> 00:08:59,300
Ejaaz:
And the popular comparison that's taken for Kimi K2 is against Claude.

146
00:08:59,540 --> 00:09:01,360
Ejaaz:
So Claude has a bunch of models out.

147
00:09:01,680 --> 00:09:04,580
Ejaaz:
Claude 3.5 is its earlier model, and then Claude 4 is its latest.

148
00:09:05,220 --> 00:09:10,540
Ejaaz:
And the general take is that this model is just better than those models,

149
00:09:10,640 --> 00:09:15,340
Ejaaz:
which is just insane to say, because for so long, Josh, we've said that Claude

150
00:09:15,340 --> 00:09:16,680
Ejaaz:
was the best coding model.

151
00:09:16,880 --> 00:09:20,880
Ejaaz:
And indeed it was. And then within the span of, what is it, five days?

152
00:09:20,880 --> 00:09:25,700
Ejaaz:
Grok 4 released and it just completely blew Claude 4 out of the water in terms of coding.

153
00:09:26,280 --> 00:09:30,220
Ejaaz:
Now Kimi K2, an open source model out of China who doesn't even have access

154
00:09:30,220 --> 00:09:34,200
Ejaaz:
to the research and kind of proprietary knowledge that a lot of American AI

155
00:09:34,200 --> 00:09:36,540
Ejaaz:
companies have also beat it as well, right?

156
00:09:36,620 --> 00:09:41,360
Ejaaz:
So it kind of beats Claude at its own game, but it's also cheaper.

157
00:09:41,540 --> 00:09:45,860
Ejaaz:
It's 20% the cost of Claude 3.5, which is just an insane thing to say,

158
00:09:45,980 --> 00:09:49,120
Ejaaz:
which means that if you are a developer out there that

159
00:09:49,120 --> 00:09:51,900
Ejaaz:
wants to try your hand at kind of like vibe coding

160
00:09:51,900 --> 00:09:55,140
Ejaaz:
a bunch of things or actually seriously coding something you

161
00:09:55,140 --> 00:09:58,340
Ejaaz:
know that's quite novel but you don't have the hands on deck to do that you

162
00:09:58,340 --> 00:10:05,220
Ejaaz:
can now spin up a Kimi K2 AI agent actually multiple of them for a very cost-efficient

163
00:10:05,220 --> 00:10:09,400
Ejaaz:
reasonable you know salary you don't have to pay like hundreds of thousands

164
00:10:09,400 --> 00:10:12,680
Ejaaz:
of dollars or you know hundreds of millions of dollars which is what Meta is

165
00:10:12,680 --> 00:10:14,680
Ejaaz:
doing to kind of buy a bunch of these software engineers,

166
00:10:14,980 --> 00:10:19,140
Ejaaz:
you can spend, you know, the equivalent of maybe a Netflix subscription or $500

167
00:10:19,140 --> 00:10:23,080
Ejaaz:
to $1,000 a month and spin up your own app. So super, super cool.

168
00:10:23,480 --> 00:10:28,080
Josh:
And also one added perk that's there is it's that even if you have a lot of

169
00:10:28,080 --> 00:10:31,160
Josh:
GPUs sitting around, you can actually run this model for free.

170
00:10:31,340 --> 00:10:34,020
Josh:
So that's the cost if you actually query it from the servers.

171
00:10:34,060 --> 00:10:36,640
Josh:
But I'm sure there's going to be companies that have access to XS GPUs.

172
00:10:36,740 --> 00:10:39,460
Josh:
They can actually just download the model because it's open source,

173
00:10:39,620 --> 00:10:41,320
Josh:
open weights, and they could run it on their own.

174
00:10:41,400 --> 00:10:44,420
Josh:
And that brings the cost of compute down to the cost per kilowatt of the energy

175
00:10:44,420 --> 00:10:45,960
Josh:
required to run the GPUs.

176
00:10:46,140 --> 00:10:49,140
Josh:
So because it's open source, you really start to see these costs decline,

177
00:10:49,200 --> 00:10:50,580
Josh:
but the quality doesn't.

178
00:10:50,720 --> 00:10:55,160
Josh:
And that's every time we see this, we see a huge productivity unlock in encoding

179
00:10:55,160 --> 00:10:57,940
Josh:
output and amount of queries used. It's like, this is freaking awesome.

180
00:10:58,470 --> 00:11:04,070
Ejaaz:
Yeah josh i saw something else come up as well so so do you remember when claude

181
00:11:04,070 --> 00:11:09,590
Ejaaz:
first released um their frontier model i think it was 3.5 or maybe it was four

182
00:11:09,590 --> 00:11:16,370
Ejaaz:
one of their bragging rights was it had a one million uh token context window which.

183
00:11:16,370 --> 00:11:17,410
Josh:
Oh yes which was huge

184
00:11:17,410 --> 00:11:23,710
Ejaaz:
Yeah which for listeners of the show is huge it's like several uh book novels

185
00:11:23,710 --> 00:11:28,510
Ejaaz:
worth um of words or characters you could just bung into one single prompt.

186
00:11:28,650 --> 00:11:32,770
Ejaaz:
And the reason why that was such an amazing thing was for a while,

187
00:11:32,930 --> 00:11:37,730
Ejaaz:
people struggled to kind of communicate with these AIs because they couldn't set the context.

188
00:11:37,930 --> 00:11:42,190
Ejaaz:
There wasn't enough bandwidth within their chat log window for them to say,

189
00:11:42,570 --> 00:11:44,330
Ejaaz:
you know, and don't forget this. And then there was this.

190
00:11:44,450 --> 00:11:47,510
Ejaaz:
And then, you know, this detail and that detail, there just wasn't enough space.

191
00:11:47,750 --> 00:11:51,070
Ejaaz:
And models weren't performing enough to kind of consume all of this in one go.

192
00:11:51,410 --> 00:11:54,270
Ejaaz:
And then Claude came out and was like, hey, we have one million context windows.

193
00:11:54,590 --> 00:11:57,230
Ejaaz:
Don't worry about it chuck in all the research papers that you want chuck in

194
00:11:57,230 --> 00:12:01,750
Ejaaz:
your essay chuck in reference books and we got you um i saw this tweet that

195
00:12:01,750 --> 00:12:04,690
Ejaaz:
was uh deleted i think you sent this to me um.

196
00:12:04,690 --> 00:12:07,650
Josh:
We got the screenshots we always come with receipts yeah i

197
00:12:07,650 --> 00:12:11,610
Ejaaz:
Wonder why they deleted it but uh good catch from you um yeah let's get into this.

198
00:12:11,610 --> 00:12:15,010
Josh:
What's your take on it was was first posted i think

199
00:12:15,010 --> 00:12:19,790
Josh:
earlier today yeah like an hour ago and then deleted pretty shortly afterwards

200
00:12:19,790 --> 00:12:23,930
Josh:
and this is from a woman name crystal crystal works with the moonshot team she

201
00:12:23,930 --> 00:12:28,730
Josh:
is part of the team that that released kimmy k2 um and in this post it says

202
00:12:28,730 --> 00:12:32,530
Josh:
kimmy isn't just another ai it went viral in china as the first to support

203
00:12:32,960 --> 00:12:36,320
Josh:
A 2 million token context window. And then she goes on to say,

204
00:12:36,640 --> 00:12:41,300
Josh:
we're an AI lab with just 200 people, which is ministerially small compared

205
00:12:41,300 --> 00:12:42,720
Josh:
to a lot of the other labs they're competing with.

206
00:12:42,920 --> 00:12:46,320
Josh:
And it was acknowledgement that they had a 2 million token context window.

207
00:12:46,420 --> 00:12:49,400
Josh:
And for those who, just a quick refresher on the context window stuff,

208
00:12:49,600 --> 00:12:53,960
Josh:
it's imagine you have like a gigantic textbook and you've read it once and you

209
00:12:53,960 --> 00:12:56,180
Josh:
close it and you kind of have a fuzzy memory of all the pages.

210
00:12:56,400 --> 00:12:59,300
Josh:
The context window allows you to lay all of those out in clear view

211
00:12:59,300 --> 00:13:02,180
Josh:
and directly reference every single page so when

212
00:13:02,180 --> 00:13:05,220
Josh:
you have two million tokens which is roughly two million words

213
00:13:05,220 --> 00:13:08,280
Josh:
of context we're talking about like hundreds and hundreds

214
00:13:08,280 --> 00:13:11,140
Josh:
of books and textbooks and knowledge and you could really dump a

215
00:13:11,140 --> 00:13:13,980
Josh:
lot of information in this for the ai to readily access and

216
00:13:13,980 --> 00:13:17,160
Josh:
that if they release that a two million token

217
00:13:17,160 --> 00:13:20,480
Josh:
open source model that's huge

218
00:13:20,480 --> 00:13:23,400
Josh:
deal i mean even grok 4 recently i believe

219
00:13:23,400 --> 00:13:27,540
Josh:
what did we say it was it was a 256 000 uh token context window something like

220
00:13:27,540 --> 00:13:32,420
Josh:
that so grok 4 is one eighth of what they supposedly have accessible right now

221
00:13:32,420 --> 00:13:37,340
Josh:
which is a really really big deal um so i'm hoping it was deleted because they

222
00:13:37,340 --> 00:13:39,860
Josh:
just don't want to share that not because it's not true i would like to believe

223
00:13:39,860 --> 00:13:42,820
Josh:
that it's true because man that'd be pretty epic yeah

224
00:13:42,820 --> 00:13:45,640
Ejaaz:
And the people are loving it josh um check out this

225
00:13:45,640 --> 00:13:49,160
Ejaaz:
graph from uh open router which basically shows

226
00:13:49,160 --> 00:13:53,200
Ejaaz:
uh the split of usage between everyone

227
00:13:53,200 --> 00:13:56,400
Ejaaz:
on their platform that are querying different models so for context

228
00:13:56,400 --> 00:13:59,340
Ejaaz:
here open router is a website that you can go to

229
00:13:59,340 --> 00:14:03,120
Ejaaz:
and you can type up a prompt just like you do at chat gpt and

230
00:14:03,120 --> 00:14:06,740
Ejaaz:
you can decide which model your

231
00:14:06,740 --> 00:14:09,640
Ejaaz:
prompt goes to or you could let open router decide for you

232
00:14:09,640 --> 00:14:13,920
Ejaaz:
and it kind of like divvies up your query so if you have a coding query it's

233
00:14:13,920 --> 00:14:18,580
Ejaaz:
probably going to send it to claude or now kimmy k2 or grok4 but if you have

234
00:14:18,580 --> 00:14:22,860
Ejaaz:
something that's more like to do with creative writing or something that's like

235
00:14:22,860 --> 00:14:27,300
Ejaaz:
a case study it might send it to OpenAI's O3 model, right? So it kind of like decides for you.

236
00:14:27,840 --> 00:14:33,340
Ejaaz:
OpenRacha released this graphic, which basically shows that KimiK2 surpassed

237
00:14:33,340 --> 00:14:38,520
Ejaaz:
XAI in token market share just a few days after launching, which basically means

238
00:14:38,520 --> 00:14:40,820
Ejaaz:
that XAI spent, you know,

239
00:14:41,200 --> 00:14:43,660
Ejaaz:
hundreds of billions of dollars training up their Grok4 model,

240
00:14:43,860 --> 00:14:47,120
Ejaaz:
which just kind of beat out the competition just last week.

241
00:14:47,480 --> 00:14:50,640
Ejaaz:
Then KimiK2 gets released completely open source

242
00:14:50,640 --> 00:14:53,720
Ejaaz:
and everyone starts to use that more than

243
00:14:53,720 --> 00:14:56,440
Ejaaz:
grok 4 which is just an insane thing to say and

244
00:14:56,440 --> 00:15:00,640
Ejaaz:
just shows how rapidly these ai models compete with each other and surpass each

245
00:15:00,640 --> 00:15:05,960
Ejaaz:
other um i think part of the reason for this josh is it's open source right

246
00:15:05,960 --> 00:15:12,000
Ejaaz:
which means that not only are retail users like myself and yourself using it

247
00:15:12,000 --> 00:15:14,060
Ejaaz:
for our daily queries you know uh you know,

248
00:15:14,570 --> 00:15:18,930
Ejaaz:
create this recipe for me or whatever, but researchers and builders all over

249
00:15:18,930 --> 00:15:25,870
Ejaaz:
the world that have so far been challenged or had this obstacle of pots of money

250
00:15:25,870 --> 00:15:30,170
Ejaaz:
basically to start their own AI company now have access to a frontier,

251
00:15:30,650 --> 00:15:34,270
Ejaaz:
world-renowned model and can create whatever application, website,

252
00:15:34,470 --> 00:15:36,570
Ejaaz:
or product that they want to make.

253
00:15:36,790 --> 00:15:40,250
Ejaaz:
So I think that's part of the usage there as well. Do you have any takes on this?

254
00:15:40,690 --> 00:15:44,090
Josh:
Yeah, and it's downstream of cost, right? We always see this when a model is

255
00:15:44,090 --> 00:15:48,230
Josh:
cheaper and mostly equivalent, the money will always flow to the cheaper model.

256
00:15:48,390 --> 00:15:51,330
Josh:
It'll always get more queries. I think it's important to note the different

257
00:15:51,330 --> 00:15:56,210
Josh:
use cases of these models. So they're not directly competing head to head on the same benchmarks.

258
00:15:56,390 --> 00:16:00,270
Josh:
I think what we see is like when we talk about Claude, it's generally known as the coding model.

259
00:16:00,610 --> 00:16:04,470
Josh:
And I don't think like OpenAI's O3 is not really competing directly with Claude

260
00:16:04,470 --> 00:16:07,890
Josh:
because it's more of a general intelligence versus a coding specific intelligence.

261
00:16:08,770 --> 00:16:13,530
Josh:
K2 is probably closer to a Claude. I would assume where it's really good at

262
00:16:13,530 --> 00:16:15,410
Josh:
coding because it uses this mixture of experts.

263
00:16:15,630 --> 00:16:19,670
Josh:
And I think that helps it find the tools. It uses this cool new novel thing

264
00:16:19,670 --> 00:16:21,770
Josh:
called like multiple tool use.

265
00:16:21,890 --> 00:16:25,290
Josh:
So each one of these experts can use a tool simultaneously and they could use

266
00:16:25,290 --> 00:16:26,810
Josh:
these tools and work together to get better answers.

267
00:16:27,050 --> 00:16:29,630
Josh:
So in the case of coding, this is a home run.

268
00:16:30,050 --> 00:16:33,350
Josh:
Like it is very cheap cost per token, very high quality outputs.

269
00:16:33,570 --> 00:16:37,950
Ejaaz:
I actually think you can compete with OpenAO3, Josh. Check this out.

270
00:16:38,690 --> 00:16:42,790
Ejaaz:
So Rowan, yeah, Rowan Cheng put this out yesterday And he basically goes,

271
00:16:42,990 --> 00:16:46,110
Ejaaz:
I think we're at the tipping point for AI-generated writing.

272
00:16:46,330 --> 00:16:50,810
Ejaaz:
It's been notoriously bad, but China's Kimi K2, an open-weight model,

273
00:16:51,010 --> 00:16:53,190
Ejaaz:
is now topping creative writing benchmarks.

274
00:16:53,450 --> 00:17:00,090
Ejaaz:
So just to put that into context, that's like having the top most, I don't know,

275
00:17:00,290 --> 00:17:04,990
Ejaaz:
smartest or slightly autistic software engineer, at the top engineering company

276
00:17:04,990 --> 00:17:12,450
Ejaaz:
working on AI models, also being the best poet or creative script and directing

277
00:17:12,450 --> 00:17:14,730
Ejaaz:
the next best movie or whatever that might be,

278
00:17:14,810 --> 00:17:17,170
Ejaaz:
or creating a Harry Potter novel series.

279
00:17:17,710 --> 00:17:22,310
Ejaaz:
This model can basically do both. And what it's pointing out here is that compared

280
00:17:22,310 --> 00:17:25,850
Ejaaz:
to 03, it tops it. Look at this. Completely beats it.

281
00:17:27,470 --> 00:17:30,070
Josh:
Okay, so I take that back. Maybe it is just better at everything.

282
00:17:31,110 --> 00:17:32,690
Josh:
Yeah, that's some pretty impressive results.

283
00:17:32,930 --> 00:17:38,010
Ejaaz:
I think like what's worth pointing out here is, and I don't know whether any

284
00:17:38,010 --> 00:17:43,410
Ejaaz:
of the American AI models do this, Josh, but mixture of experts seems to be clearly a win here.

285
00:17:43,690 --> 00:17:47,970
Ejaaz:
The ability to create an incredibly smart model doesn't come without,

286
00:17:47,970 --> 00:17:52,090
Ejaaz:
you know, this large storage load that is needed, right? One trillion parameters.

287
00:17:52,490 --> 00:17:55,990
Ejaaz:
But then combining it with the ability to be like, Like, hey,

288
00:17:56,190 --> 00:17:58,030
Ejaaz:
you don't need to query the entire thing.

289
00:17:58,270 --> 00:18:02,370
Ejaaz:
We've got you. We have a smart router, which basically pulls on the best experts,

290
00:18:02,490 --> 00:18:05,790
Ejaaz:
as you described earlier, for whatever relevant query you have.

291
00:18:05,870 --> 00:18:08,770
Ejaaz:
So if you have a creative writing task or if you have a coding thing,

292
00:18:08,910 --> 00:18:11,750
Ejaaz:
we'll send it to two different departments of this model.

293
00:18:12,110 --> 00:18:15,930
Ejaaz:
That's a really huge win. Do any other American models use this?

294
00:18:16,150 --> 00:18:18,750
Josh:
Well, the first thing that came to my mind when you said that is Grok4,

295
00:18:19,010 --> 00:18:23,050
Josh:
which doesn't exactly use this, but uses a similar thing, where instead of using

296
00:18:23,050 --> 00:18:25,630
Josh:
a mixture of experts, It uses a mixture of agents.

297
00:18:26,190 --> 00:18:31,910
Josh:
So Grok4 Heavy uses a bunch of distributed agents that are basically clones of the large model.

298
00:18:32,230 --> 00:18:36,550
Josh:
But that takes up a tremendous amount of compute. And that is the $300 a month plan.

299
00:18:36,710 --> 00:18:41,690
Ejaaz:
That's replicating Grok4 though, right? So that's like taking the model and copy pasting it.

300
00:18:41,790 --> 00:18:46,710
Ejaaz:
So let's say Grok4 was one trillion parameters just for ease of comparison.

301
00:18:47,030 --> 00:18:51,210
Ejaaz:
That's like creating, if there was four agents, that's four trillion parameters,

302
00:18:51,250 --> 00:18:53,930
Ejaaz:
right? So it's still pretty costly and inefficient.

303
00:18:53,930 --> 00:18:57,790
Josh:
Is that what you're saying no it's the actually the opposite direction of k2

304
00:18:57,790 --> 00:19:02,970
Josh:
so what they have used is just and again this is kind of similar to tracking

305
00:19:02,970 --> 00:19:06,330
Josh:
sentiment between the united states and china where the united states will throw

306
00:19:06,330 --> 00:19:08,070
Josh:
compute at it where china will throw like

307
00:19:09,090 --> 00:19:12,250
Josh:
kind of clever resource at it so grok yeah

308
00:19:12,250 --> 00:19:14,950
Josh:
when they use their mixture of agents it actually just costs a lot more

309
00:19:14,950 --> 00:19:17,670
Josh:
money whereas k2 when they use their mixture of

310
00:19:17,670 --> 00:19:20,690
Josh:
experts well it costs a lot less instead of using 4 trillion

311
00:19:20,690 --> 00:19:23,670
Josh:
parameters in this case it uses just 32 billion and it

312
00:19:23,670 --> 00:19:26,370
Josh:
kind of copies that 32 billion over and over and it's really it's a really

313
00:19:26,370 --> 00:19:29,090
Josh:
elegant solution that seems to be

314
00:19:29,090 --> 00:19:32,090
Josh:
yielding pretty comparable results so i think as we

315
00:19:32,090 --> 00:19:34,770
Josh:
see these efficiency upgrades i'm sure they will

316
00:19:34,770 --> 00:19:38,910
Josh:
eventually trickle down into the united states models and when they do that

317
00:19:38,910 --> 00:19:43,990
Josh:
is going to be a huge unlock in terms of cost per token in terms of the smaller

318
00:19:43,990 --> 00:19:47,430
Josh:
distilled models that we're going to be able to run on our own computers um

319
00:19:47,430 --> 00:19:51,550
Josh:
but yeah i don't know of any who are also using it at this scale it might be

320
00:19:51,550 --> 00:19:53,810
Josh:
novel just to k2 right now and

321
00:19:53,810 --> 00:19:58,350
Ejaaz:
And i think that this is the method that probably scales the best josh like.

322
00:19:58,350 --> 00:20:00,110
Josh:
Yeah it makes sense efficiency

323
00:20:00,110 --> 00:20:06,390
Ejaaz:
Always wins at the end right and to see um this kind of innovation come pretty

324
00:20:06,390 --> 00:20:10,730
Ejaaz:
early on in a technology's life cycle is just super impressive to see,

325
00:20:11,210 --> 00:20:16,570
Ejaaz:
Another thing I saw is there's two different versions of this model, I believe.

326
00:20:16,750 --> 00:20:22,910
Ejaaz:
There's something called Kimi K2 Base, which is basically the model for researchers

327
00:20:22,910 --> 00:20:26,690
Ejaaz:
who want full control for fine-tuning and custom solutions, right?

328
00:20:26,890 --> 00:20:32,690
Ejaaz:
So imagine this model as the entire parameter set. So you have access to one

329
00:20:32,690 --> 00:20:35,730
Ejaaz:
trillion parameters, all the weight designs and everything.

330
00:20:36,010 --> 00:20:38,850
Ejaaz:
And if you're a nerd that wants to nerd out you can

331
00:20:38,850 --> 00:20:41,970
Ejaaz:
go crazy you know if you have like your own gpu

332
00:20:41,970 --> 00:20:44,950
Ejaaz:
cluster at home or if you happen to have a convenient

333
00:20:44,950 --> 00:20:48,050
Ejaaz:
warehouse full of of servers that you weirdly

334
00:20:48,050 --> 00:20:51,090
Ejaaz:
have access to you can go crazy with it you can if you

335
00:20:51,090 --> 00:20:54,810
Ejaaz:
think about like um the early gaming days of counter-strike and then you could

336
00:20:54,810 --> 00:21:00,470
Ejaaz:
like mod it you can basically mod this uh model to your heart's desire and then

337
00:21:00,470 --> 00:21:06,310
Ejaaz:
there's a second version called k2 instruct which is for drop-in general purpose

338
00:21:06,310 --> 00:21:08,550
Ejaaz:
chat and AI agent experiences.

339
00:21:08,790 --> 00:21:12,070
Ejaaz:
So this is kind of like at the consumer level, if you're experimenting with

340
00:21:12,070 --> 00:21:16,050
Ejaaz:
these things, or if you want to run an experiment at home on a specific use

341
00:21:16,050 --> 00:21:19,630
Ejaaz:
case, you can kind of like take that away and do that for yourself.

342
00:21:19,910 --> 00:21:22,170
Ejaaz:
That's how I understand it, Josh. Do you have any takes on this?

343
00:21:22,470 --> 00:21:25,430
Josh:
That makes sense. And I think that second version that you're describing is

344
00:21:25,430 --> 00:21:27,410
Josh:
what's actually available publicly on their website, right?

345
00:21:27,530 --> 00:21:31,410
Josh:
So if you go to Kimmy.com, it has a text box. It looks just like ChatGPT like you're used to.

346
00:21:31,550 --> 00:21:34,350
Josh:
And that's where you can run that second tier model which

347
00:21:34,350 --> 00:21:37,230
Josh:
um you described as that's the the drop in general purpose

348
00:21:37,230 --> 00:21:40,210
Josh:
chat and then yeah for the the hardcore researchers there's

349
00:21:40,210 --> 00:21:43,270
Josh:
a github repo and the github repo has all the weights and all the code and

350
00:21:43,270 --> 00:21:46,530
Josh:
you can really download it dive in use the full thing i

351
00:21:46,530 --> 00:21:49,470
Josh:
was playing around with the kimmy tool and it's it's really cool

352
00:21:49,470 --> 00:21:52,270
Josh:
it's fast oh i mean it's lightning fast if you

353
00:21:52,270 --> 00:21:55,250
Josh:
go from a reasoning model to an inference model like kimmy

354
00:21:55,250 --> 00:21:57,970
Josh:
you get responses like this like when

355
00:21:57,970 --> 00:22:01,710
Josh:
i'm using grok 4 or o3 i'm sitting there sometimes for a couple minutes it's

356
00:22:01,710 --> 00:22:05,350
Josh:
waiting for an answer this you type it in and it just types back right away

357
00:22:05,350 --> 00:22:08,870
Josh:
no time waiting so it's it's kind of refreshing to see that but it's also a

358
00:22:08,870 --> 00:22:11,910
Josh:
testament to how impressive it is i'm getting great answers and it's just spitting

359
00:22:11,910 --> 00:22:14,650
Josh:
it right out so what happens when they add the reasoning layer on top well it's

360
00:22:14,650 --> 00:22:15,970
Josh:
probably going to get pretty freaking good

361
00:22:16,950 --> 00:22:20,770
Ejaaz:
So the trend we're seeing, and we saw this last week with Grok4,

362
00:22:21,230 --> 00:22:27,830
Ejaaz:
is typically we're expected to wait a while when we send a prompt to a breakthrough

363
00:22:27,830 --> 00:22:33,230
Ejaaz:
model because it's thinking, it's trying to basically replicate what we have in our brains up here.

364
00:22:33,490 --> 00:22:38,490
Ejaaz:
And now it's just getting much quicker and much smarter and much cheaper.

365
00:22:38,490 --> 00:22:43,870
Ejaaz:
So the long story short is these incredibly powerful, I kind of think about

366
00:22:43,870 --> 00:22:48,710
Ejaaz:
it as how we went from massive desktop computers to slick cell phones,

367
00:22:49,010 --> 00:22:51,190
Ejaaz:
Josh, and then we're going to eventually have chips in our brain.

368
00:22:51,650 --> 00:22:55,730
Ejaaz:
AI is just kind of like fast tracking that entire life cycle within like a couple

369
00:22:55,730 --> 00:22:56,970
Ejaaz:
of years, which is just insane.

370
00:22:57,390 --> 00:23:00,730
Josh:
And these efficiency improvements are really exciting because you can see how

371
00:23:00,730 --> 00:23:05,090
Josh:
quickly they're shrinking and allowing eventually for those incredible models

372
00:23:05,090 --> 00:23:06,150
Josh:
to just run on our phones.

373
00:23:06,370 --> 00:23:09,390
Josh:
So there's totally a world a year from now in which like a

374
00:23:09,390 --> 00:23:12,770
Josh:
grok 403 kimmy k2 capable model

375
00:23:12,770 --> 00:23:15,490
Josh:
is small enough that it could just run inside of in our

376
00:23:15,490 --> 00:23:18,290
Josh:
phone and run on a mobile device or run locally on a laptop

377
00:23:18,290 --> 00:23:21,530
Josh:
or you're offline and you kind of have this portable intelligence

378
00:23:21,530 --> 00:23:24,250
Josh:
that's available everywhere anytime even if

379
00:23:24,250 --> 00:23:27,190
Josh:
you're not connected to the world and that seems really cool

380
00:23:27,190 --> 00:23:30,310
Josh:
like we were talking a few episodes ago about apple's um local

381
00:23:30,310 --> 00:23:33,330
Josh:
free ai inference running on an iphone

382
00:23:33,330 --> 00:23:36,150
Josh:
but how the base models still kind of suck like they don't really do

383
00:23:36,150 --> 00:23:38,870
Josh:
anything super interesting they're basically good enough to do what

384
00:23:38,870 --> 00:23:41,890
Josh:
you would expect siri to do but can't do and these

385
00:23:41,890 --> 00:23:44,650
Josh:
models as we get more and more breakthroughs like this that allow you to

386
00:23:44,650 --> 00:23:47,630
Josh:
run much larger parameter counts

387
00:23:47,630 --> 00:23:50,650
Josh:
on a much smaller device it's going to start really

388
00:23:50,650 --> 00:23:54,210
Josh:
super powering these mobile devices and i can't help but think about the open

389
00:23:54,210 --> 00:23:58,910
Josh:
ai hardware device i'm like wow that'd be super cool if you had like oh three

390
00:23:58,910 --> 00:24:02,230
Josh:
running locally in the middle of the jungle somewhere with no service and you

391
00:24:02,230 --> 00:24:06,990
Josh:
still had access to all of its capabilities like that's probably coming downstream

392
00:24:06,990 --> 00:24:10,270
Josh:
of breakthroughs like this where we get really big efficiency unlocks

393
00:24:10,880 --> 00:24:14,400
Ejaaz:
I mean, it's not just efficiency, though, right? It's the fact that if you can

394
00:24:14,400 --> 00:24:18,920
Ejaaz:
run it locally on your device, it can have access to all your private data without

395
00:24:18,920 --> 00:24:22,380
Ejaaz:
exposing all of that to the model providers themselves, right?

396
00:24:22,440 --> 00:24:28,840
Ejaaz:
So one of the major concerns of not just AI models, but also with mobile phones is privacy.

397
00:24:29,440 --> 00:24:32,760
Ejaaz:
I don't want to share all my kind of like private health, financial,

398
00:24:32,760 --> 00:24:35,620
Ejaaz:
and social media data, because then you're just going to have everything on

399
00:24:35,620 --> 00:24:36,720
Ejaaz:
me and you're going to use me.

400
00:24:36,820 --> 00:24:39,480
Ejaaz:
You're going to use me as a product, right? And that's kind of like been the

401
00:24:39,480 --> 00:24:41,760
Ejaaz:
quota for the last decade in tech.

402
00:24:42,220 --> 00:24:45,820
Ejaaz:
And so with AI, that's a supercharged version of it. The information gets more

403
00:24:45,820 --> 00:24:47,240
Ejaaz:
personal. It's not just your likes.

404
00:24:47,440 --> 00:24:52,660
Ejaaz:
It's, you know, where Josh shops every day and, you know, who he's dating and

405
00:24:52,660 --> 00:24:53,640
Ejaaz:
all these kinds of things, right?

406
00:24:53,800 --> 00:24:56,840
Ejaaz:
And that becomes quite personal and intrusive very quickly.

407
00:24:57,240 --> 00:25:03,160
Ejaaz:
So the question then becomes, how can we have the magic of an AI model without it being so obtrusive?

408
00:25:03,220 --> 00:25:09,480
Ejaaz:
And that is open source locally run AI or privately run AI. and Kimi K2 is a

409
00:25:09,480 --> 00:25:13,900
Ejaaz:
frontier model that can technically run on your local device if you set up the right hardware for it.

410
00:25:14,000 --> 00:25:17,380
Ejaaz:
And the way that we're trending, you can basically end up having that on your

411
00:25:17,380 --> 00:25:19,760
Ejaaz:
device, which is just a huge unlock.

412
00:25:20,020 --> 00:25:25,100
Ejaaz:
And if you can imagine how you use OpenAI 03 right now, Josh,

413
00:25:25,240 --> 00:25:26,960
Ejaaz:
right? I know you use it as much as I do.

414
00:25:27,080 --> 00:25:30,740
Ejaaz:
The reason why you and I use it so much isn't just because it's so smart,

415
00:25:30,900 --> 00:25:32,980
Ejaaz:
but it's because it remembers everything about us.

416
00:25:33,080 --> 00:25:36,480
Ejaaz:
But I hate that Sam knows or has access to all that data.

417
00:25:36,480 --> 00:25:41,280
Ejaaz:
I hate that if he chooses to switch on personalized ads, which is currently

418
00:25:41,280 --> 00:25:44,600
Ejaaz:
the model where most of these tech companies make money right now,

419
00:25:44,760 --> 00:25:48,700
Ejaaz:
he can, and I've got nothing to do about it because I don't want to use any

420
00:25:48,700 --> 00:25:49,700
Ejaaz:
other model apart from that.

421
00:25:49,840 --> 00:25:51,040
Ejaaz:
But if there was a locally run

422
00:25:51,040 --> 00:25:56,120
Ejaaz:
model that had access to all the memory and context, I'd use that instead.

423
00:25:56,880 --> 00:26:00,440
Josh:
And this is suspicious. I mean, this is a different conversation in total,

424
00:26:00,600 --> 00:26:04,140
Josh:
but isn't it interesting how other companies haven't really leaned into memory

425
00:26:04,140 --> 00:26:06,600
Josh:
when it's seemingly the most important mode that there is.

426
00:26:07,040 --> 00:26:11,220
Josh:
Like Grok4 doesn't have good memory rolled out. Gemini doesn't really have memory.

427
00:26:11,440 --> 00:26:13,760
Josh:
There's no, Claude doesn't have memory the way that OpenAI does.

428
00:26:13,940 --> 00:26:18,880
Josh:
Yet it's the single biggest reason why we both continue to go back to ChatGPT and OpenAI.

429
00:26:19,280 --> 00:26:21,640
Josh:
So that's just been an interesting thing. I mean, Kimmy is open source.

430
00:26:21,760 --> 00:26:24,420
Josh:
I wouldn't expect them to lean too much into it. But for these closed source

431
00:26:24,420 --> 00:26:26,980
Josh:
models, that's just, it's another interesting just observation.

432
00:26:27,180 --> 00:26:30,120
Josh:
Like, hey, the most important thing isn't, doesn't seem to be prioritized by

433
00:26:30,120 --> 00:26:30,900
Josh:
other companies just yet.

434
00:26:31,340 --> 00:26:37,660
Ejaaz:
Why do you think that is so so my theory um at least from xai or grok force

435
00:26:37,660 --> 00:26:43,860
Ejaaz:
perspective is elon's like okay i'm not going to be able to build a better chat

436
00:26:43,860 --> 00:26:50,800
Ejaaz:
bot or chat messenger than openai has there's not too many features i can um.

437
00:26:51,410 --> 00:26:55,550
Ejaaz:
Set Grok 4 apart, then that O3 doesn't already do, right?

438
00:26:55,730 --> 00:26:59,670
Ejaaz:
But where I can beat O3 is at the app layer.

439
00:26:59,870 --> 00:27:05,210
Ejaaz:
I can create a better app store than they have because I haven't really created

440
00:27:05,210 --> 00:27:09,290
Ejaaz:
one that is sticky enough for users to continually use.

441
00:27:09,410 --> 00:27:15,150
Ejaaz:
And I can use that data set to then unlock memory and context at that point, right?

442
00:27:15,290 --> 00:27:18,990
Ejaaz:
So I just saw today that they released, they

443
00:27:18,990 --> 00:27:21,870
Ejaaz:
being um xai released a new feature for grok 4

444
00:27:21,870 --> 00:27:25,330
Ejaaz:
called i think it's uh companions josh um

445
00:27:25,330 --> 00:27:29,150
Ejaaz:
and it's basically these yeah these animated um

446
00:27:29,150 --> 00:27:33,830
Ejaaz:
avatar like um characters so they basically look like they're from an anime

447
00:27:33,830 --> 00:27:37,830
Ejaaz:
show and you know how you can use voice mode in open ai and you can kind of

448
00:27:37,830 --> 00:27:44,130
Ejaaz:
like talk to this uh realistic human sounding ai you now have a face and a character

449
00:27:44,130 --> 00:27:47,250
Ejaaz:
on grok 4 and it's really entertaining, Josh.

450
00:27:47,390 --> 00:27:51,090
Ejaaz:
Like I find myself kind of like engaged in this thing because I'm not just typing words.

451
00:27:51,270 --> 00:27:55,190
Ejaaz:
It's not just this binary to and fro with this chat messenger.

452
00:27:55,410 --> 00:28:00,330
Ejaaz:
It's this human, this cute, attractive human that I'm just like now speaking to.

453
00:28:00,610 --> 00:28:03,830
Ejaaz:
And I think that that's the strategy that a lot of these AI companies,

454
00:28:04,010 --> 00:28:08,570
Ejaaz:
if I had to guess, are taking to kind of like seed their user base before they

455
00:28:08,570 --> 00:28:10,590
Ejaaz:
unlock memory. I don't know whether you have a take on that.

456
00:28:10,590 --> 00:28:13,650
Josh:
Yeah, I have a fun little demo. I actually played around with it this morning

457
00:28:13,650 --> 00:28:17,950
Josh:
and I was using it totally unhinged, no filter, very vulgar,

458
00:28:18,110 --> 00:28:21,030
Josh:
but like kind of fun. It's like a fun little party trick.

459
00:28:21,530 --> 00:28:25,150
Josh:
And yeah, I mean, that was a surprise to me this morning when I saw that rolled

460
00:28:25,150 --> 00:28:27,370
Josh:
out. I was like, huh, that doesn't really seem like it makes sense.

461
00:28:27,490 --> 00:28:29,150
Josh:
But I think they're just having fun with it.

462
00:28:29,430 --> 00:28:32,670
Ejaaz:
Can we for a second talk about the team?

463
00:28:33,090 --> 00:28:37,910
Ejaaz:
So we've mentioned just now how they've all come from China and how China's

464
00:28:37,910 --> 00:28:41,710
Ejaaz:
like really advancing open source AI models, and they've completely beat out

465
00:28:41,710 --> 00:28:45,310
Ejaaz:
the competition in America, Mata's Lama being the obvious one.

466
00:28:45,470 --> 00:28:47,570
Ejaaz:
We've got Kwen from Alibaba.

467
00:28:47,850 --> 00:28:52,830
Ejaaz:
We've got Deep Seek R1. Now we have Kimi K2. The team is basically...

468
00:28:53,830 --> 00:29:00,530
Ejaaz:
The AI Avengers of China, Josh. So these three co-founders all have deep AI

469
00:29:00,530 --> 00:29:04,250
Ejaaz:
ML backgrounds that hail from the top American universities,

470
00:29:04,250 --> 00:29:05,550
Ejaaz:
such as Carnegie Mellon.

471
00:29:05,630 --> 00:29:08,830
Ejaaz:
One of them has a PhD from Carnegie Mellon in machine learning,

472
00:29:08,950 --> 00:29:14,370
Ejaaz:
which is basically, for those of you who don't know, is like God-tier degree for AI.

473
00:29:14,570 --> 00:29:19,310
Ejaaz:
That means you're desirable and hireable by every other AI company after you graduate.

474
00:29:19,470 --> 00:29:23,670
Ejaaz:
But it's not just that. They also have credibility and degrees from the top universities in China.

475
00:29:23,830 --> 00:29:28,610
Ejaaz:
Especially this one university called Tsinghua, which seemed to be the top of their field.

476
00:29:28,730 --> 00:29:34,730
Ejaaz:
I looked them up on rankings for AI universities globally, and they often come

477
00:29:34,730 --> 00:29:40,790
Ejaaz:
in number three or four in the top 10 AI universities. So pretty impressive from there.

478
00:29:41,070 --> 00:29:46,090
Ejaaz:
But what I found really interesting, Josh, was one of the co-founders was an

479
00:29:46,090 --> 00:29:51,110
Ejaaz:
expert in training AI models on low-cost optimized hardware.

480
00:29:51,110 --> 00:29:58,690
Ejaaz:
And the reason why I mentioned this is it's no secret that if you want a top

481
00:29:58,690 --> 00:30:02,970
Ejaaz:
frontier AI model, you need to train it on NVIDIA's GPUs.

482
00:30:03,110 --> 00:30:06,010
Ejaaz:
You need to train it on NVIDIA's hardware.

483
00:30:06,650 --> 00:30:10,850
Ejaaz:
NVIDIA's market cap, I think, at the end of last week, surpassed $4 trillion.

484
00:30:11,550 --> 00:30:18,070
Ejaaz:
That's $4 trillion with a T. That is more than the current GDP of the entire British economy.

485
00:30:18,410 --> 00:30:19,730
Josh:
Where I hail from. And the largest in the world.

486
00:30:19,730 --> 00:30:20,910
Ejaaz:
And there's never been.

487
00:30:20,910 --> 00:30:21,510
Josh:
A bigger company

488
00:30:21,510 --> 00:30:24,350
Ejaaz:
There's never been a bigger company it it's just

489
00:30:24,350 --> 00:30:27,590
Ejaaz:
insane to grab your head around and it's not without

490
00:30:27,590 --> 00:30:30,270
Ejaaz:
reason they supply basically or they have a

491
00:30:30,270 --> 00:30:33,210
Ejaaz:
grasp or a monopoly on the hardware that

492
00:30:33,210 --> 00:30:36,290
Ejaaz:
is needed to train top models now kimmy k2

493
00:30:36,290 --> 00:30:40,970
Ejaaz:
comes along casually drops a one trillion parameter model one of the largest

494
00:30:40,970 --> 00:30:46,730
Ejaaz:
models ever released um and it's trained on hardware that isn't nvidia's um

495
00:30:46,730 --> 00:30:50,730
Ejaaz:
and jensen huang i i need to find this clip josh but But Jensen Huang basically

496
00:30:50,730 --> 00:30:53,510
Ejaaz:
was on stage, I think it was at a private conference maybe yesterday,

497
00:30:53,810 --> 00:31:01,070
Ejaaz:
but he was quoted as saying 50% of the top AI researchers are Chinese and are from China.

498
00:31:01,070 --> 00:31:04,990
Ejaaz:
And what he was implicitly getting at is they're a real threat now.

499
00:31:05,150 --> 00:31:08,030
Ejaaz:
I think for the last decade, we've kind of been like, ah, yeah,

500
00:31:08,210 --> 00:31:12,970
Ejaaz:
China's just going to copy paste everything that comes out of America's tech sector.

501
00:31:13,210 --> 00:31:17,590
Ejaaz:
But when it comes to AI, we've kind of like maintained the same mindset up until

502
00:31:17,590 --> 00:31:19,830
Ejaaz:
now where they're really just competing with us.

503
00:31:19,910 --> 00:31:24,570
Ejaaz:
And if they have the hardware, they have the ability to research new techniques

504
00:31:24,570 --> 00:31:28,410
Ejaaz:
to train these models, like DeepSeek's reinforcement learning and reasoning,

505
00:31:28,690 --> 00:31:33,050
Ejaaz:
and then Kimi K2's kind of like efficient training run, which you showed earlier.

506
00:31:33,270 --> 00:31:38,410
Ejaaz:
They've come to play, Josh. And I think it's worth highlighting that China has

507
00:31:38,410 --> 00:31:44,750
Ejaaz:
a very strong grasp on top AI researchers in the world and models that are coming out of it.

508
00:31:45,330 --> 00:31:48,630
Josh:
Where are their $100 million offers? I haven't seen any of those coming through.

509
00:31:49,310 --> 00:31:55,050
Josh:
None, dude. The most impressive thing is that they do it without the resources that we have.

510
00:31:55,550 --> 00:32:01,050
Josh:
Imagine if they did have access to the clusters of these like H100s that NVIDIA is making.

511
00:32:01,210 --> 00:32:03,590
Josh:
I mean, that would be, would they crush us?

512
00:32:03,830 --> 00:32:08,550
Josh:
And we kind of have this timeline here where we're kind of running up against

513
00:32:08,550 --> 00:32:13,210
Josh:
the edge of energy that we have available to us to train these massive models.

514
00:32:13,470 --> 00:32:17,030
Josh:
Whereas China does not have that constraint. They have significantly more energy to power these.

515
00:32:17,210 --> 00:32:22,610
Josh:
So in the event, the inevitable event that they do get the chips and they are

516
00:32:22,610 --> 00:32:26,370
Josh:
able to train at the scale that we are, I'm not sure we're able to continue

517
00:32:26,370 --> 00:32:29,790
Josh:
our rate of acceleration in terms of hardware manufacturing,

518
00:32:30,370 --> 00:32:32,310
Josh:
large training as fast as they will.

519
00:32:32,510 --> 00:32:36,010
Josh:
And they already have done the hard work on the software efficiency side.

520
00:32:36,150 --> 00:32:40,470
Josh:
They've cranked out every single efficiency because they are doing it on constrained hardware.

521
00:32:40,650 --> 00:32:43,750
Josh:
So it's going to create this really interesting effect where they're coming

522
00:32:43,750 --> 00:32:47,690
Josh:
at it from the like ingenuity software approach we're coming at it from the

523
00:32:47,690 --> 00:32:51,170
Josh:
brute force throw a lot of compute added approach and we'll see where both both

524
00:32:51,170 --> 00:32:54,690
Josh:
sides end up um but it's clear that china is still behind because they are the

525
00:32:54,690 --> 00:32:58,730
Josh:
ones open sourcing the models and we know at this point now if you're open sourcing

526
00:32:58,730 --> 00:33:00,190
Josh:
your model you're doing it because you're behind

527
00:33:00,190 --> 00:33:03,050
Ejaaz:
Yeah yeah i mean one thing

528
00:33:03,050 --> 00:33:05,810
Ejaaz:
that did surprise me josh was that they released a one

529
00:33:05,810 --> 00:33:08,750
Ejaaz:
trillion parameter open source model i i didn't

530
00:33:08,750 --> 00:33:11,550
Ejaaz:
expect them to catch up that quickly um like one

531
00:33:11,550 --> 00:33:14,730
Ejaaz:
trillion is a lot um yeah another thing

532
00:33:14,730 --> 00:33:17,850
Ejaaz:
i was thinking about is china has dominated

533
00:33:17,850 --> 00:33:20,910
Ejaaz:
hardware for so long now so it wouldn't

534
00:33:20,910 --> 00:33:23,630
Ejaaz:
really surprise me if like i don't know a

535
00:33:23,630 --> 00:33:27,430
Ejaaz:
couple years from now they're producing better models

536
00:33:27,430 --> 00:33:30,150
Ejaaz:
at specific things basically because they have better

537
00:33:30,150 --> 00:33:33,210
Ejaaz:
hardware than america than the west um but

538
00:33:33,210 --> 00:33:36,450
Ejaaz:
where i think the west will continue to dominate

539
00:33:36,450 --> 00:33:39,350
Ejaaz:
is at the application layer and i don't

540
00:33:39,350 --> 00:33:42,310
Ejaaz:
know if i was a betting man i would say that most of the money is eventually going

541
00:33:42,310 --> 00:33:45,470
Ejaaz:
to be made on the application side of things i think grok

542
00:33:45,470 --> 00:33:48,270
Ejaaz:
4 is starting to um kind of show that

543
00:33:48,270 --> 00:33:52,130
Ejaaz:
with all these different kinds of novel features that they're releasing i i

544
00:33:52,130 --> 00:33:55,030
Ejaaz:
don't know if you've seen some of the games that are being produced from grok

545
00:33:55,030 --> 00:33:59,410
Ejaaz:
4 josh but it is ultimately insane and i haven't seen any similar examples come

546
00:33:59,410 --> 00:34:03,410
Ejaaz:
out of uh asia from any of their ai models even when they have access to american

547
00:34:03,410 --> 00:34:06,230
Ejaaz:
models So I still think America dominates at the app layer.

548
00:34:06,730 --> 00:34:11,210
Ejaaz:
But Josh, I just came across this tweet, which you reminded me of earlier.

549
00:34:11,450 --> 00:34:16,190
Ejaaz:
Tell me about OpenAI's strategy to open source model, because I got this tweet

550
00:34:16,190 --> 00:34:18,970
Ejaaz:
pulled up from Sam Altman, which is kind of hilarious.

551
00:34:19,550 --> 00:34:23,310
Josh:
Yeah. All right. So this week, if you remember from our episode last week,

552
00:34:23,310 --> 00:34:27,070
Josh:
we were excited about talking about OpenAI's new open source model.

553
00:34:27,330 --> 00:34:30,310
Josh:
OpenAI, open source model, all checks out. This was going to be the big week.

554
00:34:30,430 --> 00:34:34,310
Josh:
They released their new flagship open source. Well, conveniently,

555
00:34:35,230 --> 00:34:39,250
Josh:
I think the same day as K2 launched, later in the day, or perhaps the very next morning.

556
00:34:39,730 --> 00:34:43,730
Josh:
Sam Altman posted a tweet. He says, Hey, we plan to launch our open weights model next week.

557
00:34:44,050 --> 00:34:48,150
Josh:
We are delaying it. We need time to run additional safety tests and review high-risk

558
00:34:48,150 --> 00:34:50,650
Josh:
areas. We are not yet sure how long it will take us.

559
00:34:50,910 --> 00:34:53,770
Josh:
While we trust the community will build great things with this model,

560
00:34:54,070 --> 00:34:57,910
Josh:
once weights are out, they can't be pulled back. This is new for us and we want to get it right.

561
00:34:58,110 --> 00:35:00,770
Josh:
Sorry to be the bearer of bad news. We are working super hard.

562
00:35:01,030 --> 00:35:05,110
Josh:
So there's a few points of speculation. The first, obviously,

563
00:35:05,310 --> 00:35:08,950
Josh:
being, did you just get your ass handed to you and now you are going back to

564
00:35:08,950 --> 00:35:11,510
Josh:
reevaluate before you push out a remodel?

565
00:35:11,730 --> 00:35:14,670
Josh:
So that's one possible thing where they saw K2. They were like,

566
00:35:14,750 --> 00:35:16,270
Josh:
oh, boy, this is pretty sweet.

567
00:35:16,910 --> 00:35:21,410
Josh:
This is our first open source model. We probably don't want to be lower than them.

568
00:35:21,550 --> 00:35:24,690
Josh:
And there is this second point of speculation, which, Ejaz, you mentioned to

569
00:35:24,690 --> 00:35:28,790
Josh:
me a little earlier today, where maybe something went wrong with the training one.

570
00:35:28,910 --> 00:35:32,230
Josh:
And it's not quite that they're getting beat up by a Chinese company.

571
00:35:32,230 --> 00:35:37,110
Josh:
Is that like they actually made a mistake on their own accord and can you explain

572
00:35:37,110 --> 00:35:40,670
Josh:
to me specifically what that might be what the speculation is at least yeah

573
00:35:40,670 --> 00:35:43,850
Ejaaz:
Well i'll keep it short i think it was a little racist under

574
00:35:43,850 --> 00:35:46,870
Ejaaz:
the hood and i i can't find the tweet but basically

575
00:35:46,870 --> 00:35:50,510
Ejaaz:
one of these um ai researchers slash

576
00:35:50,510 --> 00:35:53,410
Ejaaz:
product builders on x got access to

577
00:35:53,410 --> 00:35:56,370
Ejaaz:
the model supposedly according to him and he tested it

578
00:35:56,370 --> 00:35:59,310
Ejaaz:
out uh in the background and he said yeah it's it's

579
00:35:59,310 --> 00:36:02,190
Ejaaz:
not really an intelligence thing it's just worse than

580
00:36:02,190 --> 00:36:08,450
Ejaaz:
what uh you'd expect from an alignment and uh consumer facing approach it was

581
00:36:08,450 --> 00:36:12,570
Ejaaz:
it was ill-mannered it was saying some pretty wild shit kind of the stuff that

582
00:36:12,570 --> 00:36:17,770
Ejaaz:
you'd expect coming out of 4chan um and so sam altman decided to delay whilst

583
00:36:17,770 --> 00:36:21,350
Ejaaz:
they kind of like figured out why um it was kind of acting out.

584
00:36:21,350 --> 00:36:24,510
Josh:
Got it okay so we'll leave

585
00:36:24,510 --> 00:36:27,370
Josh:
that speculation where it is there's a there's a funny post

586
00:36:27,370 --> 00:36:30,350
Josh:
that i'll actually share with you if you want to throw it up which was actually from elon

587
00:36:30,350 --> 00:36:33,810
Josh:
and we'll abbreviate but it was like elon was basically saying um

588
00:36:33,810 --> 00:36:37,310
Josh:
it's hard to avoid the the libtard slash

589
00:36:37,310 --> 00:36:40,070
Josh:
mecha hitler like approach both of them

590
00:36:40,070 --> 00:36:43,790
Josh:
because they're on so polar opposite ends of the spectrum and he said he spent

591
00:36:43,790 --> 00:36:47,330
Josh:
several hours trying to solve this problem with the system prompt but there's

592
00:36:47,330 --> 00:36:50,590
Josh:
too much garbage coming in at the foundation model level so basically i mean

593
00:36:50,590 --> 00:36:53,370
Josh:
what happens with these models is you train them based on all the human knowledge

594
00:36:53,370 --> 00:36:57,230
Josh:
that exists right so everything that we've believed all the ideas that we've

595
00:36:57,230 --> 00:36:59,050
Josh:
shared it's been fed into these models.

596
00:36:59,090 --> 00:37:03,330
Josh:
And what happens is you can try to adjust how they interpret this data through

597
00:37:03,330 --> 00:37:06,930
Josh:
the system prompt, which is basically an instruction that every single query

598
00:37:06,930 --> 00:37:12,610
Josh:
gets passed through, but at some point is reliant on this swath of human data that is just

599
00:37:13,340 --> 00:37:16,520
Josh:
It's too overbearing. And that's kind of what Elon shared.

600
00:37:16,660 --> 00:37:20,180
Josh:
And the difference between OpenAI and Grok is that Grok will just ship the crazy

601
00:37:20,180 --> 00:37:22,460
Josh:
update. And that's what they did. And they caught a lot of backlash from it.

602
00:37:22,660 --> 00:37:26,960
Josh:
But what I find interesting and what I'm sure OpenAI will probably follow is

603
00:37:26,960 --> 00:37:30,340
Josh:
this last paragraph where he says, our V7 foundation model should be much better.

604
00:37:30,460 --> 00:37:34,300
Josh:
And we're being far more selective about training data rather than just training on the entire internet.

605
00:37:34,480 --> 00:37:37,040
Josh:
So what they're planning to do to solve this problem, which is what I assume

606
00:37:37,040 --> 00:37:40,880
Josh:
OpenAI probably ran into in the case that the AI training model kind of went

607
00:37:40,880 --> 00:37:44,800
Josh:
off the rails and it started saying bad things about lots of people is that

608
00:37:44,800 --> 00:37:48,860
Josh:
you kind of have to rebuild the foundation model with new sets of data.

609
00:37:49,020 --> 00:37:52,380
Josh:
And in the case of Grok, I know one of the intentions for v7 is actually to

610
00:37:52,380 --> 00:37:57,500
Josh:
generate its own database of data based on synthetic data from their models.

611
00:37:57,880 --> 00:38:01,480
Josh:
And I'm assuming OpenAO will probably have to do this too if they want to calibrate.

612
00:38:01,640 --> 00:38:05,120
Josh:
A lot of times people call that the temperature, which is the like variance

613
00:38:05,120 --> 00:38:07,540
Josh:
of aggression in which a model uses.

614
00:38:08,280 --> 00:38:11,200
Josh:
And I don't know, I think we're gonna start to see interesting approaches from

615
00:38:11,200 --> 00:38:15,320
Josh:
that because as they get smarter, you really don't want them to necessarily

616
00:38:15,320 --> 00:38:18,580
Josh:
have these evil traits as the default.

617
00:38:18,720 --> 00:38:23,220
Josh:
And it's very hard to get around that when you train them on the data that they've been trained on so far.

618
00:38:24,160 --> 00:38:29,860
Ejaaz:
It just goes to show how, I guess, cumbersome it is to train these models,

619
00:38:30,060 --> 00:38:31,560
Ejaaz:
Josh. It's such a hard thing.

620
00:38:31,780 --> 00:38:32,560
Josh:
Yeah. Yeah.

621
00:38:32,680 --> 00:38:36,660
Ejaaz:
It's not something that you can just kind of like jump into the code and tweak a few things.

622
00:38:37,000 --> 00:38:40,740
Ejaaz:
Most of the time you don't know what's wrong with the model or where it went

623
00:38:40,740 --> 00:38:43,940
Ejaaz:
wrong. I mean, we've talked about this on a previous episode, but

624
00:38:44,300 --> 00:38:48,580
Ejaaz:
So essentially, if you build out this model, right, you spend hundreds of millions

625
00:38:48,580 --> 00:38:51,660
Ejaaz:
of dollars, and then you feed it a query.

626
00:38:51,800 --> 00:38:54,660
Ejaaz:
So you put something in and then you wait to see what it spits out.

627
00:38:54,920 --> 00:38:58,260
Ejaaz:
You don't really know what it's going to spit out. You can't predict it.

628
00:38:58,360 --> 00:39:01,140
Ejaaz:
It's completely probabilistic. and so if you

629
00:39:01,140 --> 00:39:04,020
Ejaaz:
release a model and it starts being a little racist or uh

630
00:39:04,020 --> 00:39:07,300
Ejaaz:
you know um kind of crazy uh you

631
00:39:07,300 --> 00:39:10,100
Ejaaz:
have to kind of like go back to the drawing board and you have

632
00:39:10,100 --> 00:39:13,000
Ejaaz:
to analyze many different sectors of of this model

633
00:39:13,000 --> 00:39:16,440
Ejaaz:
like was it the data that was poisoned or was it the way that we trained it

634
00:39:16,440 --> 00:39:21,180
Ejaaz:
or maybe it was a particular model weight that we tweaked too much or whatever

635
00:39:21,180 --> 00:39:25,340
Ejaaz:
that might be so i i think over time it's going to get a lot easier once we

636
00:39:25,340 --> 00:39:29,060
Ejaaz:
understand how these models actually work but my god it must be so expensive

637
00:39:29,060 --> 00:39:31,780
Ejaaz:
to just continually rerun and retrain these models.

638
00:39:32,440 --> 00:39:35,200
Josh:
Yeah when you think about a coherent cluster of 200

639
00:39:35,200 --> 00:39:37,860
Josh:
000 gpus the amount of energy the amount

640
00:39:37,860 --> 00:39:42,920
Josh:
of resources just to to retrain a mistake is is huge so i think i mean the more

641
00:39:42,920 --> 00:39:46,820
Josh:
we go into it the deeper we get the more it kind of makes sense paying so much

642
00:39:46,820 --> 00:39:50,360
Josh:
money for talent to avoid these mistakes where if you pay a hundred million

643
00:39:50,360 --> 00:39:54,240
Josh:
dollars for one employee who will give you a strategic advantage to avoid having

644
00:39:54,240 --> 00:39:57,320
Josh:
to do another training run, that will cost you more than $100 million.

645
00:39:57,760 --> 00:40:01,360
Josh:
You've already, you're already in the profit. So you kind of start to see the

646
00:40:01,360 --> 00:40:03,080
Josh:
scale, the complexity, the difficulties.

647
00:40:03,960 --> 00:40:07,460
Josh:
I do not envy the challenges that some of these engineers have to face.

648
00:40:07,720 --> 00:40:09,540
Josh:
Although I do envy the- I envy the salary.

649
00:40:09,800 --> 00:40:11,160
Ejaaz:
I envy the salary, Josh.

650
00:40:11,340 --> 00:40:14,360
Josh:
I envy the salary and I envy the adventure. Like how cool must that be trying

651
00:40:14,360 --> 00:40:18,520
Josh:
to build super intelligence for the world as a human for the first time in like

652
00:40:18,520 --> 00:40:20,320
Josh:
the history of everything.

653
00:40:20,480 --> 00:40:24,380
Josh:
So it's gotta be pretty fun. This is where we're at now with the open source

654
00:40:24,380 --> 00:40:28,780
Josh:
models closed source models k2's pretty epic i think that's a home run i think

655
00:40:28,780 --> 00:40:32,220
Josh:
we've crowned a new model today um do you have any closing thoughts anything

656
00:40:32,220 --> 00:40:35,800
Josh:
you want to add before we wrap up here this is pretty amazing i

657
00:40:35,800 --> 00:40:40,280
Ejaaz:
Think i'm most excited uh for the episode that we're probably going to release

658
00:40:40,280 --> 00:40:44,860
Ejaaz:
a week from now josh when we've seen what people have built with this open source

659
00:40:44,860 --> 00:40:48,940
Ejaaz:
model that's the best part about this by the way just to remind the listener that,

660
00:40:49,550 --> 00:40:52,650
Ejaaz:
anyone can take this model right now you if you're listening to this can take

661
00:40:52,650 --> 00:40:56,990
Ejaaz:
this model right now run it locally at home and tweak it to your preference

662
00:40:56,990 --> 00:41:00,670
Ejaaz:
now yes it's going to be you know you kind of need to know how to tweak model

663
00:41:00,670 --> 00:41:03,470
Ejaaz:
weights and stuff but i think we're going to see some really cool applications

664
00:41:03,470 --> 00:41:07,130
Ejaaz:
get released over the next week and i'm excited to play around with them personally.

665
00:41:07,130 --> 00:41:09,890
Josh:
Yeah if you're listening to this um and you can

666
00:41:09,890 --> 00:41:12,950
Josh:
run this model let us know because that means you have quite a solid uh

667
00:41:12,950 --> 00:41:15,670
Josh:
rig at your home yeah i'm not sure the average person is

668
00:41:15,670 --> 00:41:18,630
Josh:
going to be able to run this but that is the beauty of the open weights is that anybody

669
00:41:18,630 --> 00:41:21,670
Josh:
with the capability of running this can do so they

670
00:41:21,670 --> 00:41:24,370
Josh:
could tweak it how they like and now they have access to the new

671
00:41:24,370 --> 00:41:27,210
Josh:
best open source model in the world which i mean just a

672
00:41:27,210 --> 00:41:29,950
Josh:
couple months ago from now would have been the best model in the

673
00:41:29,950 --> 00:41:33,550
Josh:
world so it's moving really quickly it's really accessible and

674
00:41:33,550 --> 00:41:37,790
Josh:
i'm sure as the weeks go by i mean hopefully we'll get open ai's model open

675
00:41:37,790 --> 00:41:41,450
Josh:
source model soon in the next few weeks we'll be able to cover that but until

676
00:41:41,450 --> 00:41:46,290
Josh:
then just lots of stuff going on this was uh another great episode so thank

677
00:41:46,290 --> 00:41:50,670
Josh:
you everyone for tuning in again for rocking with us We actually plan on making this like 20 minutes,

678
00:41:50,810 --> 00:41:53,430
Josh:
but we just kind of kept tailing off into more interesting things.

679
00:41:53,610 --> 00:41:56,490
Josh:
There's a lot of interesting stuff to talk about. I mean, there's really,

680
00:41:56,650 --> 00:41:57,830
Josh:
you could take this in a lot of places.

681
00:41:58,330 --> 00:42:00,190
Josh:
So hopefully this was interesting.

682
00:42:00,870 --> 00:42:04,850
Josh:
Go check out Kimmy K2. It's really, really impressive. It's really fast.

683
00:42:05,030 --> 00:42:07,410
Josh:
It's really cheap. If you're a developer, give it a try.

684
00:42:07,730 --> 00:42:11,150
Josh:
And yeah, that's been another episode. We'll be back again later this week with

685
00:42:11,150 --> 00:42:18,630
Josh:
another topic. and just keep on chugging along as the frontier of AMI models continues to head west.

686
00:42:18,930 --> 00:42:23,610
Ejaaz:
So also we'd love to hear from you guys. So if you have any suggestions on things

687
00:42:23,610 --> 00:42:27,170
Ejaaz:
that you want us to talk more about, or maybe there's like some weird model

688
00:42:27,170 --> 00:42:32,410
Ejaaz:
or feature that you just don't understand and maybe we can do a job at explaining it, just message us.

689
00:42:32,570 --> 00:42:37,250
Ejaaz:
Our DMs are open or respond to any of our tweets and we'll be happy to oblige.

690
00:42:37,650 --> 00:42:39,670
Josh:
Yeah, let us know. If there's anything cool that we're missing,

691
00:42:40,110 --> 00:42:41,730
Josh:
send it our way and we'll cover it. That'd be great.

692
00:42:42,350 --> 00:42:45,130
Josh:
But yeah, we're all going on the journeys together. We're learning this as we go.

693
00:42:45,310 --> 00:42:47,790
Josh:
So hopefully today was interesting. And if you did enjoy it,

694
00:42:47,890 --> 00:42:50,350
Josh:
please share with friends, likes, comment, subscribe, all the great things.

695
00:42:50,370 --> 00:42:52,270
Josh:
And we will see you on the next episode.

696
00:42:52,610 --> 00:42:53,810
Ejaaz:
Thanks for watching. See you guys. See you.