1
00:00:03,560 --> 00:00:06,720
Josh:
The unthinkable has just happened open ai

2
00:00:06,720 --> 00:00:09,740
Josh:
has released an open source model open ai

3
00:00:09,740 --> 00:00:12,500
Josh:
has been closed ai since the time that i knew them

4
00:00:12,500 --> 00:00:15,300
Josh:
they have been named themselves open ai they were not

5
00:00:15,300 --> 00:00:18,900
Josh:
open source they have finally released an open source model and surprise surprise

6
00:00:18,900 --> 00:00:22,900
Josh:
it's actually really great and i think the downstream implications of an open

7
00:00:22,900 --> 00:00:26,200
Josh:
source model from a company like this that is this good are really it's a really

8
00:00:26,200 --> 00:00:32,560
Josh:
big deal i think this really matters a lot just yesterday they announced the release of GPT-OSS.

9
00:00:32,940 --> 00:00:36,660
Josh:
There are two models. There's a 120 billion parameter model and there's a 20

10
00:00:36,660 --> 00:00:38,820
Josh:
billion parameter model. We're going to get into benchmarks.

11
00:00:38,820 --> 00:00:40,100
Josh:
We're going to get into how good they are.

12
00:00:40,280 --> 00:00:44,460
Josh:
But the idea is that OpenAI has actually released an open source model.

13
00:00:44,660 --> 00:00:49,680
Josh:
And this can compare to the Chinese models because we recently had DeepSeek and we've had Kimi.

14
00:00:49,820 --> 00:00:54,060
Josh:
And those would be very good. But this is the first really solid American-based open source model.

15
00:00:54,240 --> 00:00:57,420
Josh:
So Ijaz, I know you've been kind of digging in the weeds about how this works.

16
00:00:57,500 --> 00:01:00,720
Josh:
Can you explain us exactly why this is a big deal why this happened what's going on here

17
00:01:00,720 --> 00:01:03,440
Ejaaz:
Yeah it's it's pretty huge so so here

18
00:01:03,440 --> 00:01:06,160
Ejaaz:
are the hot highlights um as you mentioned there's two

19
00:01:06,160 --> 00:01:09,700
Ejaaz:
models that came out the 20 billion parameter model which is actually small

20
00:01:09,700 --> 00:01:14,880
Ejaaz:
enough to run on your mobile phone right now and they have a 120 billion parameter

21
00:01:14,880 --> 00:01:19,740
Ejaaz:
model which is big but still small enough to run on a high performance laptop

22
00:01:19,740 --> 00:01:25,260
Ejaaz:
so if you guys have a macbook out there jump in go for it um it's fully customizable.

23
00:01:25,680 --> 00:01:26,840
Ejaaz:
So remember, open source means

24
00:01:26,840 --> 00:01:30,640
Ejaaz:
that you can literally have access to the design of the entire model.

25
00:01:30,740 --> 00:01:34,760
Ejaaz:
It's like OpenAI giving away their secret recipe to how their frontier models

26
00:01:34,760 --> 00:01:37,660
Ejaaz:
work. And you can kind of like recreate it at home.

27
00:01:37,780 --> 00:01:41,160
Ejaaz:
This means that you can customize it to any kind of use case that you want,

28
00:01:41,260 --> 00:01:44,720
Ejaaz:
give it access to all your personal hard drives, tools, data,

29
00:01:44,720 --> 00:01:46,280
Ejaaz:
and it can do wonderful stuff.

30
00:01:46,420 --> 00:01:48,540
Ejaaz:
But Josh, here's the amazing part.

31
00:01:48,800 --> 00:01:54,360
Ejaaz:
On paper, these models are as good as GPT-4 mini models, which is,

32
00:01:54,480 --> 00:01:56,000
Ejaaz:
it's pretty impressive, right?

33
00:01:56,900 --> 00:02:00,280
Ejaaz:
In practice and i've been playing around with it for the last few hours they're

34
00:02:00,280 --> 00:02:03,340
Ejaaz:
as good in my opinion and actually quicker than

35
00:02:03,340 --> 00:02:06,400
Ejaaz:
gpt-03 which is their frontier model and i

36
00:02:06,400 --> 00:02:09,620
Ejaaz:
mean this across like everything so

37
00:02:09,620 --> 00:02:14,240
Ejaaz:
reasoning um it spits out answers super quickly and i can see its reasoning

38
00:02:14,240 --> 00:02:18,980
Ejaaz:
it happens in like a couple of seconds and i'm so used to waiting like 30 seconds

39
00:02:18,980 --> 00:02:23,940
Ejaaz:
to a couple minutes on gpt-03 josh so it's pretty impressive and an insane unlock

40
00:02:23,940 --> 00:02:27,920
Ejaaz:
on coding it's as good and on creativity as well.

41
00:02:28,120 --> 00:02:32,160
Ejaaz:
So I'm my mind's pretty blown at all of this, right? Josh, what do you what do you think?

42
00:02:32,320 --> 00:02:35,780
Josh:
Yeah, so here's why it's impressive to me is because a lot of the times I don't

43
00:02:35,780 --> 00:02:38,460
Josh:
really care to use the outer bands of what a model is capable of.

44
00:02:38,580 --> 00:02:42,880
Josh:
Like I am not doing deep PhD level research. I'm not solving these math Olympiad questions.

45
00:02:43,080 --> 00:02:45,900
Josh:
I'm just trying to ask it a few normal questions and get some answers.

46
00:02:46,080 --> 00:02:49,040
Josh:
And what these models do is an excellent job at serving that need.

47
00:02:49,180 --> 00:02:51,640
Josh:
They're not going to go out and solve the world's hardest problems,

48
00:02:51,780 --> 00:02:54,440
Josh:
but neither do I. I don't want to solve those problems.

49
00:02:54,600 --> 00:02:57,980
Josh:
I just kind of want the information that I want, whether it be just a normal

50
00:02:57,980 --> 00:03:01,520
Josh:
Google type search or whether it be asking it some miscellaneous question about

51
00:03:01,520 --> 00:03:03,220
Josh:
some work that I'm doing.

52
00:03:03,900 --> 00:03:07,020
Josh:
It's really good at answering that. So I think initial impressions,

53
00:03:07,180 --> 00:03:10,100
Josh:
because they did allow you to test it publicly through their website,

54
00:03:10,160 --> 00:03:12,540
Josh:
it's just really good at the things that I want.

55
00:03:12,620 --> 00:03:15,920
Josh:
So the fact that I can run one of these models on a local device on my iPhone,

56
00:03:16,280 --> 00:03:19,440
Josh:
well, it feels like we're reaching this place that AI is starting to become

57
00:03:19,440 --> 00:03:23,460
Josh:
really interesting because for so long we've had compute handled fully on the

58
00:03:23,460 --> 00:03:25,600
Josh:
cloud and now this is the first time where

59
00:03:26,190 --> 00:03:28,630
Josh:
Compute can really happen on your computer. It could happen on your laptop.

60
00:03:28,810 --> 00:03:31,430
Josh:
I could download the model and I could actually store the model,

61
00:03:31,830 --> 00:03:35,450
Josh:
the 120 billion parameter model on a 56 gigabyte USB drive.

62
00:03:35,590 --> 00:03:40,290
Josh:
So you can take the collective knowledge of the world and put it on a tiny little USB drive.

63
00:03:40,410 --> 00:03:43,850
Josh:
And granted, it needs a bit of a bigger machine to actually run those parameters,

64
00:03:43,910 --> 00:03:46,330
Josh:
but you can install all the weights. It's 56 gigabytes.

65
00:03:46,570 --> 00:03:51,290
Josh:
It's this incredibly powerful package. And it probably, I don't know if this

66
00:03:51,290 --> 00:03:55,570
Josh:
is true, but it's probably the most condensed knowledge base in the history of humanity.

67
00:03:55,570 --> 00:03:59,030
Josh:
They've really managed to take a tremendous amount of tokens,

68
00:03:59,470 --> 00:04:02,390
Josh:
smush them into this little parameter set, and then publish it for people to

69
00:04:02,390 --> 00:04:04,230
Josh:
use. So for me, I'm really excited.

70
00:04:04,390 --> 00:04:07,650
Josh:
I like having my own mini portable models. I am excited to download,

71
00:04:08,010 --> 00:04:09,410
Josh:
try it out, run it on my MacBook.

72
00:04:09,670 --> 00:04:12,610
Josh:
I'm not sure I could run the 120 billion parameter model, but at least the 20B

73
00:04:12,610 --> 00:04:14,450
Josh:
and give it a shot and see how it works.

74
00:04:14,670 --> 00:04:19,270
Ejaaz:
You need to get the latest MacBook, Josh. I know, I got to upgrade. We can test that out.

75
00:04:19,750 --> 00:04:24,590
Ejaaz:
What I also love about it is it's fully private, right? So you can give it access

76
00:04:24,590 --> 00:04:31,230
Ejaaz:
to your personal hard drive, your Apple Notes, whatever you store on your computer, basically.

77
00:04:31,470 --> 00:04:36,690
Ejaaz:
And you can basically instruct the model to use those different tools.

78
00:04:36,870 --> 00:04:39,850
Ejaaz:
So one review that I keep seeing from a number of people who have been testing

79
00:04:39,850 --> 00:04:44,430
Ejaaz:
it so far is that it's incredibly great and intuitive at tool use.

80
00:04:44,490 --> 00:04:48,350
Ejaaz:
And the reason why this is such a big deal is a lot of the Frontier models right

81
00:04:48,350 --> 00:04:52,590
Ejaaz:
now, when they allow you to give access to different tools, they're kind of clunky.

82
00:04:53,030 --> 00:04:56,410
Ejaaz:
The model doesn't actually know when to use a specific tool and when not to.

83
00:04:56,910 --> 00:05:00,730
Ejaaz:
But these models are super intuitive, which is great. The privacy thing is also

84
00:05:00,730 --> 00:05:02,290
Ejaaz:
a big thing because you kind of

85
00:05:02,290 --> 00:05:05,390
Ejaaz:
don't want to be giving all your personal information away to Sam Altman.

86
00:05:05,570 --> 00:05:07,910
Ejaaz:
But you want a highly personalized model.

87
00:05:08,310 --> 00:05:13,210
Ejaaz:
And I think if I was to condense this entire model release in a single sentence,

88
00:05:13,330 --> 00:05:20,070
Ejaaz:
Joss, I think I would say it is the epitome of privacy and personalization in an AI model so far.

89
00:05:20,230 --> 00:05:26,510
Ejaaz:
It is that good. it is swift it is cheap and I'm going to replace it completely

90
00:05:26,510 --> 00:05:29,730
Ejaaz:
with all my GPT-4.0 queries as you said earlier like,

91
00:05:30,370 --> 00:05:33,310
Ejaaz:
Who needs to use the basic models anymore when you have access to this?

92
00:05:34,250 --> 00:05:38,010
Josh:
Yeah. So it's funny you say that you're going to swap it because I don't think I'm going to swap it.

93
00:05:38,130 --> 00:05:41,550
Josh:
I still am not sure I personally have a use case right now because I love the

94
00:05:41,550 --> 00:05:42,770
Josh:
context. I want the memory.

95
00:05:42,910 --> 00:05:46,650
Josh:
I like having it all server side where it kind of knows everything about me.

96
00:05:46,750 --> 00:05:50,870
Josh:
I guess in the case that I wanted to really make it a more intimate model experience

97
00:05:50,870 --> 00:05:55,410
Josh:
where you want to sync it up with like journal entries or your camera roll or

98
00:05:55,410 --> 00:05:59,250
Josh:
whatever, whatever interesting like personal things, this would be a really cool use case.

99
00:05:59,250 --> 00:06:03,330
Josh:
I think for the people who are curious why this matters to them,

100
00:06:03,810 --> 00:06:07,410
Josh:
well, we could talk a little briefly about like the second order effects of

101
00:06:07,410 --> 00:06:10,630
Josh:
having open source models as powerful, because what that allows you to do is

102
00:06:10,630 --> 00:06:13,310
Josh:
to serve queries from a local machine.

103
00:06:13,350 --> 00:06:17,270
Josh:
So if you are using an app or let's say you're an app developer and you're building

104
00:06:17,270 --> 00:06:22,150
Josh:
an application and your app is serving millions of requests because it's a GPT wrapper.

105
00:06:22,150 --> 00:06:26,230
Josh:
Well, what you could do now is instead of paying API calls to the OpenAI server,

106
00:06:26,450 --> 00:06:29,030
Josh:
you can actually just run your own local server, use this model,

107
00:06:29,070 --> 00:06:31,410
Josh:
and then serve all that data for the cost of the electricity.

108
00:06:31,610 --> 00:06:34,210
Josh:
And that's a really big unlock for the amount of compute that's going to be

109
00:06:34,210 --> 00:06:38,410
Josh:
available for not only developers, but for the cost of the users in a lot of these applications.

110
00:06:38,690 --> 00:06:42,630
Josh:
So for the applications that aren't doing this crazy moon math and that are

111
00:06:42,630 --> 00:06:47,250
Josh:
just kind of serving basic queries all day long, this like really significantly drops the cost.

112
00:06:47,490 --> 00:06:51,190
Josh:
It increases the privacy, like you mentioned. And there's a ton of really important

113
00:06:51,190 --> 00:06:54,450
Josh:
upsides to open source models that we just haven't seen up until now.

114
00:06:54,590 --> 00:06:57,470
Josh:
And I'm very excited to see come forward.

115
00:06:58,580 --> 00:07:01,640
Ejaaz:
Well, Josh, the thing with most of these open source models,

116
00:07:01,720 --> 00:07:05,660
Ejaaz:
we spoke about actually two major Chinese open source models that were released last week.

117
00:07:06,040 --> 00:07:10,220
Ejaaz:
It's not accessible to everyone. Like you and me aren't necessarily going to

118
00:07:10,220 --> 00:07:14,140
Ejaaz:
go to Hugging Face, a completely separate website, download these models,

119
00:07:14,380 --> 00:07:15,640
Ejaaz:
run the command line interface.

120
00:07:16,180 --> 00:07:18,500
Ejaaz:
Most of the listeners on the show doesn't even know what that means.

121
00:07:18,660 --> 00:07:20,560
Ejaaz:
I don't even know if I know what that means, right?

122
00:07:20,800 --> 00:07:25,000
Ejaaz:
But here you have a lovely created website where you could just kind of log

123
00:07:25,000 --> 00:07:28,720
Ejaaz:
on and play around with these open source models. And that's exactly what I've been doing.

124
00:07:28,860 --> 00:07:33,860
Ejaaz:
I actually have a few kind of demo queries that I ran yesterday, Josh.

125
00:07:34,120 --> 00:07:34,580
Josh:
Yeah, walk us through, let's see.

126
00:07:35,040 --> 00:07:41,600
Ejaaz:
Okay, so there's an incredibly complex test, which a lot of these AI models,

127
00:07:41,660 --> 00:07:45,500
Ejaaz:
which cost hundreds of billions of dollars to train, can't quite answer.

128
00:07:45,660 --> 00:07:52,080
Ejaaz:
And that is how many R's, the letter R's are there in the word strawberry? Most say two.

129
00:07:52,180 --> 00:07:52,980
Josh:
The bar's on the floor,

130
00:07:53,080 --> 00:07:58,520
Ejaaz:
Huh? Yeah, if we were to go with most models, they say two. They're convinced that they are only two.

131
00:07:58,860 --> 00:08:03,940
Ejaaz:
And I ran that test today, rather yesterday, with these open source models,

132
00:08:03,960 --> 00:08:07,560
Ejaaz:
and it correctly guessed three, Josh. So we're one for one right now.

133
00:08:07,700 --> 00:08:08,060
Josh:
We're on our way.

134
00:08:08,180 --> 00:08:12,160
Ejaaz:
But then I was like, okay, we live in New York City. I love this place.

135
00:08:12,580 --> 00:08:16,120
Ejaaz:
I'm feeling a little poetic today. Can you write me a sonnet?

136
00:08:16,440 --> 00:08:19,460
Ejaaz:
And my goal with this wasn't to test whether it could just write a poem.

137
00:08:19,620 --> 00:08:22,220
Ejaaz:
It was to test how quickly it could figure it out.

138
00:08:22,520 --> 00:08:26,200
Ejaaz:
And as you see it thought for a couple of seconds on this so it literally spat

139
00:08:26,200 --> 00:08:30,180
Ejaaz:
this out in two seconds um and it was structured really well you know it kind

140
00:08:30,180 --> 00:08:34,280
Ejaaz:
of flowed would i be you know reciting this out loud to the public no but you

141
00:08:34,280 --> 00:08:35,460
Ejaaz:
know i was pretty impressed.

142
00:08:35,950 --> 00:08:40,530
Ejaaz:
And then, Josh, I was thinking, you know, what's so unique about open source models?

143
00:08:40,650 --> 00:08:43,950
Ejaaz:
You just went through a really good list of why open source models work.

144
00:08:44,050 --> 00:08:48,730
Ejaaz:
But I was curious as to why these specific open source models were better than

145
00:08:48,730 --> 00:08:51,290
Ejaaz:
other open source models or maybe even other centralized models.

146
00:08:51,510 --> 00:08:54,630
Ejaaz:
So I wrote a query. I decided to ask it. I was like, you know,

147
00:08:54,710 --> 00:08:57,070
Ejaaz:
tell me some things that you could do that are the larger centralized models.

148
00:08:57,210 --> 00:09:00,250
Ejaaz:
And I spat out a really good list. I'm not going to go through all of them,

149
00:09:00,330 --> 00:09:03,110
Ejaaz:
but, you know, some of the things that we've highlighted so far, you can fine tune it.

150
00:09:03,330 --> 00:09:06,270
Ejaaz:
It's privacy. See, I really like this point that it made, Josh,

151
00:09:06,370 --> 00:09:08,810
Ejaaz:
that it just shows that AI is probably getting smarter than us,

152
00:09:08,970 --> 00:09:13,610
Ejaaz:
which is you can custom inject your own data into these models.

153
00:09:13,830 --> 00:09:18,710
Ejaaz:
Now, without kind of digging deeper into this, when you use a centralized model,

154
00:09:18,910 --> 00:09:23,970
Ejaaz:
it's already pre-trained on a bunch of data that companies like Anthropic and

155
00:09:23,970 --> 00:09:25,450
Ejaaz:
Google have already fed it.

156
00:09:25,550 --> 00:09:29,150
Ejaaz:
And so it's kind of formed its own personality, right?

157
00:09:29,270 --> 00:09:32,610
Ejaaz:
So you can't change the model's personality on a centralized model.

158
00:09:32,610 --> 00:09:37,190
Ejaaz:
But with an open model you have full reign to do whatever you want and so if

159
00:09:37,190 --> 00:09:41,590
Ejaaz:
you were feeling kind of uh adventurous you could use your own data and make

160
00:09:41,590 --> 00:09:44,590
Ejaaz:
it super personal and customizable so i thought that was really cool and fun

161
00:09:44,590 --> 00:09:46,450
Ejaaz:
demo josh have you been playing around with this.

162
00:09:46,450 --> 00:09:50,570
Josh:
Yeah it's um it's it's smart it's fun it's smart i wouldn't say it's anything

163
00:09:50,570 --> 00:09:54,810
Josh:
novel the like query results that i get are you know on par with everything

164
00:09:54,810 --> 00:09:57,970
Josh:
else i don't notice the difference which is good because it means they're performing

165
00:09:57,970 --> 00:10:01,050
Josh:
very well it's not like i feel like i'm getting degraded performance because

166
00:10:01,050 --> 00:10:02,070
Josh:
I'm using a smaller model.

167
00:10:02,650 --> 00:10:05,830
Josh:
But it's just like it's nothing too different, I would say.

168
00:10:06,290 --> 00:10:09,130
Josh:
The differences, I mean, again, all this boils down to the differences of it

169
00:10:09,130 --> 00:10:10,850
Josh:
being open source versus being

170
00:10:10,850 --> 00:10:12,830
Ejaaz:
Run on the server. Well, let me challenge you that, right? OK,

171
00:10:13,070 --> 00:10:16,390
Ejaaz:
so you're saying it's good but nothing novel.

172
00:10:16,670 --> 00:10:20,450
Ejaaz:
Would you say it's as good as GPT-4.0,

173
00:10:20,970 --> 00:10:24,990
Ejaaz:
minus the memory let's just put memory aside for a second would you use it if

174
00:10:24,990 --> 00:10:26,910
Ejaaz:
it had memory capability.

175
00:10:26,910 --> 00:10:29,770
Josh:
Actually no probably not um i still wouldn't

176
00:10:29,770 --> 00:10:32,450
Josh:
because i love my desktop application too much i

177
00:10:32,450 --> 00:10:35,550
Josh:
love my mobile app too much and i like that the conversations are

178
00:10:35,550 --> 00:10:38,310
Josh:
shared in the cloud um so i can use them on my phone i could

179
00:10:38,310 --> 00:10:41,050
Josh:
start on my laptop and go back and forth so even in

180
00:10:41,050 --> 00:10:44,370
Josh:
that case i'm probably still not a user um because

181
00:10:44,370 --> 00:10:47,090
Josh:
the convenience factor but there are there are a

182
00:10:47,090 --> 00:10:50,050
Josh:
lot of people and a lot of industries that would be and this is actually something probably

183
00:10:50,050 --> 00:10:52,750
Josh:
worth surfacing is the new industries that are now able to

184
00:10:52,750 --> 00:10:55,690
Josh:
benefit from this because a lot of industries have

185
00:10:55,690 --> 00:10:58,510
Josh:
a tough time using these AI models because

186
00:10:58,510 --> 00:11:01,170
Josh:
of the data privacy concerns particularly I mean if you think about a

187
00:11:01,170 --> 00:11:04,630
Josh:
healthcare industry people who are dealing with patients data it's

188
00:11:04,630 --> 00:11:07,190
Josh:
very challenging for them to fork it over to open AI and just trust that they're

189
00:11:07,190 --> 00:11:11,030
Josh:
going to keep it safe so what this does is it actually allows companies that

190
00:11:11,030 --> 00:11:13,450
Josh:
are in like the healthcare industry the finance industry who's dealing with

191
00:11:13,450 --> 00:11:16,650
Josh:
very like high touch personal finance the legal industry who's dealing with

192
00:11:16,650 --> 00:11:20,030
Josh:
a lot of legality government and defense a lot of these industries that were

193
00:11:20,030 --> 00:11:23,870
Josh:
not previously able to use these popular AI models,

194
00:11:24,010 --> 00:11:27,350
Josh:
well, now they have a pretty good model that they could run locally on their machines.

195
00:11:27,390 --> 00:11:31,030
Josh:
And that doesn't have any possibility of actually leaking out their customer

196
00:11:31,030 --> 00:11:35,490
Josh:
data, leaking out financials or healthcare data or, or like any sort of legal documents.

197
00:11:35,610 --> 00:11:38,630
Josh:
And, and that feels like a super powerful unlock. So for them,

198
00:11:38,750 --> 00:11:43,250
Josh:
it feels like a no brainer, obviously get the 120 B model running on a local

199
00:11:43,250 --> 00:11:46,650
Josh:
machine inside of your office, and you can load it up with all this context.

200
00:11:46,930 --> 00:11:50,630
Josh:
And that seems to be who this would be most impacting, right?

201
00:11:51,270 --> 00:11:56,590
Ejaaz:
But still to that point, I wonder how many of these companies can be bothered

202
00:11:56,590 --> 00:12:00,750
Ejaaz:
to do that themselves and run their own internal kind of like infrastructure.

203
00:12:01,310 --> 00:12:06,630
Ejaaz:
I'm thinking about OpenAI, who cracked, I think, $10 billion in annual recurring

204
00:12:06,630 --> 00:12:09,570
Ejaaz:
revenue this week, which is like a major milestone.

205
00:12:09,810 --> 00:12:14,390
Ejaaz:
And a good chunk of that, I think 33% of that is for enterprise customers.

206
00:12:14,850 --> 00:12:18,090
Ejaaz:
And to your point, like these enterprise customers don't wanna be giving open

207
00:12:18,090 --> 00:12:21,350
Ejaaz:
AI their entire data. You know, they can be used to train other AI models.

208
00:12:21,610 --> 00:12:27,790
Ejaaz:
So their fix or solution right now is they use kind of like private cloud instances,

209
00:12:28,390 --> 00:12:32,250
Ejaaz:
that I think are supplied by Microsoft by their Azure cloud service or something like that.

210
00:12:32,670 --> 00:12:34,830
Ejaaz:
And I wonder if they chose that,

211
00:12:35,470 --> 00:12:40,390
Ejaaz:
One, because there wasn't any open source models available or because they kind

212
00:12:40,390 --> 00:12:42,650
Ejaaz:
of just want to offload that to Microsoft to deal with.

213
00:12:42,830 --> 00:12:45,910
Ejaaz:
My gut tells me they're going to want to go with the latter,

214
00:12:46,070 --> 00:12:49,870
Ejaaz:
which is like, you know, just give it to some kind of cloud provider to deal with themselves.

215
00:12:50,030 --> 00:12:52,570
Ejaaz:
And they just trust Microsoft because it's a big brand name.

216
00:12:52,810 --> 00:12:55,670
Ejaaz:
But yeah, I don't really know how they'll materialize. I still think,

217
00:12:55,790 --> 00:13:00,430
Ejaaz:
and maybe this is because of my experience in crypto, Josh, that the open source

218
00:13:00,430 --> 00:13:04,150
Ejaaz:
models are still for like people that are at the fringe that are really experimenting

219
00:13:04,150 --> 00:13:06,650
Ejaaz:
with these things. but maybe don't have billions of dollars.

220
00:13:07,070 --> 00:13:10,650
Josh:
Yeah, that could be right. It'll be interesting to see how it plays out on all

221
00:13:10,650 --> 00:13:15,170
Josh:
scale of businesses because I mean, as a, like I think of a lot of indie devs

222
00:13:15,170 --> 00:13:17,170
Josh:
that I follow on Twitter and I see them all the time

223
00:13:17,590 --> 00:13:20,750
Josh:
just running local servers and they just, if they had this local model that

224
00:13:20,750 --> 00:13:24,530
Josh:
they could run on their machine and it takes the cost per query down from like

225
00:13:24,530 --> 00:13:28,230
Josh:
a penny to zero, that's like a big zero to one change.

226
00:13:28,510 --> 00:13:32,410
Josh:
So he does this model special because there are also a number of breakthroughs

227
00:13:32,410 --> 00:13:34,030
Josh:
that occurred in order to make this possible,

228
00:13:34,030 --> 00:13:37,190
Josh:
in order to condense this knowledge to be so tight so here's this

229
00:13:37,190 --> 00:13:40,330
Josh:
tweet from the professor talking about the cool tech tweaks in

230
00:13:40,330 --> 00:13:43,370
Josh:
this new model and what open ai was able to achieve some of

231
00:13:43,370 --> 00:13:46,530
Josh:
these i believe are novel some of these are seen before um if

232
00:13:46,530 --> 00:13:49,370
Josh:
you look at point two mixture of experts we're familiar with mixture of experts

233
00:13:49,370 --> 00:13:52,110
Josh:
we've seen other companies use that like kimmy and deep

234
00:13:52,110 --> 00:13:54,830
Josh:
seek basically instead of one brain doing everything the ai

235
00:13:54,830 --> 00:13:57,930
Josh:
has this team of experts that are kind of like mini brains

236
00:13:57,930 --> 00:14:00,710
Josh:
and specialize in different tasks it picks the right expert for

237
00:14:00,710 --> 00:14:03,450
Josh:
the job and it makes it faster so like instead of

238
00:14:03,450 --> 00:14:08,210
Josh:
having the entire 120 million parameter model search for one question maybe

239
00:14:08,210 --> 00:14:11,470
Josh:
you just take a couple million of those parameters that are really good at solving

240
00:14:11,470 --> 00:14:15,570
Josh:
math problems and they use it and that that's what brings compute down the first

241
00:14:15,570 --> 00:14:18,890
Josh:
point is this thing called the sliding window attention so if you imagine an

242
00:14:18,890 --> 00:14:20,750
Josh:
ai is like reading a really long book

243
00:14:21,380 --> 00:14:24,140
Josh:
It can only focus on a few pages at a time this trick

244
00:14:24,140 --> 00:14:27,400
Josh:
kind of lets it slide its focus window along the text so

245
00:14:27,400 --> 00:14:30,240
Josh:
when you think of a context window generally it's fixed right where you can see

246
00:14:30,240 --> 00:14:33,520
Josh:
a fixed set of data this sliding window

247
00:14:33,520 --> 00:14:36,340
Josh:
attention allows you to kind of move that context back and forth a

248
00:14:36,340 --> 00:14:39,160
Josh:
little bit so it takes what would have normally been

249
00:14:39,160 --> 00:14:42,020
Josh:
a narrow context window and extends it out a little bit to

250
00:14:42,020 --> 00:14:44,760
Josh:
the side so you get a little bit more context which is great for a

251
00:14:44,760 --> 00:14:47,440
Josh:
smaller model again you really want to consider that all of these are

252
00:14:47,440 --> 00:14:50,140
Josh:
are optimized for this microscopic scale that

253
00:14:50,140 --> 00:14:52,900
Josh:
can literally run on your phone and then the third point is this

254
00:14:52,900 --> 00:14:56,240
Josh:
thing called rope with yarn which sounds like a cat toy but this

255
00:14:56,240 --> 00:14:59,500
Josh:
is how the ai keeps track of the order of words so like the position

256
00:14:59,500 --> 00:15:02,440
Josh:
of the words in a sentence um so rope

257
00:15:02,440 --> 00:15:05,060
Josh:
you could imagine it like like the twisty math way to do

258
00:15:05,060 --> 00:15:07,860
Josh:
it and yarn makes it stretch further for really long stuff

259
00:15:07,860 --> 00:15:10,600
Josh:
so we have the context window that is

260
00:15:10,600 --> 00:15:13,480
Josh:
sliding we have this rope with yarn that allows you

261
00:15:13,480 --> 00:15:16,980
Josh:
to just kind of like stretch the words a little bit further and

262
00:15:16,980 --> 00:15:19,780
Josh:
then we have attention sinks which is the last one which is

263
00:15:19,780 --> 00:15:22,880
Josh:
there's a problem when ai is dealing with these endless chats that

264
00:15:22,880 --> 00:15:25,820
Josh:
lets it it kind of sinks in or ignores the boring old

265
00:15:25,820 --> 00:15:28,520
Josh:
info so it can pay attention to the new stuff so basically what it

266
00:15:28,520 --> 00:15:32,500
Josh:
is is if you're having a long chat with it and it determines hey this stuff

267
00:15:32,500 --> 00:15:35,220
Josh:
is kind of boring i don't need to remember it it'll actually just throw it away

268
00:15:35,220 --> 00:15:39,060
Josh:
and it'll increase that context window a little bit so again hyper optimizing

269
00:15:39,060 --> 00:15:42,840
Josh:
for for the small context window that it has and those are kind of the key four

270
00:15:42,840 --> 00:15:47,120
Josh:
breakthroughs that made this special again i'm not sure any of them are particularly novel,

271
00:15:47,320 --> 00:15:52,040
Josh:
But when combined together, that's what allows you to get these 04 mini results

272
00:15:52,040 --> 00:15:56,880
Josh:
or even 03 results on the larger model on something that can run locally on your laptop.

273
00:15:56,960 --> 00:16:01,260
Josh:
So it's a pretty interesting set of breakthroughs. I think a lot of times OpenAI,

274
00:16:01,400 --> 00:16:04,300
Josh:
we talk about them because of their feature breakthroughs, not really their

275
00:16:04,300 --> 00:16:05,040
Josh:
technical breakthroughs.

276
00:16:05,120 --> 00:16:08,000
Josh:
I think a lot of times the technical breakthroughs are reserved for like the

277
00:16:08,000 --> 00:16:09,940
Josh:
Kimi models or the DeepSeq models

278
00:16:09,940 --> 00:16:12,540
Josh:
where they really kind of break open the barrier of what's possible.

279
00:16:12,760 --> 00:16:16,500
Josh:
But I don't want to discredit OpenAI because these are pretty interesting things

280
00:16:16,500 --> 00:16:19,700
Josh:
that they've managed to combine together into this like one cohesive,

281
00:16:19,700 --> 00:16:21,920
Josh:
tiny little model, and then just gave it away.

282
00:16:22,800 --> 00:16:29,200
Ejaaz:
Yeah. I mean, they actually have a history of front-running open source frontier breakthroughs.

283
00:16:29,280 --> 00:16:34,100
Ejaaz:
If you remember when DeepSeek got deployed, Josh, one of their primary training

284
00:16:34,100 --> 00:16:38,340
Ejaaz:
methods was reinforcement learning, which was pioneered by an open AI researcher,

285
00:16:38,560 --> 00:16:40,500
Ejaaz:
which who probably like now works at Meta.

286
00:16:40,900 --> 00:16:46,060
Ejaaz:
Yeah, and I was I was I was looking at the feature that you mentioned just not

287
00:16:46,060 --> 00:16:48,840
Ejaaz:
the feature, but the breakthrough sliding window attention, and you mentioned

288
00:16:48,840 --> 00:16:51,080
Ejaaz:
that it can basically toggle reasoning.

289
00:16:51,500 --> 00:16:54,880
Ejaaz:
And I was pleasantly surprised to just notice that on the actual interface of

290
00:16:54,880 --> 00:16:57,280
Ejaaz:
the models here, Josh, can you see over here?

291
00:16:57,340 --> 00:17:01,040
Ejaaz:
You can toggle between reasoning levels of high, medium and low.

292
00:17:01,060 --> 00:17:04,800
Ejaaz:
So depending on what your prompt or query is, if it is kind of like a low level

293
00:17:04,800 --> 00:17:09,100
Ejaaz:
query where you're like hey just record this shopping or grocery list you know

294
00:17:09,100 --> 00:17:12,420
Ejaaz:
that's probably like a medium or a low query so oh it's pretty cool to to see

295
00:17:12,420 --> 00:17:15,180
Ejaaz:
that surface to the user like see it actively being used.

296
00:17:15,700 --> 00:17:18,800
Josh:
Yeah, no, super cool. I think I like the fine tuning of it.

297
00:17:19,080 --> 00:17:22,000
Josh:
And again, allowing you to kind of choose your intelligence levels,

298
00:17:22,020 --> 00:17:25,680
Josh:
because I imagine a lot of average people just don't, a lot of average queries

299
00:17:25,680 --> 00:17:27,240
Josh:
just don't need that much compute.

300
00:17:27,580 --> 00:17:30,640
Josh:
So if you can toggle it for the low reasoning level and get your answers,

301
00:17:30,780 --> 00:17:32,780
Josh:
that that's amazing. Super fast, super cheap.

302
00:17:32,980 --> 00:17:37,080
Ejaaz:
Did you see that trending tweet earlier this week, Josh, which basically said

303
00:17:37,080 --> 00:17:42,180
Ejaaz:
that the majority of ChatGPT users have never used a different model than ChatGPT 4.0?

304
00:17:42,400 --> 00:17:43,960
Josh:
I haven't seen it, but that makes sense.

305
00:17:43,960 --> 00:17:46,880
Ejaaz:
Yeah i i feel like the bulk of people i was chatting to

306
00:17:46,880 --> 00:17:49,540
Ejaaz:
my sister yesterday and she was kind of

307
00:17:49,540 --> 00:17:52,660
Ejaaz:
like using it for some research project at work and the

308
00:17:52,660 --> 00:17:55,540
Ejaaz:
screenshot she sent me over was foro and i was like hey you know like

309
00:17:55,540 --> 00:17:58,500
Ejaaz:
you could just run this on like a model that's like

310
00:17:58,500 --> 00:18:01,340
Ejaaz:
five times better than this right uh we'll come

311
00:18:01,340 --> 00:18:04,020
Ejaaz:
up with a much more creative set of ideas so just made me think that

312
00:18:04,020 --> 00:18:06,800
Ejaaz:
like i don't know how many people like care that they are like

313
00:18:06,800 --> 00:18:09,560
Ejaaz:
these brand new novel models and maybe um you know

314
00:18:09,560 --> 00:18:12,300
Ejaaz:
this kind of like basic model is good enough for everyone i don't know

315
00:18:12,300 --> 00:18:15,120
Ejaaz:
but um but moving on josh um there

316
00:18:15,120 --> 00:18:18,260
Ejaaz:
was a big question that popped into my head as

317
00:18:18,260 --> 00:18:21,180
Ejaaz:
soon as these models released which was are they as good

318
00:18:21,180 --> 00:18:24,000
Ejaaz:
as the chinese open source models right i wanted

319
00:18:24,000 --> 00:18:26,760
Ejaaz:
to get some opinions from people and and the reason

320
00:18:26,760 --> 00:18:29,780
Ejaaz:
why this matters i'm just give the listeners some context

321
00:18:29,780 --> 00:18:32,940
Ejaaz:
is china has been the number one

322
00:18:32,940 --> 00:18:35,860
Ejaaz:
nation to put out the best open source

323
00:18:35,860 --> 00:18:38,980
Ejaaz:
models over the last 12 months it started with deep seek

324
00:18:38,980 --> 00:18:42,140
Ejaaz:
and then alibaba's quen models got involved

325
00:18:42,140 --> 00:18:44,860
Ejaaz:
and then recently we had kimmy k2 and i think

326
00:18:44,860 --> 00:18:47,900
Ejaaz:
there was another ai lab out of china which came out so they

327
00:18:47,900 --> 00:18:51,060
Ejaaz:
have outside of america the highest density.

328
00:18:51,060 --> 00:18:54,040
Ejaaz:
Of the top ai researchers they all come out of this one university

329
00:18:54,040 --> 00:18:57,260
Ejaaz:
zinghua i believe they kind of like partially work

330
00:18:57,260 --> 00:19:00,180
Ejaaz:
or train in the u.s as well so they've got this like kind of hybrid ai

331
00:19:00,180 --> 00:19:03,540
Ejaaz:
mentality of how to build these models and they come up with a lot of these

332
00:19:03,540 --> 00:19:09,760
Ejaaz:
frontier breakthroughs um kimmy k2 for context had uh one trillion parameters

333
00:19:09,760 --> 00:19:14,320
Ejaaz:
in their model right comparing this to like 120 billion and 20 billion parameters

334
00:19:14,320 --> 00:19:20,120
Ejaaz:
models from open air i was curious like does this beat them to the punch some people josh.

335
00:19:20,580 --> 00:19:23,900
Ejaaz:
Don't think so okay this guy jason lee

336
00:19:23,900 --> 00:19:27,360
Ejaaz:
he asks uh is the gpt oss stronger

337
00:19:27,360 --> 00:19:30,800
Ejaaz:
than quen or kimmy or chinese open models and then

338
00:19:30,800 --> 00:19:35,320
Ejaaz:
he later kind of quote tweets that tweet and says answer the model is complete

339
00:19:35,320 --> 00:19:40,080
Ejaaz:
junk it's a hallucination machine overfit to reasoning benchmarks and has absolutely

340
00:19:40,080 --> 00:19:45,340
Ejaaz:
zero recall ability so a few things he's mentioning here is one it hallucinates

341
00:19:45,340 --> 00:19:47,840
Ejaaz:
a lot so it kind of makes up jargon terms,

342
00:19:48,280 --> 00:19:51,140
Ejaaz:
ideas, or parameters that didn't really exist before.

343
00:19:51,500 --> 00:19:55,800
Ejaaz:
Number two, he's saying that OpenAI designed this model purely so that it will

344
00:19:55,800 --> 00:20:01,700
Ejaaz:
do well on the exams, which are the benchmarks that rate how these models compare to each other.

345
00:20:01,840 --> 00:20:06,300
Ejaaz:
So they're saying that OpenAI optimized the model to kind of like do really

346
00:20:06,300 --> 00:20:10,660
Ejaaz:
well at those tests, but actually fail at everything else, which is what people want to use it for.

347
00:20:10,940 --> 00:20:13,680
Ejaaz:
And the final point that he makes is that it has zero recall ability,

348
00:20:13,800 --> 00:20:16,880
Ejaaz:
which is something you mentioned earlier, Josh, which says it doesn't have memory

349
00:20:16,880 --> 00:20:20,860
Ejaaz:
or context so you can have a conversation and then open up another conversation

350
00:20:20,860 --> 00:20:23,600
Ejaaz:
and it's completely forgotten about the context that it has for you from that

351
00:20:23,600 --> 00:20:25,220
Ejaaz:
initial conversation okay.

352
00:20:25,220 --> 00:20:30,020
Josh:
So not not the best not to be unfair to open ai but it feels like they delayed

353
00:20:30,020 --> 00:20:34,360
Josh:
this model a good bit of times oh yeah and they wanted it to look good and it

354
00:20:34,360 --> 00:20:38,160
Josh:
intuitively makes sense to me that they would be kind of optimizing for benchmarks

355
00:20:38,160 --> 00:20:41,940
Josh:
with this one um but nonetheless it's still impressive i'm seeing this big wall

356
00:20:41,940 --> 00:20:44,020
Josh:
of text now what is what is this what is this post here

357
00:20:44,020 --> 00:20:48,540
Ejaaz:
Well it's this post from uh one of these accounts i follow and they have an

358
00:20:48,540 --> 00:20:52,280
Ejaaz:
interesting section here which says comparison to other open weights oh sick.

359
00:20:52,280 --> 00:20:53,020
Josh:
Yeah what is this

360
00:20:53,020 --> 00:20:56,400
Ejaaz:
So he goes while the larger gpt oss

361
00:20:56,400 --> 00:20:59,420
Ejaaz:
120 billion parameter model does not come

362
00:20:59,420 --> 00:21:02,280
Ejaaz:
in above deep seek r1 so he's saying that deep seek r1

363
00:21:02,280 --> 00:21:05,120
Ejaaz:
just beats it out the park it is notable that

364
00:21:05,120 --> 00:21:08,380
Ejaaz:
it is significantly smaller in both total and active

365
00:21:08,380 --> 00:21:11,240
Ejaaz:
parameters than both of those models deep seek

366
00:21:11,240 --> 00:21:14,660
Ejaaz:
r1 has 671 billion total parameters and

367
00:21:14,660 --> 00:21:20,260
Ejaaz:
37 billion active parameters and is released natively right which makes it 10x

368
00:21:20,260 --> 00:21:24,820
Ejaaz:
larger than gpt's 120 billion parameter models but what he's saying is even

369
00:21:24,820 --> 00:21:29,460
Ejaaz:
though gpt's model is smaller and doesn't perform as well as deep seek it's

370
00:21:29,460 --> 00:21:31,760
Ejaaz:
still mightily impressive for its size.

371
00:21:32,500 --> 00:21:35,820
Josh:
Okay that's cool because that gets back to the point we made earlier in the

372
00:21:35,820 --> 00:21:39,460
Josh:
show that this is probably the most densely condensed

373
00:21:41,070 --> 00:21:43,770
Josh:
however you want to say it like base of

374
00:21:43,770 --> 00:21:46,790
Josh:
knowledge in the world they've used a lot of efficiency gains

375
00:21:46,790 --> 00:21:49,810
Josh:
to squeeze the most out of it so in this small model

376
00:21:49,810 --> 00:21:52,770
Josh:
it is i guess if we're optimizing maybe we

377
00:21:52,770 --> 00:21:56,190
Josh:
can make up a metric here on the show which is like um output per

378
00:21:56,190 --> 00:21:59,250
Josh:
per parameter or something like that like based on the total parameter

379
00:21:59,250 --> 00:22:03,210
Josh:
count of this model it gives you the best value per

380
00:22:03,210 --> 00:22:06,130
Josh:
token and that seems to be where this falls

381
00:22:06,130 --> 00:22:08,750
Josh:
in line where it's not going to blow any other open source model out of the

382
00:22:08,750 --> 00:22:11,510
Josh:
water but in terms of its size the fact that we can

383
00:22:11,510 --> 00:22:14,750
Josh:
take a phone and literally run one of these models on a phone and

384
00:22:14,750 --> 00:22:17,530
Josh:
you could go anywhere in the world with no service and have access to these models running

385
00:22:17,530 --> 00:22:20,590
Josh:
on a laptop or whatever mobile device that that's super

386
00:22:20,590 --> 00:22:24,330
Josh:
powerful and that's not something that is easy to do with the other open source

387
00:22:24,330 --> 00:22:28,650
Josh:
models so perhaps that's the advantage that open ai has it's just the density

388
00:22:28,650 --> 00:22:32,270
Josh:
of intelligence and the efficiency of these parameters that they've given to

389
00:22:32,270 --> 00:22:37,630
Josh:
us versus just being this like home run open source model that is going for the frontier,

390
00:22:37,970 --> 00:22:40,190
Josh:
it's just a little bit of a different approach.

391
00:22:40,530 --> 00:22:44,130
Ejaaz:
Yeah, we need like a small but mighty ranking on this show, Josh,

392
00:22:44,250 --> 00:22:48,190
Ejaaz:
that we can kind of like run every week when these companies release a new model.

393
00:22:48,410 --> 00:22:52,570
Ejaaz:
No, but it got me thinking, if we zoomed out of that question,

394
00:22:52,810 --> 00:22:56,170
Ejaaz:
right, because we're talking about small models versus large models,

395
00:22:56,530 --> 00:23:00,730
Ejaaz:
parameters and how effectively they use versus other models that are bigger.

396
00:23:01,670 --> 00:23:06,730
Ejaaz:
What really matters in this, Josh? In my opinion, it's user experience and how

397
00:23:06,730 --> 00:23:09,170
Ejaaz:
useful these models are to my daily life, right?

398
00:23:09,410 --> 00:23:14,070
Ejaaz:
At the end of the day, I kind of don't really care what size that model is unless

399
00:23:14,070 --> 00:23:17,450
Ejaaz:
it's useful for me, right? It could be small, it could be personal, it could be private.

400
00:23:17,850 --> 00:23:22,110
Ejaaz:
It depends on, I guess, the use case at the time. And I have a feeling that

401
00:23:22,110 --> 00:23:29,830
Ejaaz:
the trend of how technology typically goes, you kind of want a really high-performant

402
00:23:29,830 --> 00:23:31,470
Ejaaz:
small model, eventually.

403
00:23:31,670 --> 00:23:35,470
Ejaaz:
Right? I try and think about like us using computers for the first time,

404
00:23:35,670 --> 00:23:37,310
Ejaaz:
you know, back in our dinosaur age.

405
00:23:37,470 --> 00:23:42,110
Ejaaz:
And then, you know, it all being condensed on a tiny metal slab that we now

406
00:23:42,110 --> 00:23:45,150
Ejaaz:
use every day. And we can pretty much work from remotely from wherever.

407
00:23:45,330 --> 00:23:47,650
Ejaaz:
And I feel like this is where models are going to go. They're going to become

408
00:23:47,650 --> 00:23:49,590
Ejaaz:
more private. They're going to become more personal.

409
00:23:49,870 --> 00:23:53,870
Ejaaz:
Maybe it'll be a combination of, you know, it running locally on your device

410
00:23:53,870 --> 00:23:57,470
Ejaaz:
versus cloud inference and trusting certain providers.

411
00:23:57,650 --> 00:24:01,630
Ejaaz:
I don't know how it's going to fall out, but I think Like it's not a zero to

412
00:24:01,630 --> 00:24:03,090
Ejaaz:
one. It's not a black or white situation.

413
00:24:03,270 --> 00:24:06,110
Ejaaz:
I don't think everyone's just going to go with large centralized models that

414
00:24:06,110 --> 00:24:06,970
Ejaaz:
they can inference from the cloud.

415
00:24:07,110 --> 00:24:09,530
Ejaaz:
I think it'll be a mixture of both. And how that materializes,

416
00:24:09,550 --> 00:24:11,910
Ejaaz:
I don't know, but it's an interesting one to ponder.

417
00:24:12,530 --> 00:24:15,450
Josh:
Yeah, I think this is funny. This is going to sound very ironic,

418
00:24:15,450 --> 00:24:19,130
Josh:
but Apple was the person that got this most right.

419
00:24:19,370 --> 00:24:20,690
Ejaaz:
Sorry, who's Apple again?

420
00:24:21,070 --> 00:24:24,070
Josh:
Yeah, right. I mean, it sounds ridiculous to say this. And granted,

421
00:24:24,150 --> 00:24:25,710
Josh:
they did not execute on this at all.

422
00:24:25,870 --> 00:24:28,990
Josh:
But in theory, I think they nailed the approach initially,

423
00:24:28,990 --> 00:24:31,850
Josh:
which was you run local compute where all of

424
00:24:31,850 --> 00:24:34,710
Josh:
your stuff is so my iphone is the device i never

425
00:24:34,710 --> 00:24:37,550
Josh:
leave without it is everything about me it is all of my messages my

426
00:24:37,550 --> 00:24:41,290
Josh:
contacts all the contacts you could ever want from me and then the idea was

427
00:24:41,290 --> 00:24:44,350
Josh:
they would give you a local model that is integrated and embedded into that

428
00:24:44,350 --> 00:24:47,490
Josh:
operating system and then if there's anything that requires more compute well

429
00:24:47,490 --> 00:24:50,250
Josh:
then they'll send the query off into the cloud but most of it will get done

430
00:24:50,250 --> 00:24:54,470
Josh:
on your local device because most of it isn't that complicated and i think as

431
00:24:54,470 --> 00:24:57,470
Josh:
a user when i ask myself what i want from AI.

432
00:24:57,630 --> 00:25:00,730
Josh:
Well, I just want it to be my ultimate assistant. I just want it to be there

433
00:25:00,730 --> 00:25:03,610
Josh:
to make my life better. And so much of that is the context.

434
00:25:03,830 --> 00:25:07,770
Josh:
And Apple going with that model would have been incredible.

435
00:25:08,030 --> 00:25:10,170
Josh:
It would have been so great. It would have had the lightweight model that runs

436
00:25:10,170 --> 00:25:13,210
Josh:
locally, it has all the context of your life, and then it offloads to the cloud.

437
00:25:13,350 --> 00:25:17,870
Josh:
I still think this model is probably the correct one for optimizing the user

438
00:25:17,870 --> 00:25:20,510
Josh:
experience. But unfortunately, Apple just has not done that.

439
00:25:20,750 --> 00:25:24,290
Josh:
So it's up for grabs. I mean, again, Sam Altman's been posting a lot this week,

440
00:25:24,290 --> 00:25:27,610
Josh:
we do have to tease what's coming because this is probably going to be a huge

441
00:25:27,610 --> 00:25:29,990
Josh:
week. There's a high probability we get GPT-5.

442
00:25:30,230 --> 00:25:33,810
Josh:
And then they've also been talking about their hardware device a little bit. And they're saying how

443
00:25:34,400 --> 00:25:37,580
Josh:
It's like it's genuinely going to change the world. And I believe the reason

444
00:25:37,580 --> 00:25:39,740
Josh:
why is because they're taking this Apple approach where they're building the

445
00:25:39,740 --> 00:25:42,800
Josh:
operating system, they're gathering the context, and then they're just they're

446
00:25:42,800 --> 00:25:44,780
Josh:
able to serve it now locally on device.

447
00:25:44,960 --> 00:25:47,120
Josh:
They're able to go to the cloud when they need more compute.

448
00:25:47,280 --> 00:25:51,540
Josh:
And it's going to create this really cool, I think, duality of AI where you

449
00:25:51,540 --> 00:25:55,520
Josh:
have your your super private local one, and then you have the big brain one,

450
00:25:55,760 --> 00:25:58,460
Josh:
the big brother that's off in the cloud that does all the hard computing for you.

451
00:25:58,460 --> 00:26:01,780
Ejaaz:
Well, one thing is clear. There are going to be hundreds of models and it's

452
00:26:01,780 --> 00:26:05,540
Ejaaz:
going to benefit the user, you and I, for so many multiple...

453
00:26:05,540 --> 00:26:09,900
Ejaaz:
It's the big company's problems to figure out how these models work together

454
00:26:09,900 --> 00:26:11,680
Ejaaz:
and which ones get queried. I don't care.

455
00:26:12,000 --> 00:26:14,040
Ejaaz:
Just give me the good stuff and I'm going to be happy.

456
00:26:14,840 --> 00:26:20,360
Ejaaz:
Folks, OpenAI has been cooking. This was the first open source models they've

457
00:26:20,360 --> 00:26:21,720
Ejaaz:
released in six years, Josh.

458
00:26:22,000 --> 00:26:28,940
Ejaaz:
The last one was 2019 GPT-2, which seems like the stone age and it was only like four years ago.

459
00:26:29,620 --> 00:26:34,580
Ejaaz:
Thank you so much for listening. We are pumped to be talking about GPT-5,

460
00:26:34,820 --> 00:26:38,220
Ejaaz:
which we hope to be released in maybe 24 hours.

461
00:26:38,340 --> 00:26:40,120
Ejaaz:
Hopefully this week, fingers crossed. I don't know, we might be back on this

462
00:26:40,120 --> 00:26:42,280
Ejaaz:
camera pretty soon. Stay tuned.

463
00:26:42,800 --> 00:26:45,800
Ejaaz:
Please like, subscribe, and watch out for all the updates. We're going to release

464
00:26:45,800 --> 00:26:49,160
Ejaaz:
a bunch of clips as well if you want to kind of like get to the juicy bits as well.

465
00:26:49,580 --> 00:26:52,960
Ejaaz:
Share this with your friends and give us feedback. If you want to hear about

466
00:26:52,960 --> 00:26:55,840
Ejaaz:
different things, things that we haven't covered yet or things that we've spoken

467
00:26:55,840 --> 00:26:59,680
Ejaaz:
about, but you want to get more clarity on or guests that you want to join the show, let us know.

468
00:26:59,860 --> 00:27:02,140
Ejaaz:
We're going full force on this and we'll see you on the next one.

469
00:27:02,560 --> 00:27:04,160
Josh:
Sounds good. See you guys soon. Peace.

470
00:27:04,560 --> 00:27:12,340
Music:
Music