1
00:00:03,380 --> 00:00:07,300
Ejaaz:
What if I told you there was a single website you could go to where you can

2
00:00:07,300 --> 00:00:11,100
Ejaaz:
chat to any major AI model from one single interface?

3
00:00:11,480 --> 00:00:16,080
Ejaaz:
It's kind of like chat GPT, but instead every prompt gets routed to the exact

4
00:00:16,080 --> 00:00:19,780
Ejaaz:
AI model that will do the best job for whatever your prompt might be.

5
00:00:20,060 --> 00:00:25,500
Ejaaz:
Well, on today's episode, we're joined by Alex Atala, the founder and CEO of Open Router AI.

6
00:00:25,760 --> 00:00:31,120
Ejaaz:
It's the fastest growing AI model marketplace with access to over 400 LLMs,

7
00:00:31,260 --> 00:00:34,880
Ejaaz:
making it the only place that really knows how people use AI models,

8
00:00:34,900 --> 00:00:37,780
Ejaaz:
and more importantly, how they might use them in the future.

9
00:00:37,940 --> 00:00:42,220
Ejaaz:
It's at the intersection of every single prompt that anyone writes and every

10
00:00:42,220 --> 00:00:43,560
Ejaaz:
model that they might ever be.

11
00:00:43,780 --> 00:00:46,040
Ejaaz:
Alex Atala, welcome to the show. How are you, man?

12
00:00:46,320 --> 00:00:48,540
Alex:
Thanks, guys. Great. Thanks so much for having me on.

13
00:00:48,740 --> 00:00:53,440
Ejaaz:
So it is a Monday. How does the founder of OpenRouter spend his weekend?

14
00:00:53,440 --> 00:00:58,480
Ejaaz:
Presumably you know out and about chilling relaxing not at all focused on the company oh

15
00:00:58,480 --> 00:01:01,480
Alex:
I usually i love weekends with no

16
00:01:01,480 --> 00:01:06,100
Alex:
meetings planned and i just go to a coffee shop and just have tons of hours

17
00:01:06,100 --> 00:01:11,580
Alex:
stacked in a row uh to do things that require a lot of momentum build up so

18
00:01:11,580 --> 00:01:19,500
Alex:
i did that at coffee shops on saturday and sunday and then i watched blade runner again.

19
00:01:20,700 --> 00:01:23,880
Ejaaz:
Again okay um well so

20
00:01:23,880 --> 00:01:27,580
Ejaaz:
when we were preparing for this episode alex

21
00:01:27,580 --> 00:01:35,500
Ejaaz:
um i couldn't help but think that you've had a pretty insane decade of startup

22
00:01:35,500 --> 00:01:40,760
Ejaaz:
foundership right um so open router is kind of like your second major thing

23
00:01:40,760 --> 00:01:45,440
Ejaaz:
that you've done but prior to doing that you were the founder and cto of OpenSea,

24
00:01:45,600 --> 00:01:48,420
Ejaaz:
the biggest NFT marketplace out there.

25
00:01:48,580 --> 00:01:51,880
Ejaaz:
And now you're focused on one of the biggest AI companies out there.

26
00:01:51,960 --> 00:01:56,080
Ejaaz:
So it sounds like you're at kind of like the pivot point of two of the most

27
00:01:56,080 --> 00:01:58,640
Ejaaz:
important technology sectors over the last decade.

28
00:01:59,660 --> 00:02:04,540
Ejaaz:
Can you just give us a bit of background as to how you ended up here?

29
00:02:04,740 --> 00:02:06,440
Ejaaz:
And more importantly, where you started.

30
00:02:06,800 --> 00:02:10,640
Ejaaz:
Walk us through the journey of OpenSea and how you ended up at OpenRouter AI.

31
00:02:10,640 --> 00:02:20,240
Alex:
Yeah, so I co-founded OpenSea with Devin Finzer the very beginning of 2018, very end of 2017.

32
00:02:20,460 --> 00:02:24,500
Alex:
It was the first NFT marketplace. And...

33
00:02:25,420 --> 00:02:33,740
Alex:
It was not dissimilar to OpenRouter in that there was a really fragmented ecosystem

34
00:02:33,740 --> 00:02:40,720
Alex:
of NFT metadata and media that gets attached to these tokens.

35
00:02:40,840 --> 00:02:47,420
Alex:
And it was the first example of something in crypto that could be non-fungible,

36
00:02:47,540 --> 00:02:50,940
Alex:
meaning it's a single thing that can be traded from person to person.

37
00:02:51,100 --> 00:02:54,840
Alex:
Most things in the world are non-fungible. A chair is non-fungible. a

38
00:02:54,840 --> 00:02:58,140
Alex:
currency is fungible so it was

39
00:02:58,140 --> 00:03:02,240
Alex:
back back in 2018 no

40
00:03:02,240 --> 00:03:05,560
Alex:
one was really thinking about crypto in terms of non-fungible goods

41
00:03:05,560 --> 00:03:08,420
Alex:
and uh and the problem with the non with

42
00:03:08,420 --> 00:03:11,240
Alex:
non-fungible goods is that there weren't any real standards set up

43
00:03:11,240 --> 00:03:15,580
Alex:
um there was a lot of heterogeneous like

44
00:03:15,580 --> 00:03:18,860
Alex:
implementations for how to get uh like

45
00:03:18,860 --> 00:03:24,940
Alex:
a non-fungible item represented and tradable in a decentralized way So OpenSea

46
00:03:24,940 --> 00:03:33,700
Alex:
organized this like very heterogeneous inventory and put it together in one

47
00:03:33,700 --> 00:03:35,920
Alex:
place. We came up with like a metadata standard.

48
00:03:36,040 --> 00:03:43,800
Alex:
We did a lot of like a lot of work to really make the experience super good for each collection.

49
00:03:45,360 --> 00:03:50,480
Alex:
And you see a lot of those a lot of similarities with how AI works today,

50
00:03:50,520 --> 00:03:53,080
Alex:
too, where there's also just a very heterogeneous ecosystem.

51
00:03:53,100 --> 00:04:00,520
Alex:
On a lot of different APIs and different features supported by language model providers.

52
00:04:00,820 --> 00:04:05,140
Alex:
And Open Router similarly does a lot of work to organize it all.

53
00:04:06,230 --> 00:04:16,070
Alex:
Um i was at open sea uh until 2022 um when i was kind of feeling the itch to do something new.

54
00:04:16,810 --> 00:04:23,490
Alex:
And um i'm at the very end of i left in august and then chat gpt came out a few months later.

55
00:04:24,070 --> 00:04:27,070
Alex:
And uh and my biggest question around that

56
00:04:27,070 --> 00:04:30,470
Alex:
time was whether it was going to be a winner take all market

57
00:04:30,470 --> 00:04:33,530
Alex:
because opening i was very far ahead of

58
00:04:33,530 --> 00:04:36,350
Alex:
everybody else and um you know

59
00:04:36,350 --> 00:04:39,950
Alex:
we had cohere command we had a couple open source models um

60
00:04:39,950 --> 00:04:42,650
Alex:
but opening i was the only really usable one i

61
00:04:42,650 --> 00:04:45,770
Alex:
was doing little projects to experiment

62
00:04:45,770 --> 00:04:48,810
Alex:
with the gpt3 api and uh

63
00:04:48,810 --> 00:04:52,910
Alex:
and then llama came out in january um really

64
00:04:52,910 --> 00:04:55,590
Alex:
exciting about a tenth the size one on a

65
00:04:55,590 --> 00:04:59,690
Alex:
couple benchmarks but it wasn't really chattable yet and

66
00:04:59,690 --> 00:05:02,790
Alex:
uh and it wasn't until uh a few

67
00:05:02,790 --> 00:05:05,510
Alex:
months later that somebody a team at

68
00:05:05,510 --> 00:05:09,090
Alex:
stanford distilled it into a new

69
00:05:09,090 --> 00:05:12,510
Alex:
model called alpaca um distillation means

70
00:05:12,510 --> 00:05:15,350
Alex:
you you take the model and you customize it or fine tune it

71
00:05:15,350 --> 00:05:18,710
Alex:
on a set of synthetic data that they

72
00:05:18,710 --> 00:05:21,970
Alex:
made using chat gpt as a research project

73
00:05:21,970 --> 00:05:24,770
Alex:
and uh and that was it was

74
00:05:24,770 --> 00:05:27,490
Alex:
the first successful major distillation that i'm aware

75
00:05:27,490 --> 00:05:30,190
Alex:
of um and it was an actually usable model i

76
00:05:30,190 --> 00:05:33,090
Alex:
was like on the airplane talking to him i was like wow this

77
00:05:33,090 --> 00:05:36,730
Alex:
is if it only took six hundred dollars to make something like this then you

78
00:05:36,730 --> 00:05:40,530
Alex:
don't need ten million dollars to make a model there might be like tens of thousands

79
00:05:40,530 --> 00:05:44,450
Alex:
hundreds of thousands of models in the future and suddenly this started to look

80
00:05:44,450 --> 00:05:51,030
Alex:
like a new like economic primitive a new building block that people that kind

81
00:05:51,030 --> 00:05:52,530
Alex:
of deserve their own place on the internet.

82
00:05:52,910 --> 00:05:56,410
Alex:
And there wasn't one. There wasn't a place where you could discover new language

83
00:05:56,410 --> 00:05:58,810
Alex:
models and see who uses them and why.

84
00:05:59,760 --> 00:06:02,240
Alex:
And that's how OpenRouter got started.

85
00:06:02,880 --> 00:06:05,880
Josh:
That's amazing. So one of the things that we're obsessed with on this channel

86
00:06:05,880 --> 00:06:10,660
Josh:
in particular is exploring frontiers and how to properly see these frontiers

87
00:06:10,660 --> 00:06:12,380
Josh:
and analyze them and understand when they're going to happen.

88
00:06:12,660 --> 00:06:16,660
Josh:
And when I was going through your history, you have this talent consistently over time.

89
00:06:16,940 --> 00:06:21,440
Josh:
And even as far back as early on, I read you were hacking Wi-Fi routers in a hackathon.

90
00:06:21,640 --> 00:06:25,220
Josh:
You're very early to that. You were early to the NFTs. You were early to understanding

91
00:06:25,220 --> 00:06:28,620
Josh:
AI and the impact that it would have. And what I'd love for you to explain is

92
00:06:28,620 --> 00:06:31,960
Josh:
the thought process and the indicators you look for when exploring these new

93
00:06:31,960 --> 00:06:34,780
Josh:
frontiers, because clearly there's some sort of pattern matching going on.

94
00:06:34,920 --> 00:06:38,760
Josh:
Clearly you have some sort of awareness of what will be important and why it

95
00:06:38,760 --> 00:06:41,120
Josh:
will be important, and then inserting yourself into that narrative.

96
00:06:41,300 --> 00:06:45,600
Josh:
So are there patterns? Are there certain things that you look for when searching

97
00:06:45,600 --> 00:06:49,420
Josh:
for these new opportunities and that led you to make these decisions that you have?

98
00:06:49,420 --> 00:06:55,840
Alex:
I think there's there's a lot to be said for finding enthusiast communities

99
00:06:55,840 --> 00:06:58,700
Alex:
and and seeing if you're going to join it.

100
00:06:58,900 --> 00:07:01,340
Alex:
Like, can you be an enthusiast with them?

101
00:07:02,680 --> 00:07:07,700
Alex:
Like whenever something new comes out that has like some kind of ecosystem potential,

102
00:07:08,040 --> 00:07:11,460
Alex:
there's there are going to be enthusiast communities that pop up.

103
00:07:11,820 --> 00:07:17,040
Alex:
And the Internet has made it self-certain. You could just join the communities.

104
00:07:17,040 --> 00:07:20,580
Alex:
Um discord i think is a incredible

105
00:07:20,580 --> 00:07:24,460
Alex:
and super underrated platform because

106
00:07:24,460 --> 00:07:27,900
Alex:
the communities feel kind of private you're

107
00:07:27,900 --> 00:07:30,840
Alex:
like getting you don't feel like you're you know

108
00:07:30,840 --> 00:07:33,680
Alex:
seeing somebody trying to get s you

109
00:07:33,680 --> 00:07:36,980
Alex:
know like advertise something for seo juice there's

110
00:07:36,980 --> 00:07:40,100
Alex:
no seo juice in discord um it's it's

111
00:07:40,100 --> 00:07:43,020
Alex:
just people talking about what they're passionate about and and it

112
00:07:43,020 --> 00:07:45,820
Alex:
goes it gets really niche um and when

113
00:07:45,820 --> 00:07:49,280
Alex:
you find a like an interest group in discord that

114
00:07:49,280 --> 00:07:52,480
Alex:
like has to do with some some new

115
00:07:52,480 --> 00:07:57,100
Alex:
piece of technology that's just being developed right now and doesn't really

116
00:07:57,100 --> 00:08:02,680
Alex:
work very well at all um you get people who are just trying to figure out what

117
00:08:02,680 --> 00:08:07,960
Alex:
to do with it and how to make it better and i think that's like that's the first

118
00:08:07,960 --> 00:08:10,980
Alex:
core piece of magic that jumps to mind,

119
00:08:11,720 --> 00:08:17,380
Alex:
there's got to be like a willingness to be weird because if you jump into any

120
00:08:17,380 --> 00:08:20,500
Alex:
of these communities at face value it's stupid.

121
00:08:21,720 --> 00:08:26,800
Alex:
Like oh this is like just a game or it's like a really weird game I mean I'm

122
00:08:26,800 --> 00:08:32,280
Alex:
not really a collectible game so I'm going to leave right now and yeah.

123
00:08:34,190 --> 00:08:36,050
Alex:
Not only do you have to be aware, but you have to be creative.

124
00:08:36,470 --> 00:08:43,130
Alex:
Like, okay, these are just cats on the blockchain, and people are just trading cats back and forth.

125
00:08:43,990 --> 00:08:47,530
Alex:
You can't look at the community as simply that.

126
00:08:49,130 --> 00:08:52,290
Alex:
Think about what you could do with it.

127
00:08:52,510 --> 00:08:56,150
Alex:
Like, what is this unlock that wasn't achievable before?

128
00:08:56,150 --> 00:08:59,850
Alex:
Um and uh

129
00:08:59,850 --> 00:09:02,670
Alex:
and and i think there are there are people who

130
00:09:02,670 --> 00:09:05,650
Alex:
just are good who will do this and they'll join the communities

131
00:09:05,650 --> 00:09:08,610
Alex:
and and brainstorm live and you can see everybody

132
00:09:08,610 --> 00:09:11,870
Alex:
brainstorming uh in real time but like

133
00:09:11,870 --> 00:09:15,330
Alex:
another incredible example of this was the mid-journey discord

134
00:09:15,330 --> 00:09:18,310
Alex:
you know it became the

135
00:09:18,310 --> 00:09:21,430
Alex:
biggest biggest server in discord by

136
00:09:21,430 --> 00:09:25,110
Alex:
far uh and you know

137
00:09:25,110 --> 00:09:28,270
Alex:
why did that happened well you could it started with

138
00:09:28,270 --> 00:09:32,190
Alex:
something weird silly maybe

139
00:09:32,190 --> 00:09:34,970
Alex:
not super useful but you could see all the

140
00:09:34,970 --> 00:09:38,250
Alex:
enthusiasts like remixing and

141
00:09:38,250 --> 00:09:41,390
Alex:
brainstorming live how to turn it into something beautiful

142
00:09:41,390 --> 00:09:45,590
Alex:
and how to how to make it useful and um

143
00:09:45,590 --> 00:09:53,310
Alex:
and then you know just explode it like i it's the most it's the it's the most

144
00:09:53,310 --> 00:09:59,430
Alex:
incredible like niche community uh i think that discord has ever seen because

145
00:09:59,430 --> 00:10:04,970
Alex:
of like how useless it started and how insanely exciting it became.

146
00:10:06,650 --> 00:10:09,590
Alex:
So um like i mean i i

147
00:10:09,590 --> 00:10:12,510
Alex:
think i saw big sleep i was like playing around with this model

148
00:10:12,510 --> 00:10:16,230
Alex:
called big sleep in 2021 that uh

149
00:10:16,230 --> 00:10:19,190
Alex:
let you generate images that

150
00:10:19,190 --> 00:10:22,870
Alex:
look kind of like deviant art okay and

151
00:10:22,870 --> 00:10:25,550
Alex:
uh you could see you could like they're all

152
00:10:25,550 --> 00:10:29,110
Alex:
animated images and they none of them really made sense but you could get some

153
00:10:29,110 --> 00:10:32,710
Alex:
really cool stuff not like potentially something you'd want to make your desktop

154
00:10:32,710 --> 00:10:38,110
Alex:
wallpaper and if you're really like deep in some deviant art communities you

155
00:10:38,110 --> 00:10:41,970
Alex:
know you kind appreciate it and so and that that that was like oh there's like

156
00:10:41,970 --> 00:10:43,290
Alex:
a kernel of something here,

157
00:10:43,850 --> 00:10:49,170
Alex:
and uh it took like a like another year or two before mid-journey started to

158
00:10:49,170 --> 00:10:51,230
Alex:
like pick up but that was like.

159
00:10:51,230 --> 00:10:56,530
Ejaaz:
Where were you seeing all of this alex like where were you scouring just random

160
00:10:56,530 --> 00:10:59,650
Ejaaz:
forums or just wherever your nose told you to go

161
00:10:59,650 --> 00:11:03,550
Alex:
But basically there's this twitter account I'm trying to remember what it's

162
00:11:03,550 --> 00:11:12,930
Alex:
called that posts AI research papers and and like kind of tries to show what you can do with them.

163
00:11:13,270 --> 00:11:17,670
Alex:
And I discovered this Twitter account in like 2021.

164
00:11:19,570 --> 00:11:20,770
Alex:
And I.

165
00:11:21,930 --> 00:11:27,450
Alex:
I think it was not it was it wasn't at all like related to crypto but it was

166
00:11:27,450 --> 00:11:32,610
Alex:
a way you know big sleep was like the first thing i saw that used ai to generate

167
00:11:32,610 --> 00:11:35,270
Alex:
things that could potentially be nfts,

168
00:11:35,930 --> 00:11:42,570
Alex:
so i started experimenting around like how how much you could direct it to make

169
00:11:42,570 --> 00:11:49,090
Alex:
an nft collection that would make any sense it was very very difficult um but

170
00:11:49,090 --> 00:11:54,490
Alex:
that was how uh that was like the first generative and.

171
00:11:54,490 --> 00:11:58,790
Ejaaz:
This was before you were even thinking about starting open router right

172
00:12:01,410 --> 00:12:04,130
Alex:
Um yeah yeah this was back this was when i was

173
00:12:04,130 --> 00:12:07,150
Alex:
full-time at openc um oh is

174
00:12:07,150 --> 00:12:10,270
Alex:
yeah i got the it's a colic

175
00:12:10,270 --> 00:12:13,730
Alex:
this twitter account all right

176
00:12:13,730 --> 00:12:20,710
Alex:
i really recommend it they basically post papers and like explainate and explore

177
00:12:20,710 --> 00:12:27,970
Alex:
how this paper gets useful um they post animations uh like they make they make

178
00:12:27,970 --> 00:12:32,950
Alex:
ai research like kind of fun to engage with and that was that was my first experience.

179
00:12:33,450 --> 00:12:39,670
Ejaaz:
Okay, so I mean that's a massive win for X or formerly as it was known back

180
00:12:39,670 --> 00:12:41,390
Ejaaz:
then, Twitter as a platform, right?

181
00:12:41,450 --> 00:12:44,330
Ejaaz:
It gave birth to kind of like two of the biggest technologies crypto,

182
00:12:44,630 --> 00:12:49,590
Ejaaz:
also known as crypto Twitter, and now apparently all the AI research stuff which

183
00:12:49,590 --> 00:12:52,970
Ejaaz:
kind of put you on to the path that led you to OpenRatter.

184
00:12:53,170 --> 00:12:59,130
Ejaaz:
So if I've got this right, Alex, you were full-time at OpenSea with a

185
00:12:59,670 --> 00:13:02,850
Ejaaz:
multi-billion dollar company loads of important stuff to do there,

186
00:13:02,990 --> 00:13:07,650
Ejaaz:
but you still found the time to kind of scour this fringe technology because

187
00:13:07,650 --> 00:13:09,030
Ejaaz:
that's what AI was at the time.

188
00:13:09,250 --> 00:13:13,490
Ejaaz:
Prior to kind of GPT-2 or GPT-3, no one really knew about this.

189
00:13:13,610 --> 00:13:17,610
Ejaaz:
And you were playing around with these gen AI models, these generative AI models

190
00:13:17,610 --> 00:13:22,850
Ejaaz:
that would create this magical little substance and maybe it came in the form

191
00:13:22,850 --> 00:13:25,130
Ejaaz:
of a pitcher or a weird little cat.

192
00:13:25,350 --> 00:13:28,990
Ejaaz:
And you kind of like jumped into these niche forums of enthusiasts,

193
00:13:28,990 --> 00:13:31,910
Ejaaz:
as you say, and kind of explored that further.

194
00:13:32,250 --> 00:13:37,450
Ejaaz:
And it sounds like you kind of like honed that even beyond your journey from OpenSea when you left.

195
00:13:37,590 --> 00:13:43,930
Ejaaz:
I remember actually meeting you in this kind of like this abbess between you

196
00:13:43,930 --> 00:13:48,030
Ejaaz:
leaving OpenSea and starting OpenRouter where you were kind of brainstorming

197
00:13:48,030 --> 00:13:52,330
Ejaaz:
a bunch of these ideas. And I remember a snippet from our conversation

198
00:13:53,170 --> 00:13:58,670
Ejaaz:
In like one of the WeWorks here, where you just kind of like had whiteboarded a bunch of AI stuff.

199
00:13:58,890 --> 00:14:02,630
Ejaaz:
And one of those things was kind of like the whole topic of inference.

200
00:14:02,810 --> 00:14:06,790
Ejaaz:
And if I'm being honest with you, I had no idea what that word even meant back then.

201
00:14:07,290 --> 00:14:12,130
Ejaaz:
I was extremely focused on all the NFT stuff and all the crypto stuff,

202
00:14:12,150 --> 00:14:13,650
Ejaaz:
my background's in all of that.

203
00:14:13,790 --> 00:14:17,870
Ejaaz:
But I just found that fascinating that you always had your nose in some of the

204
00:14:17,870 --> 00:14:21,510
Ejaaz:
early communities. And I think that's a really important lesson there.

205
00:14:21,750 --> 00:14:27,210
Ejaaz:
I want to pick up on something that you actually brought up when you said you

206
00:14:27,210 --> 00:14:31,310
Ejaaz:
discovered kind of like your path to open router, Alex.

207
00:14:31,570 --> 00:14:35,770
Ejaaz:
And that is, you said you were playing around with these early AI models.

208
00:14:36,050 --> 00:14:39,870
Ejaaz:
So not the GPTs before Claude was even created.

209
00:14:40,050 --> 00:14:43,270
Ejaaz:
You're playing around with these random models that you would find either on

210
00:14:43,270 --> 00:14:49,010
Ejaaz:
forums, on Twitter, or on Reddit, right? and you would experiment with them.

211
00:14:49,310 --> 00:14:53,790
Ejaaz:
And I find it fascinating that back then, even when GPT became a thing,

212
00:14:54,330 --> 00:14:57,070
Ejaaz:
you were convinced that there would be hundreds of thousands,

213
00:14:57,170 --> 00:14:59,290
Ejaaz:
or did you say hundreds of thousands of AI models?

214
00:14:59,670 --> 00:15:01,510
Ejaaz:
Back then, that wasn't a normal view.

215
00:15:02,250 --> 00:15:05,870
Ejaaz:
Back then, everyone was like, you need hundreds of millions of dollars.

216
00:15:06,310 --> 00:15:10,950
Ejaaz:
Maybe it was tens of millions of dollars back then. And it was going to be a rich man's game.

217
00:15:11,230 --> 00:15:16,910
Alex:
Yeah, it was basically the Alpaca Project that kind of put me over the sack.

218
00:15:18,870 --> 00:15:24,150
Alex:
On there being many, many, many models instead of just a very small number.

219
00:15:24,290 --> 00:15:28,070
Ejaaz:
And can you explain what the Alpaca project is for the audience? Yeah.

220
00:15:29,030 --> 00:15:38,090
Alex:
So the Alpaca project, after Lama came out, you really could not chat with it

221
00:15:38,090 --> 00:15:40,630
Alex:
very well. It was a text completion model.

222
00:15:41,850 --> 00:15:45,350
Alex:
There were a couple benchmarks where it beat GPT-3.

223
00:15:46,230 --> 00:15:54,010
Alex:
And... It was about a tenth the size of what most people thought GPT-3 was sized at.

224
00:15:54,310 --> 00:15:56,350
Alex:
So it was a pretty incredible achievement.

225
00:15:57,170 --> 00:16:00,870
Alex:
But it wasn't really like, the user experience wasn't there.

226
00:16:01,450 --> 00:16:07,690
Alex:
And the Alpaca project took ChatGPT and generated a bunch of synthetic outputs.

227
00:16:09,010 --> 00:16:14,390
Alex:
And then they fine-tuned Llama on those synthetic outputs.

228
00:16:14,910 --> 00:16:21,090
Alex:
And this did two things to Llama. It taught it style, and it taught it knowledge.

229
00:16:21,450 --> 00:16:26,350
Alex:
It taught it, like, the style is like how to chat, which was the big user experience gap.

230
00:16:26,870 --> 00:16:29,210
Alex:
And it made it smarter.

231
00:16:30,350 --> 00:16:34,150
Alex:
Like, you can, fine-tuning transfers both style and knowledge.

232
00:16:34,150 --> 00:16:37,390
Alex:
And the model would, like, respond to things that it had, you know,

233
00:16:37,410 --> 00:16:42,470
Alex:
like, the content of the synthetic data, like, was reflected in the model's

234
00:16:42,470 --> 00:16:44,290
Alex:
performance on benchmarks after that point.

235
00:16:44,290 --> 00:16:47,250
Alex:
So um so if you can do

236
00:16:47,250 --> 00:16:50,990
Alex:
that without revealing all

237
00:16:50,990 --> 00:16:54,150
Alex:
the data that goes in um now now

238
00:16:54,150 --> 00:16:57,450
Alex:
there's like a way you could sell data via api without

239
00:16:57,450 --> 00:17:02,370
Alex:
like like just dumping all the data out to the world and then never being able

240
00:17:02,370 --> 00:17:07,210
Alex:
to to like monetize it again so there's like a brand new business model around

241
00:17:07,210 --> 00:17:15,990
Alex:
data that emerges um yet like the ability to create just like work towards open intelligence,

242
00:17:16,690 --> 00:17:20,590
Alex:
and uh and build like new

243
00:17:20,590 --> 00:17:23,790
Alex:
architectures test them more quickly and and and

244
00:17:23,790 --> 00:17:26,770
Alex:
uh uh fine-tune them quickly basically you

245
00:17:26,770 --> 00:17:29,710
Alex:
can build on top of the work of giants i mean

246
00:17:29,710 --> 00:17:32,590
Alex:
you don't have to start from zero every time a lot

247
00:17:32,590 --> 00:17:38,110
Alex:
of like the biggest developer experience innovations just involve like giving

248
00:17:38,110 --> 00:17:43,590
Alex:
developers a higher stair to start walking up so they don't have to start at

249
00:17:43,590 --> 00:17:51,290
Alex:
the bottom of the staircase every single time um and you know that was like the the the big.

250
00:17:52,370 --> 00:17:58,990
Alex:
Like generous give that llama had for the community um and it wasn't you know

251
00:17:58,990 --> 00:18:01,770
Alex:
that wasn't the only company doing open source models, Mastral,

252
00:18:02,470 --> 00:18:08,210
Alex:
came out with 7B Instruct a few months later. It was an incredible model.

253
00:18:08,490 --> 00:18:13,850
Alex:
Then they came out with the first open-weight mixture of experts a few months later.

254
00:18:15,410 --> 00:18:19,130
Alex:
It felt like actual intelligence, but completely open.

255
00:18:19,390 --> 00:18:24,210
Alex:
And all of these provide higher and higher stairs for other developers to kind

256
00:18:24,210 --> 00:18:29,590
Alex:
of like, basically to crowdsource new ideas from the whole planet.

257
00:18:30,340 --> 00:18:33,200
Alex:
Uh and and let these new ideas build on

258
00:18:33,200 --> 00:18:37,100
Alex:
top of really good foundations so and

259
00:18:37,100 --> 00:18:40,280
Alex:
you know when that when that like whole picture started

260
00:18:40,280 --> 00:18:46,540
Alex:
to form into place um it felt like okay this is going to be like a huge inventory

261
00:18:46,540 --> 00:18:51,720
Alex:
situation you kind of like nft collections were a huge inventory situation obviously

262
00:18:51,720 --> 00:18:55,660
Alex:
completely different really different market dynamics really different type

263
00:18:55,660 --> 00:18:59,040
Alex:
of of goal that buyers have.

264
00:18:59,920 --> 00:19:06,240
Alex:
And so a lot of like my early experimentation, like I made like a Chrome extension called Window AI.

265
00:19:06,240 --> 00:19:11,780
Alex:
I did like a few other things were just about learning how the ecosystem works

266
00:19:11,780 --> 00:19:16,180
Alex:
and like what makes it different and how the like, like what people really want,

267
00:19:16,320 --> 00:19:17,440
Alex:
what developers really want.

268
00:19:17,600 --> 00:19:21,100
Josh:
So that leads us to OpenRouter itself, right? So I kind of want you to help

269
00:19:21,100 --> 00:19:24,160
Josh:
explain to the listeners who aren't familiar with OpenRouter what it does.

270
00:19:24,160 --> 00:19:27,560
Josh:
Because I think a lot of people, the way they interact with an AI is they send

271
00:19:27,560 --> 00:19:29,720
Josh:
a prompt to their model of choice.

272
00:19:29,880 --> 00:19:33,500
Josh:
They use ChatGPT or they use the Grok app or they're on Gemini and they kind

273
00:19:33,500 --> 00:19:34,600
Josh:
of live in these siloed worlds.

274
00:19:34,740 --> 00:19:38,140
Josh:
And then the next step up from the people are those kind of who use it professionally,

275
00:19:38,280 --> 00:19:39,800
Josh:
who are developers. They're interacting with APIs.

276
00:19:40,160 --> 00:19:44,380
Josh:
Maybe they're not interfacing with the actual UI, but they're calling a single model.

277
00:19:44,540 --> 00:19:47,460
Josh:
And OpenRouter kind of exists on top of this, right? Can you walk us through

278
00:19:47,460 --> 00:19:50,340
Josh:
how it works and why so many people love using OpenRouter?

279
00:19:50,340 --> 00:19:56,520
Alex:
Open Router is an aggregator and marketplace for large language models.

280
00:19:56,900 --> 00:20:03,600
Alex:
You can kind of think of it as like a Stripe meets Cloudflare for both of them.

281
00:20:04,740 --> 00:20:09,300
Alex:
It's like a single pane of glass. You can orchestrate, discover,

282
00:20:09,340 --> 00:20:13,280
Alex:
and optimize all of your intelligence needs in one place.

283
00:20:14,340 --> 00:20:17,720
Alex:
One billing provider gets you all the models.

284
00:20:17,720 --> 00:20:21,980
Alex:
Uh there's like 470 plus now uh

285
00:20:21,980 --> 00:20:24,900
Alex:
like all the models like they sort of implement features

286
00:20:24,900 --> 00:20:28,300
Alex:
but they do it differently and they also there's

287
00:20:28,300 --> 00:20:32,480
Alex:
a lot of like intelligence brownouts as andre carpoffi calls them yeah where

288
00:20:32,480 --> 00:20:36,660
Alex:
models just go down all the time even the you know even the top models like

289
00:20:36,660 --> 00:20:39,420
Alex:
anthropic and gemini and and open

290
00:20:39,420 --> 00:20:47,560
Alex:
ai um so what we do is you know we like developers need a lot of choice.

291
00:20:48,340 --> 00:20:49,860
Alex:
CTOs need a lot of reliability.

292
00:20:50,840 --> 00:20:55,620
Alex:
CFOs need predictable costs. CISOs need complex policy controls.

293
00:20:55,940 --> 00:21:03,140
Alex:
All of these are inputs to what we do, which is build a single pane of glass

294
00:21:03,140 --> 00:21:10,520
Alex:
that makes models more reliable, lower costs, gives you more choice, and,

295
00:21:11,380 --> 00:21:17,160
Alex:
and then and helps you choose between all the options for where to source your intelligence.

296
00:21:17,760 --> 00:21:20,780
Josh:
How does it work uh because i would imagine like what

297
00:21:20,780 --> 00:21:23,600
Josh:
each as and i on the show we frequently talk about benchmarks right where

298
00:21:23,600 --> 00:21:27,760
Josh:
a certain model is the best at coding and that infers that maybe you should

299
00:21:27,760 --> 00:21:30,640
Josh:
go to that model to do all of your coding needs because it's the best at it

300
00:21:30,640 --> 00:21:34,060
Josh:
but it would appear as if it's not true if you're routing through a lot of different

301
00:21:34,060 --> 00:21:39,640
Josh:
providers so how do you consider which provider gets routed to when and how

302
00:21:39,640 --> 00:21:41,140
Josh:
to get the best result for what you're asking

303
00:21:41,140 --> 00:21:44,900
Alex:
So we've taken a different approach so

304
00:21:44,900 --> 00:21:48,060
Alex:
far which is instead of like focusing on

305
00:21:48,060 --> 00:21:50,980
Alex:
a production router that picks

306
00:21:50,980 --> 00:21:53,940
Alex:
the model for you um we try

307
00:21:53,940 --> 00:21:57,840
Alex:
to help you choose the model so we

308
00:21:57,840 --> 00:22:01,000
Alex:
we build lots we create lots of analytics both on

309
00:22:01,000 --> 00:22:03,720
Alex:
your account and uh and on our

310
00:22:03,720 --> 00:22:06,740
Alex:
rankings page to help you browse and discover the models that

311
00:22:06,740 --> 00:22:10,700
Alex:
like the power users are really using successfully on

312
00:22:10,700 --> 00:22:13,380
Alex:
a certain type of workload um because we

313
00:22:13,380 --> 00:22:16,780
Alex:
think like developers today primarily want to

314
00:22:16,780 --> 00:22:19,840
Alex:
choose the model themselves um switching between all

315
00:22:19,840 --> 00:22:22,840
Alex:
families can result in like a lot like very

316
00:22:22,840 --> 00:22:25,660
Alex:
unpredictable behavior but once you've

317
00:22:25,660 --> 00:22:28,360
Alex:
chosen your model um we try to

318
00:22:28,360 --> 00:22:31,640
Alex:
help developers not need to think about the provider there are

319
00:22:31,640 --> 00:22:34,940
Alex:
like sometimes dozens of

320
00:22:34,940 --> 00:22:38,220
Alex:
providers for a given model uh all kinds

321
00:22:38,220 --> 00:22:46,380
Alex:
of companies including the hyperscalers like aws google vertex and azure um

322
00:22:46,380 --> 00:22:54,480
Alex:
and uh like scaling startups like together fireworks deep infra um and a long

323
00:22:54,480 --> 00:22:56,340
Alex:
tail of providers that provide,

324
00:22:57,240 --> 00:22:59,240
Alex:
like very unique features,

325
00:23:00,080 --> 00:23:02,280
Alex:
very like exceptional performance.

326
00:23:03,200 --> 00:23:06,100
Alex:
There's all kinds of differentiators for them.

327
00:23:06,220 --> 00:23:10,280
Alex:
So what we do is we collect them all in one place. And if you want a feature,

328
00:23:10,520 --> 00:23:12,220
Alex:
you just get the providers that support it.

329
00:23:12,460 --> 00:23:18,040
Alex:
If you want performance, you get prioritized to the providers that have high performance.

330
00:23:18,340 --> 00:23:22,660
Alex:
If you really are cost sensitive, you get prioritized to the providers that

331
00:23:22,660 --> 00:23:27,840
Alex:
are really low cost today. and we basically create all these lanes. There's.

332
00:23:31,710 --> 00:23:35,330
Alex:
Innumerable ways you could get routed but

333
00:23:35,330 --> 00:23:38,350
Alex:
you're in full control of the of the overall user

334
00:23:38,350 --> 00:23:41,190
Alex:
experience that you're aiming for and that's

335
00:23:41,190 --> 00:23:44,290
Alex:
what that's what we found that was missing from the

336
00:23:44,290 --> 00:23:47,730
Alex:
whole ecosystem was just a way of doing that and uh

337
00:23:47,730 --> 00:23:53,470
Alex:
and you know we get like between on average five to ten percent uptime boosts

338
00:23:53,470 --> 00:24:01,190
Alex:
over going to um providers directly just by load balancing and sending you to

339
00:24:01,190 --> 00:24:06,250
Alex:
the top provider that's up and able to handle your request.

340
00:24:10,070 --> 00:24:13,550
Alex:
We really focus hard on efficiency and performance.

341
00:24:14,110 --> 00:24:19,490
Alex:
We only add about 20 to 25 milliseconds of latency on top of your request.

342
00:24:19,650 --> 00:24:22,810
Alex:
It all gets deployed very close to your servers up the edge.

343
00:24:25,010 --> 00:24:29,930
Alex:
We overall get just We stack providers.

344
00:24:29,930 --> 00:24:37,290
Alex:
We figure out what you can benefit from that everybody else is doing and just

345
00:24:37,290 --> 00:24:43,230
Alex:
give you the power of big data as a developer just accessing your model choice.

346
00:24:44,030 --> 00:24:48,490
Josh:
So it kind of allows you to harness the collective knowledge of everybody, right?

347
00:24:48,510 --> 00:24:51,150
Josh:
You get all of the data, you have all of the queries, you know which yields

348
00:24:51,150 --> 00:24:54,710
Josh:
the best result, and you're able to deliver the best product for them.

349
00:24:54,850 --> 00:24:59,450
Josh:
Now, in terms of actual LLMs, EJ has actually pulled this up just before, which is a leaderboard.

350
00:24:59,610 --> 00:25:03,550
Josh:
And I'm interested in how you guys think about LLMs, which are the best,

351
00:25:03,710 --> 00:25:06,510
Josh:
how to benchmark them, and how you route people through them.

352
00:25:06,810 --> 00:25:10,430
Josh:
Is there a specific... Do you believe that benchmarks are accurate,

353
00:25:10,550 --> 00:25:13,370
Josh:
and do you reflect those in the way that you route traffic through these models?

354
00:25:13,650 --> 00:25:25,270
Alex:
In general, we have taken the stance that we want to be the capitalist benchmark for models.

355
00:25:25,350 --> 00:25:27,530
Alex:
What is actually happening?

356
00:25:27,990 --> 00:25:36,090
Alex:
And part of this is that I really think both the law of large numbers and the

357
00:25:36,090 --> 00:25:42,330
Alex:
enthusiasm of power users are really, really valuable for everybody else.

358
00:25:42,330 --> 00:25:45,950
Alex:
Like when you're routing to

359
00:25:45,950 --> 00:25:50,050
Alex:
um like clod in

360
00:25:50,050 --> 00:25:52,890
Alex:
let's say you're routing to clod 4 and you're

361
00:25:52,890 --> 00:25:56,230
Alex:
based in europe um there you

362
00:25:56,230 --> 00:26:00,910
Alex:
know all of a sudden there might be like a huge variance in in throughput from

363
00:26:00,910 --> 00:26:05,150
Alex:
one of the providers and you're only able to detect that if like some other

364
00:26:05,150 --> 00:26:09,970
Alex:
users have discovered it before you and so we route around the provider that's

365
00:26:09,970 --> 00:26:13,670
Alex:
like running kind of slow in Europe and send you,

366
00:26:13,910 --> 00:26:15,890
Alex:
if your data policies allow it,

367
00:26:16,190 --> 00:26:18,170
Alex:
to a much faster provider somewhere else.

368
00:26:18,230 --> 00:26:22,050
Alex:
And that allows you to get faster performance. So, like, um...

369
00:26:23,100 --> 00:26:26,420
Alex:
That's, like, on the provider level, how, like, numbers help.

370
00:26:26,600 --> 00:26:30,660
Alex:
On the, like, model selection level, like, what you see on this rankings page

371
00:26:30,660 --> 00:26:36,560
Alex:
here, power users will, like, when we put up a model, like, we put up a new

372
00:26:36,560 --> 00:26:39,280
Alex:
model today from a new model lab called ZAI,

373
00:26:40,000 --> 00:26:43,080
Alex:
like, the power users instantly discover it.

374
00:26:43,080 --> 00:26:51,020
Alex:
We have this LLM enthusiast community that dives in and really figures out what

375
00:26:51,020 --> 00:26:54,260
Alex:
a model is good for along a bunch of core use cases.

376
00:26:54,440 --> 00:26:59,160
Alex:
The power users figure out which workloads are interesting, and then you can

377
00:26:59,160 --> 00:27:04,440
Alex:
just see in the data what they're doing. And everybody can benefit from it.

378
00:27:04,860 --> 00:27:10,380
Alex:
That's why we open up our data and share it for free on the rankings page here.

379
00:27:11,080 --> 00:27:16,080
Ejaaz:
I'm seeing this one consistent unit across all these rankings,

380
00:27:16,480 --> 00:27:18,540
Ejaaz:
Alex, which is tokens, right?

381
00:27:18,740 --> 00:27:23,360
Ejaaz:
And Josh and I have spoken about this on the show before, but I'm wondering

382
00:27:23,360 --> 00:27:28,660
Ejaaz:
how, like you've chosen this specific unit to measure how good or effective

383
00:27:28,660 --> 00:27:31,420
Ejaaz:
these models are or how consumed or used they are.

384
00:27:32,100 --> 00:27:35,720
Ejaaz:
Can you tell us a bit more as to why you picked this particular unit and what

385
00:27:35,720 --> 00:27:41,440
Ejaaz:
that tells you as like the open router platform as to how a user is using a particular model?

386
00:27:41,740 --> 00:27:44,820
Alex:
Yeah, I think dollars is a good metric too.

387
00:27:45,620 --> 00:27:55,260
Alex:
The reason we chose tokens is primarily because we were seeing prices come down really quickly.

388
00:27:57,440 --> 00:28:02,140
Alex:
Open Router has been around since the beginning of 2023.

389
00:28:02,140 --> 00:28:11,780
Alex:
And I didn't want a model to be penalized in the rankings just because the prices

390
00:28:11,780 --> 00:28:16,820
Alex:
are going down really dramatically now like there's a,

391
00:28:17,890 --> 00:28:26,090
Alex:
There's a paradox called Jevons paradox, which is that when prices decrease like 10x,

392
00:28:26,850 --> 00:28:35,090
Alex:
users' use of some component of infrastructure increases by more than 10x.

393
00:28:35,390 --> 00:28:37,930
Alex:
And so maybe they didn't get 10x at all.

394
00:28:39,010 --> 00:28:42,490
Alex:
But I thought there were some other advantages to using tokens,

395
00:28:42,550 --> 00:28:46,990
Alex:
too. Tokens don't have this penalty and don't rely on Jevon's Paradox,

396
00:28:47,070 --> 00:28:48,690
Alex:
which can have a lot of lag.

397
00:28:49,590 --> 00:28:53,230
Alex:
They also are a little bit of a proxy for time.

398
00:28:53,610 --> 00:29:01,970
Alex:
A model that is generating a lot of tokens and doing so for a while across a lot of users.

399
00:29:02,090 --> 00:29:06,190
Alex:
It means that a lot of people are reading those tokens and actually doing something with them.

400
00:29:06,510 --> 00:29:11,530
Alex:
And same goes for input. But if I really want to send an enormous number of

401
00:29:11,530 --> 00:29:16,250
Alex:
documents and the model has a really, really, really tiny prompt pricing,

402
00:29:16,250 --> 00:29:18,990
Alex:
I think that's still valuable and something that we want to see.

403
00:29:19,110 --> 00:29:22,930
Alex:
We want to see that this model is processing an enormous number of documents.

404
00:29:23,590 --> 00:29:26,110
Alex:
That's a use case that should show up in the rankings.

405
00:29:27,550 --> 00:29:33,670
Alex:
And so we decided to go with tokens. We might like add dollars in the future,

406
00:29:33,670 --> 00:29:41,910
Alex:
but I think tokens are, you know, they don't have this like Jevons Paradox lag.

407
00:29:42,750 --> 00:29:48,690
Alex:
And there wasn't anything else. Like nobody was doing any kind of like overall analytics.

408
00:29:48,690 --> 00:29:54,850
Alex:
We didn't see any other company even do it until Google did a few months ago

409
00:29:54,850 --> 00:29:58,590
Alex:
where they started publishing the total amount of tokens processed by Gemini.

410
00:30:00,380 --> 00:30:06,420
Alex:
So we'll see which use cases really need dollars.

411
00:30:06,760 --> 00:30:09,080
Alex:
But tokens have been holding up pretty well.

412
00:30:09,800 --> 00:30:14,660
Ejaaz:
Yeah, I mean, this dashboard is awesome. And I recommend anyone that's listening

413
00:30:14,660 --> 00:30:19,640
Ejaaz:
to this that can't see our screen to get on OpenRouter's website and check it out.

414
00:30:20,100 --> 00:30:25,180
Ejaaz:
I've been following it for the last two weeks kind of pretty rigorously, Alex.

415
00:30:25,400 --> 00:30:28,140
Ejaaz:
And what I love is you can literally see...

416
00:30:28,580 --> 00:30:31,520
Ejaaz:
So two weeks ago Grok 4 got released right

417
00:30:31,520 --> 00:30:34,220
Ejaaz:
and Josh and I were making a ton of videos on this we were

418
00:30:34,220 --> 00:30:38,020
Ejaaz:
using it with pretty much everything that we could do and

419
00:30:38,020 --> 00:30:42,680
Ejaaz:
then this other model came out of China pretty much a few days after called

420
00:30:42,680 --> 00:30:47,500
Ejaaz:
Kimi K2 and I was like oh yeah whatever this is just some random Chinese model

421
00:30:47,500 --> 00:30:51,360
Ejaaz:
I'm not going to focus on it and then I kept seeing it in my feed and I thought

422
00:30:51,360 --> 00:30:55,560
Ejaaz:
okay maybe I'll give this a go and I kind of like went straight to open rather than just

423
00:30:55,580 --> 00:31:02,760
Ejaaz:
almost gauge the interest from a wider set of AI users. And I saw that it was skyrocketing, right?

424
00:31:03,200 --> 00:31:06,920
Ejaaz:
And then I saw that Quen dropped their models last week.

425
00:31:07,060 --> 00:31:10,920
Ejaaz:
And again, I came to Open Router and it preceded the trend, right?

426
00:31:11,040 --> 00:31:14,600
Ejaaz:
People had already started using it. So I love how you describe Open Router

427
00:31:14,600 --> 00:31:17,740
Ejaaz:
as this kind of like prophetic orb,

428
00:31:18,060 --> 00:31:22,300
Ejaaz:
basically, where the enthusiasts and the community itself can kind of like front

429
00:31:22,300 --> 00:31:26,200
Ejaaz:
run very popular trends. And I think that's a very powerful moat.

430
00:31:26,300 --> 00:31:33,160
Ejaaz:
And kind of on this path, Alex, I noticed that a lot of these major model providers

431
00:31:33,160 --> 00:31:34,980
Ejaaz:
see the value in this, right?

432
00:31:35,120 --> 00:31:41,440
Ejaaz:
So if I'm not mistaken, OpenAI kind of like used your platform to kind of secretly

433
00:31:41,440 --> 00:31:45,860
Ejaaz:
launch their Frontier model before they officially launched it, right?

434
00:31:46,220 --> 00:31:50,820
Ejaaz:
Can you walk us through, you know, how that comes about and more importantly,

435
00:31:51,100 --> 00:31:53,820
Ejaaz:
why they want to do that and why they chose OpenRoddy to do that?

436
00:31:54,200 --> 00:31:58,660
Alex:
Uh open ai will sometimes

437
00:31:58,660 --> 00:32:01,680
Alex:
give uh early access

438
00:32:01,680 --> 00:32:04,760
Alex:
to their to models to some of their customers for

439
00:32:04,760 --> 00:32:08,780
Alex:
testing and we asked them if they

440
00:32:08,780 --> 00:32:14,360
Alex:
wanted to try a stealth model with us which we had never done before um it involved

441
00:32:14,360 --> 00:32:21,620
Alex:
like launching it as under another name and seeing how users respond to it without

442
00:32:21,620 --> 00:32:28,760
Alex:
having any bias or sort of inclination for against the model at the onset.

443
00:32:30,160 --> 00:32:35,820
Alex:
And it would be like a new way of testing it and a new way of...

444
00:32:35,820 --> 00:32:38,380
Alex:
It was like an experiment for both us and them.

445
00:32:38,560 --> 00:32:47,080
Alex:
And they generously decided to take the leap of faith and try it. And we...

446
00:32:48,330 --> 00:32:51,630
Alex:
Launched gpt 4.1 with

447
00:32:51,630 --> 00:32:55,530
Alex:
them at and we called it quasar alpha and

448
00:32:55,530 --> 00:32:59,750
Alex:
it was a million uh

449
00:32:59,750 --> 00:33:02,790
Alex:
token context length model opening us first very

450
00:33:02,790 --> 00:33:07,030
Alex:
very long context model and it was also optimized

451
00:33:07,030 --> 00:33:11,890
Alex:
for coding and the incredible

452
00:33:11,890 --> 00:33:14,810
Alex:
there were a couple incredible things that happened first

453
00:33:14,810 --> 00:33:18,030
Alex:
we have this community uh of benchmarkers

454
00:33:18,030 --> 00:33:20,870
Alex:
that run open source benchmarks and we give

455
00:33:20,870 --> 00:33:23,970
Alex:
a lot of them grants to help fund the benchmarks

456
00:33:23,970 --> 00:33:27,350
Alex:
grants of open router tokens they'll just run the

457
00:33:27,350 --> 00:33:31,170
Alex:
suite of tests against all the models and some of them are very creative like

458
00:33:31,170 --> 00:33:37,030
Alex:
there's one that tests uh like the ability to generate fiction there's one that

459
00:33:37,030 --> 00:33:44,570
Alex:
tests um like how like whether it can make a 3d object project in Minecraft called MCBench.

460
00:33:45,850 --> 00:33:48,970
Alex:
There are a few that test different types of coding proficiency.

461
00:33:50,030 --> 00:33:53,530
Alex:
There's one that just focuses on how good it is at Ruby, because Ruby is,

462
00:33:54,150 --> 00:33:56,250
Alex:
turns out a lot of the models are not great at Ruby.

463
00:33:56,410 --> 00:33:59,450
Alex:
There are a lot of like languages that all the models are pretty bad at.

464
00:34:00,510 --> 00:34:04,590
Alex:
And so we have this like long tail of very niche benchmarks,

465
00:34:04,590 --> 00:34:11,170
Alex:
And all the benchmarkers ran, you know, for free their benchmarks on Quasar

466
00:34:11,170 --> 00:34:15,170
Alex:
Alpha and found pretty incredible results for most of them.

467
00:34:16,030 --> 00:34:21,230
Alex:
And so the model got like, you know, OpenAI got this feedback in real time.

468
00:34:21,390 --> 00:34:25,170
Alex:
We kind of like helped them find it.

469
00:34:25,330 --> 00:34:31,910
Alex:
And they made another snapshot, which we launched as Optimus Alpha.

470
00:34:32,610 --> 00:34:35,850
Alex:
And they could compare the feedback that they got from the two snapshots.

471
00:34:36,860 --> 00:34:43,100
Alex:
Um, and, and then they, and then like two weeks later, they launched GPT 4.1 live for everybody.

472
00:34:43,660 --> 00:34:47,960
Alex:
So it was like, uh, uh, was it an experiment for us?

473
00:34:48,060 --> 00:34:54,880
Alex:
And, and we've done it, um, again since, uh, with, uh, another model provider

474
00:34:54,880 --> 00:34:57,900
Alex:
that, uh, that's still working on it.

475
00:34:58,440 --> 00:35:03,360
Alex:
Um, and it, and it's kind of like a cool way of learning of like crowdsourcing,

476
00:35:03,360 --> 00:35:08,460
Alex:
uh, benchmarks that you wouldn't have expected. and also getting unbiased community sentiment.

477
00:35:09,540 --> 00:35:13,740
Josh:
That's great. So now when we see a new model pop up and we want to test GPT-5,

478
00:35:13,920 --> 00:35:15,260
Josh:
we know where to come to to try it early.

479
00:35:16,020 --> 00:35:19,880
Josh:
We'll see because rumor is it's coming soon. So we'll be, we're on your watch list.

480
00:35:20,500 --> 00:35:23,740
Josh:
But having, I do want to ask you about open source versus closed source because

481
00:35:23,740 --> 00:35:25,920
Josh:
this has been an important thing for us. We talk about this a lot.

482
00:35:26,080 --> 00:35:27,900
Josh:
You have a ton of data on this.

483
00:35:28,100 --> 00:35:30,680
Josh:
I'm looking at the leaderboards there. There are open source models that are

484
00:35:30,680 --> 00:35:31,940
Josh:
doing very well, closed source.

485
00:35:32,200 --> 00:35:36,120
Josh:
What are your takes in general? How do you feel about open source versus closed

486
00:35:36,120 --> 00:35:40,240
Josh:
source models, particularly around how you serve them to users?

487
00:35:40,800 --> 00:35:47,180
Alex:
Both models, both types of models have supply problems, but the supply problems are very different.

488
00:35:47,400 --> 00:35:50,860
Alex:
Typically, what we see with closed source models is that there's there's very

489
00:35:50,860 --> 00:35:54,060
Alex:
few suppliers, usually just one or two.

490
00:35:54,860 --> 00:35:57,980
Alex:
Like with Grok, for example, there's Grok Direct and there's Azure.

491
00:35:58,800 --> 00:36:04,120
Alex:
Um with anthropic there's anthropic direct there's google vertex there's aws

492
00:36:04,120 --> 00:36:08,680
Alex:
bedrock um and then we also like deploy it in different regions like we have

493
00:36:08,680 --> 00:36:14,080
Alex:
an eu deployment um for customers who'd like only want their data like to stay in the eu,

494
00:36:15,600 --> 00:36:18,500
Alex:
and uh and we do custom deployments for

495
00:36:18,500 --> 00:36:21,620
Alex:
the for the closed source models too to just kind of guarantee good

496
00:36:21,620 --> 00:36:25,580
Alex:
throughput high and high rate limits for people um

497
00:36:25,580 --> 00:36:29,660
Alex:
but uh the

498
00:36:29,660 --> 00:36:39,020
Alex:
like a tricky part is that like the the demand usually the like the closed source

499
00:36:39,020 --> 00:36:44,000
Alex:
malls are doing most of the tokens on open router um it's it's dominant you

500
00:36:44,000 --> 00:36:48,500
Alex:
know it's probably 80-ish 70 to 80 percent closed source tokens today.

501
00:36:49,800 --> 00:36:57,560
Alex:
But the open source models have a much more fragmented supply, like cell supply.

502
00:36:58,500 --> 00:37:02,140
Alex:
Side order book um and and like

503
00:37:02,140 --> 00:37:05,940
Alex:
the rate limits for each provider is

504
00:37:05,940 --> 00:37:09,460
Alex:
like a like less stable on average um it

505
00:37:09,460 --> 00:37:12,920
Alex:
usually takes a while for the hyperscalers to serve a

506
00:37:12,920 --> 00:37:18,520
Alex:
new closed source a new open source model um so we so the load balancing work

507
00:37:18,520 --> 00:37:24,260
Alex:
that we do on um open source models tends to be a lot more valuable the load

508
00:37:24,260 --> 00:37:28,080
Alex:
balancing work that we do for closed source models tends to be very focused

509
00:37:28,080 --> 00:37:30,600
Alex:
on caching and feature awareness,

510
00:37:31,060 --> 00:37:36,420
Alex:
making sure you're getting clean cache hits and only transitioning over to new

511
00:37:36,420 --> 00:37:38,800
Alex:
providers when your cache is expired.

512
00:37:39,960 --> 00:37:45,780
Alex:
For open source models, there's way less caching. Very, very few open source

513
00:37:45,780 --> 00:37:47,120
Alex:
models implement caching.

514
00:37:48,100 --> 00:37:52,540
Alex:
And so switching between providers becomes more common. and

515
00:37:52,540 --> 00:37:55,340
Alex:
uh like we we also track a

516
00:37:55,340 --> 00:37:58,020
Alex:
lot of quality differences between the the open

517
00:37:58,020 --> 00:38:00,920
Alex:
source providers some of them will deploy at lower

518
00:38:00,920 --> 00:38:04,580
Alex:
quantization levels which means like it's kind of like a way of compressing

519
00:38:04,580 --> 00:38:11,380
Alex:
the model um generally doesn't have an impact on the quality of the output uh

520
00:38:11,380 --> 00:38:18,420
Alex:
but and yet we still see some odd things from some of the open source providers.

521
00:38:18,760 --> 00:38:25,260
Alex:
And so we run tests internally to detect those outputs. And we're building up

522
00:38:25,260 --> 00:38:26,500
Alex:
a lot more muscle here soon.

523
00:38:27,160 --> 00:38:32,080
Alex:
So that like, they get pulled out of the routing lane and don't affect anyone.

524
00:38:33,090 --> 00:38:36,850
Josh:
So closed source accounts for 80% or something like that, a very large amount.

525
00:38:36,990 --> 00:38:37,810
Josh:
Do you see that changing?

526
00:38:38,050 --> 00:38:41,570
Josh:
Because that post we just had, it's at nine out of the 10 fastest growing LLMs

527
00:38:41,570 --> 00:38:43,410
Josh:
last week, they were open source.

528
00:38:43,670 --> 00:38:47,090
Josh:
And every time it seems like China comes out with another model,

529
00:38:47,210 --> 00:38:54,130
Josh:
it was Kimmy K2 a week or two ago, it kind of really pushes the frontier of open source forward.

530
00:38:54,310 --> 00:38:58,430
Josh:
And the rate of acceleration of open source seems to be as fast,

531
00:38:58,530 --> 00:39:02,110
Josh:
if not faster than closed source, where it's just, it's making these improvements very quickly.

532
00:39:02,310 --> 00:39:06,910
Josh:
It has the benefit of being able to compound in speed because it's open source

533
00:39:06,910 --> 00:39:07,930
Josh:
and everyone can contribute.

534
00:39:08,310 --> 00:39:11,630
Josh:
Do you think that starts to change where the percentage of tokens you're issuing

535
00:39:11,630 --> 00:39:14,130
Josh:
are from open source models versus closed source?

536
00:39:14,290 --> 00:39:17,030
Josh:
Or do you continue to see a trend where it's going to be Google,

537
00:39:17,230 --> 00:39:20,170
Josh:
it's going to be OpenAI that are serving a majority of these tokens to users?

538
00:39:20,450 --> 00:39:25,550
Alex:
In the short term, we're likely to see open source models continue to dominate

539
00:39:25,550 --> 00:39:29,230
Alex:
the fastest growing model category on OpenRouter.

540
00:39:29,230 --> 00:39:36,170
Alex:
And the reason for that is that a lot of users who come for a closed source

541
00:39:36,170 --> 00:39:40,770
Alex:
model, but then decide they want to optimize later,

542
00:39:41,070 --> 00:39:49,150
Alex:
either they want to save on costs or try out a new model that's supposed to

543
00:39:49,150 --> 00:39:54,890
Alex:
be a little bit better in some direction that their app cares about or their use case cares about,

544
00:39:55,230 --> 00:39:58,730
Alex:
then they leave the closed source model and go to an open source model.

545
00:39:58,730 --> 00:40:02,730
Alex:
So open source tends to be like a last mile optimization thing,

546
00:40:03,150 --> 00:40:07,350
Alex:
making a big generalization because the reverse can happen too.

547
00:40:08,510 --> 00:40:11,870
Alex:
And so because it's a last mile optimization thing,

548
00:40:12,350 --> 00:40:17,110
Alex:
the jump from this model is not being used at all to this model is really being

549
00:40:17,110 --> 00:40:19,550
Alex:
used by a couple of people who have

550
00:40:19,550 --> 00:40:27,290
Alex:
left Claude 4 and want to try some new coding use case will be bigger.

551
00:40:29,270 --> 00:40:34,090
Alex:
Than the closed-source models, which start at a really high base and don't have

552
00:40:34,090 --> 00:40:35,450
Alex:
growth quite as dramatic.

553
00:40:36,790 --> 00:40:43,590
Alex:
So the other part of your question, though, was whether there's going to be like a flippening of.

554
00:40:43,590 --> 00:40:48,790
Josh:
Close or some sort of like chipping it away at that monopoly of close source tokens.

555
00:40:49,070 --> 00:40:52,050
Alex:
It's hard to predict these things because, you know,

556
00:40:52,130 --> 00:40:58,550
Alex:
I think like the the biggest problem today with open source models is that the

557
00:40:58,550 --> 00:41:05,150
Alex:
incentives are not as strong like the model lab and the model provider.

558
00:41:05,150 --> 00:41:08,010
Alex:
Um they've you know they're they're

559
00:41:08,010 --> 00:41:10,750
Alex:
sort of established incentives for how to

560
00:41:10,750 --> 00:41:19,050
Alex:
grow as a company and attract good high quality um ai talent and um and giving

561
00:41:19,050 --> 00:41:27,650
Alex:
the model weights away impairs those incentives now like we might see yeah this

562
00:41:27,650 --> 00:41:29,810
Alex:
is where we might see like decentralized providers,

563
00:41:30,990 --> 00:41:32,730
Alex:
helping in the future.

564
00:41:33,950 --> 00:41:35,390
Alex:
A way for like,

565
00:41:36,490 --> 00:41:40,050
Alex:
uh you know like a really good incentive scheme that

566
00:41:40,050 --> 00:41:43,190
Alex:
like allows high quality talent

567
00:41:43,190 --> 00:41:46,870
Alex:
to work on an open source model um

568
00:41:46,870 --> 00:41:54,070
Alex:
that remains open weights at least uh like could fix this i like i you know

569
00:41:54,070 --> 00:41:58,810
Alex:
i stay pretty i try to stay close to the decentralized providers um and like

570
00:41:58,810 --> 00:42:02,810
Alex:
learn a lot from them there's some like cool on the provider side on like on

571
00:42:02,810 --> 00:42:06,850
Alex:
running inference i I think there's some really cool incentive schemes being worked on.

572
00:42:07,010 --> 00:42:12,150
Alex:
But on actually developing the models themselves, I haven't seen too much, unfortunately.

573
00:42:12,770 --> 00:42:19,830
Alex:
So I think if we see one, flipping in the radar. And until we do, I personally doubt it.

574
00:42:20,390 --> 00:42:24,710
Josh:
TBD, do you have personal takes on how you feel about open source versus closed source?

575
00:42:24,870 --> 00:42:28,810
Josh:
Because this has been a huge topic we've been debating too. It's just the ethical

576
00:42:28,810 --> 00:42:32,570
Josh:
concerns around alignment and closed source models versus open source.

577
00:42:32,570 --> 00:42:35,570
Josh:
When you look at the competitors, China, generally speaking,

578
00:42:35,790 --> 00:42:39,670
Josh:
is associated with open source, whereas the United States is generally associated with closed source.

579
00:42:39,810 --> 00:42:45,310
Josh:
And we saw Llama and Meta release the open source models, but now they're raising

580
00:42:45,310 --> 00:42:49,470
Josh:
a ton of money to pay a lot of employees a lot of money to probably develop a closed source model.

581
00:42:49,570 --> 00:42:52,810
Josh:
So it seems like the trends are kind of split between US and China.

582
00:42:53,010 --> 00:42:55,810
Josh:
And I'm curious if you have any personal takes, even outside of OpenRouter,

583
00:42:55,950 --> 00:43:00,730
Josh:
of which you think serves better for the long term outlook on,

584
00:43:00,850 --> 00:43:05,150
Josh:
I mean, the position of the United States or just the general safety and alignment

585
00:43:05,150 --> 00:43:06,410
Josh:
conversation around AI?

586
00:43:06,710 --> 00:43:12,690
Alex:
I mean, like a very simple fundamental difference between the two is that an

587
00:43:12,690 --> 00:43:17,410
Alex:
innovation in open source models can be copied more quickly than an innovation

588
00:43:17,410 --> 00:43:18,550
Alex:
in closed source models.

589
00:43:20,950 --> 00:43:26,290
Alex:
So in terms of velocity and like how far ahead one is over the other,

590
00:43:26,710 --> 00:43:29,570
Alex:
that is like a massive structural difference.

591
00:43:29,570 --> 00:43:37,210
Alex:
That means that closed source models should be theoretically always ahead until

592
00:43:37,210 --> 00:43:40,890
Alex:
a really interesting incentive scheme develops, like I mentioned before.

593
00:43:41,890 --> 00:43:47,690
Alex:
Uh, I think, and I think that's, you know, I don't see like evidence that that's

594
00:43:47,690 --> 00:43:51,710
Alex:
going to change in terms of China versus the U S.

595
00:43:52,430 --> 00:44:01,190
Alex:
Um, it's, I think it's very interesting that China has not had like a major closed source model.

596
00:44:01,190 --> 00:44:04,270
Alex:
Um and i don't really

597
00:44:04,270 --> 00:44:08,050
Alex:
see a great reason why i'm

598
00:44:08,050 --> 00:44:11,470
Alex:
not aware of any reasons that's not that's not going

599
00:44:11,470 --> 00:44:14,590
Alex:
to be going to be the case in the future um my prediction

600
00:44:14,590 --> 00:44:18,410
Alex:
is that there's going to be a closed source model from china um

601
00:44:18,410 --> 00:44:29,390
Alex:
and uh you know if uh uh you know if like it's possible that DeepSeas and Moonshot

602
00:44:29,390 --> 00:44:36,350
Alex:
and Gwen have built up really sticky talent pools.

603
00:44:36,950 --> 00:44:41,470
Alex:
But generally with talent pools, after enough years have passed,

604
00:44:41,930 --> 00:44:47,850
Alex:
people quit and go and create new companies and build new talent pools.

605
00:44:49,530 --> 00:44:54,670
Alex:
And so we should see some of that. It's not the case that the AI space has NDAs

606
00:44:54,670 --> 00:44:58,470
Alex:
or non-competes that the hedge fund space has.

607
00:45:00,590 --> 00:45:04,870
Alex:
That might happen in the future too. But assuming that the current non-compete

608
00:45:04,870 --> 00:45:11,190
Alex:
culture continues, there should be more companies that pop up in China over time.

609
00:45:11,350 --> 00:45:13,290
Alex:
And I'm betting that some of them will be closed source.

610
00:45:13,530 --> 00:45:18,190
Alex:
And my guess is that the two nations will start to look more similar.

611
00:45:18,630 --> 00:45:24,350
Ejaaz:
Yeah, I guess that's why you have Zuck dishing out 300 mil to a billion dollar

612
00:45:24,350 --> 00:45:27,010
Ejaaz:
salary offers to a bunch of these guys, right?

613
00:45:27,290 --> 00:45:31,710
Ejaaz:
One more question on China versus the US. I kind of agree with you.

614
00:45:31,870 --> 00:45:37,210
Ejaaz:
I didn't really expect China to be the one to lead open source anything,

615
00:45:37,210 --> 00:45:40,870
Ejaaz:
let alone the most important technology of our time.

616
00:45:41,010 --> 00:45:44,450
Ejaaz:
Do you think is their secret source to building these models, Alex?

617
00:45:44,730 --> 00:45:48,270
Ejaaz:
And I know this might be out of the forte of

618
00:45:48,270 --> 00:45:52,850
Ejaaz:
open router specifically but as someone who has studied this technology for

619
00:45:52,850 --> 00:45:59,590
Ejaaz:
a while now i'm struggling to figure out you know what advantage they had you

620
00:45:59,590 --> 00:46:03,070
Ejaaz:
know they're discovering all these new techniques and maybe the simple answer

621
00:46:03,070 --> 00:46:06,270
Ejaaz:
is like constraints right they don't have access to all of

622
00:46:06,850 --> 00:46:11,350
Ejaaz:
nvidia's chips they don't have access to infinite compute so then maybe they're

623
00:46:11,350 --> 00:46:14,410
Ejaaz:
forced to kind of like figure out other ways around the same kinds of problems

624
00:46:14,410 --> 00:46:19,930
Ejaaz:
that western companies are focused on But it's pretty clear that America, with all its funding,

625
00:46:20,230 --> 00:46:23,570
Ejaaz:
hasn't been able to make these frontier breakthroughs.

626
00:46:23,650 --> 00:46:29,450
Ejaaz:
So I'm curious whether you are aware of or know some kind of technical moat

627
00:46:29,450 --> 00:46:33,910
Ejaaz:
that Chinese AI researchers or these AI teams that are featuring on Open Rata

628
00:46:33,910 --> 00:46:36,870
Ejaaz:
day in and day out have over the U.S.?

629
00:46:41,080 --> 00:46:46,060
Alex:
Well, I don't know.

630
00:46:46,440 --> 00:46:52,500
Alex:
There are certainly some that they've come up with that like DeepSeek had a

631
00:46:52,500 --> 00:46:57,940
Alex:
lot of very cool inference innovations that they published in their paper.

632
00:46:58,100 --> 00:47:04,100
Alex:
But a lot of what they published in the original R1 paper were things that OpenAI

633
00:47:04,100 --> 00:47:07,860
Alex:
had done independently themselves many months before.

634
00:47:07,860 --> 00:47:11,760
Alex:
So uh i like

635
00:47:11,760 --> 00:47:14,840
Alex:
on the inference side and on

636
00:47:14,840 --> 00:47:18,000
Alex:
uh some of the model side i think like deep seek we we

637
00:47:18,000 --> 00:47:21,240
Alex:
had talked to their team for years before r1 came

638
00:47:21,240 --> 00:47:24,560
Alex:
out they had many models before that and

639
00:47:24,560 --> 00:47:27,560
Alex:
they were always like a pretty sharp optimum like

640
00:47:27,560 --> 00:47:30,260
Alex:
team for doing inference um like they

641
00:47:30,260 --> 00:47:33,480
Alex:
came up with like the best user experience for caching prompts

642
00:47:33,480 --> 00:47:41,360
Alex:
long before deep cpr1 came out and they had very good pricing um they uh they

643
00:47:41,360 --> 00:47:47,540
Alex:
were just they were like you know by far the the strongest chinese team um that

644
00:47:47,540 --> 00:47:52,220
Alex:
we were aware of uh well before that happened and so i'm guessing there was like some talent.

645
00:47:53,960 --> 00:47:56,860
Alex:
Uh accumulation that they were working on in china

646
00:47:56,860 --> 00:48:00,000
Alex:
for people who wanted to stay in china and yeah that's

647
00:48:00,000 --> 00:48:02,820
Alex:
that's a huge advantage like american companies are obviously not

648
00:48:02,820 --> 00:48:06,840
Alex:
doing that there's a duck is very on

649
00:48:06,840 --> 00:48:09,740
Alex:
point that a lot of this is just based on talent

650
00:48:09,740 --> 00:48:12,540
Alex:
um there are a lot of

651
00:48:12,540 --> 00:48:16,040
Alex:
ai is open and out there and just like and

652
00:48:16,040 --> 00:48:19,060
Alex:
very composable like a big tree of knowledge

653
00:48:19,060 --> 00:48:22,100
Alex:
there's a paper that comes out and it cites like

654
00:48:22,100 --> 00:48:25,180
Alex:
20 other papers and you can go and read all

655
00:48:25,180 --> 00:48:28,180
Alex:
of the cited papers and then you like have kind of

656
00:48:28,180 --> 00:48:30,880
Alex:
a basis for understanding the paper but you really have to

657
00:48:30,880 --> 00:48:33,680
Alex:
go one level deeper and read all the cited papers two levels

658
00:48:33,680 --> 00:48:37,700
Alex:
down to really understand what's going on and it's.

659
00:48:37,700 --> 00:48:40,620
Alex:
Just that no very few people can do that um and

660
00:48:40,620 --> 00:48:44,060
Alex:
it takes like a lot of years of experience to like actually

661
00:48:44,060 --> 00:48:46,960
Alex:
apply that knowledge and learn all these

662
00:48:46,960 --> 00:48:50,140
Alex:
things that have not been written in any paper at all and uh

663
00:48:50,140 --> 00:48:53,800
Alex:
and there's just there's just such such it

664
00:48:53,800 --> 00:48:56,600
Alex:
like a small number of people um who can

665
00:48:56,600 --> 00:48:59,540
Alex:
really lead research on all the different dimensions that

666
00:48:59,540 --> 00:49:03,300
Alex:
go on to making a model and uh um and

667
00:49:03,300 --> 00:49:06,500
Alex:
and like the the border between china and the u.s is

668
00:49:06,500 --> 00:49:09,420
Alex:
is pretty defined you have to leave china move to the u.s

669
00:49:09,420 --> 00:49:13,040
Alex:
and really establish yourself here um so

670
00:49:13,040 --> 00:49:16,020
Alex:
i do think there's like country arbitrage there's like

671
00:49:16,020 --> 00:49:22,260
Alex:
there's you know the head the hedge fund background arbitrage there's uh there's

672
00:49:22,260 --> 00:49:25,960
Alex:
there's hardware arbitrage like there's like a ton of hardware that's only available

673
00:49:25,960 --> 00:49:32,060
Alex:
in china but not here vice versa that creates an opportunity um and this this

674
00:49:32,060 --> 00:49:33,060
Alex:
will just continue to happen.

675
00:49:33,930 --> 00:49:37,610
Ejaaz:
Yeah, I think this arbitrage is fascinating.

676
00:49:37,910 --> 00:49:44,130
Ejaaz:
I read somewhere that there's probably less than 200 or 250 researchers in the

677
00:49:44,130 --> 00:49:49,030
Ejaaz:
world that are worthy of working at some of these frontier AI model labs.

678
00:49:49,170 --> 00:49:54,490
Ejaaz:
And I looked into some of the backgrounds of the team behind Kimi K2,

679
00:49:54,750 --> 00:50:01,430
Ejaaz:
which is this recent open source model out of China, which broke all these crazy rankings.

680
00:50:01,430 --> 00:50:04,990
Ejaaz:
I think it was like a trillion parameter model or something crazy like that.

681
00:50:05,190 --> 00:50:08,410
Ejaaz:
And a lot of them worked at some of the top American tech companies.

682
00:50:08,990 --> 00:50:11,710
Ejaaz:
And they all graduated from this one university in China.

683
00:50:11,870 --> 00:50:16,530
Ejaaz:
I think it's Tsinghua, which apparently is like, you know, the Harvard of AI

684
00:50:16,530 --> 00:50:18,850
Ejaaz:
in China, right? So pretty crazy.

685
00:50:19,850 --> 00:50:25,070
Ejaaz:
But Alex, I wanted to shift the focus of the conversation to a point that you

686
00:50:25,070 --> 00:50:29,110
Ejaaz:
brought up earlier in this episode, which is around data.

687
00:50:29,830 --> 00:50:34,690
Ejaaz:
Okay, so here's the context that like Josh and I have spoken about this at length, right?

688
00:50:35,110 --> 00:50:39,790
Ejaaz:
We are obsessed with this feature on OpenAI, which is memory, right?

689
00:50:39,950 --> 00:50:43,950
Ejaaz:
And I know a lot of the other memory, sorry, a lot of the other AI models have memory as well.

690
00:50:44,050 --> 00:50:49,150
Ejaaz:
But the reason why we love it so much is I feel like the model knows me, Alex.

691
00:50:49,390 --> 00:50:54,390
Ejaaz:
I feel like it knows everything about me. It can personally curate any of my prompt.

692
00:50:54,610 --> 00:50:59,090
Ejaaz:
It just gets me. It knows what I want and it just serves up to me in a platter

693
00:50:59,090 --> 00:51:01,490
Ejaaz:
and off I go, you know, doing my thing.

694
00:51:02,470 --> 00:51:07,410
Ejaaz:
Now, Open Router sits on top of like kind of like the query layer, right?

695
00:51:07,490 --> 00:51:12,210
Ejaaz:
So you have all these people writing all these weird and wonderful prompts and

696
00:51:12,210 --> 00:51:16,970
Ejaaz:
kind of routing it through on towards like different AI models.

697
00:51:18,830 --> 00:51:21,850
Ejaaz:
You hold all of that data or maybe you have access to all of that data.

698
00:51:21,930 --> 00:51:25,410
Ejaaz:
And I know you have something called private chat as well, where you don't have access to it.

699
00:51:25,930 --> 00:51:29,470
Ejaaz:
Talk to me about like what OpenRouter and what you guys are thinking about doing

700
00:51:29,470 --> 00:51:31,190
Ejaaz:
with this data, because presumably,

701
00:51:31,630 --> 00:51:36,290
Ejaaz:
or in my opinion, you guys have actually the best mode, arguably better than

702
00:51:36,290 --> 00:51:40,190
Ejaaz:
ChatGPT, because you have all these different types of prompts coming from all

703
00:51:40,190 --> 00:51:42,970
Ejaaz:
these different types of users for all these different types of models.

704
00:51:43,530 --> 00:51:47,610
Ejaaz:
So theoretically, you could spin up some of the most personal AI models for

705
00:51:47,610 --> 00:51:49,550
Ejaaz:
each individual user if you wanted to.

706
00:51:49,730 --> 00:51:52,970
Ejaaz:
Do I have that correct? Or am I, you know, speaking crazy?

707
00:51:54,650 --> 00:51:59,190
Alex:
No, that's true. No, it's something we're thinking about.

708
00:52:00,750 --> 00:52:04,310
Alex:
By default, your prompts are not logged at all.

709
00:52:05,130 --> 00:52:08,850
Alex:
We don't have prompts or completions for new users by default.

710
00:52:09,270 --> 00:52:11,930
Alex:
You have to toggle it on in settings.

711
00:52:15,030 --> 00:52:21,310
Alex:
But the result, a lot of people do toggle it on. And as a result,

712
00:52:21,490 --> 00:52:25,970
Alex:
I think we have by far the largest multi-model prompt data set.

713
00:52:26,910 --> 00:52:32,190
Alex:
Uh, but what we've done today, we've barely done anything with it.

714
00:52:32,290 --> 00:52:36,710
Alex:
We classify a tiny, tiny, tiny subset of it. And that's what you see in the rankings page.

715
00:52:37,330 --> 00:52:43,090
Alex:
Um, but, uh, what it could be done on like a per account level is really,

716
00:52:43,090 --> 00:52:45,510
Alex:
um, like three main things.

717
00:52:45,650 --> 00:52:51,670
Alex:
One memory right out of the box. You can, you can get this today by like combining

718
00:52:51,670 --> 00:52:56,210
Alex:
open router with like a memory as a service. We've got a couple of companies

719
00:52:56,210 --> 00:52:58,030
Alex:
that do this, like Memzero and SuperMemory.

720
00:52:59,270 --> 00:53:03,350
Alex:
And we can partner with one of those companies or do something similar and just

721
00:53:03,350 --> 00:53:04,910
Alex:
provide a lot of distribution.

722
00:53:05,130 --> 00:53:09,930
Alex:
And that basically gets you a chat GPT as a service where it feels like the

723
00:53:09,930 --> 00:53:14,950
Alex:
model really knows you and the right context gets added to your prompt.

724
00:53:16,710 --> 00:53:23,770
Alex:
The other things that we can do are help you select the right model more intelligently.

725
00:53:25,010 --> 00:53:31,930
Alex:
There's a lot of models where there's like a super clear, like migration decision that needs to be made.

726
00:53:33,490 --> 00:53:36,330
Alex:
And, and we can just see this very clearly in the data.

727
00:53:36,650 --> 00:53:41,550
Alex:
But we right now we just like, you know, we have like a channel or like some

728
00:53:41,550 --> 00:53:44,370
Alex:
kind of communication channel open with the customer, we can just tell them

729
00:53:44,370 --> 00:53:47,410
Alex:
like, hey, and we know you're using this model a ton.

730
00:53:48,090 --> 00:53:51,650
Alex:
It's been deprecated. This model is significantly better. you

731
00:53:51,650 --> 00:53:54,690
Alex:
should move this kind of workload over to it or like

732
00:53:54,690 --> 00:53:58,350
Alex:
this workload you'll get way better pricing if you do this um

733
00:53:58,350 --> 00:54:01,470
Alex:
and and that's basically like that's the

734
00:54:01,470 --> 00:54:04,230
Alex:
only sort of guidance and kind of like

735
00:54:04,230 --> 00:54:06,850
Alex:
opinionated routing we've done so far and it could

736
00:54:06,850 --> 00:54:09,550
Alex:
be a lot more intelligent a lot more out of the box a lot more

737
00:54:09,550 --> 00:54:13,050
Alex:
built into the product um and then

738
00:54:13,050 --> 00:54:16,590
Alex:
the the last thing

739
00:54:16,590 --> 00:54:19,250
Alex:
we can do i mean there's there's probably tons of

740
00:54:19,250 --> 00:54:22,690
Alex:
things we're not even thinking about um but

741
00:54:22,690 --> 00:54:26,050
Alex:
like getting really

742
00:54:26,050 --> 00:54:30,150
Alex:
really smart about how

743
00:54:30,150 --> 00:54:33,770
Alex:
models and providers are responding to prompts and

744
00:54:33,770 --> 00:54:37,830
Alex:
uh showing you just the really coolest

745
00:54:37,830 --> 00:54:40,750
Alex:
data just like telling you

746
00:54:40,750 --> 00:54:44,590
Alex:
what kinds of of prompts um are

747
00:54:44,590 --> 00:54:48,130
Alex:
going to which models and how those models are replying and

748
00:54:48,130 --> 00:54:51,190
Alex:
just like characterizing the reply in all kinds of interesting ways

749
00:54:51,190 --> 00:54:54,090
Alex:
like did the model refuse to answer what's the refusal rate

750
00:54:54,090 --> 00:54:57,330
Alex:
did the model um did the.

751
00:54:57,330 --> 00:55:00,170
Alex:
Model like successfully make a tool call or did it decide to

752
00:55:00,170 --> 00:55:02,990
Alex:
ignore all the tools that you passed in that's a huge one

753
00:55:02,990 --> 00:55:06,630
Alex:
um did the model like pay

754
00:55:06,630 --> 00:55:12,410
Alex:
attention to its context did uh you know did what did did some kind of truncation

755
00:55:12,410 --> 00:55:16,830
Alex:
happening happen before you sent it to the model So there's all kinds of like

756
00:55:16,830 --> 00:55:24,410
Alex:
edge cases that cause developers apps to just get dumber and they're all detectable.

757
00:55:25,310 --> 00:55:31,270
Ejaaz:
I'm so happy you said that because I have this kind of like hot take,

758
00:55:31,410 --> 00:55:35,290
Ejaaz:
but maybe not so hot take, which is I actually think all the Frontier models

759
00:55:35,290 --> 00:55:40,790
Ejaaz:
right now are good enough to do the craziest stuff ever for each user.

760
00:55:40,950 --> 00:55:44,870
Ejaaz:
But we just haven't been able to unlock it because it just doesn't have the context.

761
00:55:44,870 --> 00:55:48,250
Ejaaz:
Sure, you can attach it to a bunch of different tools and stuff,

762
00:55:48,410 --> 00:55:53,290
Ejaaz:
but if it doesn't know when to use the tool or how to process a certain prompt

763
00:55:53,290 --> 00:55:56,630
Ejaaz:
or if the users themselves don't know how to read

764
00:55:57,530 --> 00:56:01,170
Ejaaz:
the output of the AI model themselves, like you just said, we need some kind

765
00:56:01,170 --> 00:56:02,750
Ejaaz:
of analytics into all of this,

766
00:56:03,290 --> 00:56:06,710
Ejaaz:
then we're just kind of walking around like headless chickens almost.

767
00:56:07,670 --> 00:56:11,270
Ejaaz:
So I'm really happy that you said that. One other thing that I wanted to get

768
00:56:11,270 --> 00:56:15,950
Ejaaz:
your take on on the data side of things is, I just think this whole concept

769
00:56:15,950 --> 00:56:20,990
Ejaaz:
or notion of AI agents is becoming such a big trend, Alex.

770
00:56:21,290 --> 00:56:28,070
Ejaaz:
And I noticed a lot of Frontier Model Labs release new models that kind of spin

771
00:56:28,070 --> 00:56:30,650
Ejaaz:
up several instances of their AI model.

772
00:56:30,770 --> 00:56:33,390
Ejaaz:
And they're tasked with a specific role, right?

773
00:56:33,750 --> 00:56:36,850
Ejaaz:
Okay, you're going to do the research. You're going to do the orchestrating.

774
00:56:37,090 --> 00:56:40,730
Ejaaz:
You're going to look online via a browser, blah, blah, blah,

775
00:56:40,730 --> 00:56:44,690
Ejaaz:
blah, blah. And then they coalesce together at the end of that little search

776
00:56:44,690 --> 00:56:48,130
Ejaaz:
and refine their answer and then present it to someone, right?

777
00:56:48,290 --> 00:56:51,750
Ejaaz:
You know, Grok4 does this, Claude does this, and a few other models.

778
00:56:52,310 --> 00:56:58,570
Ejaaz:
I feel like with this data that you're describing, OpenRouter could be or could

779
00:56:58,570 --> 00:57:00,050
Ejaaz:
offer that as a feature, right?

780
00:57:00,150 --> 00:57:05,090
Ejaaz:
Which is essentially, you can now have super intuitive, context-rich agents

781
00:57:05,090 --> 00:57:08,490
Ejaaz:
that can do a lot more than just talk to you or answer your prompts.

782
00:57:08,490 --> 00:57:11,110
Ejaaz:
But they could probably do a bunch of other actions for you.

783
00:57:11,630 --> 00:57:17,790
Ejaaz:
Is that a fair take, or is that something that maybe might be out of the realm of open router?

784
00:57:18,550 --> 00:57:22,430
Alex:
Our strategy is to be the best inference layer for agents.

785
00:57:23,090 --> 00:57:29,370
Alex:
And what I think developers want...

786
00:57:30,490 --> 00:57:33,770
Alex:
Is control over how their agents work.

787
00:57:35,130 --> 00:57:42,310
Alex:
And our developers at least want to use us as a single pane of glass for doing

788
00:57:42,310 --> 00:57:48,270
Alex:
inference, but they want to see and control the way an agent looks.

789
00:57:48,470 --> 00:57:52,110
Alex:
An agent is basically just something

790
00:57:52,110 --> 00:57:57,450
Alex:
that is doing inference in a loop and controlling the direction it goes.

791
00:57:57,450 --> 00:58:01,150
Alex:
So um what what

792
00:58:01,150 --> 00:58:04,550
Alex:
we want to do is just like build incredible docs

793
00:58:04,550 --> 00:58:08,630
Alex:
really good primitives that make that easy

794
00:58:08,630 --> 00:58:11,310
Alex:
to do so that you know like i think like

795
00:58:11,310 --> 00:58:14,210
Alex:
a lot of our developers are just people building agents and so

796
00:58:14,210 --> 00:58:17,370
Alex:
what they want is they want the primitives to

797
00:58:17,370 --> 00:58:20,750
Alex:
be solved so that they can just keep creating new

798
00:58:20,750 --> 00:58:24,410
Alex:
versions and new ideas um without worrying

799
00:58:24,410 --> 00:58:27,330
Alex:
about like you know re-implementing tool calling over

800
00:58:27,330 --> 00:58:31,250
Alex:
and over again and um and and

801
00:58:31,250 --> 00:58:34,230
Alex:
and so like at least for this is like a it's it's

802
00:58:34,230 --> 00:58:37,410
Alex:
a tough problem given how many models there's like a new model or provider every

803
00:58:37,410 --> 00:58:44,070
Alex:
day and uh and people actually want them and use them so uh to standardize this

804
00:58:44,070 --> 00:58:49,290
Alex:
like make make these tools like really dependable um that's kind of like where

805
00:58:49,290 --> 00:58:54,010
Alex:
we want to focus and uh so that like agent developers don't have to worry about it.

806
00:58:54,270 --> 00:58:57,830
Josh:
As we level up towards closer and closer to getting to AGI beyond,

807
00:58:58,030 --> 00:59:00,150
Josh:
I'm curious what Open Router's kind of endgame is.

808
00:59:00,270 --> 00:59:03,330
Josh:
If you have one, what is the master plan where you hope to end up?

809
00:59:03,510 --> 00:59:06,150
Josh:
Because the assumption is as these systems get more intelligent,

810
00:59:06,330 --> 00:59:09,350
Josh:
as they're able to kind of make their own decisions and choose their own tool

811
00:59:09,350 --> 00:59:15,030
Josh:
sets, what role does Open Router play in continuing to route that data through?

812
00:59:15,150 --> 00:59:18,990
Josh:
Do you have a kind of master plan, a grand vision of where you see this all heading to?

813
00:59:19,170 --> 00:59:24,750
Alex:
You're saying like as agents get better at choosing the tools that they use

814
00:59:24,750 --> 00:59:31,690
Alex:
what what becomes our role when like the agents are really good at that yes.

815
00:59:31,690 --> 00:59:34,990
Josh:
Yes and like where do you see open router fitting into the picture and what

816
00:59:34,990 --> 00:59:38,470
Josh:
would be the best case scenario for this this future of open router

817
00:59:38,470 --> 00:59:41,190
Alex:
Right now open routers bring your own tool,

818
00:59:42,600 --> 00:59:45,340
Alex:
platform um we don't have like a

819
00:59:45,340 --> 00:59:49,440
Alex:
marketplace of mcps yet uh and

820
00:59:49,440 --> 00:59:55,120
Alex:
and i i do think like a lot of the i think most of the most used tools will

821
00:59:55,120 --> 01:00:00,540
Alex:
be ones that developers configure themselves agents just work like they're given

822
01:00:00,540 --> 01:00:06,740
Alex:
access to it like i think like a holy grail for for open router is that.

823
01:00:07,760 --> 01:00:12,500
Alex:
The the ecosystem is going to like basically my

824
01:00:12,500 --> 01:00:16,080
Alex:
prediction for how the ecosystem is going to evolve is that um

825
01:00:16,080 --> 01:00:19,380
Alex:
all the models are going to be adding state and

826
01:00:19,380 --> 01:00:22,060
Alex:
other kinds of stickiness that just make you want to stick

827
01:00:22,060 --> 01:00:25,180
Alex:
with them so they're going to add server-side tool calls

828
01:00:25,180 --> 01:00:31,340
Alex:
they're going to add like um you know web search that that is stateful they're

829
01:00:31,340 --> 01:00:35,720
Alex:
going to add memory They're going to add all kinds of things that try to prevent

830
01:00:35,720 --> 01:00:42,060
Alex:
developers from leaving and increase lock-in.

831
01:00:42,420 --> 01:00:44,460
Alex:
And OpenRouter is doing the opposite.

832
01:00:45,360 --> 01:00:49,820
Alex:
We want developers to not feel vendor lock-in.

833
01:00:49,920 --> 01:00:53,720
Alex:
We want them to feel like they have choice and they can use the best intelligence,

834
01:00:53,720 --> 01:00:55,840
Alex:
even if they didn't before.

835
01:00:56,660 --> 01:01:00,700
Alex:
It's never too late to switch to a more intelligent model. That would be like,

836
01:01:00,960 --> 01:01:05,300
Alex:
you know, a good always on outcome for us.

837
01:01:05,680 --> 01:01:13,640
Alex:
And so what I think we'll end up doing is, is like partnering with other companies

838
01:01:13,640 --> 01:01:19,820
Alex:
or building the tools ourselves if we have to, so that developers don't feel stuck.

839
01:01:20,780 --> 01:01:23,540
Alex:
That's how I, you know, there's a lot of ways the ecosystem could evolve,

840
01:01:23,640 --> 01:01:25,000
Alex:
but that's how I would put it in a nutshell.

841
01:01:25,880 --> 01:01:29,580
Josh:
Okay, now there's another personal question that I was really curious about,

842
01:01:29,700 --> 01:01:34,200
Josh:
because I was also right there with you in the crypto cycle when NFTs got absolutely

843
01:01:34,200 --> 01:01:36,120
Josh:
huge, was a big user of OpenSea.

844
01:01:36,220 --> 01:01:38,940
Josh:
And it was kind of this trend that went up and then went down.

845
01:01:39,280 --> 01:01:44,560
Josh:
And NFTs kind of fizzled out, it wasn't as hot anymore, and AI kind of took the wind from the sails.

846
01:01:44,860 --> 01:01:48,380
Josh:
And it's a completely separate audience, but a similar thing where now it's

847
01:01:48,380 --> 01:01:49,340
Josh:
the hottest thing in the world.

848
01:01:49,340 --> 01:01:54,180
Josh:
And i'm curious how you see the trend continuing is this a cyclical thing that

849
01:01:54,180 --> 01:01:59,420
Josh:
has ups and downs or is this a one-way trajectory of more tokens every day more

850
01:01:59,420 --> 01:02:03,400
Josh:
ai every day is do you see it being a cyclical thing or is this a a one-way

851
01:02:03,400 --> 01:02:06,120
Josh:
trend towards up into the right nfts

852
01:02:06,120 --> 01:02:11,320
Alex:
Kind of follow uh crypto in a,

853
01:02:12,670 --> 01:02:15,650
Alex:
indirect way um when crypto

854
01:02:15,650 --> 01:02:18,430
Alex:
has ups and downs nfts generally lag a bit

855
01:02:18,430 --> 01:02:22,590
Alex:
but they they have similar ups and downs and um

856
01:02:22,590 --> 01:02:30,730
Alex:
and crypto is an extremely long-term play on like building a new financial system

857
01:02:30,730 --> 01:02:40,590
Alex:
and there are so many reasons that it's not going to happen overnight um and And they're like,

858
01:02:40,810 --> 01:02:43,370
Alex:
it's very, very entrenched reasons.

859
01:02:43,530 --> 01:02:49,290
Alex:
Whereas AI, there are some overnight business transformations going on.

860
01:02:49,770 --> 01:02:55,550
Alex:
And the reason AI, I think, moves a lot, one of the reasons that AI moves a

861
01:02:55,550 --> 01:02:59,270
Alex:
lot faster is it's just about making computers behave more like humans.

862
01:02:59,890 --> 01:03:05,530
Alex:
So if a company already works with a bunch of humans, then there's,

863
01:03:05,610 --> 01:03:07,510
Alex:
you know, there's some engineering that needs to be done.

864
01:03:07,510 --> 01:03:10,730
Alex:
There's some like thinking about how

865
01:03:10,730 --> 01:03:14,350
Alex:
to like scale this but

866
01:03:14,350 --> 01:03:17,810
Alex:
but in general i think that it's not like

867
01:03:17,810 --> 01:03:21,090
Alex:
after seeing what can be possible um inference

868
01:03:21,090 --> 01:03:24,130
Alex:
will be the fastest growing operating expense for all companies

869
01:03:24,130 --> 01:03:27,250
Alex:
it'll it'll be like oh we can just hire

870
01:03:27,250 --> 01:03:37,490
Alex:
high-performing employees at a click of a

871
01:03:37,490 --> 01:03:41,150
Alex:
and they they work 24 7 they

872
01:03:41,150 --> 01:03:44,070
Alex:
scale elastically it's like you know

873
01:03:44,070 --> 01:03:47,130
Alex:
it it's not that hard it's not like huge mental

874
01:03:47,130 --> 01:03:51,230
Alex:
model shift it's just like a huge upgrade to the way companies work today um

875
01:03:51,230 --> 01:03:55,850
Alex:
in most cases so it's just completely different from crypto there's there's

876
01:03:55,850 --> 01:04:00,910
Alex:
like other than both being you know than nfts i mean other than both being new

877
01:04:00,910 --> 01:04:04,450
Alex:
they're fundamentally very different changes.

878
01:04:04,910 --> 01:04:11,150
Ejaaz:
You're probably one of very few people in the world right now that has crazy

879
01:04:11,150 --> 01:04:13,230
Ejaaz:
insights to every single AI model.

880
01:04:13,430 --> 01:04:17,130
Ejaaz:
Definitely more than the average user, right? Like I have like three or four

881
01:04:17,130 --> 01:04:19,650
Ejaaz:
subscriptions right now and I think I'm a hotshot.

882
01:04:20,070 --> 01:04:25,090
Ejaaz:
You get access to like 400 and what is it? 57 models right now on OpenRata.

883
01:04:25,410 --> 01:04:28,570
Ejaaz:
So an obvious question that I have for you is

884
01:04:29,930 --> 01:04:33,190
Ejaaz:
I'm not going to say in the next couple of years, because everything moves way

885
01:04:33,190 --> 01:04:34,630
Ejaaz:
too quickly in this sector.

886
01:04:34,870 --> 01:04:41,810
Ejaaz:
But over the next six months, is there anything really obvious to you that should

887
01:04:41,810 --> 01:04:43,950
Ejaaz:
be focused on within the AI sector?

888
01:04:44,110 --> 01:04:47,190
Ejaaz:
Maybe it's like the way that certain models should be designed,

889
01:04:47,190 --> 01:04:52,050
Ejaaz:
or perhaps it's at the application layer that no one's talking about right now.

890
01:04:52,330 --> 01:04:56,610
Ejaaz:
Because going on from our earlier part of the conversation, you just pick these

891
01:04:56,610 --> 01:04:59,310
Ejaaz:
trends out really early. and I'm wondering if you see anything.

892
01:04:59,630 --> 01:05:02,890
Ejaaz:
It doesn't have to be open-racket related. It could just be AI related.

893
01:05:03,150 --> 01:05:11,410
Alex:
I've seen the models trending towards caring more about how resourceful they

894
01:05:11,410 --> 01:05:14,490
Alex:
are than what knowledge they have in the bank.

895
01:05:14,990 --> 01:05:20,550
Alex:
Not all of, I feel like a lot of the applications, I think the model labs maybe,

896
01:05:21,650 --> 01:05:24,710
Alex:
a lot of them, I don't know how many of them really deeply believe that,

897
01:05:24,910 --> 01:05:31,370
Alex:
but a couple of them uh talk about it and i don't think it's really hit the

898
01:05:31,370 --> 01:05:37,710
Alex:
application space yet um because people will will ask chat gpt things and if

899
01:05:37,710 --> 01:05:39,810
Alex:
the knowledge is wrong they think the model is stupid,

900
01:05:40,890 --> 01:05:44,490
Alex:
and that's just kind of a bad way of evaluating a model um

901
01:05:44,490 --> 01:05:48,110
Alex:
like whatever knowledge a person has whatever

902
01:05:48,110 --> 01:05:51,590
Alex:
a person like where calls happen at a certain time like

903
01:05:51,590 --> 01:05:55,170
Alex:
does not it's not a proxy for how smart they are um

904
01:05:55,170 --> 01:05:57,930
Alex:
like the the intelligence and usefulness of a model

905
01:05:57,930 --> 01:06:01,970
Alex:
is going to trend towards how good it is at using tools and

906
01:06:01,970 --> 01:06:05,170
Alex:
uh and and how good it is at like paying

907
01:06:05,170 --> 01:06:11,230
Alex:
attention to its context of a long long long long context and so it's like it's

908
01:06:11,230 --> 01:06:17,450
Alex:
it's total memory capacity and accuracy um so i think those two things need

909
01:06:17,450 --> 01:06:22,170
Alex:
to be like emphasized more um the.

910
01:06:23,430 --> 01:06:26,750
Alex:
Like it might be that that models pull all

911
01:06:26,750 --> 01:06:30,170
Alex:
of their knowledge from like online databases

912
01:06:30,170 --> 01:06:34,090
Alex:
from like real-time uh scraped

913
01:06:34,090 --> 01:06:36,750
Alex:
index indices of the web along with a

914
01:06:36,750 --> 01:06:40,590
Alex:
ton of real-time updating data sources um and

915
01:06:40,590 --> 01:06:44,570
Alex:
they're never they're always kind of like relying on some some sort of database

916
01:06:44,570 --> 01:06:49,670
Alex:
for knowledge but relying on their reasoning process for for tool calling you

917
01:06:49,670 --> 01:06:56,350
Alex:
know like we we put it We spend probably the plurality of our time every week

918
01:06:56,350 --> 01:06:59,730
Alex:
on tool calling and figuring out how to make it work really well.

919
01:06:59,850 --> 01:07:04,790
Alex:
Humans, the big difference between us and animals is that we're tool users and tool builders.

920
01:07:06,330 --> 01:07:11,190
Alex:
And that's where human acceleration and innovation has happened.

921
01:07:11,990 --> 01:07:19,090
Alex:
So how do we get models creating tools and using tools very,

922
01:07:19,210 --> 01:07:21,710
Alex:
very effectively? there's very little,

923
01:07:22,770 --> 01:07:25,030
Alex:
There are very few benchmarks. There's very little priority.

924
01:07:25,250 --> 01:07:28,970
Alex:
There's the Tau Bench for measuring how good a model is at tool calling.

925
01:07:29,090 --> 01:07:31,310
Alex:
But there's, and there's like maybe a few others.

926
01:07:32,250 --> 01:07:38,670
Alex:
There's Swee Bench for measuring how good a model is at multi-turn programming tasks.

927
01:07:39,610 --> 01:07:42,930
Alex:
It's very, very hard to run, though. It costs like, you know,

928
01:07:43,350 --> 01:07:46,850
Alex:
for Sonnet, it could cost like $1,000 to run it.

929
01:07:48,150 --> 01:07:54,510
Alex:
And it's like the user experience for kind of like evaluating the real intelligence

930
01:07:54,510 --> 01:07:55,950
Alex:
of these models is not good.

931
01:07:56,130 --> 01:08:00,790
Alex:
And so like I love, as much as we don't have benchmarks listed on OpenRouter

932
01:08:00,790 --> 01:08:01,970
Alex:
today, I love benchmarks.

933
01:08:02,250 --> 01:08:07,030
Alex:
And I think like the app ecosystem and like developer ecosystem should spend

934
01:08:07,030 --> 01:08:10,070
Alex:
a lot more time making very cool and interesting ones.

935
01:08:10,430 --> 01:08:17,350
Alex:
Also, we will give credit grants for all the best ones. So I highly encourage it.

936
01:08:17,970 --> 01:08:22,810
Ejaaz:
Well, Alex, thank you for your time today. I think we're coming up on a close

937
01:08:22,810 --> 01:08:25,370
Ejaaz:
now. That was a fascinating conversation, man.

938
01:08:25,590 --> 01:08:31,890
Ejaaz:
And I think your entire journey from just non-AI stuff, so OpenSea all the way

939
01:08:31,890 --> 01:08:37,430
Ejaaz:
to OpenRouter has just been a great indicator of where these technologies are

940
01:08:37,430 --> 01:08:39,870
Ejaaz:
progressing and more importantly, where we're going to end up.

941
01:08:39,870 --> 01:08:45,030
Ejaaz:
I'm incredibly excited to see where OpenRatter goes beyond just prompt routing.

942
01:08:45,410 --> 01:08:49,250
Ejaaz:
I think some of the stuff you spoke about on the data side of things is going

943
01:08:49,250 --> 01:08:53,110
Ejaaz:
to be fascinating and arguably one of your bigger features. So I'm excited for future releases.

944
01:08:53,350 --> 01:08:59,590
Ejaaz:
And as Josh said earlier, if GPT-5 is releasing through your platform first,

945
01:08:59,990 --> 01:09:03,150
Ejaaz:
please give us some credits. We would love to use it.

946
01:09:03,750 --> 01:09:07,850
Ejaaz:
But for the listeners of this show, as you know, we're trying to bring on the

947
01:09:07,850 --> 01:09:13,890
Ejaaz:
most interesting people to chat about AI and Frontier Tech. We hope you enjoyed this episode.

948
01:09:14,230 --> 01:09:18,670
Ejaaz:
And as always, please like, subscribe, and share it with any of your friends

949
01:09:18,670 --> 01:09:21,450
Ejaaz:
who would find this interesting. And we'll see you on the next one. Thanks, folks.