Exploring the frontiers of Technology and AI
Ejaaz:
Last week, a Chinese company released a free AI model that is as good as Anthropik's
Ejaaz:
best model. It also beats ChatGPT 5.5 at writing and coding,
Ejaaz:
but it comes with a twist.
Ejaaz:
It's a sixth of the price and it's completely open source.
Ejaaz:
You can download it and run it at home. Now, in that same week,
Ejaaz:
the United States government banned Anthropik's most powerful model,
Ejaaz:
Fable 5, after someone revealed that an unrestricted version of it had hacked
Ejaaz:
into the National Security Agency's systems.
Ejaaz:
I think we've reached a point of no return. And not to sound dramatic, but
Ejaaz:
in six months, it is very realistic that we will have open source or open weight
Ejaaz:
models that are accessible to anyone in the world with an internet connection
Ejaaz:
and 5 to 10k to run at home,
Ejaaz:
that they can fine tune to do anything.
Ejaaz:
And it's mythos grade level models. These are the same models that we're hearing
Ejaaz:
rumors and reports from verified that they can exploit some of the most secure
Ejaaz:
systems in the world faster than any other exploiter has been able to do in the past.
Ejaaz:
And I think we're going to look back on 2026 as the moment or the year that
Ejaaz:
everything really changed and the point where humanity as itself really needs
Ejaaz:
to focus on safeguards and figuring out how to regulate
Ejaaz:
and release these AI models in the future. So we've reached a convergence of
Ejaaz:
this really interesting trend where the most powerful models in the world are
Ejaaz:
freely available and open source, available for anyone to access.
Ejaaz:
And the government, the United States specifically, has an off switch for their most powerful model.
Josh:
Yeah, it's been a couple of months, it seems, since we've had some news on the
Josh:
frontier of China. And you kind of forget about them every couple of weeks where
Josh:
they just kind of disappear, they quiet down.
Josh:
The new models come out, we see the fables, we see the mythos of the world.
Josh:
But then out of nowhere, they strike back and seemingly every single time it
Josh:
comes as a surprise at how powerful these new models have become so to start
Josh:
with this we have a new model from our favorite company to pronounce jeepu.
Josh:
I feel like i want to name my dog that is such a cute name but jeepu
Josh:
is doing something not so cute they're actually releasing a model named glm
Josh:
5.2 which kind of blew everyone's expectations out of the water i remember way
Josh:
back like six months ago when deep seek was doing this like
Josh:
deep secret release model everyone is like wait you did what with what and
Josh:
that's what this model feels like again we're getting that moment again because
Josh:
this is an open weights model which is not to be confused with open source and
Josh:
we'll talk about that in a little bit but this is an open weights model that is if i'm
Josh:
correct about this within one single point of the sw bench pro benchmark which
Josh:
is the benchmark that a lot of people use for coding oh yeah of gpt 5.5
Josh:
the like frontier coding model from open ai and that comes as a surprise because
Josh:
the cost well one if you run it locally is free but two if you run it on a server
Josh:
is like you said earlier you just one sixth of the cost so you're getting a
Josh:
incredible amount of coding capability for something that costs a fraction of
Josh:
what it costs if you were to go to one of these larger language models and it seems to work,
Josh:
almost as good, if I'm right. And this comes as a surprise to most people because
Josh:
every time we start to count China out, we're like, no, surely they can't catch up.
Josh:
They continue to chip away at this frontier.
Ejaaz:
There's a few things that people will jump to immediately. OK,
Ejaaz:
one, that these benchmarks can be easily gamed.
Ejaaz:
We're going to show you a few examples of benchmarks that couldn't be gamed
Ejaaz:
and GLM 5.2 performs really, really well. But the second thing is the cost.
Ejaaz:
Cost has become a really important point of discussion amongst enterprises specifically that are spending
Ejaaz:
hundreds of millions of dollars per year to access Claude and GPT.
Ejaaz:
It's just too much money for them to spend in terms of like the return on investment
Ejaaz:
that they're getting in work that they actually see.
Ejaaz:
So what they're now turning towards is these free open source models,
Ejaaz:
primarily designed and made by Chinese AI labs that can cut costs down drastically.
Ejaaz:
Just last week, we had Microsoft announce that they're replacing their co-pilot
Ejaaz:
LLM with not ChatGPT, with not Claude, but with DeepSeq itself.
Ejaaz:
So the point is, this comes at a very important time where cheaper models are
Ejaaz:
getting a lot of attention.
Ejaaz:
So now when we look at GLM 5.2 specifically, it is
Ejaaz:
Five to seven times cheaper than GPT 5.5 and Claude Opus 4.8,
Ejaaz:
but performs, as we're seeing on the benchmarks right here, almost as good as
Ejaaz:
each of these models, specifically at the metric that is the most important, which is coding.
Ejaaz:
Now, a lot of skeptics quite rightly were like, I don't know if this is actually
Ejaaz:
true. Like, let me test it against a few other independent benchmarks.
Ejaaz:
It came up pretty high. So if you look at the front end development when it
Ejaaz:
comes to like website design, GLM 5.2 Max is just below Fable 5.
Ejaaz:
We're not even talking about Opus 4.7 or 4.8 anymore, which it absolutely beat.
Ejaaz:
And then when we're looking at like anecdotes or feedback from like distinguished
Ejaaz:
individuals in the Western frontier.
Ejaaz:
So right now we're looking at a tweet from the CEO of Vercel.
Ejaaz:
He goes, I'm genuinely impressed, almost shocked at how good GLM 5.2 is at coding.
Ejaaz:
So this is feedback from real people using this for real use cases.
Ejaaz:
For the last three years, Josh, we've basically been told that the hundreds
Ejaaz:
of billions of dollars that is being spent on AI CapEx is for one single reason
Ejaaz:
only, to gain a moat ahead of any other model provider.
Ejaaz:
So we spend all this money on compute to train a frontier AI model.
Ejaaz:
And that moat, it doesn't matter what other companies do in China,
Ejaaz:
we will have the best model and that's enough for us.
Ejaaz:
This release from Gipu with GLM 5.2 basically shows us the opposite.
Ejaaz:
For a fraction of the cost, you can create a near frontier model that does like,
Ejaaz:
I don't know, 95% of the work,
Ejaaz:
And so it brings into question the valuation between these companies.
Ejaaz:
Should they be spending this amount of money or can we just do it for a lot
Ejaaz:
cheaper like these Chinese AI labs?
Josh:
Yeah, well, the large AI labs, I'm not sure they have a choice.
Josh:
I mean, it's just that you have to continue to push the frontier forward,
Josh:
whether you like it or not.
Josh:
But I think what we're seeing is a lot of these questions that we were excited
Josh:
to see play out, we're starting to get answers to.
Josh:
Like now it's less China versus America and more open source versus closed source
Josh:
because I mean, the open source models are coming from inside too.
Josh:
We have NVIDIA. They're working on open source models that are incredible,
Josh:
and they're making progress in that front.
Josh:
We have Apple now, who has an actually functional Siri on everyone's hardware
Josh:
device that runs essentially for free.
Josh:
So they're slowly starting to nibble away at this, I guess, the lower bottom
Josh:
of the barrel set of use cases.
Josh:
And then we have china which is glm that's deep seek that's these larger models
Josh:
where they're actually competing on the frontier so these big frontier private models are facing
Josh:
heat both from the lower end of the stack but also right at the top where these
Josh:
benchmarks sit and we're going to see how that plays out economically for in
Josh:
the case of jipu at least it's been playing out pretty well and,
Josh:
we probably should talk about the stock a little bit believe it or not this
Josh:
company is publicly traded not here in the united states but this is publicly
Josh:
traded at least in china and it's gone up.
Ejaaz:
What is that
Josh:
1500 percent 15x on the year that's like a crazy return and some interesting
Josh:
facts about this return and it's it's so funny to see kind of i guess how inefficient
Josh:
chinese markets are also note that the chart you're seeing on screen
Josh:
they have a lunch break in their stock market i didn't know this labeled it,
Josh:
like i didn't realize that chinese stock markets had an hour-long lunch break
Josh:
in the middle of the day. So that's cute and that's fun.
Josh:
But the numbers are pretty outrageous. When we trade, when we talk about expensive
Josh:
companies, we talk about SpaceX, who's trading what is it, like a very high
Josh:
multiple towards earnings. And,
Josh:
What we have with Jibu and this company that it's kind of owned by,
Josh:
Knowledge Atlas Technology, it's currently trading at about $136 billion market cap.
Josh:
It made $170 million or $107 million, I should say, in the full year of 2025.
Josh:
That means it trades 1,300 times sales, which is just this unbelievably high
Josh:
multiple on this company.
Josh:
And I think it's a testament to the, I guess, the lack of availability to get
Josh:
AI exposure in Chinese markets, but also the confidence and the excitement and
Josh:
enthusiasm they have around companies like this. That was just an interesting thing to see.
Ejaaz:
Yeah, I mean, at this valuation, it's about, what is that, like a fifth of Anthropics
Ejaaz:
valuation right now, which is, I think, around a trillion dollars.
Ejaaz:
So again, like it begs the question, is Chinese AI labs underpriced or are American
Ejaaz:
companies overpriced? And I'm curious to hear, like what listeners of the show actually think.
Ejaaz:
I tend to think that they probably need to meet somewhere in the middle.
Ejaaz:
We were actually saying before we started recording, Could you imagine the reaction
Ejaaz:
to this news if Anthropic was a publicly traded company and a new 3D open source
Ejaaz:
model that was freely accessible to anyone could achieve pretty much 95%
Ejaaz:
of the capability of Opus 4.8?
Ejaaz:
Like, I wonder what that would have done to the stock price in like a fair market
Ejaaz:
value, but crazy to see nonetheless. So if we're looking at a few different
Ejaaz:
metrics that compare cost and performance, just quickly to run you guys through this.
Ejaaz:
For input versus output tokens, for a million tokens, you're looking at around
Ejaaz:
$1.50 to $4.50 when it comes to cost.
Ejaaz:
Now, comparing that to Opus 4.8, that's around, I believe, $5 versus $25.
Ejaaz:
So again, we're achieving that 3 to 5x cheaper when it compares to a model of
Ejaaz:
similar performance and capability.
Ejaaz:
Now, I was skeptical of the benchmarks, and I have a new favorite benchmark
Ejaaz:
to compare it against, which is called DeepSwee.
Ejaaz:
DeepSwee is basically a benchmark that gives no models any answers.
Ejaaz:
Typically, with a benchmark, you have an answer sheet, and it can kind of cheat
Ejaaz:
and look at it and figure out a way to get to that answer.
Ejaaz:
There's no answer sheet for this
Ejaaz:
one, so it's a very accurate test of how good your model is at coding.
Ejaaz:
For DeepSuite, GLM 5.2 achieved a very modest fifth place. Now,
Ejaaz:
that is probably, or rather, fourth place, fifth place, fifth place.
Ejaaz:
And that is a pretty accurate standing of how agentic coding looks like for
Ejaaz:
this particular model. It is the highest number one place for open source model.
Ejaaz:
It absolutely crushed Kimi K2 by 17 percentage points. or a very clear lead.
Ejaaz:
And it's great to see how it weighs up. Like if it may not be frontier capability,
Ejaaz:
but if you want a workhorse, if you want an agent that basically works overnight
Ejaaz:
and isn't going to break the bank, GLM 5.2 is probably something that you can look at.
Ejaaz:
Another thing is it's really good at front-end web development.
Ejaaz:
So if you're looking at this screen right now, the website that you're seeing
Ejaaz:
was completely one-shotted in about 10 minutes from this one single model, GLM 5.2.
Ejaaz:
And repeatedly across design benchmark, Arena Benchmark was another one that I saw.
Ejaaz:
It performs really highly, in some cases beating Fable 5. So it's a really good
Ejaaz:
front end design model if that is something of interest.
Ejaaz:
And then the final one, because I know a lot of listeners on the show is like,
Ejaaz:
you know, how good are these models at like trading, investing, making money for you?
Ejaaz:
Well, there's this very famous benchmark, which is called the Vending Benchmark,
Ejaaz:
which basically allows an AI model to control a theoretical $10,000 and see
Ejaaz:
if it can make money by stocking a vending machine and then conducting sales,
Ejaaz:
managing inventory against competition.
Ejaaz:
It achieved second place right behind Claude Opus 4.7, which is the current
Ejaaz:
leading model. So it's also pretty good at making money as well.
Josh:
Yeah, and it also has a very clear roadmap to continue to be good and to get
Josh:
even better. There's an interaction actually between Elon Musk and the CEO of
Josh:
Z.ai, who is creating these models.
Josh:
So this guy asked, what's your current timeline for China to reach Fableclass?
Josh:
GLM 5.2 certainly shortened the gap. And then Elon said probably Q1.
Josh:
And then the CEO said, won't take that long. Which means they expect us to get
Josh:
a new Fableclass level model that's open weight and open source within the next six months.
Josh:
Which is incredibly compelling because that is going to be served up as open weights.
Josh:
And as you know, with open weights, you can actually run it on your own hardware.
Josh:
But the question is, do you actually want to run this on your hardware?
Josh:
I see on Twitter all the time, people who are spending tens of thousands of
Josh:
dollars to get those Mac studios, they're stacking them up in their offices,
Josh:
they're trying really hard to run these models locally.
Josh:
And I hate to break it to you, but the math ain't really math in on this so well.
Josh:
So there's a suite by Mike Schweinbach I thought was great. And it says the
Josh:
minimum to run the model is about $20,000 in hardware and you get about 20 tokens per second out.
Ejaaz:
For $20,000, that's like,
Josh:
That's pretty slow. It's not thinking that fast. And if you have these really
Josh:
long chain of thoughts, these long reasoning traces, it's going to take you
Josh:
a very long time to get an answer that involves deep thinking.
Josh:
So for about $20,000, you can get close to 35 billion tokens.
Josh:
And that's a 12 to one input to output ratio, assuming you have like good token caching setup.
Josh:
So he's saying if you ran the hardware 24-7 with zero downtime,
Josh:
it would take roughly five and a half years just to break even.
Josh:
And that right there is why open weights models are incredible.
Josh:
You're probably better off getting it served directly from their servers from
Josh:
the cloud instead of running your own.
Josh:
Because not only do you have to deal with the complexity, you have to power
Josh:
it all on, you have to deal with hardware stuff, and you have to worry about
Josh:
getting the actual hardware.
Josh:
Because Lord knows, getting those computers now is not as easy as it used to
Josh:
be. So interesting note on cost,
Josh:
on how available these are and accessible these are on a relative basis.
Ejaaz:
And the Chinese companies themselves are willing to subsidize these costs, just to be clear.
Ejaaz:
Like to play around with Kimi K 2.7, which is their frontier model,
Ejaaz:
I've been able to access it and use it since they launched it.
Ejaaz:
And I've been free using it to kind of like do research and all that kind of
Ejaaz:
stuff. And I've never once been charged for it. So there's a high subsidy coming
Ejaaz:
from like the Chinese side of things as well.
Ejaaz:
The other thing I'll say is these numbers may look big, right?
Ejaaz:
Like who on earth is spending $20,000 to get hardware that you can like run
Ejaaz:
at home to run these models open source?
Ejaaz:
But the idea is six months from now, 12 months from now, these very same models
Ejaaz:
will be distilled enough.
Ejaaz:
So that means it can maintain its intelligence, but good enough to run on your
Ejaaz:
local hardware at home, a custom PC, or maybe even your laptop.
Ejaaz:
The trend that we're undeniably seeing with these open-source models in particular
Ejaaz:
is higher intelligence for lower-cost hardware.
Ejaaz:
And if that trend continues, we will end up seeing this model that we're talking
Ejaaz:
about today being able to run off your handset. So it's something that seems
Ejaaz:
unfeasible right now to access.
Ejaaz:
But further on down the line, open-source, in my opinion, is pretty undeniable.
Ejaaz:
You'll be able to run it at home, and that's pretty good. But moving on.
Ejaaz:
The reason why we wanted to write this episode is there's a convergence of two trends, right?
Ejaaz:
So last week, we had a lot of reporting around Fable 5 being banned by the United States government.
Ejaaz:
The primary reason is the United States government does not think the model
Ejaaz:
is safe. If placed in a malicious actor's hands, we'll be able to be used against
Ejaaz:
government systems, hack, exploits, all that kind of stuff. And it's proven
Ejaaz:
itself on internal testing.
Ejaaz:
And the most recent revealing was a quote from a senator saying that the head of the NSA
Ejaaz:
Explained in a red team exercise, which is like a controlled environment,
Ejaaz:
that Claude Mythos 5 was able to breach all of its systems.
Ejaaz:
And typically, it would take months for an individual expert to do that.
Ejaaz:
It did it in hours. And this is just a crazy story and headline to read.
Ejaaz:
They've switched it off. It's not accessible to anyone. If you go on cloud right
Ejaaz:
now, you're unable to access Fable 5.
Ejaaz:
But the point is, these two trends have converged at the same time.
Ejaaz:
And it's important to discuss this because very soon in a few months time,
Ejaaz:
as that Elon tweet showed, we're going to end up with Mythos grade level models
Ejaaz:
that are freely available to anyone, subsidized by China or available to run at home for 10k.
Ejaaz:
And that is pretty scary, I guess.
Josh:
Yeah. Is that the lead now? Are we at six months? Does that feel about right?
Josh:
Like if they, if they release Mythos class by the end of this year,
Josh:
and then that gives, I guess, an open AI and Anthropic a six month head start.
Ejaaz:
And then the head of Chippoo has said it.
Josh:
So, yeah. So it seems like that's about right currently where we have like a
Josh:
six month window between us and the current bleeding edge open source.
Josh:
I could see that kind of getting closer and closer. It feels like they're right on the tail.
Josh:
Of course, understanding what's going on internally would be very helpful to
Josh:
know, because I'm sure GPT 5.5, well, we know we're getting 5.6 pretty soon.
Josh:
I'm sure Anthropic is working on something even more powerful than Mythos.
Josh:
And it feels like we don't really have a choice but to continue progressing
Josh:
as fast as we are. Otherwise, these are going to catch up.
Josh:
And they won't have the guardrails that are put in place currently by the Frontier
Josh:
models. Now, what's happening currently is we're seeing this fork.
Josh:
In terms of these private models where only people internally are now able to
Josh:
use them and anyone out in the world is getting, I guess, kind of disabled.
Josh:
They're getting a handicap because they're not actually able to access these frontier models.
Josh:
So we're seeing this weird crossroads where there's a small subset of people
Josh:
that work internally within OpenAI, within Anthropic, that are getting access to these models.
Josh:
The government is limiting their public use, which means the public is getting left behind.
Josh:
And then China is coming up and they're saying, hey, in six months,
Josh:
we're going to be right here at your head.
Josh:
So it's this really interesting dynamic that's at play. And we're going to really
Josh:
have to closely monitor this as these new frontier models continue to be released,
Josh:
because you have to assume, even though the world isn't using Mythos or Fable, they're continuing
Josh:
to iterate and to build better models. They're not just going to stop because of this.
Josh:
Same with OpenAI, same with all the other frontier labs.
Josh:
The question is, are these models
Josh:
going to be held privately for just a small subset of people to use?
Josh:
Or is there going to be this path forward in which the public can use them?
Josh:
I think everyone's hope is that there is a path forward.
Josh:
But currently, we're at this weird standstill where it feels like China's kind
Josh:
of breathing down your neck here.
Ejaaz:
Well, the irony also is if the government is just going to come in and switch
Ejaaz:
off the frontier model, it's going to push companies to use open source models.
Ejaaz:
Imagine you're an enterprise, right? And you're running your entire company
Ejaaz:
on Fable 5 or whatever the frontier model is from an AI lab.
Ejaaz:
And then suddenly you know that the government can just switch the button off
Ejaaz:
and suddenly your company can't do its thing.
Ejaaz:
You're more incentivized to kind of like run an open model at home that's privately
Ejaaz:
inferenced such that you can never shut it down.
Ejaaz:
So if I was an enterprise that has been running Fable 5 and that has now been
Ejaaz:
shut off, I'll be looking over at this GLM 5.2 thing and thinking,
Ejaaz:
well, it's MIT open source.
Ejaaz:
Yeah, maybe it costs 20K to run on hardware, but like I'll rather spend that
Ejaaz:
and save, you know, hundreds of millions down the line versus like going with Fable 5.
Ejaaz:
And yeah, maybe achieving frontier level performance, but then,
Ejaaz:
you know, being shut off potentially by the government, according to their agenda,
Ejaaz:
like that's not something that you potentially want.
Ejaaz:
Now, I want to give a quick counterpoint to the whole Chinese open source AI
Ejaaz:
models are going to take over the world because they're cheaper,
Ejaaz:
they're as good, maybe not as good, but as good, good enough,
Ejaaz:
right? Which is very simple.
Ejaaz:
If you're an American lab that has a frontier AI model that is expensive and
Ejaaz:
you see your neighbors, or if you see your adversaries, China,
Ejaaz:
distilling your model and presenting it as a cheaper model, you just do the same for your own model.
Ejaaz:
And Anthropic has demonstrated that many times, producing Sonnet.
Ejaaz:
Sonnet 4 is basically their cheaper model of Opus 4.8, I believe.
Ejaaz:
And then you see it with ChatGPT, with GPT Flash. These AI labs will produce
Ejaaz:
a cheaper version, and they'll distill it directly from their frontier models.
Ejaaz:
And as these models get good enough to rebuild themselves, it gets easier to do.
Ejaaz:
So I can see a world where they release Fable 6 in the future with a companion
Ejaaz:
model, which is like Sonic 6. And it's super cheap for anyone that wants 85%
Ejaaz:
of the capability and don't care about that extra 15%. And it's super cheap.
Ejaaz:
So it's competitive with the Chinese models. I don't think America has lost
Ejaaz:
the kind of like cheap model argument, but the open source one,
Ejaaz:
they definitely have. I don't see the American and labs open sourcing anytime soon.
Josh:
Yeah, well, we saw MetaPivot very clearly from the open source,
Josh:
but like the savior of the open source world to closed source very quickly.
Josh:
And I mean, that hasn't worked out too well for them or anyone really,
Josh:
which is disappointing.
Josh:
There is a small caveat. Maybe we should cover about what open source actually
Josh:
means because it's not truly open source. There are still some secrets.
Josh:
I think a better way to classify this is open weights. And when you go through
Josh:
training, there's, let's say, a trillion parameters. Each one of those parameters
Josh:
gets tuned over and over and over through each training run,
Josh:
which happens trillions of times.
Josh:
And the output of this are the weights. It's just a large text file that has
Josh:
all of those parameters finely tuned that the model can run off of.
Josh:
What it doesn't include is the actual source code that it took to make that.
Josh:
It doesn't include the ability to reproduce it. All it shares is the outputs.
Josh:
So while you could take their outputs and you could retune and fine-tune those
Josh:
parameters to give you exactly what you want
Josh:
it's not giving you the recipe it's not giving you the secrets on how it built it
Josh:
so there is still some proprietary knowledge as it relates to this open source
Josh:
model these chinese companies because they they are actually preserving the
Josh:
recipe in which they landed on this the data that they trained on there's a
Josh:
lot of secrets the output
Josh:
is what's open source and that's technically open weight so when we say open
Josh:
source i think what we really mean whenever you hear open source model chances
Josh:
are it's open weights and that's a pretty big distinction because that allows
Josh:
them to keep their kind of their secret sauce of how they do it and it's also
Josh:
probably for the better because i assume,
Josh:
you got to imagine they've been distilling some sort of stuff from i mean i
Josh:
remember seed dance that was so like obviously stolen material because it was
Josh:
just able to reproduce all the copyright and video formats from any public tv show in the world so.
Josh:
Where they get their data from leaves a lot to be desired and questioned,
Josh:
but that's kind of the nuance between open source and open weights.
Josh:
And what we're getting right now currently is open weights.
Ejaaz:
I don't necessarily believe it's open models versus centralized models.
Ejaaz:
I think it lands somewhere in between. Now, we've been noticing this new type
Ejaaz:
of product that is getting used by a lot of software engineers and AI users.
Ejaaz:
It's probably best demonstrated by this recent product release from Sakana AI.
Ejaaz:
It's called this new model called Fugu.
Ejaaz:
And they describe it as a multi-agent orchestration system. Basically how it
Ejaaz:
works is you send their model a prompt as you do with ChatGPT or Claude.
Ejaaz:
And it disperses that prompt across many different models. It could be closed
Ejaaz:
models like Claude and GPT.
Ejaaz:
Could be open models like GPO GLM or Kimi K2.7. as well as their own trained
Ejaaz:
model called Fugu, I believe.
Ejaaz:
And the result of this is like agentic debate. So these models kind of produce their own answers.
Ejaaz:
Then you have another model that kind of judges these answers and produces the
Ejaaz:
best answer from all of this.
Ejaaz:
And the result from these tests is basically, not only do you have a better
Ejaaz:
quality output, but it's also cheaper.
Ejaaz:
So the orchestration module basically picks the best models to do something
Ejaaz:
when it's like cheaper, and then only uses the best models when it really needs
Ejaaz:
to solve a really hard task that the other cheaper models can't do.
Ejaaz:
So it saves you a bunch of money, and we see it across other companies like
Ejaaz:
OpenRouter with their new Fusion API. The point being made here is,
Ejaaz:
We are headed towards a world where the ideal AI chatbot uses multiple models,
Ejaaz:
and they may not just be from the same company.
Ejaaz:
So the question I have for the United States government and any government that
Ejaaz:
decides to regulate, whether it's open source models or closed source models,
Ejaaz:
how are you going to regulate every single model in the world,
Ejaaz:
especially when the model labs come from other countries or are in fact open source?
Ejaaz:
You can't regulate open source models. That's the whole idea of it,
Ejaaz:
whether it's open weight or open source.
Ejaaz:
The whole idea is the government can't try to doubt if you're running it on hardware at home.
Ejaaz:
So it's just a really interesting nuance. I just don't think that the stance
Ejaaz:
that the United States government has taken so far is necessarily the most productive
Ejaaz:
one. I understand why they're doing it, but we need to figure out a different framework.
Josh:
It's funny because I saw this news this morning about this Sakana Fugu.
Josh:
I think I'm pronouncing that right. I mean, surely I've never heard of this.
Josh:
I don't know if you've ever heard of this. I think a lot of people watching
Josh:
have never heard of this company. They're Japanese. They came out of nowhere.
Josh:
And suddenly they're posting benchmarks that show that it has higher performance than Fable.
Josh:
And maybe that's true. Maybe they use this mixture of agents.
Josh:
But I think it's also notable that a lot of this is benchmarks.
Josh:
And I actually got some time to play around with the new GLM model this weekend.
Josh:
And while I'm sure it's great at coding and technical use, that's not really
Josh:
what I generally use the models for.
Josh:
And as I'm actually using these models, I'm giving it the general vibe test
Josh:
i'm noticing that i really do strongly bias the american closed source models like,
Josh:
uh gpt and like anthropics um opus and um claude and i mean fable when it was
Josh:
available was incredible and
Josh:
although the benchmarks show that it's very competent at coding a lot of people
Josh:
aren't using it for coding they're using it for other things and
Josh:
and the the general the vibe check doesn't get passed with these models yet
Josh:
at least um so i think that's something worth noting too is like these are just
Josh:
benchmarks i encourage anyone who's listening go try this out for yourself and see for yourself.
Josh:
Some people may actually get a lot of benefit from using a cheaper model.
Josh:
Some people just like having all the context in one place and they want just
Josh:
a better overall experience.
Josh:
With the routing, I think this is a super interesting precedent that we're seeing.
Josh:
Sakana fugu and how they are choosing to route their outputs through a series
Josh:
of open source and closed source models in order to generate a better and more
Josh:
powerful outcome i wonder the costs i noticed that as i was looking through the documentation
Josh:
there was no real cost associated i have to assume it's,
Josh:
not as high but pretty close because it is routing through.
Josh:
A lot of the private models and some open source models in order to get this
Josh:
which means it's probably consuming a good bit of tokens it's not totally going
Josh:
to be this like open source very low price model
Josh:
but it is interesting to see this trend towards more router based applications
Josh:
where not everyone needs to solve this incredibly difficult challenge.
Josh:
Perhaps you spin off a few sub agents, they use a more lightweight model to
Josh:
get you an answer without needing to consume a lot of those higher cost tokens.
Josh:
So it's cool, innovative, I won't say it's novel, we've seen this before,
Josh:
but it's a new iteration of this that is now showing pretty compelling benchmarks.
Ejaaz:
On the cost side of things, if it's anything like OpenRouter's Fusion API,
Ejaaz:
which does the same architecture, it achieves roughly like 30 to 50% cheaper
Ejaaz:
versus the frontier models, which isn't that major compared to like some of
Ejaaz:
the Chinese open source models.
Ejaaz:
But it still saves you a bunch of money if you're an enterprise using this at length.
Ejaaz:
I'm trying to think about the major takeaway that I have for myself after we've
Ejaaz:
done this episode, Josh.
Ejaaz:
And I think the main one is I'm inclined to say, and I hope I'm wrong,
Ejaaz:
that future AI model releases, Fable and above, whether it comes from GPT 5.6
Ejaaz:
or 6 or other frontier AI labs,
Ejaaz:
they're going to be more controlled in their release because governments are
Ejaaz:
going to start getting more involved.
Ejaaz:
We're going to start seeing nationalization attempts from different nation states
Ejaaz:
in order to figure out how to release these AR models because if they're out
Ejaaz:
in the wild, they can exploit and cause some real damage.
Ejaaz:
I don't want to think about what could happen in terms of a major event,
Ejaaz:
but I think we're reaching that point where we need to pay careful attention.
Ejaaz:
So that's what we're trying to do on this episode. At least that's what I'm trying to do.
Josh:
Yeah, I think that's right. Like the speed and acceleration of these models
Josh:
and the cadence in which they're released is up only.
Josh:
If we had a chart that showed you the length of time in between major model
Josh:
releases, It is just getting shorter and shorter and shorter,
Josh:
and that's not changing.
Josh:
So there needs to be a way to reliably be able to push these out.
Josh:
Otherwise, the gap between what exists behind closed doors and what's available
Josh:
to the public is just going to keep growing.
Josh:
And I'm not sure what implications that has, but it sounds like it is noteworthy and something...
Josh:
Something needs to change in a material way because the speed and velocity in
Josh:
which progress is being made is not slowing down.
Josh:
Like, what does this look like a year from now? How quick are these models able
Josh:
to improve themselves? What are the benchmarks look like? Can we even create
Josh:
benchmarks anymore because it will be so capable?
Josh:
We're right on that cusp because we are approaching this vertical asymptote off the curve.
Josh:
And it's just like, it's a little weird. It feels like we're on this roller
Josh:
coaster and we're like kind of going down, but I guess it's inverted where we're
Josh:
going up and we're going up really fast and you're not really sure. It's escaping.
Josh:
It's escaping control in a way well i wouldn't say escaping control but it's
Josh:
just like that it's it's definitely getting fast and it's like okay like if
Josh:
you're driving your car really fast you got to be a little more careful once
Josh:
you reach high speed because like things things can kind of get a little shaky quickly so.
Josh:
We're at that point and models are getting very capable very quickly.
Josh:
I can't imagine what OpenAI's mythos class model looks like.
Josh:
I'm sure they're working on them.
Josh:
We talk about, I mean, the hardware. I always think about these are the Blackwell series models.
Josh:
What happens with the Vera Rubin series models? It's like this,
Josh:
we are going to accelerate so fast. And I think it's important to,
Josh:
yeah, work on these safeguards now where it's still reasonable to catch up,
Josh:
where there's only one model release in which you have to focus on.
Josh:
And there's not 10 different ones from all these different companies that are
Josh:
being pushed every single week so interesting that's the update china is,
Josh:
back with their open weights model not to be confused with open source and um
Josh:
yeah we still don't have fable access so
Josh:
hopefully these things will get sorted but i think it's it's noteworthy that
Josh:
china they they never disappeared i want to know what deep seek is doing next
Josh:
i think that's my next question is like where's deep seek at where's deep seek v v5 or v6 they just.
Ejaaz:
Raised a massive round 50 billion dollars um that's their valuation at least
Ejaaz:
there's still a fraction of frontier labs but yeah they raised like uh was it
Ejaaz:
nine billion dollars the founder himself put in three billion dollars there
Ejaaz:
they're doing pretty well and we haven't seen a model race from them anytime soon
Josh:
Yeah yeah so will be fun to see but that is the update on china on open source
Josh:
thank you guys so much for watching as always if you enjoyed this episode don't
Josh:
forget to share it with a friend who might also like the show who might care
Josh:
about china or open source models or wherever it may be
Josh:
if you listen on a podcast player rating us how you believe we deserve to be
Josh:
rated is always appreciated we
Josh:
love the five stars those are always great uh newsletter twice a week next one
Josh:
is dropping on wednesday a day after you listen to this and yeah that's i have one final.
Ejaaz:
Request josh something that you and i discussed on our on our walk uh last week but um
Ejaaz:
We are in the market for sponsors or anyone that can support us, please.
Ejaaz:
Josh and I and producer Luke have been keeping the lights on this entire time
Ejaaz:
and we've reached a point where we're feeling really confident about the numbers
Ejaaz:
and all the support that you guys have given us.
Ejaaz:
And we would love to have a partner that we feel very passionate about join
Ejaaz:
us and support us in our vision of growing this into the leading frontier and
Ejaaz:
AI tech podcast in the world.
Ejaaz:
So if there's anyone out there listening to this that is inspired or wants to
Ejaaz:
support us, let us know, DM us, you know, we're on X, we're everywhere,
Ejaaz:
just reach out and we would love to hear from you.
Josh:
That would be great. All the support is very much appreciated.
Josh:
Keep the lights on around here and keep things going strong.
Josh:
So yeah, thank you as always for the support. If you made it this long,
Josh:
you're a real one and hopefully you enjoyed this episode. So thank you as always
Josh:
and we will see you on the next one.