Exploring the frontiers of Technology and AI
Josh:
A couple of weeks ago, we covered the Claude Mythos release,
Josh:
the model that found decade old security flaws overnight and scared the hell
Josh:
out of basically anyone who is following the AI story.
Josh:
So much so that the federal government is involved. But the part that we didn't
Josh:
get into is the backend that powered this model.
Josh:
Mythos was built on a chip from March, 2024 that Jensen pulled out of his pocket
Josh:
on stage at GTC, which was the Blackwell chip.
Josh:
It had 208 billion transistors. Everyone treated it like the future had arrived.
Josh:
And yet it took two years of fabrication for us to get the first manifestation
Josh:
of that, which is Claude Mythos.
Josh:
24 models from Keynote to a working model. It happened with Hopper,
Josh:
it happened again with Blackwell, and it's going to happen again with our future models.
Josh:
But the difference is we have a series of future models that exist today that
Josh:
we can kind of map out to where we're going to be heading based on this trajectory
Josh:
that we've seen with the previous chips.
Josh:
And it's pretty awe-inspiring to see where we are going to go considering there
Josh:
are three generations of chips that have already been announced since Blackwell.
Josh:
We have Vera Rubin, Rubin Ultra, and Feynman.
Josh:
Each one, many multiples more powerful than the last. And when you look at what
Josh:
Blackwell already produced in the very first version, it gets impossible to
Josh:
imagine a world where we don't reach AGI on hardware that's already been designed.
Josh:
Everything that's been announced that is going into production almost certainly
Josh:
is going to produce models indistinguishable from AGI. At least that's what
Josh:
it seems like on surface level?
Ejaaz:
Yeah, so the story here in a single sentence is AGI, like AI models, are already here.
Ejaaz:
We just haven't distributed it because we haven't powered up the GPUs that enable
Ejaaz:
it. So everyone is obsessed with AI models.
Ejaaz:
We talk about our favorite models, how we prompt them, how intelligent they
Ejaaz:
are. But very few people are talking about the fact that
Ejaaz:
The hardware is the thing that powers these things. They train these things.
Ejaaz:
They inference these things.
Ejaaz:
And it's still about 70% of the influence of how intelligent your model is.
Ejaaz:
And the prime example, most recent example of that has been Anthropics Mythos
Ejaaz:
release, right? You just mentioned it. It's discovered a bunch of different cybersecurity flaws.
Ejaaz:
It is this all being powerful thing that the governments around the world, including the U.S.
Ejaaz:
Government, Federal Reserve, they're sharing meetings with the top banks to
Ejaaz:
talk about the craziness of this model we must prepare.
Ejaaz:
There's a lot of doomer news out there in the future.
Ejaaz:
Little do you know that this was powered by a GPU or this was trained by a GPU
Ejaaz:
that was built 20 months ago. So we're talking about almost two years ago.
Ejaaz:
It's called Blackwell. And I want to give you guys an idea of the timeline of what this looked like.
Ejaaz:
So in March 2024, NVIDIA GTC, which is like their developer conference,
Ejaaz:
Jensen Huang comes on stage and he presents this gargantuan scrap of metal.
Ejaaz:
It looks very pretty, by the way. And he goes, this is Blackwell,
Ejaaz:
GB200, GB300, a brand new GPU.
Ejaaz:
We can train frontier models on it. Everyone gets so excited.
Ejaaz:
Their stock price absolutely ascends, right?
Ejaaz:
The thing is, people couldn't get their hands on this until exactly a year later.
Ejaaz:
So to give you guys an idea of the timeline, he announces it in March 2024.
Ejaaz:
Then by the middle of the year, they discover there's like a bit of a design
Ejaaz:
flaw and they amend that.
Ejaaz:
And then by the end of 2024, early 2025, they start shipping these units of
Ejaaz:
Blackwell GPUs out to the top frontier AI labs.
Ejaaz:
But there's an important nuance here, which is it's just the GPU sitting in a data center.
Ejaaz:
They aren't actually powered up. It's not until 6 to 12 months after that fact
Ejaaz:
that these GPUs were finally powered up,
Ejaaz:
used to train models, which is why we now start to see these new AGI-like models
Ejaaz:
like OpenAI SPUD and Claude Mythos come to fruition.
Ejaaz:
So the point is, there is a long gap between the frontier GPUs being announced
Ejaaz:
and rolled out to them actually being powered to train the models.
Ejaaz:
We talked about Elon Musk and XAR a lot on this show before.
Ejaaz:
They actually have the largest arsenal of these Blackwell GPUs.
Ejaaz:
They bought about a million of them.
Ejaaz:
The crazy part about this now is they're not like one, two, but three new NVIDIA
Ejaaz:
GPU models that have been announced in the recent NVIDIA GTC.
Ejaaz:
So there is a major lag between Frontier hardware and the new AI models that are being released.
Ejaaz:
And people don't understand this. And we want to tell you the story.
Josh:
You just remember GPT-4, how long ago that was and how that felt like the huge,
Josh:
most pivotal model that OpenAI ever released.
Josh:
I mean, that was the big one right after ChatGPT came out. That was trained
Josh:
using the Hopper chips. You know, the most recent model.
Ejaaz:
Hopper's a word I haven't heard in a while, Josh.
Josh:
Yeah, well, you know, GPT 5.4, the most recent model that we're using every
Josh:
single day on ChatGPT. That was also trained on Hopper chips.
Josh:
The same chips are training models from GPT-4 to GPT-5.4.
Josh:
And it's a testament to how the efficiency gains of software can actually increase
Josh:
the throughput of hardware.
Josh:
And I think I want to use that as an example because what we just got recently
Josh:
with Mythos through Anthropic, that seems to be the first real implementation process.
Josh:
Of a true Blackwell model. And rumors are that SPUD, the new open AI model,
Josh:
is going to kind of be the same in terms of power that is coming as it relates
Josh:
to the first Blackwell model.
Josh:
And even if we don't actually iterate on the hardware, the amount of progress
Josh:
we're going to get from Blackwell models alone seems like it is going to be
Josh:
difficult to imagine it doesn't become some sort of an AGI, right?
Josh:
It's like when you think about the difference of intelligence between GPT-4
Josh:
and GPT-5.4 and how far we've come, that applied to Blackwell at this new scale,
Josh:
seems crazy but that's not even the crazy part because
Josh:
we have an entire roadmap of these three generations of
Josh:
chips that are coming that we can very clearly map to
Josh:
the gains that we're going to see and i think that's when things
Josh:
get like particularly disturbing because on the chart that we're looking on
Josh:
screen now we have blackwell that's where we are right now blackwell is a significant
Josh:
improvement over the previous model but then we have vera rubin which jumps
Josh:
from 20 petaflops to 50 petaflops that's a two and a half to five times multiple
Josh:
on the compute then we We have Ruben Ultra,
Josh:
which is scheduled for the second half of 2027.
Josh:
That is a 14 times multiple.
Josh:
And then we have Feynman in 2028, which is an estimated 30 to 50 times multiple.
Josh:
On the current chip stack that we have today, assuming that we get no software progress at all.
Josh:
And what we saw with the Hopper chips is that we got a tremendous amount of
Josh:
progress just from software.
Josh:
So when you combine this 30 to 50 times multiple with a maybe another 100 times
Josh:
multiple on software, if we make another breakthrough, we're looking at some
Josh:
pretty insane improvements here that like are really hard to wrap your head around.
Ejaaz:
I want to point out that these improvements, these multiples that you just mentioned
Ejaaz:
are just on the speed and power of these hardware modules, right?
Ejaaz:
So it's going to work 3x harder or 14x harder, but it's also going to cost you
Ejaaz:
a lot less to be able to train the same type of intelligence or model.
Ejaaz:
So the intelligence per density, which is a unit that we completely made up,
Ejaaz:
and we don't know if it exists, but it somehow rhymes in my head at least,
Ejaaz:
is improving and it's going to be cheaper with each successive model.
Ejaaz:
But if you want to get a bit of context as to like what that looks like in terms
Ejaaz:
of like the models that you use today and what it's going to look like tomorrow,
Ejaaz:
we have this other table here, which kind of like maps it out.
Ejaaz:
So with Blackwell today, you get about a two to three X more intelligent, crazier model, right?
Ejaaz:
That's what Claude Mythos is supposedly meant to be. It's like a larger size.
Ejaaz:
It's trained on these Blackwells.
Ejaaz:
You're going to see a bunch of models similar come out from OpenAI and XAI over
Ejaaz:
the next couple of months.
Josh:
Just to pause you there, these are already models deemed too dangerous to release for the public.
Ejaaz:
Yes. Just some emergency meetings literally being called by the federal chair, top banks.
Ejaaz:
Actually, I read something yesterday that the NSA is using or conferring or
Ejaaz:
re-engaged with Anthropic, as well as the Pentagon and the U.S.
Ejaaz:
Defense Department, after banning and blacklisting Anthropic because it's so powerful.
Josh:
And that's where we are today.
Ejaaz:
That's today. So that's right here. 2026, two to three X, right?
Ejaaz:
Yeah, crazy. Now, you might notice that by next year, we have a larger multiple
Ejaaz:
on the original multiple.
Ejaaz:
By next year, we're going to have a 10 to 15x improvement purely through Vero
Ejaaz:
Rubin GPUs. Now, I must emphasize...
Ejaaz:
This does not include post-training. This doesn't include all the fine,
Ejaaz:
fancy techniques that AI labs themselves will implement to make a smart model.
Ejaaz:
This is just the hardware.
Ejaaz:
It's like buying the hardware and training a model today versus next year,
Ejaaz:
you're gonna get a 10 to 15x more intelligent model, but it gets even scarier.
Ejaaz:
2028, 30 to 50x.
Ejaaz:
2029, 100 to 200x. Now, I haven't seen these multiples in any other industry
Ejaaz:
for any kind of performance or hardware improvement.
Ejaaz:
So I can't wrap my head around this because it looks like just a few small numbers
Ejaaz:
that are getting larger, but these are multiples of its predecessor,
Ejaaz:
which means that we're probably going to get AGI,
Ejaaz:
honestly, by the start of next year.
Ejaaz:
And they're trained on hardware that currently exists and is rolling out.
Ejaaz:
I don't know. I'm just kind of scared reading all of this, to be honest,
Ejaaz:
because what happens if we have universal access to this?
Ejaaz:
There's going to be a load of malicious actors which can use these models for
Ejaaz:
various different things. But also, I don't know what these models are going
Ejaaz:
to be capable of. They're going to be so much smarter than humans themselves.
Josh:
The disturbing thing is that this technology is here. Like this is,
Josh:
it's no longer an engineering problem or a physics problem necessarily.
Josh:
It's just a matter of actually producing the thing and plugging it into an outlet and putting it online.
Josh:
And this is coming. Like there are no novel breakthroughs required to make this a reality.
Josh:
Now, what that looks like on the other side, I don't know, but I think it's
Josh:
safe to assume the velocity of improvement we're going to get is certainly not
Josh:
slowing down. It is turning more closely resemble a vertical line than anything else.
Josh:
And I think it begs the question, like, at what point do we reach AGI and how do we even define that?
Josh:
Because I'm not sure we spoke about that much on the show, but Ejaz,
Josh:
when you say AGI, what do you mean by AGI? What would you be looking for?
Josh:
To declare, okay, we have finally reached AGI.
Ejaaz:
Okay, so this is like my own made-up definition, but it's what will make me go, okay, this is AGI.
Ejaaz:
It would be a single AI model, not many, but a single AI model that advances
Ejaaz:
the frontier of three key major industries autonomously. So I'll pick these
Ejaaz:
industries as examples.
Ejaaz:
Financial industry, so it trades better than the average world.
Ejaaz:
Sorry, then the best hedge fund or investor.
Ejaaz:
It is able to make assessments better than any of the financial analysts,
Ejaaz:
the top experts, et cetera, in that industry.
Ejaaz:
In science, it has discovered a bunch of medical cures for some major diseases
Ejaaz:
such as cancer, Alzheimer's, and stuff like that, that scientists,
Ejaaz:
top scientists at their top level could not figure out. It accelerates their research.
Ejaaz:
And maybe one other industry that I can't think of right now,
Ejaaz:
but it's when these models start doing things that the best of the best humans
Ejaaz:
right now couldn't figure out themselves and couldn't have seen themselves.
Ejaaz:
Do you have a similar definition or?
Josh:
Yeah, I think that sounds right. I think, and again, it's very fuzzy.
Josh:
Everyone kind of has their own custom definition of what they believe AGI is going to be.
Josh:
But for me, it's just AI that's smarter than the smartest human at pretty much
Josh:
any cognitive task that exists.
Josh:
So you can go to this model and it will be better
Josh:
anyone else who you can ask on planet earth about anything and
Josh:
the problem with models today is they're very spiky like you can do this
Josh:
for code probably and it can code better than every
Josh:
human on earth but if you ask it you know a generalized question
Josh:
about something that you really know a lot about there's a
Josh:
lot of times where it's not completely accurate or it will respond
Josh:
as if it has the intelligence of a three-year-old it fails the
Josh:
reasoning tests of a lot of simple things it still feels like
Josh:
it's this very spiky entity once it is fully
Josh:
developed once it is actually better at every cognitive task
Josh:
that includes physical things too that includes like understanding physics
Josh:
of the real world world models that feels like agi and
Josh:
then artificial super intelligence asi feels like
Josh:
it is smarter than all humans combined so it's like if we put all of our brains
Josh:
together no matter how long we tried we can never come up with the things that
Josh:
artificial super intelligence will come up with and i mean will we get there
Josh:
using this chip architecture possibly I'm seeing a 50x multiple,
Josh:
not including the software multiples.
Josh:
And like those compounding on top of each other at the rate that we're moving,
Josh:
seems like the only real constraint is going to be physical.
Josh:
It's going to be actually rolling out these models and powering them on.
Ejaaz:
Well, another crazy thing is, I think a lot of people, including myself,
Ejaaz:
would assume that with every chip upgrade, it's going to be more expensive,
Ejaaz:
and it's going to be bigger.
Ejaaz:
It's going to be clunkier, right? Like the data centers are going to get bigger,
Ejaaz:
it's going to be more expensive.
Ejaaz:
I wish I had a chart to show this, but it's actually the complete inverse.
Ejaaz:
And I'll give you some examples, some numbers to explain that, right?
Ejaaz:
So a reasoning task that costs $1 on Blackwell costs $0.20 on Vero Rubin,
Ejaaz:
which is rolling out as we speak or later this year.
Ejaaz:
And it'll only cost $0.07 on Rubin Ultra, which starts to get released by the start of next year.
Ejaaz:
So the cost is going down pretty massively.
Ejaaz:
Now, by 2028, Jensen announced the Feynman GPU, right?
Ejaaz:
A single rack of that. So we're talking about like just a couple of that.
Ejaaz:
Blocked on top of each other, will process more compute than was required to
Ejaaz:
train GPT-4 that you mentioned earlier, Josh.
Ejaaz:
So the point is, less is more, but somehow more powerful, but also somehow more
Ejaaz:
cheap relative to the intelligence that you're building.
Ejaaz:
And if you assume this intelligence is going to reach this ASI,
Ejaaz:
AGI-like state, it's going to make you money as well.
Ejaaz:
So you end up just having i guess i i'm afraid to say this but the best of old
Ejaaz:
worlds both worlds i don't know what humans are going to be doing but it's great for ai.
Josh:
Basically yeah there's no world in which things don't
Josh:
get better and it feels like right now we're really just constrained by this
Josh:
compute power there's this great meme that i saw online it's
Josh:
it it said uh mythos is too powerful for public release
Josh:
but the reality is is that they're just completely out of compute and
Josh:
anthropic can't actually supply the tokens required to give
Josh:
mythos to the world these optimizations these cost structures
Josh:
yeah there it is we got on screen now great meme
Josh:
great meme but these these cost structures that are
Josh:
going to incur from these new models are going to completely destroy that factor
Josh:
at least for now until whatever that next generation of model is that is so
Josh:
powerful that it's constraining gpus and the interesting thing is that open
Josh:
ai has the same exact thing going on all these models are kind of converging
Josh:
on the same spot but they all seem to be compute constrained.
Ejaaz:
I think what critics will push back on though, Josh, for everything that we've
Ejaaz:
said so far is, okay, cool.
Ejaaz:
You can buy these new hardware things, but why would you do that if you could
Ejaaz:
just wait a few months or six months and buy the next thing?
Ejaaz:
Jensen's just shipping out these products. He's making a load more money.
Ejaaz:
It doesn't make sense. These things are depreciating assets.
Ejaaz:
By the time you've bought the first one and you've ramped that up with power
Ejaaz:
and training your next model, there's already three other new chip architectures.
Ejaaz:
And he would be right, that critic would be right,
Ejaaz:
except that they're massively, massively wrong. And we have proof for that,
Ejaaz:
right? GPUs have now become this anti-depreciation machine.
Josh:
One of the most amazing things about this phenomenon, and it feels like a narrative
Josh:
violation, is the idea that the GPUs that were released three years ago are
Josh:
actually more valuable today than they were at the time they launched,
Josh:
which is a pretty bizarre idea.
Josh:
We have this artifact on screen that shows a chart.
Josh:
And an H100 from NVIDIA cost $30,000 when it launched in 2023.
Josh:
At its peak because of the scarcity because everyone
Josh:
needs these things it was selling for a four times multiple at 120 000
Josh:
per h100 this is kind
Josh:
of outrageous it was a little exorbitant we don't need to
Josh:
be paying that much money but now that they are old they're not depreciated
Josh:
but there's much better hardware out there they're still holding their price
Josh:
at 30 000 in fact you can see a rebound that happens in late 2025 where the
Josh:
cost of these h100 gpus actually ticks upwards And I think a lot of the people,
Josh:
Michael Burry most famously, who is the guy behind the big short,
Josh:
He created an entire short thesis around the idea that the depreciation schedule
Josh:
of these GPUs wasn't aggressive enough and they were actually going to lose
Josh:
their value and therefore the market was going to deflate because the companies
Josh:
weren't marking these down properly.
Josh:
The reality is, is that not only are they not going down, they're starting to
Josh:
trend back up because the incremental cost for a token is so low with these
Josh:
and everyone's so desperate for compute that they're like, well,
Josh:
might as well spend some extra money,
Josh:
get the H100s and start generating inference tokens with them.
Josh:
It's this pretty amazing phenomenon that's happening.
Ejaaz:
Yeah, so if you're wondering why this is happening, explicitly it's AI demand
Ejaaz:
is growing faster than chip supply can expand.
Ejaaz:
We don't have enough fabs or the manufacturing prowess or the energy grid to
Ejaaz:
support creating and generating more GPUs to satiate the demand that we're seeing
Ejaaz:
in AI across all these different industries, right? It's a very pervasive bit of technology.
Ejaaz:
Now, the data that we're showing you on the screen right now isn't siloed to
Ejaaz:
like a few research papers. This is happening in the market right now,
Ejaaz:
and it's incredibly liquid.
Ejaaz:
So a new phenomenon of companies in AI whose stocks have all skyrocketed are
Ejaaz:
these things called neoclouds, right?
Ejaaz:
So these are like, think of it as like AWS. They supply compute to train your
Ejaaz:
AI models by setting up their own data centers, and they kind of like provide
Ejaaz:
it to you in like a cloud or data center specific structure.
Ejaaz:
Examples would be CoreWeave, for example. The idea here is these data centers or these GPU providers.
Ejaaz:
70% of the GPUs that they're running are old GPUs that we're showing you on our screen right now.
Ejaaz:
And they're booked out, I'm not exaggerating, 6 to 12 months in advance.
Ejaaz:
In fact, they're done so in contracts and the same providers renew the contracts
Ejaaz:
three months before the contract needs to be renewed just to make sure that
Ejaaz:
they get access to these older GPUs.
Ejaaz:
So the point I'm trying to make, and you mentioned this just now,
Ejaaz:
Josh, is all that matters is can I get AI tokens generated to do the thing that
Ejaaz:
my company needs or answer the prompt that I have?
Ejaaz:
And if the answer is yes, and it's for a reasonable price, I'm down to go for
Ejaaz:
that because the value that you can build and earn on top of that is invaluable,
Ejaaz:
right? They can have a large markup on that.
Ejaaz:
So it makes sense that these assets are kind of like in high demand.
Ejaaz:
And to your earlier point, Michael J. Burry like shorted the entire market saying
Ejaaz:
that these are depreciating assets and he got that completely wrong.
Ejaaz:
And his thesis specifically was based on it can't train frontier models.
Ejaaz:
And he's actually right.
Ejaaz:
The older models can't train frontier models. But what they are being used for
Ejaaz:
is one thing very specifically, inference, which is if someone has a question,
Ejaaz:
how do I get them the answer? How do I process the prompt?
Ejaaz:
That's what the older GPUs are being used for. And they're really damn good at it.
Ejaaz:
And the reason why it's important and essential for AI labs specifically who
Ejaaz:
are training models, who you might think might want the expensive models is
Ejaaz:
they have a ton of inference.
Ejaaz:
They use inference to even train the new models. So it's this new paradigm where
Ejaaz:
all these old GPU architectures are being re-found or repurposed for this really
Ejaaz:
important thing that is inference.
Ejaaz:
So important context to understand if you're investing in some of these companies, for example.
Josh:
Yeah. And why is it so valuable? Well, it's a testament to the software improvements,
Josh:
right? So we have those software efficiency improvements that we didn't have three years ago.
Josh:
So that same hardware generates a lot more value.
Josh:
And if we scroll down to the value multiplier section of this artifact it shows
Josh:
that the cost of a chatbot inference in 2023 was three dollars an hour and now
Josh:
autonomous agents completing these complex tasks is 30 to 300 dollars per hour
Josh:
The value that you can charge for these tokens is significantly higher than it was in the past.
Josh:
And the amount of tokens that you're able to generate efficiently at higher
Josh:
quality is much higher as well.
Josh:
So there's all these converging forces that are just making the market desperate for compute.
Josh:
Nobody has the compute required that they want. And NVIDIA is trying to put
Josh:
it online as fast as they can, but it's not fast enough.
Josh:
And I assume as we go through this, we're going to continue to see varying bottlenecks
Josh:
and the efficiencies will move to where there are bottlenecks,
Josh:
which creates new bottlenecks right now we're seeing some convergence around
Josh:
cpus and cpus seem to be like they're going to be hitting a
Josh:
shortage somewhat soon because we're out of gpus let's move to cpus
Josh:
and it's it's this really interesting dynamic but that is the idea
Josh:
on this nvidia episode or just the chip episode in
Josh:
general that it is hard to imagine a world in which we don't reach
Josh:
agi given the currently announced infrastructure it
Josh:
doesn't require any breakthroughs it's just if nvidia does
Josh:
what they announced on stage through jensen huang through these next three
Josh:
chips it is almost impossible to imagine what the world of intelligence is going
Josh:
to look like and i think it's important to understand is that mythos is trained
Josh:
on a two-year-old chip and no one's really talking about that so it blew my
Josh:
mind hopefully it blew yours as well uh at least found it a little bit fascinating
Josh:
and that is our episode today thank you guys so much for watching we really appreciate it
Ejaaz:
And i know some of you are probably thinking oh there's a bunch of challenges
Ejaaz:
here and josh actually just mentioned one of them which is like you got cpus
Ejaaz:
we don't have enough energy, we don't have enough memory.
Ejaaz:
And that's like another episode that we can get into.
Ejaaz:
So all of those things assumed will be leveled at some point.
Ejaaz:
And we're gonna see all those industries grow versus being constrained.
Ejaaz:
People are throwing trillions of dollars into this industry.
Ejaaz:
So all of those problems should theoretically be fixed.
Ejaaz:
But rest be sure, we will be the first show to cover it and give you those thoughts
Ejaaz:
before it happens, by the way.
Ejaaz:
And Intel is a sneaky one to get into. But we'll talk about that another time.
Ejaaz:
Thank you so much for listening. If you are not subscribed to us, please subscribe.
Ejaaz:
It helps us out massively. We are having banger weeks on YouTube,
Ejaaz:
Spotify, Apple, and wherever you listen to us.
Ejaaz:
Please rate us. Leave us a comment. We love hearing your feedback.
Ejaaz:
There are like thousands of newbies that are listening to the show, welcome.
Ejaaz:
And also give us feedback about stuff that we may not be covering that you want
Ejaaz:
to hear more of. We're always open to feedback.
Ejaaz:
But until then, I guess we'll see you on the next one.