Exploring the frontiers of Technology and AI
Josh:
Another week, another banned model. The government has just restricted access
Josh:
to GPT's 5.6. OpenAI is now joining the club that Anthropic joined a few weeks
Josh:
ago of a model too powerful to be distributed publicly.
Josh:
GPT 5.6 seems like it's pretty good. It looks like it is a mythos class model,
Josh:
but Ejaz, upon talking to you, the truth is it might not have even really needed to be banned.
Josh:
Is the government overreacting here a little bit because this model seems like
Josh:
it's not quite what they're just on the surface?
Ejaaz:
I think so. OpenAI is framing on this is that this is their response to Claude
Ejaaz:
Mythos 5 or Fable 5, which is Anthropics Frontier model that has been restricted
Ejaaz:
by the government for now, like two and a half weeks.
Ejaaz:
My take on this is that I think they've intentionally created a bench maxed model.
Ejaaz:
And what I mean by this is, well, there's a few things. Number one,
Ejaaz:
you and I can't use this right now.
Ejaaz:
At least when Fable 5 released, we could use the thing and try it out for ourselves,
Ejaaz:
we could get independent verification that this model was actually good.
Ejaaz:
With GPT-5.6 Sol, which is their most powerful model, there's three of them,
Ejaaz:
and I'll get into that in a second, we have a few benchmarks that have been
Ejaaz:
cherry-picked by OpenAI themselves.
Ejaaz:
I'm showing the flagship one on the screen right now called Terminal Bench 2.1.
Ejaaz:
This is the coding benchmark, which every model is kind of like measured against.
Ejaaz:
And you'll notice on the left, GBT 5.6 Sol Ultra, which is like the max max
Ejaaz:
mode of their best model, comes in at 91.9%, which technically beats Claude
Ejaaz:
Mythos 5 and Fable as well.
Ejaaz:
So technically, if you looked at this, you might think it's really,
Ejaaz:
really good at coding. But there's a lot of information which we'll get into
Ejaaz:
later on in this episode, which suggests that the model might actually be cheating.
Ejaaz:
But before we do that, let's maybe get into what the models are,
Ejaaz:
because it's three of them, right?
Josh:
Yeah, three models. We have Sol, Terra, and Luna. If you are anywhere adjacent
Josh:
to crypto, that triggers a little bit of PTSD because those are the names of
Josh:
tokens that have not done so well.
Josh:
But in the context of OpenAI and ChatGPT, these are the three model types.
Josh:
So Sol is the largest one.
Josh:
It is the Sun. It is their flagship model with 5.6.
Josh:
Then they have Terra, which is a kind of mid-tier model. It seems like the pricing
Josh:
of that is going to be pretty competitive, if not a little bit lower than what
Josh:
we're used to on the frontier.
Josh:
And then Luna is the affordable of a model. Luna is the low end model that seems
Josh:
like it still performs very high, but the cost is low.
Josh:
The output is $6. The input is $1 per million tokens. And that's kind of the trio.
Josh:
It seems like they're starting to revise their branding a little bit to make
Josh:
it slightly more accessible. There's no GPT 5.5 X high minus this.
Josh:
It's like, no, okay, there's Solitarra Luna, and you can kind of have an idea of where they all stand.
Josh:
And that's how it seems like they're going to be moving forward here with something
Josh:
accessible it's great that it's accessible it's a bummer that their market who
Josh:
I assume this is targeted towards can't actually use this this is just for companies currently who,
Josh:
probably don't care if it's named x high or whatever um so that is the current
Josh:
trio that they're going forward with.
Ejaaz:
So there's a few advantages if we walk through some of these models so sol which
Ejaaz:
is their most powerful model uh comes in at a third of the cost that anthropics
Ejaaz:
mythos 5 and fable 5 come in at so uh
Ejaaz:
If it does end up being publicly released and you end up using it and you're
Ejaaz:
like wow this is as good as fable 5 you now have a much cheaper model so that
Ejaaz:
might be an important decision point for any user whether you're an enterprise
Ejaaz:
or a retail person using this.
Ejaaz:
And then if you look at Terra, if you look at the cost that we show on the screen
Ejaaz:
right now, you'll notice that that's very similar to GPT 5.5.
Ejaaz:
So you have this technically better model that is as cheap as 5.5.
Ejaaz:
So we've noticed this trend, it's another confirmation that as these models
Ejaaz:
get better, they also weirdly become cheaper. There's like this inversely proportional
Ejaaz:
trend, which you kind of like is counterintuitive.
Ejaaz:
But it's great, because it means if these things are available to everyone,
Ejaaz:
it is way more accessible to use at scale.
Ejaaz:
And then you have Luna, which they basically described as the workhorse.
Ejaaz:
So let's say if you get Sol to design like a really smart genius plan or solution,
Ejaaz:
you would then use Luna to actually execute on a bunch of the work.
Ejaaz:
And there have been a number of different kind of like general reviews as to
Ejaaz:
like what this model is like.
Ejaaz:
Unfortunately, I have to take random
Ejaaz:
people's word on X for how good it is because we can't use it itself.
Ejaaz:
And when asked, Sam basically said, listen, right now it's in limited release
Ejaaz:
to a specific set of partners. I think it's like 10 to 20 partners,
Ejaaz:
a really limited set that the government themselves, the US government has approved,
Ejaaz:
they're vetted and approved.
Ejaaz:
And Sam said, as for like, why wider general release, we're working hard for a worldwide release.
Ejaaz:
So in the very same vein that Fable 5 has been restricted due to a government
Ejaaz:
framework that they're trying to figure out, OpenAI has been subjected to the
Ejaaz:
same thing. But there's a twist, and we'll get into this later,
Ejaaz:
which is Sam was voluntary about this.
Ejaaz:
He gave it up and said, yeah, we can ban the model before we even need to get
Ejaaz:
it tested. He didn't push back like Darien and Anthropic did.
Josh:
Yeah, it's a bummer. And he came out with a public launch post about this,
Josh:
where it started with good news, like, hey, Sol is smart, efficient,
Josh:
and a significant step forward.
Josh:
It's the same price as 5.5, which is pretty cool that we get a step function
Josh:
improvement at the same price.
Josh:
Also noteworthy is that the Terra model, the small one, is roughly equivalent
Josh:
to 5.5 in terms of intelligence.
Josh:
So we're getting 5.5 now at a much cheaper rate, which is really cool.
Josh:
But then he continues this post with bad news at the request of the U.S. government.
Josh:
It is launching today in limited preview instead of open access launch that
Josh:
we were planning on it seems like it was i mean internally pretty disappointing for sam,
Josh:
uh to have to deal with this it seems like they have been trying to publish
Josh:
this model for a little while according to the rumor mill they finally got around
Josh:
to publishing it and immediately were just slapped with the band hammer and it's tough and i think,
Josh:
the government probably views this as a mythos class model according to some
Josh:
of the evals it looks like they scored a 91.9 on the exploit bench which is
Josh:
the cyber security eval test that shows how,
Josh:
roughly dangerous the model is if we're going to this according to the government.
Ejaaz:
This is like part of my whole qualm with this model release.
Ejaaz:
If you look at this exploit bench that we're showing on the screen right now,
Ejaaz:
you'll notice something weird.
Ejaaz:
You'll notice that Mythos 5 is
Ejaaz:
technically higher and achieved a better score than GPT 5.6 did on this.
Ejaaz:
But if you look, GPT 5.6 scored the exact same score as Mythos Preview
Ejaaz:
which was the first mythos model that Anthropic released. And I think that this
Ejaaz:
is Sam Altman, or OpenAI specifically, trying to play politics here.
Ejaaz:
I think he intentionally tried to game the benchmark in this respect to match
Ejaaz:
mythos preview so that the government didn't slap him with an automatic restricted
Ejaaz:
ban, which they've done with Anthropic right now. So I think he's being political.
Ejaaz:
And it's not just this benchmark that has proven this.
Ejaaz:
Are you aware of the long horizon one, Josh? You know, the benchmark where like
Ejaaz:
you just let the model run loose on a task for like six hours and there's a
Ejaaz:
probability of how good it is at actually succeeding at that task. Yes.
Ejaaz:
So typically these models have been getting exponentially better.
Ejaaz:
So if you look at Fable 5, I believe their long horizon task performance is
Ejaaz:
11.5 hours, which means that 50% of the time, if you put it on a task that is
Ejaaz:
as hard for a human to take 11.5 hours on,
Ejaaz:
Fable will be successful, which is a huge measure against whether models can
Ejaaz:
replace a bulk of what humans do as hard work.
Ejaaz:
Now, GPT 5.6 was put against the same test, except they couldn't come out with a score.
Ejaaz:
And the reason why they couldn't come out with a score was the model was caught
Ejaaz:
cheating every single time.
Ejaaz:
If they didn't include the fact that the model was cheating,
Ejaaz:
and by the way, this was available in the system card that OpenAI spoke about,
Ejaaz:
so they confirmed it as well.
Ejaaz:
If you ignored all the cheating that it did, it would have achieved a 205 hour
Ejaaz:
long horizon task performance, which is basically like 20x what Fable 5 did,
Ejaaz:
but it cheated, it found the answers, it kind of like held the rules against themselves.
Ejaaz:
And so I can't really believe anything that we're seeing right now until we
Ejaaz:
actually use this model and it's restricted for now.
Josh:
Yeah, and I hope we get a chance to test this model so you can see exactly how
Josh:
bad it's cheating. That doesn't sound very aligned to me, Mr.
Josh:
Altman. But I will say there seems to be some parts of this model that are pretty impressive.
Josh:
And that has a lot to do with the pricing and efficiency.
Josh:
It appears as if they were able to accomplish this benchmark using about a third
Josh:
of the tokens that would traditionally need to be used in order to get there.
Josh:
So it looks like on an efficiency front, it's very strong. And in terms of actual
Josh:
capabilities, it is right up there with a Mythos class model.
Josh:
So this seems really strong and i wonder what the downstream implications of
Josh:
this kind of like trio of models is going to be if this is the new standard
Josh:
because it seems like this in a way is somewhat an answer to the chinese open source models where,
Josh:
they have their flagship model but now they also have this really tiny model
Josh:
that's just as capable as the one you were using yesterday or actually the one
Josh:
that the public is currently using today except it costs a fraction of the,
Josh:
you are a customer for OpenAI and you have a complicated task that doesn't need
Josh:
to be routed entirely through the sole model.
Josh:
They can have their internal router routed through the Luna model,
Josh:
which is a little bit smaller, and then they could get a better response for a cheaper cost.
Josh:
And it seems like that's what the companies are doing now, is they're starting
Josh:
to figure out how to increase efficiency and then route it through the correct
Josh:
model in order to kind of make the prices competitive, knowing that open source
Josh:
is coming and they're very, very cheaper token.
Ejaaz:
If you are a frontier lab like Anthropic or OpenAI, the number one threat that
Ejaaz:
you're facing right now isn't each other.
Ejaaz:
It is these open source models being created by Chinese AI labs that are essentially
Ejaaz:
free to access and use and a lot, lot cheaper than the frontier models.
Ejaaz:
So six months from now, the estimate is you'll have a Mythos 5 level model that
Ejaaz:
is free and accessible to anyone and everyone.
Ejaaz:
So the government ban in that respect kind of seems sort of weird.
Ejaaz:
Like, why are we banning something that is eventually going to become available
Ejaaz:
to everyone? So like, you know, how are we protecting against this?
Ejaaz:
Well, one major answer is you have big American companies like Coinbase,
Ejaaz:
like Microsoft, like Uber that are switching their internal token use
Ejaaz:
to these Chinese models because they are open source or open way and free and
Ejaaz:
accessible to use and saves them millions and millions of dollars.
Ejaaz:
So what is the answer from Anthropic and OpenAI? It's to release not just a
Ejaaz:
flagship model, but two distilled versions that are
Ejaaz:
maybe not as good, but maybe 80 to 90% as good, but a heck of a lot cheaper.
Ejaaz:
That way they can keep you within their ecosystem. So it wouldn't surprise me
Ejaaz:
if say Anthropic or OpenAI released a feature in the future where you can kind
Ejaaz:
of type your prompt and they will route different parts of the task to different
Ejaaz:
types of the models and save you a ton of money.
Ejaaz:
One visceral example of this actually over the last couple of days is Brian
Ejaaz:
Armstrong had a tweet from Coinbase. Brian Armstrong is the CEO of Coinbase.
Ejaaz:
And he said that they have successfully increased the amount of tokens spent
Ejaaz:
as a company and slashed their budget in half by 50%.
Ejaaz:
And the way that they've been able to do this is through aggregators and routers.
Ejaaz:
So I think it's a good point to show that the fact that OpenAI and Anthropic
Ejaaz:
are now releasing a bunch of models at once isn't a coincidence.
Ejaaz:
It is a direct response to China and these open source models.
Josh:
Yeah. And maybe I want to take a second to go back to that cheating thing because
Josh:
it seems somewhat serious um in the model card they were documenting some of the incidents of
Josh:
the model deleting the wrong virtual machines and then copying hidden credentials
Josh:
between machines without proper authorization and then falsifying the claims
Josh:
in research drafts so like you can actually read the outputs of this model
Josh:
hiding its outputs understanding that it's hiding its outputs and knowing exactly
Josh:
what it's doing and it begs the question is like we we have access to this now,
Josh:
but the reasoning traces are still very much a black box and in the case that
Josh:
there is a next generation soul model like gpt 5.7 or 5.8 or 6.0 that's going
Josh:
to be incredibly intelligent and
Josh:
better at hiding things i'm always curious about what that looks like is is
Josh:
like are they against the clock to
Josh:
make sure this model is aligned enough to not cheat versus being able to catch it when it cheats
Josh:
or is it just going to always be easy to see those reasoning traces and see
Josh:
when it's not telling the truth i think that's something that we're going to
Josh:
be facing pretty soon too it's like okay well it claims it's mythos class model,
Josh:
but it turns out it hasn't actually...
Josh:
Enacted on those things. It's just kind of cheating its way there.
Josh:
So it seems like with every increasing release, we have to take these benchmarks
Josh:
with a little bit more of a grain of salt because they were getting there in
Josh:
weird ways and we're not entirely sure and the companies aren't entirely sure
Josh:
and there's still this black box.
Josh:
So it's a weird place for getting to, but I think we'll kind of understand more
Josh:
once it's publicly available. Like right now it is in that preview.
Josh:
There's not a lot of people who are able to use it, to test it,
Josh:
to see those reasoning traces, to understand why it's making the decisions it does.
Josh:
And I'm hopeful that we'll get access to it pretty soon to start using it and
Josh:
testing it out ourselves.
Ejaaz:
It's this weird trio of things happening at once where
Ejaaz:
they're now frontier models that are super intelligent, that are capable of
Ejaaz:
exploiting any system in the world at the same time that you can't get access
Ejaaz:
to the thing, at the same time that the government is choosing and handpicking
Ejaaz:
who gets access to that thing, right?
Ejaaz:
So it's like you have more intelligence, but now like less access to the actual model.
Ejaaz:
The other part is like what you described just now, which is like understanding
Ejaaz:
what the model is thinking, whether it's cheating, what its true intentions are.
Ejaaz:
That's the field of interpretability, right?
Ejaaz:
And actually, Dario and Atthropic have probably put in the most research and
Ejaaz:
investment into this with auto encoders and a bunch of other stuff like that.
Ejaaz:
The idea is, can we read what the model is thinking?
Ejaaz:
And the simple answer right now is, we kind of can, but not really.
Ejaaz:
It is this black box that you mentioned. And it is maybe concerning that we
Ejaaz:
are kind of like accelerating pretty rapidly to these hyper intelligent models,
Ejaaz:
but we don't know truly what they're capable of.
Ejaaz:
And if we do start applying them to anyone and everyone's workload or personal
Ejaaz:
life, whatever that might be.
Ejaaz:
You could speculate potentially that these models might be nefarious in some
Ejaaz:
ways. And, you know, it's proven by these system cards, it's proven by these
Ejaaz:
different types of experiments. So the truth is, we don't know.
Ejaaz:
And maybe the answer is like, we have to get the AI to like evaluate itself,
Ejaaz:
which then gets very messy. But I don't know if you feel this,
Ejaaz:
Josh, but like, it feels like it's escaping us at this point.
Ejaaz:
Well, we're at this point where like, you know, these models are very intelligent,
Ejaaz:
they're kept indoors until they can be publicly released.
Ejaaz:
And without the public release, we don't get a kind of global feedback as to
Ejaaz:
what might be wrong. One thing I loved about the public releases is people could
Ejaaz:
point out and criticize and say like, hey, actually, it can only do this or
Ejaaz:
hey, no one knew that it could do this, check this out.
Ejaaz:
But we kind of have lost that with these closed models. And it's kind of sad
Ejaaz:
that OpenAI hasn't been able to release it publicly, at least yet.
Josh:
Yeah, it's a shame. And it's important to note that like this wasn't an export
Josh:
control. This was not mandatory. This was the government suggesting they keep
Josh:
it private, OpenAI complying, and
Josh:
then working with them to choose 20 companies, I believe the number is.
Josh:
And this starts to feel like it disconnects the public and the private more and more.
Josh:
Because like internally, you have to assume progress is not slowing down.
Josh:
These companies are continuing to train even more powerful models.
Josh:
They're continuing to use them internally to recursively improve those.
Josh:
And in fact, when Sam Montwell was asked, is there anything happening on the
Josh:
meta slash continuous learning front, he replied with continual progress.
Josh:
And it seems like everything is just going to continue to move faster.
Josh:
These models are going to begin
Josh:
this recursive learning loop in which they can help build themselves.
Josh:
And that's part of the reason why we're getting such an increased cadence of
Josh:
model launches is because as they get more intelligent, more capable,
Josh:
they can help build the building blocks that allows them to create the even
Josh:
new for faster frontier model that's better and better price efficiency, better intelligence.
Josh:
And the public still isn't even caught up on the last one. So these blocks that are put in place.
Josh:
Only really harm, I mean, I would think the public right now,
Josh:
because they're the ones that are now losing out on the access to this intelligence.
Josh:
Whereas internally, everyone is just going to keep building,
Josh:
you have to assume like they're not going to stop training these models just
Josh:
because the public can't use them.
Josh:
So now there's this increasing gap where even if you are on the frontier as
Josh:
a public facing figure, you don't have access to the actual frontier.
Josh:
And there's this disconnect where it's like, there's three types of people,
Josh:
there's like the people who use ChatGPT as Google, there's the people who are
Josh:
using it as a productivity tool, and there's the people who are building it.
Josh:
And there's these huge gaps in terms of perception of what these things are
Josh:
capable of at each one of those levels.
Josh:
And I wonder what the downstream effects of those gaps start to become.
Josh:
As they become more and more pronounced and increased over time,
Josh:
where there is such a large divide between the people who think that ChatGPT is like
Josh:
a really powerful version of Google versus the people who are using ChatGPT
Josh:
to build like unbelievable systems, build entire companies using these AI models.
Josh:
And that disconnect seems like it's only going to grow as these models continue
Josh:
to be held back. So I hope that there is a constructive conversation here.
Josh:
Based on Sam's AMA, it sounds like there will be in terms of just building out
Josh:
a reliable infrastructure that allows companies to predictably go through this
Josh:
process that allows them to get the model publicized.
Josh:
And I think that's where we're at right now is trying to figure out where that
Josh:
framework sits so that when a company has gpt 5.6 they can go to the government
Josh:
say hey we have this we want to release it here help us get there they could
Josh:
pass all the tests and then publish it and i think that's hopefully the goal
Josh:
going forward with with these new model launches like it sucks i'd love to be
Josh:
using fable and gpt 5.6 right now we're still stuck on the old models.
Ejaaz:
Yeah i mean the answer can't be as simple as let's just complain about every
Ejaaz:
government ban the truth of the matter is like these models are potentially
Ejaaz:
capable of doing what the government can say they can do and exploit every single
Ejaaz:
security system that they have so in that event it's probably smart
Ejaaz:
to give the reins over to anyone and everyone in case it does end up in a massive
Ejaaz:
exploit or whatever that might be.
Ejaaz:
I have to think, if you are Sam or Dario right now, you have spent the best
Ejaaz:
part of a decade creating these companies, building these models,
Ejaaz:
and you've seen the most rapid acceleration ever.
Ejaaz:
And so in your mind, you want to disseminate this to as many people as you can,
Ejaaz:
right? The point of building this intelligence is to allow anyone and everyone
Ejaaz:
to use it so that they can improve the GDP per capita of a nation or improve
Ejaaz:
their own lives for whatever purpose, right?
Ejaaz:
And so to be at odds with that and not being able to do it because there's a
Ejaaz:
very valid safety reason is like, it's just not an easy conversation to have.
Ejaaz:
So I do empathize a bit with what they're trying to do. Like Dario has mentioned
Ejaaz:
many times that, you know, we have to be careful about how we release these models going forward.
Ejaaz:
What that looks like is the discussions that are happening behind the doors
Ejaaz:
right now, if I had to guess. That's what Dario is talking about with the government.
Ejaaz:
That's what Sam is talking about with the government. And I'm hoping to see
Ejaaz:
a framework of some sort be released soon.
Ejaaz:
I keep seeing these like rumors and screenshots of people potentially getting
Ejaaz:
access at a beta version for Fable 5 again, or getting access to GPT 5.6 Sol
Ejaaz:
And giving their feedback on that.
Ejaaz:
But it isn't official yet. And I'm honestly kind of glad that they're taking
Ejaaz:
a bit of time to figure this out, because I do think it's important.
Ejaaz:
And if, like I said earlier, we do end up with open source models that are as
Ejaaz:
good as these exploitable models right now but freely accessible and open to all
Ejaaz:
then you have to think that these companies need to harden the security systems before that happens.
Ejaaz:
That's what they're working on right now. We joked about this last week,
Ejaaz:
I think, but where's all that compute going that is currently not being able
Ejaaz:
to serve retail customers for Fable 5 or GPT 5.6?
Ejaaz:
It's going into building GPT 6 or it's going into building Fable 6,
Ejaaz:
I presume. And so that gap is a very hard path to walk because to your point,
Ejaaz:
most people use this for Google.
Ejaaz:
And the less that they keep using these tools, which again, is the majority
Ejaaz:
of the people like most people just use it as a Google search,
Ejaaz:
they don't set up their own agents, they don't have .md files,
Ejaaz:
which describes and tells the agent what to do, they're not looping research,
Ejaaz:
or getting it to figure out how to do their job.
Ejaaz:
They're just using as Google search, they're going to become increasingly more
Ejaaz:
antagonistic as a part of the public, basically. And that is a scary reality
Ejaaz:
to be in. And I'm glad that they're taking the time to figure out what these frameworks are.
Ejaaz:
I'm hoping, fingers crossed, that it'll be more open than we expect.
Josh:
Yeah, I agree. And it's this weird discovery process. And in the process of
Josh:
learning about how all this worked, I came across another example that was almost
Josh:
one for one precedent in terms of this happening in the past.
Josh:
And it was actually around crypto.
Josh:
And not crypto in the sense of cryptocurrency, but actual encryption.
Josh:
And after World War II, encryption itself, the ability to encrypt files was
Josh:
deemed so dangerous that it was put on the US munitions list is what it was called.
Josh:
So basically, in the 1970s up until the 1990s.
Josh:
Exporting encryption was the same as exporting weapons, like our actual missiles or guns.
Josh:
And it was treated the same way because it would be so dangerous to have encryption.
Josh:
So what happened is this programmer, he released this thing called PGP,
Josh:
which is just like a free, easy encryption software.
Josh:
And he just published it out on the open web. And then MIT took that open source
Josh:
published code, they placed it into a physical hardcover book,
Josh:
and then printed the book.
Josh:
And they were like, yeah, you think this is nonsense? Like, come sue us, bro.
Josh:
And the strategy was basically to dare the government to take the university
Josh:
press to court over a book.
Josh:
And the reality is, is that this book wound up getting exported and they tried
Josh:
to press charges on everyone.
Josh:
But the reality is, is that in 1996, they actually removed the order for encryption
Josh:
to be deemed on the munitions list. And it's no longer dangerous enough to use.
Josh:
And now fast forward to today, like every single thing that we rely on uses encryption.
Josh:
So we've been here before in which the government sees this new scary thing.
Josh:
They believe it's too dangerous to release. They want to control it.
Josh:
But then it turns out like code is just, it's just words on a page.
Josh:
And the reality is, is that if you can compress all of these weights into something
Josh:
that can be distributed even faster than that through the internet,
Josh:
you don't need to print a hardcover book.
Josh:
You just need to post a link to a Dropbox.
Josh:
And that's all it takes to really move the frontier forward.
Josh:
And it seems like we're going to probably somewhat meet the same fate that we
Josh:
did in the 1990s with encryption over AI.
Josh:
It's just like these digital goods and services are so difficult to regulate
Josh:
that all it takes is a Chinese open source lab to drop one file on the open
Josh:
internet and it spreads like wildfire and it changes everything.
Josh:
So it was an interesting historical precedent that I want to highlight of like,
Josh:
oh, this has happened before.
Josh:
We lived in it for about 30 years and then eventually someone got creative enough,
Josh:
it got repealed and it got changed. But we've been here before where these things
Josh:
seemed too dangerous and now they are prevalent in every single thing we use.
Josh:
And in fact, we are so reliant on encryption now that like I couldn't imagine a world without it.
Ejaaz:
I think it's safe to say that not knowing what the future looks like in terms
Ejaaz:
of governing these models, in terms of how we disseminate these models, is okay for now.
Ejaaz:
I'm just glad that we're being proactive about it. And it's,
Ejaaz:
to some extent, in the public sphere of discussion. Like, we know that Sam's
Ejaaz:
talking to the government. We know that Pete Hexeth is talking to Dario Modi.
Ejaaz:
And maybe that's enough. Like, what those conversations actually yield,
Ejaaz:
we will find out, hopefully, in a couple of weeks' time when some form of a framework comes out.
Ejaaz:
But these are the necessary steps to figure out what that unknown thing is and hey we might be
Ejaaz:
looking back on this conversation three years from now heck even a year from
Ejaaz:
now because of how fast everything's going
Ejaaz:
and realize that we were completely wrong and that it was terrible or realize
Ejaaz:
that it was even better than we can could have thought because we figured out
Ejaaz:
a way to understand how the models think or to koc or verify in a particular
Ejaaz:
way so i'm confident in humanity's
Ejaaz:
directive of figuring this out eventually we just have to be okay with not knowing for now.
Ejaaz:
And I think the government doesn't know. I think that Anthropic and OpenAI are
Ejaaz:
working their hardest to figure out what that might be.
Ejaaz:
And other frontier labs are also doing the same.
Ejaaz:
Another thing worth pointing out is
Ejaaz:
If you've been listening to any of our conversations and episodes over the last
Ejaaz:
couple of weeks, you've noticed that it's pretty heavy Anthropic and OpenAI.
Ejaaz:
And that's a trend, the final trend probably that I want to point out in this
Ejaaz:
episode, which is there is a consolidation of resources, capital,
Ejaaz:
GPUs, and frontier models to two companies right now.
Ejaaz:
Over the last week and a half, Google lost four key members,
Ejaaz:
including their CTO of DeepMind, to either Anthropic or OpenAI.
Ejaaz:
And so there seems to be this consolidation of not just AI researchers,
Ejaaz:
but also some of the smartest economists from universities are quitting their
Ejaaz:
desk jobs and joining and doing research at these companies.
Ejaaz:
And so it just feels like this vacuum is happening at this moment for OpenAI
Ejaaz:
and Anthropik, right as they're about to go public.
Ejaaz:
And it's going to be very interesting to see how that plays out.
Ejaaz:
Obviously, when a company goes public, there is more transparency.
Ejaaz:
And I think that's going to be warranted. But I heard rumors that OpenAI might
Ejaaz:
be delaying it until 2027.
Ejaaz:
So all these interesting things, it's okay not to know at this point,
Ejaaz:
but I look forward to unpacking it on this show going forwards.
Josh:
Yeah, there's going to be a lot of unknown wildcards. I think the one thing
Josh:
that's been certain throughout recording, what, we're almost at 200 episodes
Josh:
of this, is that every week, every month, things just continue to get crazier at an increasing rate.
Josh:
I think we're just going to continue to see that as all this progress happens,
Josh:
as a lot of the talent, like you mentioned, consolidates to a few companies
Josh:
that are then really just pushing the frontier forward.
Josh:
I mean, right now it is OpenAI and it's anthropic.
Josh:
Elon's been posting a lot on X talking about how he really believes grok is
Josh:
going to be close to catching up and the xai team so we'll see but there i mean
Josh:
i remember just a few months ago we were super bullish on google and couldn't
Josh:
be more excited about the products they're releasing
Josh:
and then that was that was kind of like the end of the releases and then they
Josh:
just like kind of stopped and there's still incredible products incredible models
Josh:
but the frontier has moved so much
Josh:
past that now that it's like okay yeah they're cool but like that's like six
Josh:
month old news dude like where's the new stuff and we're going to continue to
Josh:
see this and i'm just hopeful that everyone can keep up,
Josh:
like having more companies in this race is a good thing and we want everyone
Josh:
to keep up so i'm rooting for everyone i hope,
Josh:
this all works out um i hope everyone's able to continue progress and i hope
Josh:
we're able to start using these models this is it's a weird present for us too
Josh:
it's like normally we like to come on here and talk about
Josh:
the new model show you some examples show you the cool things you could do with
Josh:
it now we could just speculate on benchmarks because that's really all we have
Josh:
we can't touch these things to use them
Josh:
and hopefully that changes soon because i mean i miss our examples i miss like
Josh:
getting to play around with them i'll touch it feel it yeah now it's just a
Josh:
lot of speculation and um,
Josh:
Yeah, that's it. So we'll see. We'll continue to follow this along as it goes,
Josh:
as always. I'm keeping you up to date on everything.
Ejaaz:
Exactly. If any of you miss Josh and I trying to recreate Mario from scratch,
Ejaaz:
you know, please send us in the comments. We'll be the first ones to do it.
Josh:
Yeah, send a letter to this government saying, dude, we missed our demos. Give it back.
Josh:
We should get us on that list. Yeah, there's only 20 companies like make it 21 and throw us on there.
Ejaaz:
Like for all the doom and gloom in the world, like our core focus and vision
Ejaaz:
at Lemuelist has literally just been to teach and show people how amazing this tech is.
Ejaaz:
And that's all we actually care about. So this whole like model release thing
Ejaaz:
has been a bit of a pain, but like we can't wait to get our hands on these models
Ejaaz:
and actually do good with it.
Ejaaz:
So as Josh mentioned, this with what, 200 episodes into this right now,
Ejaaz:
we couldn't have done any of this without you folks that have been listening
Ejaaz:
and watching us and waiting us and subscribing to us.
Ejaaz:
All the comments and feedback has been incredibly helpful. We've now reached
Ejaaz:
a point where we are looking for sponsors, for people to support us.
Ejaaz:
We've been keeping the lights on ourselves so far.
Ejaaz:
And we've reached a point where if we want to continue doing it,
Ejaaz:
we do need some form of support. So if you are someone who is in a position
Ejaaz:
to be able to have a conversation with us about that, please reach out to us
Ejaaz:
on X or on our email, which is included in the description. I'm just out of my personal one there.
Ejaaz:
And also, if you know someone that might be able to help, please ping them,
Ejaaz:
let them know like send them your favorite episode and let them make a judgment
Ejaaz:
but it would be super helpful for us as we continue the show
Josh:
Yeah and as always if you enjoyed it don't forget to share it with your friend
Josh:
as well who might also enjoy it rate us on your favorite podcast player and
Josh:
yeah as always we'll be back again with another episode later in the week so
Josh:
thank you so much for watching and we will see you guys in the next one.
Ejaaz:
See you guys