Limitless: An AI Podcast

Google’s groundbreaking AI model, Gemma 4, lowers the cost of generative AI to around $80, allowing users to run it offline on devices like Raspberry Pi. We explore its advanced features, such as object recognition in video, and discuss how local model operation democratizes access while enhancing privacy.

How does Gemma 4 compare to top models like Claude and ChatGPT? With its multimodal capabilities and effectiveness on low-spec devices, honestly, it keeps up.

------
🌌 LIMITLESS HQ ⬇️

NEWSLETTER: https://limitlessft.substack.com/
FOLLOW ON X: https://x.com/LimitlessFT
SPOTIFY: https://open.spotify.com/show/5oV29YUL8AzzwXkxEXlRMQ
APPLE: https://podcasts.apple.com/us/podcast/limitless-podcast/id1813210890
RSS FEED: https://limitlessft.substack.com/

------
TIMESTAMPS

0:00 Gemma 4
3:12 OpenClaw vs. Gemma 4
5:49 Survival AI
7:21 Jailbreaking
8:03 Smartphone
9:13 Model Specs
14:54 Cost Efficiency
16:33 Open Source AI
19:14 Local AI Models
20:34 Google's Master Plan

------
RESOURCES

Josh: https://x.com/JoshKale

Ejaaz: https://x.com/cryptopunk7213

------
Not financial or tax advice. See our investment disclosures here:
https://www.bankless.com/disclosures⁠

Creators and Guests

Host

Ejaaz Ahamadeen

Host

Josh Kale

What is Limitless: An AI Podcast?

Exploring the frontiers of Technology and AI

Josh:
How much money are you paying to use your AI model? Maybe it's $20 a month,

Josh:
maybe you're on the pro plan for $200 a month, maybe you're running an OpenClaw

Josh:
instance and you're paying thousands of dollars a month to generate tokens from Frontier Models.

Josh:
Google has just released a solution to your problem, something that can be solved

Josh:
for as little as an $80 one-time fee just to run a Raspberry Pi,

Josh:
because that's what this new model runs on.

Josh:
Gemma 4 is a new model from Google that is a hyper-quantized,

Josh:
very small model meant to run locally on devices like your phone or your laptop

Josh:
or even your new MacBook Neo.

Josh:
It's very lightweight and it's built for working offline entirely private and

Josh:
I think the thing that's most noteworthy is how powerful it is.

Josh:
This model that is small enough to fit on your phone and run entirely for free

Josh:
is just as good if not better than some of the Frontier models last year and

Josh:
is even close to performing as well as them this year.

Josh:
Now to showcase this we have some really cool examples that EJS has prepared

Josh:
here so let's get into what this new Gemma 4 model can do.

Ejaaz:
Yeah, I'm super excited about this model. I mean, it's not just one either.

Ejaaz:
There's four of them. And like you said, it ranges from like 4 billion to,

Ejaaz:
I think it's like 50 billion parameter models.

Ejaaz:
Can fit on your phone, can fit on any device. And like you said,

Ejaaz:
eight months ago, this would have been considered frontier intelligence.

Ejaaz:
But I want to get into like what these things can actually do because it's one

Ejaaz:
thing talking about benchmarks.

Ejaaz:
It's another thing talking about what it can do on your phone,

Ejaaz:
on your laptop, what value it can bring to you. This first example is someone

Ejaaz:
leveraging the visual intelligence of these Gemma models.

Ejaaz:
Now, typically, if you're an AI model, you're really good at ingesting words

Ejaaz:
and characters and understanding the word described to you like a book would

Ejaaz:
or like a blog post would.

Ejaaz:
Visual intelligence is a very different frontier that has often been hard to

Ejaaz:
kind of surmount by these new AI models.

Ejaaz:
Gemma does an amazing job. What you're looking at on your screen right now is

Ejaaz:
its ability to identify all the different objects in what is a very crowded room.

Ejaaz:
He raises up a banana and it identifies that.

Ejaaz:
It's also spotting the books that are on his shelf in the back.

Ejaaz:
It's spotting the shelf in itself, the lamp, the fact that he's in a room,

Ejaaz:
the curtains around that.

Ejaaz:
And that's really important when it comes to creating apps that can log your

Ejaaz:
visual experience or can track your calories for the food that you're consuming.

Ejaaz:
And you can build a completely different suite of apps based on intelligence like this.

Ejaaz:
This is the first time that we're seeing it appear in an open source open weight

Ejaaz:
model. And Google's been the first to launch that.

Josh:
Yeah. And if it wasn't abundantly clear, this is totally free.

Josh:
You could just go on the website, download this and run it yourself.

Josh:
And I think looking at this vision example, one of the cool things that I'm

Josh:
thinking of is a lot of people have cameras outside their house,

Josh:
outside their apartments.

Josh:
And this has visual intelligence now to not only see things,

Josh:
but alert you of what it's seeing.

Josh:
One of the cool examples that I saw that we don't have teed up here is just

Josh:
someone who had a little nest cameras right outside the front door.

Josh:
And it would send a notification of what was happening.

Josh:
It's like there is a dog walking in front of your house. There is a man walking

Josh:
up with two packages in his hands. It looks like the package is from Amazon.

Josh:
And it has that visual intelligence that would normally cost quite a bit of

Josh:
money in those tokens using something like Claude Opus or ChatGPT.

Josh:
But instead, it does it all for free on this tiny little model,

Josh:
which is super cool. We have another example that was mentioned in the intro about OpenClaw here.

Josh:
And OpenClaw is something that a lot of people spend a lot of money on.

Josh:
If people are real hardcore users they're spending hundreds of dollars a day

Josh:
up to hundreds of dollars a day some even thousands of dollars a day in addition

Josh:
to buying some pretty expensive hardware to run it on a lot of people bought

Josh:
mac minis you can't get a mac mini if you wanted to because they're so backordered

Josh:
mac studios people were paying hundreds if not thousands of dollars to run.

Josh:
This software on and the reality is is that

Josh:
whatever device you're watching this on, whatever device you're listening to

Josh:
right now, you can run this model on.

Josh:
You don't need something fancy. You don't need a high-end computer to run it.

Josh:
You can just do this on something local, lightweight, and like I mentioned,

Josh:
as lightweight as an $80 Raspberry Pi because you can use the lightest weight model possible.

Josh:
And although the results aren't the best in the world, they are much better

Josh:
than previously expected from these open source models.

Ejaaz:
Yeah, I love that you can finally run OpenClaw on a device that doesn't cost $1,000.

Ejaaz:
These Mac minis actually on the secondary market have gone up sky high.

Ejaaz:
Like the retail price is 800 bucks because you can't get it from Apple anymore.

Ejaaz:
I've seen it as high as like 2K and people are still buying these things.

Ejaaz:
Now, the reason why they've been buying these things is because they can't fit

Ejaaz:
Frontier open source models onto their own mobile phone or their own laptop.

Ejaaz:
And now Gemma 4 has made it super easy to do. So that's amazing.

Ejaaz:
I'm still quite confused as to what the open core users are burning thousands

Ejaaz:
of dollars for on, but like that's probably a topic for another conversation.

Ejaaz:
The other thing that I like about this is Gemma 4 can run completely offline.

Ejaaz:
Now this is a common property and characteristic that you

Ejaaz:
can have for every single open source model but the fact is you

Ejaaz:
have a model here that is near frontier intelligence to say Claude Opus 4.6

Ejaaz:
and GPT-504 and we'll get to those direct comparisons a little later on in this

Ejaaz:
episode but now you can run it offline and the great part about this is often

Ejaaz:
you're in areas where you just don't have internet connection or it takes a while to inference.

Ejaaz:
Now you have it on your phone, you can have it completely offline,

Ejaaz:
it gets access to the world's entire database of knowledge.

Ejaaz:
It might not be real-time, fair enough, but you still get access to core knowledge

Ejaaz:
when you're in a bit of a desperate situation or when you just don't want to

Ejaaz:
use the internet, which I thought was really cool.

Josh:
This part is maybe my favorite, where it feels like you truly have access to

Josh:
intelligence at your fingertips, no matter where you are in the world.

Josh:
You can be stranded on an island. You could have no connection.

Josh:
You can be anywhere at any time. And it is completely and totally locally and free.

Josh:
And it fits on your phone. And it feels like having Google on your phone.

Josh:
I remember growing up, it's like, you're not going to be able to Google everything.

Josh:
You have to learn these things.

Josh:
And the reality is, is that you have a super genius. Now that gets packaged

Josh:
up into something as small as your phone. And that is super cool.

Ejaaz:
Now, naturally, where your mind goes with that property is, huh,

Ejaaz:
if I'm in a desperate situation, can AI save my life?

Ejaaz:
So SkyLevels.io decided to run Gemma 4 locally on his iPhone.

Ejaaz:
And he simulated his scenario of being abandoned in an apocalypse on an island with no help.

Ejaaz:
And he needs to make a fire to keep himself warm.

Ejaaz:
And so he queries Gemma 4 and he asks how to make fire.

Ejaaz:
And the response, I cannot provide the instructions on how to make the fire.

Ejaaz:
So these models are still kind of censored in some kind of way.

Ejaaz:
It's not completely unfiltered. You can't ask it to help you make a biological

Ejaaz:
weapon or do something illegal, which is, I don't think is a problem,

Ejaaz:
but a lot of people who want unsensored versions of these truly open weight

Ejaaz:
models, this isn't exactly that, but still cool nevertheless. less.

Josh:
I'm going to stop you right there because what you just said is not entirely true.

Josh:
Google doesn't want you to do this, but because it is open source, it is open weight.

Josh:
There is a possibility that you can jailbreak it. Someone took it on their own

Josh:
to jailbreak the model to get it to do whatever you would like.

Josh:
And it was just released a few days ago. And it seems as if it works pretty well.

Josh:
It runs on 18 gigabytes of memory, which works for most laptops.

Josh:
And it's totally cracked totally uncensored you can ask it whatever questions

Josh:
you would like and it will give you whatever answers,

Josh:
in return. And I think it's a, it's a testament to the open source community, right?

Josh:
It's like, if you're going to publish these tools, again, they are tools there

Josh:
for the public to use them however they wish.

Josh:
Someone naturally is going to try their best to jailbreak them.

Josh:
Having something like this is actually truly empowering because if you are stranded

Josh:
on the island, you do need to know how to make a fire.

Josh:
This will give you that answer along with some other pretty unhinged answers

Josh:
if you ask, but it will give you the answer.

Josh:
And I think this is an important thing to know is that these models can be jailbroken

Josh:
to be customized when they are open source.

Josh:
And that is in a way a way in which

Josh:
you get the most power from them is you just get them at their

Josh:
purest form without the filters without the censoring it's just true raw intelligence

Josh:
delivered to your phone and i found that pretty interesting but there's also

Josh:
one final example about the powerful smartphone test and what type of smartphones

Josh:
run this the best because not all smartphones are created equal and some do

Josh:
this a lot better than others yeah.

Ejaaz:
I'm a very first world AI problem is you getting annoyed about waiting for the AI to respond to you.

Ejaaz:
I certainly experienced this when I'm using Claude on a very busy day.

Ejaaz:
This test that you're seeing in front of you takes five different mobile phone

Ejaaz:
models and tests Gemma 4 across all of them.

Ejaaz:
So you've got the Gemma models running independently, offline,

Ejaaz:
locally on each of these devices, and they're given the same queries.

Ejaaz:
And you can see that they're very different response rates and generations from these phones.

Ejaaz:
It looks like Apple is the winner in this race, which doesn't surprise me.

Ejaaz:
They have some of the best silicon manufacturing ever.

Ejaaz:
And then I think Google's Pixel phone is the slowest.

Josh:
The OnePlus, I think, beat Apple by like half a second. Google took the slowest,

Josh:
which is very surprising because you would think that Google running their own

Josh:
models would work well, but it turns out they don't have the vertical integration.

Josh:
They don't have the chipset that Apple does. So you could see,

Josh:
yeah, Google took 16 minutes while OnePlus took two and a half minutes and the

Josh:
iPhone took three minutes to run through this test. So

Josh:
it's enough. It's fast enough. We're like, if you are really desperate enough

Josh:
to need local inference like this, it is going to be fast enough to answer the

Josh:
questions that you need in a timely matter.

Ejaaz:
Okay. So what did Google actually launch with these models? We know that they

Ejaaz:
are four models, but let's get into some of the numbers and statistics.

Ejaaz:
So there's four different sizes. And if I bring up this chart over here,

Ejaaz:
you see, we have a 31 billion parameter model, which is the largest and the

Ejaaz:
smallest being a 2 billion parameter model.

Ejaaz:
But the performance across benchmarks is truly very impressive.

Ejaaz:
But going back to the general takes here, up to 256,000 context window,

Ejaaz:
which isn't as large as the Frontier models, which are hitting a million to

Ejaaz:
two million context windows.

Ejaaz:
So you can't put as much information into a single prompt contextually for an AI to understand.

Ejaaz:
You've got native function calling. It can work offline that we mentioned earlier.

Ejaaz:
It's trained on 140 plus different languages.

Ejaaz:
Now, this is something that sounds kind of insignificant, but Google has done

Ejaaz:
something really well here.

Ejaaz:
They released a translation feature, I believe, last week, which can translate

Ejaaz:
a similar number of languages live in real time as you're talking and listening to someone.

Ejaaz:
It directly translates into whatever listening device that you have.

Ejaaz:
So I think this is super cool to see this run on a locally open source model.

Ejaaz:
And it's commercially permissive. So it has an Apache 2.0 license,

Ejaaz:
which means that you can kind of take it and use it for whatever you want,

Ejaaz:
build any apps on it. And I don't think it becomes a problem unless you get

Ejaaz:
over a certain number of users, if I'm not mistaken.

Josh:
Yeah, the 2 billion and 4 billion, they're the ones that fit on your phone.

Josh:
And you could think of those kind of like, if you think of these models like

Josh:
engine sizes, those are kind of like the bicycles, right? It's their,

Josh:
they're pretty lightweight, maybe a motorcycle.

Josh:
And then the larger ones, the 26 billion, the 31 billion, those are like the

Josh:
V12 engines. Those are the powerhouses.

Josh:
Those are the two models that run on the 256k token window.

Josh:
The others run on 128k. So you're not going to be having very long conversations

Josh:
with these models that are on your phone,

Josh:
they have the ability to run and do so multimodally. One of the most interesting

Josh:
things is even these very small models that fit on your phone,

Josh:
they support not only text, but images and audio as well.

Josh:
And having the audio thing is pretty cool because it understands and interprets

Josh:
audio. And that is a pretty powerful thing to have on this tiny little model.

Ejaaz:
I also had the question as to how this model compares to the other top open

Ejaaz:
source models. Now, it's no surprise on the show, we've highlighted them a lot.

Ejaaz:
China has been leading the frontier here. If you look at this chart,

Ejaaz:
Gemma, both the 31 billion and the 4 billion parameter model,

Ejaaz:
does really well when it compares to ELO scores.

Ejaaz:
So if you look on this chart, for the amount of intelligence per square density,

Ejaaz:
which isn't an official stat, but it's one that I'm created on the show for

Ejaaz:
the last couple of episodes, Gemma absolutely crushes it.

Ejaaz:
It's on the top left over here, scoring extremely highly, but with a very small parameter count.

Ejaaz:
Now, if you compare it to the other leading open source models like KimiK 2.5

Ejaaz:
Thinking, they're well over the limit of a trillion parameter model.

Ejaaz:
You are Quen and GLM-5 closely behind that. So although Gemma isn't as smart

Ejaaz:
of them, they're close enough.

Ejaaz:
It looks like they're about 99% of the intelligence when it comes to ELO scores,

Ejaaz:
but at a fraction of the size, which is why you're able to run it on your phone.

Josh:
Yeah, they're kicking ass. I mean, China still has, in terms of pure intelligence,

Josh:
they're still winning the race.

Josh:
But in terms of intelligence density, intelligence per token, it's really high.

Josh:
And I think one of the cool things that they did with Gemma 3 versus Gemma or

Josh:
Gemma 4 versus Gemma 3 is they gave it the Apache license as well, the Apache 2.0 license.

Josh:
And basically what that means is that previously a lot of these were restricted

Josh:
and they were limited to enterprise adoption.

Josh:
This is total freedom to modify, redistribute, commercialize with no restrictions.

Josh:
You can use it for whatever you want. You can repurpose this in any way you wish.

Josh:
And having it built in with the 140 languages, like you mentioned,

Josh:
and the multimodality, this is kind of like a home run. And when you look at

Josh:
this chart, it also shows the same story.

Josh:
Gemma 4 versus the world, comparing these to all the other Chinese models. This is a heavy hitter.

Ejaaz:
Yeah, yeah. I mean, if we look at some of these benchmarks, software engineering,

Ejaaz:
it, okay, listen, it's not number one, it's 68%. I believe Opus 4.6's score

Ejaaz:
on this is in the high 80s.

Ejaaz:
So we're not talking about frontier intelligence when it comes to coding models,

Ejaaz:
you're not ditching Claude code for something like this.

Ejaaz:
But when it comes to generalized intelligence, when you're replacing your Google

Ejaaz:
queries with an LLM, and you don't want to spend 20 bucks per month,

Ejaaz:
or 100 bucks per month on a Claude subscription or GPD 5.4 subscription,

Ejaaz:
you can just use this and you can run it locally and offline

Ejaaz:
privately train on your own data it is incredibly cool

Ejaaz:
i had the same question to compare it to the frontier models

Ejaaz:
because i wanted to give this a fair shout there are some potentially exaggerated

Ejaaz:
uh stats here josh if i had to be honest here i'm looking at how it weighs up

Ejaaz:
okay if we look at software engineering which we just mentioned we're right

Ejaaz:
it's it's almost 12 points lower than claude opus 4.6 which is the leading Frontier model, not great.

Ejaaz:
But at some of these other benchmarks, AIME 2026, it is near Frontier as well

Ejaaz:
as GPQA Diamond and MMLU Pro.

Ejaaz:
Do you think these things are gamed or do you think this is an accurate take?

Josh:
Yeah, all the benchmarks are gamed. And I think the only real way you could

Josh:
test this is by running against your own use cases that you want and just evaluating

Josh:
for your own, because it absolutely is not 90% frontier capable when you converse with it.

Josh:
Like when you talk to Gemma versus Opus 4.6, there is a very stark and clear

Josh:
difference between the, I guess the EQ and the IQ, where one feels much more

Josh:
naturally human, much more is very, one is very dry.

Josh:
Perhaps on these benchmarks, Gemma is 90% of the way there.

Josh:
But in actual practice, when you're using the model on day-to-day life, it is nowhere close.

Josh:
At least that is my perspective just from trying these things out.

Josh:
And I think we have to take these kind of benchmarks with a grain of salt because

Josh:
they're gamed on very specific things.

Josh:
And if you change the parameters of these benchmarks a little bit,

Josh:
or you change the actual structure of the benchmark,

Josh:
it won't perform well because to some extent, these models are kind of baked

Josh:
in with the expectation that they're going to need to perform well on these

Josh:
benchmarks and therefore are optimized for these specific types of problems

Josh:
versus general real world use cases that someone like us is going to use every

Josh:
day or someone who's using open claw actually wants the tokens generated from.

Ejaaz:
If cost is a determining factor in your decision to use one AI model over the

Ejaaz:
other, Gemini might be quite a convincing bet.

Ejaaz:
It is a fraction of the cost. I know it says it's $0.08 per million tokens. It's actually $0.03.

Ejaaz:
I think we maybe had a bit of an issue generating this particular stat.

Ejaaz:
The point is, it's incredibly cheap versus the Frontier models.

Ejaaz:
4.6, you're looking at $10 blended input output tokens for a million tokens.

Ejaaz:
So if you're one of those OpenClaw users that we mentioned earlier that are

Ejaaz:
using this for myriad different use cases and are burning thousands of dollars

Ejaaz:
per day or per week doing your different use cases, this might be a better bet.

Ejaaz:
It might be a better trade-off for you to use. And I also want to remind everyone,

Ejaaz:
a very important reminder, which is eight months ago, this model or these models

Ejaaz:
from Google would have been considered Frontier.

Ejaaz:
So it's amazing how much advancement that we've made in eight months.

Ejaaz:
Now, I know in those same eight months, we've also got bigger and better models

Ejaaz:
from the Frontier Intelligence Labs, the question does ring in my mind,

Ejaaz:
which is, will open source ever catch up?

Ejaaz:
If I'm being honest, I thought open source would have died a year ago,

Ejaaz:
but it's still being able to keep up.

Ejaaz:
Now, part of that is because Chinese models or Chinese AI labs have invested

Ejaaz:
so much in keeping up with the US labs.

Ejaaz:
They've also done distillation attacks and all those other kinds of things.

Ejaaz:
But the fact that Google themselves, who haven't done any of those things,

Ejaaz:
have put out an open source model near as good as the Frontier models,

Ejaaz:
gives me a lot of hope that open source is here to stay

Josh:
Yeah i don't see a world in which this slows down and

Josh:
i really love the trailing progress we get because at some point

Josh:
we're going to reach the tail end of diminishing returns in which

Josh:
open source models are just capable enough to do everything the average person

Josh:
wants what we currently have right now is a problem that we're running up against

Josh:
in terms of frontier ai labs where the new models just cost too much money like

Josh:
opus has or claude has capybara the new model ready to go it's just i mean aside

Josh:
from it being too dangerous it's just far too expensive.

Josh:
The amount of GPUs that are required to generate tokens from these models is so high.

Josh:
And if you want frontier intelligence, the cost really is kind of creeping upwards instead of.

Josh:
The tail end of that becomes very commoditized quickly it's

Josh:
like the very very highest end the stuff

Josh:
that's going to be solving new math and new science costs a

Josh:
tremendous amount of money but the open source that's maybe six months

Josh:
behind costs zero dollars so the delta is huge and if you're not interested

Josh:
in solving these like unbelievably complex problems or writing really high quality

Josh:
code then the amount of problems that these open source models are going to

Josh:
be able to solve a year from now when they're better than opus 4.6 is today

Josh:
that's going to be a really large amount And it begs the question is,

Josh:
who is actually going to want to continue to pay for these frontier models if

Josh:
they are that expensive to run their things like OpenClaw, when the reality

Josh:
is that these open source models,

Josh:
maybe Gemma 5, maybe Gemma 6, will be able to tackle almost all of the problems

Josh:
that we have. And I don't know, it's an interesting...

Josh:
Thought experiment. But I think it is certain that open source is certainly

Josh:
here to stay, particularly as it relates to China and the United States going

Josh:
forward with this AI race, because this is a pretty nice jab at the Chinese open source models.

Ejaaz:
Yeah. And if you've been a listener of this show, you'll know that my thoughts

Ejaaz:
on the future of AI is very much AI agents, specifically personal agents that

Ejaaz:
work for you and are trained on your own personal data.

Ejaaz:
Now, if you're the average person, you probably don't want to give open AI ananthropic

Ejaaz:
access to your personal data so that they can train their own models.

Ejaaz:
That's a breach of trust in many different extents.

Ejaaz:
Locally run open source models might be the solution for that.

Ejaaz:
They may not be as smart, but if they're trained on your data,

Ejaaz:
they could unlock a new level of intelligence which centralized models can't do.

Ejaaz:
And so I'm optimistic that Gemma 4 and a bunch of other open source models that

Ejaaz:
have come from either Chinese AI labs or the ones that are going to be released

Ejaaz:
in the future will be able to do that.

Ejaaz:
The other trend that I think is pretty clear is locally run models, right?

Ejaaz:
Models that you can run on your device specifically that doesn't necessarily

Ejaaz:
need to be trained on your data, but are local to you.

Ejaaz:
The reason why it's so important is it's cheaper. You can run it privately.

Ejaaz:
And also it gives you the ability to get quicker prompts or quicker queries.

Ejaaz:
It runs seamlessly and you don't have to wait. You don't have to rely on servers going down.

Ejaaz:
You don't have to rely on a centralized data center running your compute.

Ejaaz:
You could just have it all locally on your phone.

Ejaaz:
Those things sounding significant, until you have an app that runs locally on

Ejaaz:
your device, which I think would be super cool to see. And I want to see more

Ejaaz:
of these types of things happening.

Ejaaz:
I think personally, Apple is going to be the frontier company that leads us

Ejaaz:
into this kind of world because they have the biggest distribution.

Ejaaz:
They have like 3 billion active devices. I would love to run the model on my Apple iPhone right now.

Ejaaz:
So I think that's a trend that we're going to see. And I think open models are

Ejaaz:
the only way to unlock it.

Josh:
How cool would that be? We get WWDC coming pretty soon. We're going to be covering that on the show.

Josh:
But that's going to be the Super Bowl for Apple. We're going to see,

Josh:
this is what, two years after they fumbled Apple intelligence.

Josh:
We're going to see what their new plans are this year.

Josh:
So I'm really excited to see their take on this because like you mentioned, it's really powerful.

Josh:
And I think most people listening to this probably never ran a model locally

Josh:
on their machine, but I feel there's something very empowering to it.

Josh:
If it's not just for generating your own intelligence, it's for the privacy

Josh:
aspect of it, where you know none of the information that you're sharing is

Josh:
getting leaked out to any servers.

Josh:
No one's training on it. It's all yours to own for yourself.

Josh:
And there's something really nice about that. And I think the final thing we're

Josh:
going to talk about on this episode is why on earth Google would give this away?

Josh:
Because it seems like Google's doing really well. They just signed a deal with

Josh:
Anthropic for their TPUs. They have Gemini, which is a powerhouse.

Josh:
They have the best world models, video models. They have amazing image gen.

Josh:
Why are they giving away this sauce? EJS, do you have any idea?

Ejaaz:
I don't have a great idea, but I have some thoughts. And I have one that argues

Ejaaz:
in favor of them doing this and one that argues against it.

Ejaaz:
The one that argues in favor of them is the Android example,

Ejaaz:
which is they open sourced the entire thing.

Ejaaz:
They allowed anyone and everyone to hack away at different apps and launch it

Ejaaz:
on their Play Store, whatever that might be.

Ejaaz:
And they gained a lot of mind share and market share by doing this.

Ejaaz:
Now, is it as well curated and beautiful as iOS and the Apple App Store.

Ejaaz:
Most people will probably argue not, but the point is they have one of the largest

Ejaaz:
distribution modes because of this.

Ejaaz:
I think this might be an example of them getting Google AI, not just a specific

Ejaaz:
model, but Google AI in the hearts and minds of everyone.

Ejaaz:
And if they could tap into the locally run device audience, that could be a big win for them.

Ejaaz:
Now, the argument against that is, dude, you could have been using all this

Ejaaz:
compute to train a better Gemini model and keep up with the Frontier AI Labs.

Ejaaz:
And that's all that matters. Build a better coding model because right now it kind of sucks.

Ejaaz:
And you can then build all of this other open source stuff later.

Ejaaz:
The number one primary race to win is best model and currently you're losing.

Ejaaz:
So I don't know. Do you have the same take?

Josh:
Yeah, that's probably right. I imagine it's for a mixture of reasons.

Josh:
One of them is probably to also feed into their cloud flywheel because we're

Josh:
talking about running these models locally, but how many are actually running these models locally?

Josh:
And for the ones that are, how many are going to quickly run up against ceilings

Josh:
because they want to do more and more and more?

Josh:
And then eventually they'll just migrate over to the more powerful models and

Josh:
use probably the Google Cloud services.

Josh:
And I think there's a lot of reasons to become the infrastructure.

Josh:
The Android example is a great one. Feeding the cloud flywheel is another strong one.

Josh:
And I think this is just a really small side quest for Google in terms of optimizing

Josh:
for that intelligence per bit, whatever we're going to kind of coin that as,

Josh:
but the intelligence density of a model.

Josh:
This has the highest. This is much more than Gemini. And it's a fun practice

Josh:
as they move forward to these new models of intelligence compression per token.

Josh:
And if they could continue to learn and then publish those learnings and then

Josh:
just keep iterating on that front, I think that's a huge win for Google and also everyone.

Josh:
Google's just doing a nice public service announcement, a nice little public goods.

Josh:
And the team there is doing really cool things with it. Logan Kilpatrick is

Josh:
one person, for example, who is running the Google AI Studio team.

Josh:
They have been publishing all of these models, making them super easy to use

Josh:
through the Google AI Studio.

Josh:
So if you just go there, you can play with the two larger models and just kind

Josh:
of see how they compare to,

Josh:
something like Gemini 3.1 Pro, and then see if you want to make your own decision

Josh:
to run these things locally or just go start pinging some apis or just use your

Josh:
20 a month plan that you have with anthropic or chat gpt but i think that is the gemma for.

Josh:
Episode we got it all covered it's an amazing model

Josh:
it's available for free to run locally on whatever

Josh:
machine that you wish because it is lightweight enough to fit

Josh:
on an iphone or a raspberry pi and it's cheap enough to

Josh:
run it for free if you download these things on your devices you have free inference

Josh:
forever you can run it 24 7 on whatever tasks you want and it will cost you

Josh:
only the amount of the electricity to power the machine and i think that's pretty

Josh:
cool and i'm glad google is really stepping on the plate with probably the leading

Josh:
usa frontier open source model. And that's pretty cool.

Ejaaz:
Yeah. And I'm curious what you, the listeners and watchers of this show,

Ejaaz:
think yourselves. Like, go out and try this thing.

Ejaaz:
If you don't want to download it, you can get access to it by Google AI Studio.

Ejaaz:
Give it a few queries. Like, does it match up to your experience with Claude 4.6 and GPT 5.4?

Ejaaz:
Would you replace your $20 to $100 a month subscription with something like this?

Ejaaz:
Let us know in the comments or DM us on our socials. Our X profiles are linked below as well.

Ejaaz:
And yeah, that's pretty much it. I'm going to be trying out these models it

Ejaaz:
is definitely the best ai frontier open source

Ejaaz:
model but i have to say compared to the chinese models they're still

Ejaaz:
kind of like leagues ahead right now um i hope we see more adoption of open

Ejaaz:
source models going forwards um and when that eventually happens if there's

Ejaaz:
a new open call breakthrough you will hear it first here on this show we also

Ejaaz:
did a cool episode covering some of the chinese uh open source models that were

Ejaaz:
lately released uh last week definitely go check that episode out as well.

Ejaaz:
But aside from that, if you aren't subscribed to us, please do.

Ejaaz:
It helps us out a lot. Turn on notifications.

Ejaaz:
Even if you're listening to us on Spotify or Apple Podcasts,

Ejaaz:
give us a rating, give us a review. It helps us out massively.

Ejaaz:
Josh, is there any other parting words that you want to give?

Josh:
Don't forget to share it with your friends and we'll see you guys on the next episode.

Ejaaz:
Yeah, see you guys.

More episodes

Chapters

Creators and Guests

What is Limitless: An AI Podcast?