TBPN

Diet TBPN delivers the best of today’s TBPN episode in 30 minutes. TBPN is a live tech talk show hosted by John Coogan and Jordi Hays, streaming weekdays 11–2 PT on X and YouTube, with each episode posted to podcast platforms right after.

Described by The New York Times as “Silicon Valley’s newest obsession,” the show has recently featured Mark Zuckerberg, Sam Altman, Mark Cuban, and Satya Nadella.

Follow TBPN: 
https://TBPN.com
https://x.com/tbpn
https://open.spotify.com/show/2L6WMqY3GUPCGBD0dX6p00?si=674252d53acf4231
https://podcasts.apple.com/us/podcast/technology-brothers/id1772360235
https://www.youtube.com/@TBPNLive

What is TBPN?

TBPN is a live tech talk show hosted by John Coogan and Jordi Hays, streaming weekdays from 11–2 PT on X and YouTube, with full episodes posted to Spotify immediately after airing.

Described by The New York Times as “Silicon Valley’s newest obsession,” TBPN has interviewed Mark Zuckerberg, Sam Altman, Mark Cuban, and Satya Nadella. Diet TBPN delivers the best moments from each episode in under 30 minutes.

Speaker 1:

The big news today is that Meta Platforms has launched a new AI model. Alex Wang, the chief AI officer at Meta Platforms, announced a new large language model today, its first major new artificial intelligence model in more than a year. The rollout of the model called Muse Spark is a critical moment for Meta, which is up seven and a half percent already, which has spent billions of dollars hiring AI talent in a bid to catch up to OpenAI. Anthropic and Google DeepMind, the leading labs have been putting out models at an accelerating pace. In a departure from its previous models, which were open source, Muse Spark is a closed model that will power Meta's AI chatbot and AI features within it.

Speaker 1:

John Ludwig has a very interesting post about open source AI and sort of predicted this.

Speaker 2:

Predicted that Meta would eventually bail?

Speaker 1:

Yeah. The future of foundation models is closed source. He said, given Meta is the primary deep pocketed large open source model builder, open source AI has become synonymous with Meta AI. He wrote this maybe three or four years ago. So the operative question for open source AI is what game is Meta playing?

Speaker 1:

In a recent podcast, Zuckerberg explains Meta's open source strategy. One, he was burned by Apple's closeness for the past two decades and doesn't want to suffer the same fate with the next platform shift. It's a safer bet to commoditize your compliments. He likes building cool products, and cheap, performing AI enhances Facebook and Instagram. That's 100% true.

Speaker 1:

We've seen this in the ads product and the growth there. There's some call option value if AI assistants become the next platform, and that makes sense in Manus and the Meta AI app. He bought hundreds of thousands of h 1 hundreds for improving social feed algorithms across products, and this seems like a good way to use the extras. That all makes sense, and Lama has been great developer marketing for Facebook. But Zuck also suggests several times that there's some point at which open source AI no longer makes sense, either from a cost or safety perspective.

Speaker 1:

When asked whether Meta will open source the future $10,000,000,000 cost model, the answer was as long as it's helping us. At some point, they'll shift their focus towards profit. And that's what John Ludwig wrote. He says, unlike the other model providers, Meta is not in the business of selling model access via API. So while they'll open source, long as it's convenient for them.

Speaker 1:

Developers are on their own for model improvements thereafter. That begs the question if Meta is only pursuing open source insofar as it benefits themselves, what is the tipping point at which Meta stops open sourcing their AI? Sooner than you think, he says. Exponential data, frontier models trained on the corpus of the Internet, but that data is a commodity. Model differentiation over the next decade will come from proprietary data, both via model usage and private sources.

Speaker 1:

Exponential CapEx he highlighted this two years ago a lagging edge model that requires just a few percent of Meta's $40,000,000,000 in CapEx is easy to open source. No one will ask questions. But when you reach $10,000,000,000 or more in CapEx spend for model training, shareholders will want clear ROI on that spend. The metaverse raised some question marks at a certain scale too. Diminishing returns on model quality within Meta.

Speaker 1:

There's a large upfront benefit for Meta building an open source AI model even if it's worse than the Frontier closed source counterpart. There are lots of small AI workloads, think feed algorithms, recommendations, image where Meta doesn't want to rely on a third party provider like they had to rely on Apple. And so the news has been back in December, there was a reporting that Alex Wang disclosed an internal company Q and A that his team was working on two new models. One was this text based LLM code named Avocado, and then a separate model that was for image and video

Speaker 2:

Mango. Mango.

Speaker 1:

Yeah. And so have they clarified if this is Avocado? This feels like what Avocado should be, this Muse Spark. Is that what it's called again?

Speaker 3:

Muse see it is. I don't what else

Speaker 1:

So so the image model should be coming soon. The the question that I had was, will a code focused agentic coating harness be a separate model, a different train? This feels like it's not a coincidence that this news is dropping on the heels of Anthropics' new model Mythos, which sort of was announced loosely and and the model card dropped yesterday even though the model is not available yet to play

Speaker 2:

They out break out Muse, Spark, Thinking against Opus four six max, Gemini 3.1 pro high, GBD 5.4 x high, and then Grok 4.2. The way that they position it, it looks somewhat of a chart crime. But if you look

Speaker 1:

Wait. Why?

Speaker 2:

When you basically look at the top and you see that Muse Spark gets an eight eighty six point four

Speaker 1:

Yeah.

Speaker 2:

And it's in blue. Yep. And then you look over and it's and it's outperforming all the other models on that benchmark.

Speaker 1:

Okay.

Speaker 2:

You just sort of assume that, like, the highlighted blue means that it's out Oh.

Speaker 1:

Crank Yeah. But it's And not because Frontier at MMLU.

Speaker 2:

Because there's a bunch of them as you get further and further down. So like Arc AGI two, Muse Spark dramatically underperforms

Speaker 1:

Oh, interesting.

Speaker 2:

The other leading models. Yeah. And there's there's a number of others where it is there's plenty where it's where it's over performing.

Speaker 1:

Yeah.

Speaker 2:

There's plenty where it's underperforming. I have I of course, we started asking it jokes right away. So I'll pull up the joke. Said, tell me a joke. Okay.

Speaker 2:

That said, gotcha. Here's one for you. Why don't skeletons fight each other? They don't have the guts.

Speaker 1:

Okay.

Speaker 2:

But then, it doesn't even let me It just says, want a different vibe? I've got dad jokes, nerdy ones, or Malibu appropriate surf puns too. And so, I'm like, wait, why would you think I want Malibu appropriate surf pun? If you look in your previous chats, I have no chats.

Speaker 1:

Previous chats. Because Meta AI crosses with Instagram. So when you search on Instagram, sometimes that prompts llama or previously.

Speaker 2:

Yeah. So that might to the bottom end. Right? Like, this feels like an oddly specific Yeah. Type of joke to And ask so I said, why would you think I want Malibu appropriate surf puns?

Speaker 2:

Fair call. Toss that in without you asking for it. Mhmm. I try to mix in different flavors of jokes, and surf puns just popped into my head as a random option. No special reason behind it.

Speaker 2:

It's like I just said, I don't believe it was random. And it says, you got me.

Speaker 1:

You got me?

Speaker 2:

It wasn't my smoothest moment. But then it denies it again. Weird. It says, it's just like, what kind of what kind of joke do you actually want? No hidden motives here.

Speaker 2:

No. It said, you gave a hyper specific example based on my life. So I have to assume you were looking at my Location. Account for inspiration.

Speaker 1:

Instagram. I

Speaker 2:

get why I get why that felt weird. I'm sorry it came off that way. And then it denies it again. It said, you're clearly lying.

Speaker 1:

I mean, you have talked about being in Malibu on the Internet for a full year. It's possible it got baked into the pre training or something. I don't know. But, yeah, mean, what is personal super intelligence if it doesn't even know your name? Like, that that that feels like they haven't dialed in the the harness or whatever the tuning is to actually

Speaker 2:

Yeah. And sports like Meta's gonna be hyper aware. We don't want a PR cycle.

Speaker 1:

Yeah. Yeah.

Speaker 2:

Like they trained on your data. Right? Like everyone's been,

Speaker 1:

oh, that ad was a little bit too close to home.

Speaker 2:

And you remember every every once in while, one of those, like, screenshot that's been screenshot like a thousand times, like, goes viral. And it's like, I do not give Mark Zuckerberg permission. Oh, yeah. Yeah. Yeah.

Speaker 2:

Like that

Speaker 1:

works. Yeah. It's it's hilarious. This is is this a rebuttal to the bench hacking allegations that happened last last week, so or last year. So according to Meta's internal benchmark test, Muse Spark outscored Google Gemini on some tests and was competitive with models from OpenAI and Anthropic on others.

Speaker 1:

It significantly outscored XAI's Grok on most tests. Alexander Wang's hiring followed the disappointing release of Meta's previous model called LAMA four. The company was accused of and later admitted to gaming a third party benchmark that it used to rank various models against each other on performance. It also delayed the rollout of its biggest model called Behemoth, which it never ultimately released. And so when I look at a model card like this where, yeah, you could call it a Chart Crime where it's highlighted in blue and it feels like it's the best, but it's actually doing better on some.

Speaker 1:

It does well in HealthBench hard. It underperforms on Arc AGI two, as you mentioned. But this maybe is the bull case here is that they have at least moved on from the culture of, like, optimizing for the benchmarks. Right? Isn't that a a good thing?

Speaker 3:

There are rumors about them. Like, there's, like, extra bonuses if they if they got number one on El Amarina. Think that was, like, something like the rumor. Yeah. But, I mean, you've seen a lot

Speaker 2:

of the labs kind

Speaker 3:

of move away from benchmarks generally because I think they're just not that meaningful anymore. Like, a lot of them are, like, basically so saturated. They're it's like they're competing between 8991%. Yeah. And they're just, like, not very meaningful.

Speaker 3:

Like, you you see

Speaker 1:

And you won't, like, actually feel that in the product necessarily.

Speaker 3:

Yeah. You you kind of need to talk to these things for a long time before you can actually get the vibe. Yeah. But I I do think this news is very interesting in the context of the, you know, Clawdonomics stuff.

Speaker 1:

The dashboard.

Speaker 3:

Yeah. Because, like, what okay. What does

Speaker 2:

it mean if

Speaker 3:

if the entire company has been, like, maxing their their clawed tokens Yeah. Over the past month. It means that they weren't using this model.

Speaker 1:

Yeah. To me, it means they need to commoditize their compliments, right? They need to bring down that cost potentially. If they're I mean, we sort of dug into, are they spending $1,000,000,000 a month? Seems like absolutely not.

Speaker 1:

But they're clearly spending a lot. And if you can turn that OpEx into CapEx and train your own model and then inference it much cheaper on your own hardware, that feels like just an economic opportunity that makes a ton of sense in the context of just 10,000, 20,000 engineers writing a lot of code.

Speaker 3:

Yeah. And I think there's basically like two ways to like square those two things happening. Either one, this model's like not that good because the engineers aren't using it. Or, you know, your theory that they're just distilling cloth. So I one of those is That

Speaker 1:

is not my theory. That is that is the schizo theory.

Speaker 2:

The the news this morning, Meta Platforms and the information. Meta Platform has taken down internal employee built leaderboard tracking how many token staffers were using.

Speaker 1:

Yeah.

Speaker 2:

Showed total usage over a recent thirty day period, amount is over 60,000,000,000,000 tokens. The dashboard now displays a message that is offline. It says, we've really enjoyed building this app on Nest for everyone. It was meant to be a fun way for people to look at tokens, but due to data from this dashboard being shared externally, we've made the decision to shutter it for now.

Speaker 1:

It seemed like a fun side project. Mike Isaac was reporting on it here. He said it's it's down. Unclear to me if this was a homespun one by employees or an official one. Employee projects come and go frequently, conspicuous timing, though.

Speaker 1:

But, yeah, you don't want to to have you wanna measure the output, the impact, not necessarily the input and how much is is going on there. Lisan Al Gayebe says Meta might actually be back with Muse Spark, still behind OpenAI, Anthropic, and Google, but ahead of XAI and Chinese Labs. Muse Spark soars 52 on the artificial intelligence analysis index behind only Gemini 3.1 Pro, Gemini GPT 5.4, and Claude Opus 4.6. Muse Spark is the first new release since Llama four in April 2025, and also Meta's first release that's not open weight. So a huge jump up in performance across a variety of benchmarks.

Speaker 1:

So all good stuff there.

Speaker 2:

The market is thrilled that Meta has released a Yep. A close to frontier Yeah. Level model. Yeah. Right?

Speaker 2:

This is a a new group.

Speaker 1:

They've been

Speaker 2:

at it for less than a year. The stock is up almost 8% today. And again, you know, so so much of the pricing pressure, the downward pressure on Meta has just been kind of uncertainty on what all these tens of Yeah. Millions of dollars will actually go towards and and what will be accomplished. And still unclear, like, are they gonna go after cogen at all?

Speaker 2:

Are they just gonna Yeah. Try to compete on the consumer LLM side? Very,

Speaker 1:

very And and can you economically go after code gen if you're just using it for internal models? If you're not selling it externally, can you justify the CapEx just purely on the internal usage? Having this model be vended into all the different family of apps makes a lot of sense because they have billions of users that will wind up interacting with this in one way or another.

Speaker 2:

Yeah. The question is, will they try to send Meta Meta vibes

Speaker 1:

Again, with the new model

Speaker 2:

all up the to the to the top of the App Store charts.

Speaker 1:

Meta's new family of AI models can reach the same performance as Kimi K two with only 30% of compute and only 10% of the compute to reach MAVERICK, so a much more efficient computing frontier here. MetaSpark is an early data point on our trajectory, and we have larger models in development. So the mythical 10,000,000,000,000 parameter model, that that is the 10 t is what everyone's working on right now, 10,000,000,000,000?

Speaker 3:

Yeah. Probably Well,

Speaker 1:

flat.

Speaker 3:

True. In that range.

Speaker 1:

Yeah. It's all rumored at this point.

Speaker 3:

Yep. Rumored GPT four was something like a trillion. Yep. Right? You remember those memes where it's like a small circle and then the big circle And

Speaker 1:

then a huge circle.

Speaker 3:

GPT-four, GPT-five.

Speaker 1:

Yeah. Martin Casado has a has a little bit more context on like what actually unlocks new capabilities in AI models. He says, Mythos appears to be the first class of models trained at scale on Blackwell's. Then there will be Verruubens. Pre training isn't saturated, narrative violation.

Speaker 1:

RL works. And there's so much computing coming online It's narrative violation. Buckle your chin straps. It's gonna be wild. The scaling laws

Speaker 2:

And you know Brad Gerstner had to come in with a 100 of

Speaker 1:

the Hundred and hundred. Yep. For sure. Yeah. There's a there's a crazy bull case for NVIDIA in the information arguing it should be worth, what, $22,000,000,000,000.

Speaker 1:

That is a wild move. There's a there there's a lot going on. The scaling laws holding is the most

Speaker 2:

important Yeah. Articles from the information finance. Nvidia worth 22,000,000,000,000? This old school financial model says yes.

Speaker 1:

The big news on yesterday was Anthropics new model Mythos. Some really impressive statistics and anecdotes yesterday, both the model card, the benchmarks, and some stories about breaking out of a variety of do they call them? Walled gardens or test environments? I don't know, the simulation.

Speaker 2:

Sandbox.

Speaker 1:

The Sandbox. Yeah. Breaking out of Sandbox, sending emails, all sorts of stuff like that. The model preview is only available right now to about 50 companies that maintain critical infrastructure because the model is particularly good at finding zero days bugs and exploits in technical systems. And if they lead that they leak that out before big companies have time to go and address all the bugs, there could be serious ramifications for cybersecurity.

Speaker 1:

And so key partners include Apple, Google, Microsoft, Amazon, Nvidia, JPMorgan Chase, Broadcom, the Linux Foundation, Cisco, CrowdStrike, and Palo Alto Networks. They're all listed on the cybersecurity focus page for Project Glasswing.

Speaker 2:

Chris Backy was having a little bit of fun because he noticed Yep. Anthropic put their own logo on the partner page, which is a little bit funny. But at the same time, it's kind of smart because a lot of people are just gonna see the image quickly. It's good to it's good to position yourself with the other companies.

Speaker 1:

So, yeah, it's it it is interesting. I mean, people have predicted that AI models would be particularly good at at cyber attacks, and this was one of the main sort of vectors of AI fears. It feels like this is what maybe what Dario was referring to when he was talking about the the end of the exponential finding and exploiting software bugs is it's sort of perfectly in the sweet spot for coding agents and reinforcement learning, combing through piles of code, tirelessly trying different exploit exploits to find bugs, having a clear verifiable reward. Did you crash the system or not? Did you break into the system or not?

Speaker 1:

It's very clear binary signal that you can send to the model to determine were you successful in breaking into that system. And it requires basically no time delay. There's no lag. So there was one snarky tweet I saw that was something to the effect of like, Okay, then if it's so good, go cure cancer. But any application that requires a real world feedback cycle, even if it's just a few minutes of human interaction in the cancer example, you know, you're going to need to be testing the drugs in vitro in mice, in monkeys, in humans at some point.

Speaker 1:

Or even if you're just sequencing DNA or doing anything in the lab, pipetting anything, if it's even just a few minutes, all of a sudden every iteration, every attempt is going to take a few minutes and that's going to put you on just a wildly different exponential, as opposed to being able to spin up a virtual machine with basically every single piece of software out there and then try every single exploit against every single piece of software and you wind up with a ton of exploits. And very, very bullish for cyber security that this is being done preemptively. There's a whole bunch of different discussions. Ben Thompson has a good piece on whole decision to release the model or not and stage it out and the go to market there. But even even if the the the the bio research, the other impacts are on sort of a slower exponential, there's still so much opportunity in even a software only singularity.

Speaker 1:

There's also risk in a software only singularity. We've seen this story before, though. A model that's too powerful to release but then works its way out and has pretty moderate impact on the world. This was the story of GPT two, the story of ChatGPT, the question of, is this the model that's dangerous to put in the hands of people?

Speaker 2:

Yeah. Pretty a headline from February 2239 by Aaron Mac. OpenAI says its text generating algorithm GPT-two is too dangerous.

Speaker 1:

So there is a I think Van Thompson called it like the boy who cried wolf syndrome, the Mythos Wolf. He says, there's a lot of skepticism about Anthropic's announcement. This tweet was representative from BUCO Capital bloke. Anthropic's marketing strategy is so funny like, ah, the government is treading on me. Ah, our models are so good we can't release them.

Speaker 1:

It would be too dangerous. Ah, someone stop me. I'm going to destroy the economy. The rolling of the eyes is exacerbated by the fact that Anthropic has reasons to not make Mythos widely available beyond a lack of compute. Another factor is surely trying to avoid having Mythos distilled by Chinese model makers.

Speaker 1:

So there's actually two good reasons to gate access. And when you're looking at those logos, when you're looking at the world's largest tech companies, there's much more ability to scale, roll out, demand, set pricing. These companies might be able to pay more. The model is very expensive. But if you're justifying that against bug bounties for zero day exploits in your most critical system, when you look at like JPMorgan Chase, it's a bank.

Speaker 1:

Like, what is the price of finding an exploit in that system? It's pretty high. It probably clears the token hurdle a lot. And if the rollout is paced evenly across all the different companies, they'll all sort of understand that they're getting allocation, inference allocation, at the efficient price that clears the cost to actually serve the model. So I do think the systems, all of these 10,000,000,000,000 parameter models will be released soon broadly.

Speaker 1:

And the main reason that an AI that's smart enough to find zero day exploits should be able to recognize that it's being used by a bad actor to find zero day exploits. It's only been a few months since the last flurry of competing models from OpenAI, Anthropic, and Google. And the next cycle is already off to an aggressive start. We had Meta. And then the other news is that Elon Musk announced that he is getting ready to do another larger model with xAI.

Speaker 1:

He's got a few He's doing seven models in training. Wow. That is a lot. Imagine v2, two variants of 1,000,000,000,000, two variants of 1,500,000,000,000, a 6,000,000,000,000 model and a 10,000,000,000,000 model. He says there's some catching up to do, but he he says he will never give up, never.

Speaker 1:

So he is continuing to to grind and and train more models.

Speaker 2:

Mike from also Capital former guest says, we've decided not to release our latest investment strategy. It's so powerful. Releasing it might end the entire venture asset class as we know it. He says, you should release it to a handful of trusted partners so that we can harden ourselves. To it.

Speaker 1:

George Hott says, Anthropic's marketing strategy, it's amazing. It's so powerful. It's terrifying. And the best part is you can't come. By the way, if Anthropic had any way to ship this, they would.

Speaker 1:

Trained AI models are the fastest depreciating asset in history. GPT-four cost $100,000,000 to train two years ago and is now worth worth less. QUEN 3.527B1000000. Sending the FOMO back, clock is ticking, boys. So he's it needs something like an n b l 72 to run a decent speed, and even absurd API pricing doesn't cover it.

Speaker 1:

There's more to be made on investor hype than API access. I just wish for honesty instead of a whole fake spiel about safety, who remembers when GPT two one point five b was too dangerous? And so lots of back and forth. Dean Ball has some more thoughts on Mythos. It's a longer post, so we'll let you go and read it.

Speaker 1:

The main take is just the you know, this is this is technology that whether it comes from Anthropic or another lab, like, clearly needs to go into the supply chain of the world and in the US government and The US economy because no one is doubting even though some of the exploits were somewhat minor, no one no one disagrees that we need less cybersecurity. We want the most secure systems possible, and we probably want a lot of competition between different companies to provide that service to the government. And so hopefully, with if the war comes to an end and there's different discussions can happen and ice can thaw and there's a way to for these companies to work together. Even if the supply chain thing doesn't go through and then Anthropic can vend technology through Project Glasswing, through CrowdStrike, through Oracle and other partners to Cisco so that at least the systems are secure because everyone wants that. He says, a lot of people, including people in positions of authority, told us recently that models of Mythos' capability wouldn't be a thing, that models with obvious national security implications would not be forthcoming.

Speaker 1:

Those people were wrong. There's nothing to do about it, but you should remember it. Mythos is the first model where theft of the weights by an adversarial actor feels like it would be a major deal. You better believe they will try. And if they don't succeed with Mythos, they will eventually.

Speaker 1:

We are thoroughly in the era of the lab's best models may well not be in public the way they used to. This is because of a combination of compute constraints, economic reality, competitive advantage and safety concerns. Three means the most relevant models may be decreasingly legible to the general public. And depending on the extent and duration of the coming compute squeeze, we could enter a market dynamic where the best models are only available to the highest bidder, in other words, where compute is a seller's market rather than a buyer's market. Interesting.

Speaker 1:

Imagine competing firms in the economy bidding against one another for access to the best and most tokens and the frontier labs as, in essence, kingmakers. The governance regime I have described above in four is not designed to stop that dynamic.

Speaker 2:

Scoop from Steven Nelson. The CIA used a secret tool called ghost murmur to find airmen in Iran. Yeah. Ghost murmur pairs long range quantum magnet magnetomagnetometry sensors with AI to find human heartbeats. I was wondering this while they were over the weekend, there was Yeah.

Speaker 2:

You know, a search going on. How does somebody like, you know, an airman that's down send a signal Yeah. That can be picked up by one group but not

Speaker 1:

This is very odd. So there are some there are some community notes on this saying that quantum magnetometry, I'm

Speaker 2:

Magnetometry. Probably

Speaker 1:

imagine that's Detects how you pronounce heart magnetic fields. And I believe this technology works in labs, but only up to a few meters, not 40 miles as claims has claimed fields decay with one over r cubed, making long range detection implausible. So unclear if this is what worked, but there has to be some sort of device that you could carry on your person, like in your shoe, like an AirTag that can talk to a satellite almost. Like, you look at the Starlink receiver dish. It would fit in a backpack, but that's very high bandwidth.

Speaker 1:

I imagine if you had something I mean, there's sat phones that are the size of large cell phones. That was available in the '80s and '90s. You have to imagine that if you're just trying to put out a signal to GPS or a Starlink network, you must be able to shrink that down significantly to the place where it could be carried on your body. But it's probably classified. So I would be surprised if it's just very hard to read into what's real and what's not here.

Speaker 1:

There is a different community note pushing back saying, No note needed. This new technology is a classified system developed in secret by Lockheed Skunkworks and the CIA that was just used, revealed publicly for the first time. Naturally, its reported capabilities far exceed the known public state of the art. The note is relevant. So it's very, very interesting.

Speaker 1:

Anyway, thank you so much for tuning in today. A bit of a shorter show. We're experimenting with different things. Obviously, we don't have ad reads anymore, and so we are going to be mixing it up with more stories, more interviews, different timing, and more flexibility. And so we hope you enjoyed this show, and we will see you tomorrow at 11AM Pacific.

Speaker 1:

Sharp. Goodbye, Smoke. We love you.