TBPN

  • (00:00) - Intro and Overview
  • (02:57) - Ukraine’s drone attack on Russia
  • (03:21) - Soren Monroe-Anderson (Neros)
  • (09:02) - Connor Love (Lightspeed Ventures)
  • (13:45) - Erik Prince (Off Leash)
  • (15:20) - Elon Musk Vs. Donald Trump
  • (21:44) - Delian Asparouhov (Varda)
  • (32:48) - Cluely Update from dropout Roy Lee
  • (46:52) - Kian Sadeghi (Nucleus)
  • (01:00:12) - AI Day on TBPN
  • (01:00:14) - Mark Chen (OpenAI)
  • (01:28:49) - Sholto Douglas (Anthropic)

TBPN.com is made possible by: 
Ramp - https://ramp.com
Figma - https://figma.com
Vanta - https://vanta.com
Linear - https://linear.app
Eight Sleep - https://eightsleep.com/tbpn
Wander - https://wander.com/tbpn
Public - https://public.com
AdQuick - https://adquick.com
Bezel - https://getbezel.com 
Numeral - https://www.numeralhq.com
Polymarket - https://polymarket.com
Attio - https://attio.com

Follow TBPN: 
https://TBPN.com
https://x.com/tbpn
https://open.spotify.com/show/2L6WMqY3GUPCGBD0dX6p00?si=674252d53acf4231
https://podcasts.apple.com/us/podcast/technology-brothers/id1772360235
https://youtube.com/@technologybrotherspod?si=lpk53xTE9WBEcIjV

What is TBPN?

Technology's daily show (formerly the Technology Brothers Podcast). Streaming live on X and YouTube from 11 - 2 PM PST Monday - Friday. Available on X, Apple, Spotify, and YouTube.

Speaker 1:

You're watching TVPN. This week, what were the top stories? What were the most interesting things that we learned?

Speaker 2:

One Different interviews.

Speaker 1:

The the the Ukraine Drone Attack. That was huge. Yep. Operation spiderweb, a whole ton of of drones were smuggled into Russia in shipping containers. They emerged and went out and attacked bombers.

Speaker 1:

We had Soren Monroe Anderson from Niros on the show to to break that down for us. We also talked to Connor Love at Lightspeed who does a lot of in in defense tech investing. And, of course, a couple weeks ago, had Eric Prince on the show, the founder of Blackwater, and he had kind of predicted that the Ukrainian military was perhaps under rated. And and we might be seeing, like, something like this in the future. And so that was interesting to see to see play out.

Speaker 1:

So we will take you through those kind of interviews, recaps some of those. Then, obviously, we had the absolute meltdown between between president Donald Trump and Elon Musk. That unfolded on X and Truth Social, the two

Speaker 2:

Dueling

Speaker 1:

dueling social

Speaker 2:

social platforms.

Speaker 1:

Exactly. And although it's a highly political story, there are big business implications.

Speaker 3:

Totally.

Speaker 1:

Talking about what's going to happen in space between NASA, Boeing, different launch providers.

Speaker 2:

Yeah, the implications for Tesla, SpaceX, Eurolink, even The Boring Company.

Speaker 1:

Right? A lot of stuff

Speaker 2:

that many of Elon's businesses are heavily regulated. Yep. And, the potential impacts are substantial.

Speaker 1:

Yeah. Then we also, had had, some earlier stage founders on the show. Roy from Cluely came on and went pretty viral, put on a show.

Speaker 4:

You're surrounded by journalists. Hold your position.

Speaker 1:

Clearly is a a is a service to help you cheat on everything. Yep. We gotta actually try the app. Someone was asking was was pushing us to try it and see how good the product is. But regardless of the product, he's also a phenomenal marketer and he came on and put on absolute show.

Speaker 2:

And he's printing apparently.

Speaker 1:

Yeah, he's doing great. He

Speaker 2:

can't spend all the money that they're bringing in.

Speaker 1:

And so we'll give you an update on Roy and see where that business is. And then Kian, from Nucleus came on to launch Nucleus Embryo, which he calls the first ever genetic optimization software that helps parents give their children the best possible start in life.

Speaker 2:

Quite a lot of controversy there on the timeline this week. A lot of people hate it. A lot of people love it. And we'll let you kind of decide for yourselves.

Speaker 1:

And then we had a whole bunch of AI experts on the show from Google, OpenAI, Ananthropic, got to the front leading edge of the debates around AI

Speaker 2:

LLMs. Poor John yesterday, you were fighting as long as you could. You didn't want to talk about drama. You got dragged into it. But we did get some great coverage from Mark Chen at OpenAI as well as Sholto over at Anthropic.

Speaker 1:

Yeah. It was a lot of fun. The big news over the weekend was the Ukraine drone attack on Russia. They shipped shipping containers into deep into Russia, at which point, drones flew out of the containers and, hit strategic targets. We're gonna have two guests on the show today, Soren Munro Anderson from Niros to talk about that, and also Connor Love from Lightspeed to talk about, that and also Defense Tech investing generally.

Speaker 1:

Today, we have Soren, from Niros who builds drones and has been to The Ukraine, and so we'll bring him into the studio and, ask him how he's doing. How are you doing?

Speaker 2:

There he is. Welcome. Great.

Speaker 5:

How are you guys?

Speaker 1:

We're good. We have a new sound board. So expect some wild some wild cards. Wild stuff. Could you

Speaker 2:

give us a high level overview of the history of drone warfare in Ukraine? Because I understand it's been progressing super rapidly on both sides and it'd be helpful to understand kind of the different stages.

Speaker 1:

Do they ever have like predator drones? Like the global war on terror type of drone? Or do they jump straight to quadcopter and kind of like leapfrog the technology?

Speaker 5:

So you've had this Russian aggression war in Ukraine since 2014. Obviously the full scale invasion was 2022, but even during that period before the full scale invasion, there was some usage of drones for surveillance and dropping explosives. These are primarily still small drones like what you're seeing now, but this was not a proliferated technology. Then when the full scale invasion happened, within a few months, the Ukrainians started thinking about all these ways that they could use inexpensive drone technology to get an asymmetric advantage, and that is where FPV drones started becoming a really, really big deal. So they pioneered really, this idea of, you know, putting an explosive on a racing drone and using that as a precision strike weapon.

Speaker 5:

There were instances of this happening in other places, but they really scaled it and they've really refined it. And then Russia was was much slower to take it seriously, although now they're they tend to, in some ways, outproduce Ukraine and they have a much, you know, more direct line to China, where most of these components are coming from. But since 2022 and FPV is just starting to get used, now it's reached an unbelievable scale. It's estimated Ukraine is gonna produce four and a half million FPV drones this year, and those are ranging from, you know, ones that are this big to 15 inch propellers, fiber optic controlled drones, many different types and sizes of warheads, different configurations. And I can talk more about the drones that were used in Operation Spiderweb as well because those were really interesting.

Speaker 5:

But what we've seen is just this vast technology landscape, where new clever ideas like fiber optic are, you know, gonna be the hot thing for a few months. And then they sort of just become another tool in the tool belt. And it's just this constant arms race.

Speaker 2:

Yeah. This attack was unique in a bunch of different ways, but is this something that had been to your knowledge or just, you know, more generally known to be something that had been attempted multiple times or, maybe like, I'm curious to know, yeah, kind of the backstory on this type of attack. Because it seems, you know, it's massive difference to be using this technology way behind enemy lines versus using it at the front line.

Speaker 5:

Yeah. So primarily, FPV drones are used on the front line, say, the 30 kilometer band across the zero line. What was so unique here is that it was FPV drones, short range drones, being used 4,000 kilometers inside of Russia. Yeah. It it was this unbelievable application where, you know, we you've seen the Ukraine using long range one way attack drones that are going, you know, 1,500 kilometers to strike targets deep inside of Russia.

Speaker 5:

But here, these were small drones actually driven in on on trucks, basically in the tops of shipping containers. I don't know of any operations that were similar to this beforehand. I think it was not something they wanted to give away, and the drones were actually operating on cellular. They were not operating on local, like the normal low latency local radios you use for FPVs typically. And so I think this is going be something that a

Speaker 1:

lot of people are going

Speaker 5:

look at and see if you have drones that are operating on cellular, you can't really tell them apart from cell phones. That's really hard to defend against, really hard to But now it's going to be part of air air based defense is thinking about drones that are operating on cellular being piloted from basically anywhere in the world.

Speaker 2:

Talk about the Russian response, the immediate response to this incident from the footage that I saw and I think most people saw that that tracked it. It seemed incredibly challenging to respond to it quickly. By the time you could sort of organize a response, a lot of the the core damage had been done. What do you think the question I think that every country is asking themselves now is how do you defend against this type of attack whether you're at war like Ukraine and Russia are, or you're just thinking long term.

Speaker 5:

Yeah. This clearly poses a massive threat to critical infrastructure. I mean, being blatant, The US does not have any defenses in place that would stop this from happening. We already know we there's already news stories about drones that are flying over our air force bases, and we can't do anything about it. And I think the only approach here has to be a a multilayered system where you're looking at all of the different types of electronic warfare and also considering things like satellite communications and cellular communications where you're basically able to turn those off on the flip of the switch, which is a huge, a huge inconvenience and a huge, thing to build into the infrastructure, but clearly that's going to be required.

Speaker 1:

Welcome to the stream, Connor.

Speaker 6:

How are you doing? I'm good. I'm doing alright. Good to be back, guys.

Speaker 2:

Yeah. Oh, he's got a suit this time.

Speaker 1:

Oh, looking great.

Speaker 6:

I wanted to I I won't say I dress up just for you, but, you know, I would have taken the suit off far before this, you know, if I wasn't coming on.

Speaker 1:

Fantastic. Good. Good. Thanks so much for jumping on. Have you been tracking the Ukraine story closely?

Speaker 1:

Any insights there? Anything in the portfolio that's, at all relevant in the defense tech world? Do you expect a response from the US government or guidance or change to any strategies? Really any takes on that?

Speaker 6:

I mean, first shit, what a time to be alive. I mean, I'm sure your Twitter feeds and your group chats were blown up, no pun intended, over the I mean, it's pretty crazy. I mean, let's be honest, like, first, I'm not shocked that the Ukrainians did this. I mean, the execution seemed to be flawless from what we can pull from open source intel. I do think though, I mean, again, it's not a surprise that the Ukrainians have been mastering drone warfare for the last handful of years.

Speaker 6:

And you want to call it that they called it spider web. This was their this was their charging horse. This was their Israeli beeper. And the outcome is pretty impressive, to be honest. I mean, what from the outside looking in, the Russians woke up over the weekend and they thought they were getting their $4 team orders and what did they get?

Speaker 6:

They got a thousand, you know, FPV drones, you know, blowing them to smithereens. So it's pretty impressive. I mean, my takeaways from this are really twofold.

Speaker 1:

Please.

Speaker 6:

The first is like, there's never been a clear signal of where warfare is going. And to be clear, I view this from both the entrepreneurs in my portfolio, but also from my perspective, I mean, the world is about cheap, attributable, a lot of times autonomous systems. And that's playing out warfare that's playing out in other areas of life. And then the second thing is, candidly, it's like it's really hard to defend yourself at the pace at which things are changing. And again, like I know we do some things here in The United States and are trying to be on the front end of a of this innovation.

Speaker 6:

But when this happens, I think this almost just resets everyone again and says, all right, how do we respond to it? And I think it's to your point, it's not a direct US response. It's more of, hey, what do we need to buy? What do we need to develop for our own fight in some way, shape or form?

Speaker 2:

Yeah, what do you think, obviously, you're a venture capitalist, not a geopolitical strategist, but what's the right Russian response to this? Is it, hey, we suddenly need to be wary of having cell coverage anywhere near strategic military assets? I mean, seems like Ukraine and Ukraine Ukraine Ukraine's perfect world, they could run this style of attack a bunch and copy and paste and hit other targets. But it feels like something that was dependent on cellular technology that that's something that the Russians can revoke, you know, fairly fairly quickly. Sure, it'll be inconvenient, but I'm curious if you have a take.

Speaker 6:

Yeah, I mean, to be honest, when I think about how do you defend against this? I think there is, I wouldn't call this the easy answer of just turning off the cellular network. I actually think the only way to do it kind of practically is in layers or in a multitude of different ways because, yeah, the reality is if you looked at how the Ukrainians carried out this attack, they did so on the local, you know, Russian cell network, which again, I don't think any Russian kind of defense unit, any of these basis was ever thinking that they would have to turn off their own cell network. And then there's just the practicality of how you do it. I mean, I I think there was what, four or five different attacks that hit all at the same time.

Speaker 6:

What do do? You turn off the network for tens of thousands, hundreds of thousands of people. And oh, by the way, this is like a dirty little secret that nobody talks about. You know, yes, you have your military systems that are protected and all that, but a lot of coordination is happening through WhatsApp. A lot of coordination on, so all of a sudden you turn off the cell networks, you're actually inhibiting your own defense, your own response, first responder, you know, getting your own people out of there.

Speaker 6:

So I think it's a bit more complex than that. And then the last thing I'd say is just like, even if you do this in layers, you need to be resilient in a way, but you're not going to stop everything. I mean, was just brilliant masterclass of, again, if maybe there was a plan, we didn't know this, but maybe there's a plan for 100 bases and we only hit five of them. So if you think about just the broad geographic coverage you have to have, I think to be a % certain on anything, it's just you it's impossible. You can't do it.

Speaker 7:

The most capable military in in Europe right now is Ukrainian military. The lessons learned that they have are very significant. The drone tech is far and away the best. Their ability to to fight against and to even conduct electronic warfare and even closer support in this environment is, is leaps and bounds ahead of even what the US military is. So that's the military to learn from.

Speaker 7:

Ukraine does have a corruption problem. I hope sincerely that Trump is able to get a ceasefire in place and to stop this killing because it's it's absolutely pointless. It's just Slavs killing Slavs at this point, and it's it's nobody's gonna advance. Mhmm. And you have a blend of old and new.

Speaker 7:

I mean, if you look at the pictures of the front there now, it's almost indistinguishable from the Battle of the Somme. Artillery duels, static lines, bunkers, all the rest. Now the problem is somebody can fly an FPV into your bunker on the other side, but between tens of millions of landmines, which make armored breakthrough very difficult, it slows down any attack so that the FPVs and artillery can get to it, you're not gonna see any kind of blitzkrieg, Hans Kuderian maneuver warfare there until some significantly different weapon systems come along. So look, Europe needs to get serious about it. They're far from it at this point.

Speaker 2:

Elon Post four minutes ago, the Trump tariffs will cause a recession in the second half of this year. Wow. Somebody else was saying, can I finally say that Trump's tariffs are super stupid?

Speaker 1:

Somebody else is

Speaker 2:

posting, Mads posting is saying it's Xi Jinping. He says bro, you seeing this? And it's Putin on the other end, he's just looking at it. Hold up, got a line and it's, we'll start pulling some of these up.

Speaker 1:

Ridiculous. What else is going on here? This is the present versus Elon.

Speaker 2:

Nival says Elon's stance is principled. Trump's stance is practical. Tech needs Republicans for the present. Republicans need tech for the future. Drop the tax cuts, cut some pork, get the bill through.

Speaker 1:

This is so crazy.

Speaker 2:

Antonio Garcia says remember there's F you money and then there's F the world money. Will Stansell says imagine being ICE agents suiting up for your biggest mission of all time right now. People are saying that Trump's gonna deport Elon.

Speaker 1:

Elon back to South Africa.

Speaker 2:

Well, DePuze says time to drop the really big bomb growing Daniels in the Epstein file.

Speaker 1:

That is really terrible. No. We

Speaker 2:

had a question from a friend of the show. He said the real question is if Tesla is down 14%, how could SpaceX and OpenAI be trading I mean, I'm just saying like at a high level.

Speaker 1:

Yeah yeah yeah.

Speaker 2:

You know, China is the big beneficiary here of Sarah Guo says if anyone has some bad news to bury, might I recommend right now?

Speaker 1:

Yes, yes, yes. You, what's the canonical bad startup news like, oh yeah, you missed earnings or something. Drop it now.

Speaker 2:

Inverse Kramer says Bill Ackman is currently writing the longest post in the history of this app.

Speaker 1:

And we have a video from Trump here if we want. I can throw it in the tab and we can share it on the stream and react to

Speaker 2:

it live. Lex Friedman says to Elon, that escalated quickly triple your security. Be safe out there brother. Your work SpaceX, Tesla, XAI and Neuralink is important for the world.

Speaker 1:

We need to get Elon on the show today. If somebody's listening and can make that happen, I would love to hear

Speaker 8:

from you.

Speaker 2:

Max Meyer says so I got this wrong. I didn't say it never happened but I thought it wouldn't. I'm floored at the way this has happened. He didn't think they would have a big breakup. Many people didn't think they would have a big breakup.

Speaker 2:

Even just earlier this week, it seemed like they might just have a somewhat peaceful exit. Trump just posted a little bit ago. I don't mind Elon turning against me, but he should have done so months ago. This is one of the greatest bills ever presented to Congress. It's a record cut in expenses, $1,600,000,000,000 and the biggest tax cut ever given if this bill doesn't pass, there will be a 68% tax increase and things far worse than that.

Speaker 2:

I didn't create this mess. I'm just here to fix it. Anyways, lots going on.

Speaker 1:

Let's go to this this Trump video. Wanna see what he has. I've seen and I'm sure you've seen regarding Elon Musk and your big beautiful bill. What's your reaction to that? Do you think it in any way hurts passage in the senate which is, course, what is your seeking?

Speaker 4:

Well, look, you know, I've always liked Elon, and it's always very surprised. You saw the words he had for me, the words of and he hasn't said anything about me that's bad. I'd rather have him criticize me than the bill because the bill is incredible. Look, Elon and I had a great relationship. I don't know if we're well anymore.

Speaker 4:

I was surprised because you were here. Everybody in this room practically was here as we had a wonderful send off. He said wonderful things about me. You couldn't have nicer, said the best thing. He's worn the hat.

Speaker 4:

Trump was right about everything. And I am right about the great, big, beautiful bill. But I'm very disappointed because Elon knew the inner workings of this bill better than almost anybody sitting here, better than you people. He knew everything about it. He had no problem with it.

Speaker 4:

All of a sudden, he had a problem, and he only developed the problem when he found out that we're gonna have to cut the EV mandate because that's billions and billions of dollars, and it really is unfair. We wanna have cars of all types. Electric, we wanna have electric, but we wanna have a gasoline, combustion. We wanna have different. We wanna have hybrids.

Speaker 4:

We wanna have all we wanna be able to sell everything. He hasn't said bad about me personally, but I'm sure that'll be next. But I'm I'm very disappointed in Elon. I've helped Elon a lot.

Speaker 2:

That is mister president. Did he

Speaker 6:

I just wanna clarify. Did he raise any of these concerns with you privately before he raised them publicly? And this is the guy you put in charge of cutting spending. Should people not take him seriously about spending now? Are you saying this is all sour grapes?

Speaker 4:

No. He worked hard, and he did a good job. And I'll be honest, I think he misses the place. I think he got out there, and all of a sudden, he wasn't in this beautiful Oval office and he was and he's got nice offices too. But there's something about this when

Speaker 7:

I was telling the chance Folks.

Speaker 4:

This is where it is. People come in

Speaker 2:

Breaking news, Dalian. That's Gruhab. He's joining us in the temple for some live reactions. Come on in.

Speaker 1:

Guests. I can't even spell surprise guest. I'm so excited about this. Yeah.

Speaker 2:

In other news, eleven Labs dropped a new product.

Speaker 1:

In other news, $2,000,000 seed round. Stop it. Stop it. We love Eleven Labs.

Speaker 2:

No. They'll they'll they'll

Speaker 1:

Keep grinding. But just launch again tomorrow. You're going to have to launch again. Start shooting a new vibe reel. Start shooting a new writing a new blog post because no one's good.

Speaker 1:

Lulu says yes. Delay the launch on TVPN.

Speaker 2:

So basically right now I can just pull up and just refresh. I'm gonna just be refreshing True Social.

Speaker 1:

So okay, Jordy's on True Social. I'll be on X. Give us your reaction, Delihan, what's going on?

Speaker 3:

Mean, at some point I was like, I'm just sort of scrolling X and I like tuned into you guys like an hour ago and I like, like, they're talking about some AI thing. Was like, at some point they're gonna switch to like, we have to do. Then I was watching it and I was like, okay,

Speaker 2:

like, gotta John resisted. I

Speaker 1:

fought it for like a for like a half an hour, but we couldn't do it. But yeah, give us your quick reaction.

Speaker 3:

I mean, I'll always, you know, sort of give it from the, you know, sort of space angle. You know, it's amazing that, you know, how much the world has shifted since, you know, Friday of last week, whereas, you know, sort of presumed that Jared Isaacman was going be the, you know, sort of NASA admin, to today, it was released, the Senate reconciliation package re added budget back into NASA, largely for the SLS program, which was basically the program that, you know, sort of Jared and Elon were, you know, sort of largely advocating to, you know, sort of completely shut down. So, you know, it has already shown, like, you know, the sort of counter reaction, you know, is already showing up, you know, in policy.

Speaker 1:

Sorry, SLS program, is that space shuttle or no?

Speaker 3:

Sorry, that's the SLS launch rocket. It's based off of old space shuttle hardware, but it is basically the internal, you know, sort of NASA run competitor effectively to like a Starship heavy launch rocket. Yeah. Because it was, you know, sort of generally behind budget, behind schedule, and there are so many commercial heavy lift rockets coming online, the default was canceled. That is largely sort of a Boeing based program.

Speaker 3:

And if you look at, you know, three months ago, you know, when they were announcing the F-forty seven program, know, Elon walks into the Secretary of the Air Force's office. Obviously, he'd been sort of ranting against, you know, manned fighter jets and believing that that shouldn't be what, you know, be what the department is prioritizing. Thirty minutes after that meeting was when they announced the F-forty seven program. And so now you're seeing basically like the equivalent in space where, you know, you know, that was obviously awarded to Boeing. Boeing was the, is the largest prime behind, you know, SLS.

Speaker 3:

You know, Boeing basically, you know, is going to be the biggest winner of, you know, NASA refunding USSLS and Jared Eisenman not being NASA administrator.

Speaker 2:

So tying this back to the timeline, Trump posted less than thirty minutes ago, in light of the president's statement about cancellation of my government contract, SpaceX will begin decommissioning its Dragon spacecraft immediately. Break that down.

Speaker 3:

I mean, that just means that we no longer have a vehicle that can go to the International Space Station. We no longer have a vehicle that can bring astronauts up and down. You know, also don't have a vehicle that can deorbit the International Space Station safely, right? The Dragon was expected to be able to do that. So what that means is, you know, if you guys remember all the memes about stranded, you know, from last year around Boeing Starliner, it now means that the space station, you know, itself is basically, you know, sort of stranded.

Speaker 3:

And that's like, you know, one of the government contracts, obviously, that, you know, SpaceX is involved in. Elon, I've heard generally, like, just wants to shift all things to Starship anyways. And so in some ways, was probably kind of looking for an excuse to, you know, sort of shut down Dragon and refocus energies. There's also a part of where it's like, look, he is like kind of independent in the space world and that, you know, Starlink's total top line revenue is going to be passing the NASA budget in the next year or And so in terms of like size of, you know, state actor that can influence space, you know, his own company is basically about to become, you know, as large of an actor as like the entire United States. So I don't think there's going be like a de escalation here.

Speaker 3:

Like, you know, my estimation is like on both sides, it's going to continue to escalate. You know, if we thought that we lived in dynamic times, you know, when Trump got into office, it's going be even more dynamic when he was

Speaker 1:

like, The dynamism will continue until morale improves.

Speaker 3:

Elon, the center, AOC, the progressive populist and Trump, the sort of conservative populist and man, it's remarkable I

Speaker 2:

mean, I just have so many questions, right? How does this impact Golden Dome?

Speaker 1:

What's Boeing stock doing?

Speaker 2:

Will Golden Dome even be a viable project without SpaceX?

Speaker 3:

I think there's just going to be more resistance probably to working with, you know, sort of up starts because they would be ones that would probably be more likely to collaborate, you know, sort of with, you know, SpaceX. So,

Speaker 1:

yeah, feels like Boeing would be a logical beneficiary of this turmoil and yet they're down today. They haven't really popped.

Speaker 3:

Oh, really? Yeah. I mean, I'm not obviously, you know, one to give, like, you know, public talk.

Speaker 1:

Yeah. I I know. I'm I'm just kinda working through it myself and it's it's surprising. It just feels like

Speaker 3:

like Tesla to drop and Boeing to pop Yeah.

Speaker 1:

Yeah. That would be the expectation, but there there must be something there. Because there there it feels like this is purely interpersonal between Elon and Trump and not it's not like, oh, Boeing was secretly behind the scenes the whole time lobbying even more effectively. It doesn't oh, you got the well, where's the tinfoil hat? Maybe we need a tinfoil hat.

Speaker 1:

But yeah, when you're in Boeing world, it's like, hey, we're only down 1%. Let's The cool of the century.

Speaker 2:

My question is, has has there ever been a crash out of this magnitude ever?

Speaker 1:

In history.

Speaker 2:

Well In Internet history.

Speaker 1:

When when Elon and Trump became

Speaker 9:

friends

Speaker 2:

Honestly, world scale.

Speaker 3:

I actually probably world history equivalent. I feel like there was something in, like Yeah. We maybe didn't have to that. In United States where Yeah. You know, was a

Speaker 9:

crashing out out between

Speaker 2:

used to mean calling up The New York Times and just ranting. Yeah. Now you can just live post, like, all your reactions, and it's just all real time. This is like crash outs are actually intensified. You actually wanna be long crash outs Yes.

Speaker 1:

Definitely. Over the next

Speaker 6:

20 four of So,

Speaker 3:

you know, you gotta be on both ex and and truth social to like stay on top of things.

Speaker 1:

Yeah. Yeah. I actually did like a deep research report a while back on like, has the richest man in America ever been close with the US President going back to like, you know, was Rockefeller particularly close? Yep. And and because the the the narrative was like, oh, this is like so unprecedented.

Speaker 1:

And in fact, it is unprecedented.

Speaker 3:

Oh, really? Yeah.

Speaker 1:

Yeah. I would

Speaker 3:

have guessed that like Rockefeller was close

Speaker 1:

to too. Me too. That's what I was going for. It was like, no. I imagine this is always it is always close.

Speaker 1:

But no. I I I think because the president has become more powerful globally, your your your your your your point about, you know, mayor of America, Dictator of the world, like, it becomes increasingly valuable for the richest man to have a close alliance. And so it's become more I I don't know exactly how accurate that research was. It's totally possible that, like, behind the scenes, Rockefeller was really close to the president at the time, and we just didn't write about it in the history books. But there certainly aren't very many anecdotes about the richest man in America going on Yeah.

Speaker 2:

So Pavel Pavel had a Preston had

Speaker 3:

a great for AP US history 2015, you know, a is the seed to get damaged?

Speaker 2:

Yeah. So this is this is

Speaker 3:

where, you know, Elon Musk called the president at the time a potential pedophile. Was it a, about Epstein Island, b, about a cave in The Philippines, c

Speaker 1:

What a mess.

Speaker 2:

No. So Pavel had a good post. He was quoting the the big bomb, from Elon. He said, hypothetical question about The USA's power structure. Is the man with the most access to capital more or less powerful than the political head honcho?

Speaker 2:

Purely hypothetical. It's a good question to ask.

Speaker 3:

I mean, I think both like archetypes have grown both in absolute power, but also in relative power to the rest of globe, basically since the Gilded Era, right? If you think about the President of The United States in 1925, I'd say pretty darn powerful, but like there was clear, like, you know, it was a, you know, sort of multipolar, you know, sort of world. Argentina was pretty darn rich at the time. Obviously, Europe was still, you know, sort of recovering from World War I, but UK was generally, you know, sort of doing well. Like, was not, you know, was clear that there was a, you know, sort of huge, you know, outweighed effect.

Speaker 3:

And then if you look at probably the, you know, sort of biggest, you know, industries at the time, I don't think you could claim that even like Standard Royal at its peak, I'd have to go look at the exact numbers, but that, like, it had the size of budgets relative to, you know, the US government in terms of, you know, sort of budgets, right? Versus I feel like now for the first time, you both have, you know, sort of US President extremely, extremely powerful. Yeah. And then you have, like, you know, sort of mag seven effectively, like the size of, you know, sort of, you know, huge states

Speaker 8:

or something

Speaker 3:

like they're, you know, fucking like their own state governments.

Speaker 1:

And then also just more bureaucracy, more red tape. So, like, I I when I think about the nineteen twenties, like, rubber bands, it's like, it is the it is the you can just do things era. And so you wanna build a railroad like, yeah, you might need to get like one rubber stamp, but it's not gonna be ten years and tons of lobbying and all this different stuff. You can kind of just go, you can just go wild.

Speaker 2:

You know it's bad when Kanye is saying bros, please know we love you both so much. The of reason is Kanye West.

Speaker 3:

Yes. Thank you. You need bring them together and, you know, form a peace treaty.

Speaker 2:

Nikita Beer just added his pronouns back to his bio.

Speaker 1:

Let's go.

Speaker 3:

Let's go. He's got a rubber band. Elon's got a rubber band all the way back to, you know, sort of extreme wokeism straight back to, you know, sort of super climate change. Wow. Somebody's

Speaker 2:

sharing, resharing the picture of the cyber truck blown up in front of the Trump Tower, I guess. And it's just like this.

Speaker 1:

This is in real life.

Speaker 3:

It was foretold. Yeah. It was a question of like when and what magnitude, not if.

Speaker 2:

Always bad if Vladimir Putin is operating to negotiate between president Trump and Elon, I think, I think a lot of the world is waiting for Roy Lee's take clearly the clearly army. Want that people have been asking him to get involved with geopolitics. Wow.

Speaker 3:

Love the, Sheil Mohad put up a, you know, sort of meme about, Narenda, the, like, prime minister of India. You know, he basically, copied and pasted the Trump Truth Social post about negotiating peace between India and Pakistan when it wasn't like actually fully negotiated.

Speaker 1:

So it

Speaker 3:

was like, you know, posting about, you know, negotiating a ceasefire between, Elon and Trump.

Speaker 2:

Funny thing is, like true social, you can just read all of Trump's posts without creating an account. You didn't It truly shows that like I would think that you would have to make an account to read them all but they just that it's not gated at all. It's his goal. Could be the biggest but you know, they clearly, I don't think they care about monetization. Bitcoin is actually, falling alongside falling.

Speaker 3:

Wow. Bitcoin falling, Boeing falling, Tesla falling. Who's the biggest winner of

Speaker 1:

the day? I think it's China.

Speaker 3:

China. Yeah. China. China. Sean Maguire.

Speaker 3:

They're quite

Speaker 1:

really sold off. It's it's down 3% today at a hundred and 1 k. So still up, but, you

Speaker 8:

know Yeah.

Speaker 1:

Yeah. Rough.

Speaker 3:

Winnie the Pooh just dipping his, you know, hands in that pot of honey, just snacking away, watching from the sidelines.

Speaker 1:

Yeah. Let's see. Chinese stocks. US stocks. Chinese I I can't find it.

Speaker 3:

Okay. That's probably my commentary on the day, boys.

Speaker 1:

Anyway, this is great. It was fantastic. Thanks for jumping on. Thanks for hopping on so quickly. Next up, we have Roy from Cluely coming back for an update.

Speaker 1:

He's hired 50 interns, I think, or something close to it.

Speaker 2:

He said they're bringing every intern on.

Speaker 1:

They're bringing every intern on.

Speaker 2:

We got every intern coming in.

Speaker 1:

Well, welcome to the studio, Roy. How are

Speaker 6:

you doing? Oh, woah.

Speaker 2:

Let's go. There they are.

Speaker 1:

I think we're overpowering you. Can can can you hear us?

Speaker 8:

Yeah. Yeah. We can hear you.

Speaker 2:

Yeah. Make sure we're zoomed out all the way so we can see everybody that we got a small army

Speaker 1:

here. This is incredible. How big is the team? Kick us off. How many you got at this point?

Speaker 10:

The team is, 11 full time plus the interns.

Speaker 1:

How many interns you got so far?

Speaker 10:

Interns. Bro, we're closing in on 50, brother.

Speaker 1:

Let's go. That's amazing. Congratulations. Are they all doing? How do you manage everything?

Speaker 1:

Is it just is it is purely social media? Is that what you want them to focus on? Growth?

Speaker 10:

Yeah. Yeah. Growth marketing. Like, the only goal of the company is get 1,000,000,000 eyeballs onto Cluely. So they have unrestricted creative freedom and permission to do anything and everything.

Speaker 10:

Just just make the company go viral. Every single person you see behind you has over a hundred thousand followers on some social media platform.

Speaker 11:

Wow. Thousand plus.

Speaker 1:

That's remarkable.

Speaker 11:

Me too now. Me too now.

Speaker 1:

Yeah. There we go. There we yeah. You probably popped. What's working?

Speaker 1:

What platforms have actually been, been driving the most growth? I mean, sure you've run a lot of tests. What have you learned that's, that you can share?

Speaker 8:

Bro. Ben, take it away, bro.

Speaker 1:

Let's hear

Speaker 12:

UGC has been really good. We just hit 10,000,000 views today eight days. Wow.

Speaker 2:

There we go.

Speaker 12:

Hoping to get a hundred million views in the next month.

Speaker 1:

What what what platforms specifically are the most fertile ground for targeting your specific customer? Because you can imagine that there's a lot of folks who are AI curious on X, but then there's much broader, more viral audience, more general audience on platforms like TikTok, YouTube, Instagram, what's working, and what is, is the next next platform that you're gonna be focused on.

Speaker 12:

Yeah. Well, we're trying to go viral on every platform regardless. But the main thing right now is Instagram Reels.

Speaker 1:

Mhmm. Oh, Instagram Reels. Interesting. And what is the main value prop that you're hitting people with? Is it still the cheat on test thing or have you evolved at all?

Speaker 1:

What? Still?

Speaker 12:

Like, interviews. Yeah.

Speaker 1:

Interviews. Okay. And, has there been this was controversial when you launched it. Is it still controversial in the comments? Are you getting flamed?

Speaker 1:

Has anyone big dunked on you and has that driven virality? Is that actually a net positive?

Speaker 10:

Instagram is not like Twitter, like, you could post the craziest shit on Instagram and they will still not think it's controversial.

Speaker 1:

Really?

Speaker 10:

So, how to make it controversial, like, we have to engage and bait some other way. Like, is cheating tool is controversial on Twitter but on Instagram, you could you could have, like, a white guy say the n word 10 times and it's still not controversial enough, you know? Like like, you need crazy sh t on Instagram and that's what we crack. Every single person here has, like, very great viral scents and

Speaker 1:

if

Speaker 10:

you watch the reels that do go viral, you see there's, like, ways that we've engaged and beta the videos, and this is what we'll keep doing to, probably a billion views a month is what probably

Speaker 2:

How how long does it take to figure out if an intern is cracked? Is it, like, an hour, two hours? How much time do you need?

Speaker 10:

For me, personally? Me personally, probably like ten minutes. But for anybody watching, probably would take like one one or two weeks.

Speaker 2:

There we go. There we go. How do you guys think about how do you guys think about product marketing? Obviously, you're just going viral everywhere, getting all this attention. How do you make sure that it that it he's shaking his head off.

Speaker 2:

Doesn't think about it's not about the product. It's about the attention. Attention

Speaker 10:

is all can't anything go viral. Yeah.

Speaker 2:

Yeah. But but but how do you

Speaker 13:

side of the street. You know? You, make some UGC videos, make some Twitter posts. You know? You can sell anything.

Speaker 13:

You know? In 2025, product doesn't matter. You know? I could jack off off the side of a building, sell some videos of it for $20 each, make $2,000,000,000,000. It's crazy.

Speaker 2:

2,000,000,000,000. That that's intense. How do you guys think about burn? Is it on your mind is it on your mind at all?

Speaker 3:

I don't know

Speaker 1:

if you saw

Speaker 10:

the last tweet, but as of literally, like, two days ago, we're still we're still cash flow positive. We're still fucking profitable.

Speaker 1:

We're still profitable.

Speaker 2:

Let's give it up for your profit. Let's hear it. Let's hear it. It sounds So

Speaker 1:

you're charging for the products and people are paying. Are they at all satisfied or do they feel like they got Oh,

Speaker 9:

they're all

Speaker 10:

satisfied, bro. Like the product works. You're either using this as a consumer and it's working because, like like, you're passing your interviews and or if it doesn't work, you're not gonna complain to me because I'm a go write to your employer and tell them, yo. Guess who's complaining about using the product? Like, I'll get you blacklisted if you complain really

Speaker 2:

how are you thinking about how do you how are you guys thinking about product evolutions? What do you wanna add to the product? Obviously, you wanna help people cheat on everything. Where where are you gonna help people cheat next?

Speaker 10:

We don't care about like, the product is going to be led by the virality of the content. Mhmm.

Speaker 11:

We have

Speaker 10:

video ideas right now that we're gonna try to push for different use cases. We're gonna see which ones go consistently the most viral. If you can make something go go more viral, then, like, you can just build the technology after you have all the attention. So we'll figure out the exact use cases and exact niches we're gonna quintuple down on once once these guys get to work.

Speaker 1:

What what, what formats on Instagram Reels are, like, the most modern in terms of, consistently viral? Like you mentioned like man on the street interviews, what do do for a living? That's always been fertile ground. What about, I see a lot of those like mobile game ads that look like, you know, you're fighting down some sort of bridge and then you go into the game, it's actually just match three. What what what are the different formats that you like to pull from?

Speaker 10:

Every week, there's two new ones. And at the end point, there's probably 10 to 20 viral trends that is happening. And these cycles so quick, you need to keep your finger on the pulse. These things will, like,

Speaker 9:

expire immediately.

Speaker 1:

You

Speaker 10:

need be on the ball and, like, if I if I told you right now, by the time people watch this on YouTube, like, would have all been expired.

Speaker 1:

Well, we're live, so so give us give us the latest and greatest. Like, what's going viral today?

Speaker 12:

Well, right now we got 10 views using a Snapchat format.

Speaker 1:

Which has

Speaker 12:

been viral for, like, the last three years, to be honest.

Speaker 1:

Okay. And

Speaker 12:

and I I think that, like, we just have to get people who continuously scroll TikTok, like, six hours a day.

Speaker 1:

Yeah. But what's the what's the actual format that you use? Like, describe the video. What is the hook? Like, break it down for me like you're explaining the, like, the art behind the viral format.

Speaker 10:

There's a caption. It starts with a face. Usually, a a handsome dude or a pretty girl. They're saying, damn. This interview is starting with the interviewer starting with the hard questions.

Speaker 8:

I should

Speaker 9:

have been

Speaker 10:

a CS major, not a business major. That's engagement made because people are saying, like, bro, like, CS is way harder than business. Then it turns around the interviewer asks, like, like, hey, how are you doing? Why should we hire you? And then this guy uses Cluey to generate a response but he can't f cking read the response so he reads it hella autistically, like, oh, I revel in detail and and then that's that is, like, another conversation point, like, people are cooking on the guy because he, he can't read properly.

Speaker 10:

The guy is like a doing a really dumb interview using Kuwait.

Speaker 1:

That's great. That's great.

Speaker 2:

How are you guys using AI generated content internally? I know a lot of these the videos that you guys are creating are just typical social media vertical video. Do you have an intern that's just generating, basically copy and pasting, making

Speaker 1:

a bunch of other Do have a vo three or any tools relevant? Anything clicking?

Speaker 10:

Not not yet. I think there's still like a 10% left before they cross the uncanny valley.

Speaker 1:

Mhmm.

Speaker 10:

And the biggest thing is that people need to think your video is real. Mhmm.

Speaker 9:

That, like, like, that is

Speaker 10:

a difference between 100 k views and 10,000,000 views if people think it is real.

Speaker 2:

Yeah. AI CEOs bearish on AI.

Speaker 1:

About yeah. What?

Speaker 10:

Google needs, like, 10 more Chinese researchers to, like, figure it out. And once once they push out the latest update, then then then b o three will be there. But right now, we need real people.

Speaker 1:

Yeah. Yeah. Yeah. Well, I I mean, what about just using AI as, like, stock footage replacement? Not not as the lead in for the video, not the entire video, but just, like, sprinkled in to illustrate a point, you know, an establishing shot of, a building, a helicopter pulling into a building.

Speaker 1:

Like, that that historically has been kind of something that you would reach to, you know, Adobe Stock video for. VO three feels like it's there, but are you not drawing on that at all yet?

Speaker 10:

If there's a viral format when we need it, maybe we'll use it. But right now, like, it's it's really brain dead to go viral on Instagram. Yeah. Formats are not hard. You don't need a helicopter.

Speaker 10:

You need a guy, a camera, a really shitty camera. You need a computer.

Speaker 4:

That's

Speaker 1:

it. I mean, what about, like, those those kind of, like, AI mashups, like Harry Potter Balenciaga or the, the kangaroo with the plane ticket getting on the plane? Like like, AI content can go viral when it's really, when it's, like, inspired almost by a human. It's not it's not entirely AI generated, but it's using the tools effectively to create something that's, like, still catchy. Do you think you'll be using any of that anytime soon?

Speaker 10:

Pro probably very soon. We're scaling up. Like, what you see right now is probably about less than 1% of what the size we will be by the end of this year. Like, we are profitable. We're not trying to be profitable.

Speaker 10:

We just keep making so much money we can't help it. So we're really scaling this shit up to I'm not even trolling you. 1,000 creators are gonna be shipping out content. We're doing a complete Internet takeover.

Speaker 1:

Okay. So so so why in house? Why why like like, why do they even have to be employees? Couldn't you turn this into like a multilevel marketing scheme or something? A pyramid scheme?

Speaker 10:

Like Actually, what we're gonna do. That's

Speaker 1:

Oh, that's what you're gonna do.

Speaker 2:

Okay. MLM.

Speaker 6:

MLM. MLM.

Speaker 2:

Hey. How are guys worried that you could be infiltrated by journalists? I'm sure they're circling the house right now.

Speaker 4:

The hit

Speaker 1:

pieces are gonna come. You know, we're we're doing a softball interview right now. I

Speaker 2:

mean, the

Speaker 1:

person is brave

Speaker 2:

enough to try to do a hit piece on the Cluelly Army is Oh, it's gonna be it's gonna be

Speaker 10:

I bet they're dying too. Look, more more eyeballs is better. No company that ever died from a founder being too controversial. You got Deal fucking infiltrating with genuine spies they're still doing fine, bro. You got all the workers 17 guys.

Speaker 10:

They're they're they're still kicking like no company ever dies from being too controversial. You die because you don't make enough fucking money.

Speaker 11:

Yeah. Yeah. Yeah.

Speaker 1:

Yeah. Speaking of making money, what what's the pricing model right now? Are you doing anything on price discrimination? Is there a super high tier if you get a whale? What does it clearly whale look like?

Speaker 1:

Can I spend $2,000 a month on this service?

Speaker 2:

Yeah. You should add a tipping feature too.

Speaker 1:

Yeah. People

Speaker 2:

should people should be able to tip you guys if if they have a good experience and get the job.

Speaker 1:

Like, really financialization, pay as you go, high interest rate loans. Just really push it. Make it sports gambling in there, maybe. Just throw it all in.

Speaker 10:

Yeah. I mean, it's $20 a month for Okay. Super, a hundred dollars a year, and and our top line revenue is really being driven up by Enterprise. Mhmm. Enterprise, you're

Speaker 1:

have to

Speaker 10:

talk to the sales team to get a custom quote but, you know, like, there's a lot Wait,

Speaker 2:

are you serious? What are the is that more on the sales side? What who who are the Enterprise?

Speaker 1:

So you sell the SDRs?

Speaker 10:

You guys laugh because you think I can't sell enterprise because

Speaker 2:

I'm No. I I I don't believe

Speaker 1:

it. I trust.

Speaker 10:

Like, these 20 these Fortune 500 CEOs, like, are, like, 35 year old dudes who sit there, school reporter laughing at my post.

Speaker 1:

Yeah, yeah, yeah, yeah, no, no, seems it seems legit, it makes sense.

Speaker 2:

No, I I believe it.

Speaker 1:

But but, I mean, you're going even higher tier? Like, what the $2,000 a month Cluely Vision?

Speaker 10:

For for a consumer, there's a lot more we can do with more compute but right now, we're, like be honest, I didn't expect to grow this fast, the edge team is quite small, I'm, like, spending a lot of time trying

Speaker 11:

to hire -Sure.

Speaker 10:

More more competent engineers. Have a lot of backlog tasks that we need to fill out, especially for this last contract that we signed, so we're full time focusing on the one big guy that we got right now. And after that, then we'll try and scale this up. But right now we're focused on the one one big client that we signed.

Speaker 2:

Yeah. Talk about your compensation strategy. The people want to know. You said you can raise infinite capital and you're so confident. I believe you.

Speaker 2:

But but I'm curious to to get some more insight there.

Speaker 10:

Bro, I feel like it's so retarded to be

Speaker 9:

a company. Sorry. Am I allowed to say that? No.

Speaker 2:

You're not allowed.

Speaker 1:

No. This is a family friendly show.

Speaker 10:

It's very stupid to be a company.

Speaker 1:

Like, try

Speaker 10:

to race to the bottom to see how little you can pay your employees. Bro, if I'm making hella money, we're all making hella money, like, like, it's I'm trying to pay them more to see if, man, like, maybe tomorrow we'll start being, like, cash flow negative but make up a pussy, bro, like, I I would like to pay these guys what they're worth and the output is f cking insane. We did 10,000,000 UGC views in, what, like, eight days, like Mhmm. Like, you don't see this sort of traction in any company and you don't see killers like this in any company unless you're paying these motherfuckers, like, they're worth, bro, like, I don't know what happened to, what they're

Speaker 2:

Maxed out contracts. Maxed out contracts.

Speaker 1:

Exactly. Exactly. Yeah. What about devices? I mean, seemed like this would be a natural fit for some sort of AI wearable or other platform.

Speaker 1:

Is there an app coming, or are you interested in what's happening with Johnny Ive and OpenAI? What's what's your take on the device world?

Speaker 10:

We're very interested in the hardware space. We've got, like, a million things cooking on hardware. We got people in the garage right now working on shit you don't you don't even know about, bro. Like like like, we're we're bringing manufacturing back to America, and it all starts at the Cluelly Garage.

Speaker 2:

Cluelly Garage. I love to see it. Nobody they doubted, but you guys are re industrializing America. You guys really are

Speaker 9:

the hard tech

Speaker 10:

down there. They're working on brain chips down there.

Speaker 2:

Brain chips. Brain chips.

Speaker 1:

That's the future.

Speaker 2:

There we go.

Speaker 1:

There we

Speaker 2:

go. The new Neuralink. Yeah. I know, there's a world in the future where you guys actually just roll up Neuralink and OpenAI. For For sure.

Speaker 2:

Fluly umbrella.

Speaker 1:

Yep. Definitely. It's

Speaker 10:

possible. Excited to offer acquisitions for, for both of those companies. It's in the roadmap.

Speaker 2:

It's on the road. Alright. This has been a lot of fun. I'm excited for you guys. It is and I have no doubt that you'll go from 10,000,000 views a week.

Speaker 2:

10,000,000 views a week to a hundred. And I'm excited to see you guys hit that billion view mark very soon. So keep it up. We are all very entertained and, rooting for you.

Speaker 10:

Shake it. Shake

Speaker 1:

Shake it. I love the energy. Thanks, man. We appreciate you joining.

Speaker 2:

Better guys.

Speaker 11:

We'll talk

Speaker 2:

to Keep having fun. Bye.

Speaker 1:

Next up, have Kian from Nucleus coming on with a big announcement. Something like ten years in the making, close to it, maybe seven years. We'll bring Keon in. Let's play some soundboard. How are doing?

Speaker 1:

Welcome to the show.

Speaker 9:

That's a

Speaker 1:

great intro.

Speaker 12:

Their tweets are flying. Oh my god.

Speaker 9:

You guys

Speaker 10:

seen this?

Speaker 2:

Yes. You seeing this? Seeing this?

Speaker 6:

Break it down

Speaker 1:

for us. Explain explain what's happening.

Speaker 2:

There's nothing like a launch day.

Speaker 8:

I'm trying

Speaker 9:

to figure out, guys, is this is it Gattaca or is it Theranos? Because people can't they can't make up their mind.

Speaker 1:

Oh, yeah.

Speaker 10:

You know? I'm like, we're gonna find

Speaker 9:

figure it out. We're trying to figure out what's going on. Let's give some context to the audience. Nucleus has launched Nucleus Embryo, the world's first genetic optimization software. Mhmm.

Speaker 9:

Basically, parents can give their children the best start in life. They can pick their embryo based off of physical characteristics like eye color, IQ, they can go to disease risk like cancers or heart disease. Basically, really believe parents can get all the information that exists about their embryos, and they can pick however they want. For me personally, you know, it's been ten years in the making. The journalists actually covered it today in The Wall Street Journal, was a journalist that covered my gene editing in a warehouse in Brooklyn Ten Years ago.

Speaker 1:

Yes. Let's see. Wow. Overnight

Speaker 9:

success. You know, it's a long time in genetics.

Speaker 1:

Yeah. So so so break down the state of the art because, like, embryo screening exists. I think most parents in America, at least if they the means do some sort of screening while the embryo is growing? Is this purely for IVF? Is this just going a layer deeper?

Speaker 1:

And then is I wanna talk about the regulatory and FDA component as well.

Speaker 9:

Yeah. Let's talk about it. So basically, if you go to IVF clinic today, you're a couple. The vast, vast, vast majority of clinics the first thing I should understand is that the IVF process is principally controlled today by clinicians or doctors. Mhmm.

Speaker 9:

Honestly, couples don't have as much liberty in our perspective as they should. It's their baby. It's their embryos. They should have the right to those that information, and they should be to pick off any vertical. However, today in the clinic, what generally happens is people test embryos for very rare and severe genetic conditions.

Speaker 9:

For example, like a chromosomal abnormality, like Down syndrome, for example. Yes. Or even a condition like cystic fibrosis or T SACS or PKU. Right? These are conditions that are very rare, that maybe someone might have a carrier for cystic fibrosis, again, it's it's pretty rare.

Speaker 9:

Then there are conditions that we've all heard of heard about things like breast cancer, things like coronary artery disease, the things that actually kill the vast majority of people today. Right? Chronic conditions kill the vast majority people today. Those conditions are just not tested for in the clinic, even though we have very good science actually that can make those predictions. How do we know this as DNA company?

Speaker 9:

Well, that's what we do. We build models that predict disease and the way you test those models in adults. So we go from adults to embryos is actually because we can basically well validate these models to show that they work in both the embryonic context and in the adult context. And so what we're really doing is we're going from, Okay, instead of just looking for really severe, like Down syndrome cystic fibrosis, why not do breast cancer? Why not do heart disease?

Speaker 9:

Why not do colorectal cancer? Why not do schizophrenia? Why do Parkinson's? But then why stop there? And this is really the important thing.

Speaker 9:

Because ultimately, if you think about diseases and traits, the extreme version of any trait is actually a disease. Height is a good example of this. One extreme end is John, for example. He's like Markman syndrome almost. Then the other end is me dwarfism.

Speaker 9:

Right? It's like the on both ends. Okay? So, you know, so, you know, IQ is another example of this. One end is, like, you know, autism.

Speaker 9:

The other end, it can actually be some sort of, you know, a cognitive basically challenge that people have. And so when you think about it, we start realizing that people have drawn a line in the sand saying, you can get, you know, rare diseases. You can get common diseases. But then they really say you can't can't get any traits like height. Mhmm.

Speaker 9:

Even though the best predictor we have today actually in the world, the best polygenic predictor is for height. So as a company, we've kind of completely reimagined this and said, wait a second, what's going on here? You should have access to the entire stack. Rare diseases, we do, cystic fibrosis, common diseases like breast cancer, and also traits all the way up to something like IQ.

Speaker 1:

Yeah. So I mean, that test, are you just giving people the data? Because I imagine that once you get into particular recommendations, that's more what I would expect a licensed doctor to need to do.

Speaker 2:

Yeah. My sense is that they you can allow people to get the data from their doctor and then feed it into Nucleus. Is that correct?

Speaker 9:

So that is correct. Actually we have a couple, there was like 10 announcements today. Know how we do it. We like to 10 announcements in one day. We are actually very, very excited to announce a huge partnership with Genomic Prediction.

Speaker 9:

Genomic Prediction is actually the oldest embryo testing company that exists. They've done genome wide tests in embryos for almost a decade at this point, and I think they've done over 120,000 couples for PGTA, which is a specific kind of test. And so we're actually partnering with them. So we make it very easy for genomic prediction customers to request their files and actually port it over to Nucleus. But really, this isn't just for genomic prediction customers.

Speaker 9:

Anyone who's undergoing IVF can go to their clinic and say, I want my embryo's data. You can take that data. You can upload it to Nucleus, and then all of sudden, you know, the application layer of makes this technology universally, basically, universally accessible.

Speaker 1:

Now how much of how much of the benefit is is actual algorithmic analysis bringing in other data points to contextualize the data versus just better UI and better hydration of existing text? Because we we had we had a friend on the show who was talking about getting some medical results from a doctor. The doctor's office was closed. It took two days to until the doctor was gonna be able to interpret the results. He was able to just take a photo, upload it to ChatGPT and say, hey, is this is this a you know, is this really, really bad?

Speaker 1:

Should I be panicking? Yeah. Because it seems somewhat out of the range. And ChatGPT was able to say, hey, you still gotta talk to the doctor, but this isn't this isn't the craziest thing I've ever seen. This is way out of distribution.

Speaker 1:

And so that's almost like a pure UI layer, but extremely valuable. I know it might not be, like, the right narrative for some people that it's, like, not as innovative, but I think that, like, all that matters at the end of the

Speaker 3:

day Both.

Speaker 1:

Is giving people benefits.

Speaker 9:

Right? Both.

Speaker 1:

It's always both.

Speaker 9:

You you you you have fundamentally technology, just for technology sake,

Speaker 1:

is

Speaker 9:

not siliconized.

Speaker 8:

It's about. Right?

Speaker 9:

Siliconized by making something that people want. Okay? And people can actually use. Exactly. Think about the the nucleus innovation, it's it's it's two pronged.

Speaker 9:

Okay? One is in the informatics. Right? You know, I've been doing this for five years. Sure.

Speaker 9:

I I almost I would argue to myself that I spent too much time, you know, developing the science. Right? Sure. Science in a in a nutshell isn't actually very useful. You need to expand and access to it.

Speaker 9:

So Yeah. On that point, we do multiple different kinds of analyses

Speaker 1:

Mhmm.

Speaker 9:

That make it such that we can actually provide the most comprehensive analyses that exist today. But moreover, and this is really the, I think, a key point to your point, John, is people understand them. People can see them. I mean, you can pull up the platform. I'm not sure if you guys have shown already, but it's very easy to sort, compare for your embryo.

Speaker 9:

You can actually name your embryos and stack rank your embryos. You can understand what the score means. We lead with overall risk, or we tell you, for example, instead of saying you're the 90 nine percent top for genetic risk for condition, which, you know, what does it actually mean? We say, hey. You have a, you know, five percent chance for the like of, let's say schizophrenia or some other condition.

Speaker 9:

In other words, by leaving overall risk, people have much greater intuitive understanding of the results. We're communicating to them. We have genetic counselors on hand. So this really is a what are we showing here? We showing something?

Speaker 9:

I'd be showing the

Speaker 1:

Yeah. Yeah. We pulled up here.

Speaker 9:

Think you're showing actually

Speaker 1:

Pull up your website. That's another thing.

Speaker 9:

That's a fun one. That's an Easter egg.

Speaker 1:

That's an Easter egg.

Speaker 9:

The that's the kind of approach that we're taking here. And I think consumers are responding to it. Right? People want to have access to their data. The clinician, the doctor shouldn't decide what embryo to implant.

Speaker 9:

You should.

Speaker 1:

Okay, so talk to me about what requires FDA approval. Obviously, new medical devices. Like if you were developing a machine to take in an embryo and sequence the DNA, I would expect that the FDA would want an approval for that medical device. But if you are taking data and just showing it to a customer in a different UI, that feels like probably a very light FDA process. And there's probably a continuum in the middle where once you're making a recommendation, they have rules around that, right?

Speaker 9:

We as a company do not tell you which embryo to implant. Sure. Basically parents, the couple has complete agency to decide how they want to use the information to implant their embryo. Moreover, let's be clear, height, right? I mean, is can a height analysis be a medical device?

Speaker 9:

You know, it doesn't even make sense. Right? IQ, height, these traits, for example, we all you know, traits are something that I don't think actually belongs in even the kind of infrastructure thing about medical care. Right? These are things that go beyond medical care.

Speaker 9:

These are things like, you know, that that people just kind of intuitively know and that there are DNA tests done every single day for you to see for these analyses because they're not disease analyses. Right? Mhmm. So we do both diseases and traits to be clear. My point is many of these innovations that you have to wonder, like, you know, should the should the government say if someone can or cannot pick their embryo based off height?

Speaker 9:

That doesn't seem right to me. I think it should be in the complete liberty of of the individual to decide that.

Speaker 1:

Yeah. But, I mean, we're we're a democratic country. And so if if, you know, a a a huge swath of population says that the FDA should review that type of, test or that type of analysis

Speaker 9:

Iced analysis? It

Speaker 1:

could happen. I mean, the the FDA reviews all sorts of different stuff. And so I I guess the question shifts to, like, do you expect a change from FDA on the way these analysis tools are regulated?

Speaker 9:

I think right now the most important thing is just putting these high quality rigorous scientific results in people's hands and helping them basically have healthier children, helping them give their child the best start in life. Yeah. You know, I I think that generally speaking that, you know, people should have more liberty, more chores in in medicine. I think the broader longevity trend actually touches on that point as well. Yeah.

Speaker 9:

So that's what we're excited to do at Nucleus.

Speaker 1:

Yeah. I mean, the the fact that you're partnering with a a company on on actual on the actual, like, medical device side, like, they are doing the sequencing of the embryos. Like, that really takes it out of the Theranos question entirely in my mind. I think you feel like you should be beating the drum there a little bit more. It's like like, didn't say we created some new device, but I know.

Speaker 1:

We have to find this.

Speaker 10:

We ship. That's the difference.

Speaker 1:

We

Speaker 9:

ship. It's live and shop, baby. Go look at it.

Speaker 10:

Go use it.

Speaker 9:

That's that's the evidence. It's the why is there a footage? Okay.

Speaker 2:

I love the visual of John and his wife selecting between embryos, and it's like six ten or seven two. Tough choice. Well, if we go with the six ten, he has,

Speaker 1:

you know You could potentially fly commercial once in his life.

Speaker 9:

We actually could actually play this game right now.

Speaker 1:

Yeah.

Speaker 9:

Okay? Here. We're gonna play a game right now. I'm gonna put in the chat. Okay.

Speaker 9:

Pickyourembryo.com. Okay? Everyone listening to this. Pickyourembryo.com.

Speaker 2:

I'm gonna

Speaker 9:

go to it. Oh my god. Here we go. Little Easter egg here. Okay.

Speaker 9:

Let's see what's more important to you, John. Intelligence or muscle strength? Come on. What do think?

Speaker 1:

Muscle strength. Let's go. We're the the future of bodybuilding. Let's

Speaker 9:

go. Lighter.

Speaker 2:

John would John would take a he would he would happily have a five two son if he had, you know, top point o 1% bodybuilding. Exactly.

Speaker 9:

Yeah. Okay. So your lifespan or height? Come on. Lifespan.

Speaker 1:

Lifespan. Let's go.

Speaker 9:

Let's go. Let's go.

Speaker 1:

Maybe low depression. You gotta be golden retriever mode. You gotta be You

Speaker 9:

need you need low depression.

Speaker 1:

You need low depression. Let's go low o o o c d. I don't mind bouncing around a bunch. Okay.

Speaker 9:

What's Risk taking anxiety.

Speaker 1:

Let's go high risk taking.

Speaker 13:

There we go.

Speaker 9:

Okay. Wait. Analyzing

Speaker 1:

Is this some generative AI stuff going on? This is great. Nadia. I got Nadia too. Enduring athlete.

Speaker 1:

Let's go. Physically strong, cautious, built to last. Yeah. This is great. Is this driving a lot of, a lot of attention, a lot of downloads is going viral yet?

Speaker 1:

This seems like something that's designed to be shareable.

Speaker 9:

Dropped it right now. Technology Bros, we

Speaker 1:

got you

Speaker 6:

the exclusive.

Speaker 1:

Let's go. There we go.

Speaker 9:

We can pick your embryo. People say, what's it like? Maybe not doing IVF yet. No problem.

Speaker 2:

Only nine percent of people choose Nadia.

Speaker 1:

Okay. Well, we're contrarian. We like that

Speaker 11:

here.

Speaker 1:

Yeah, that's funny.

Speaker 9:

It's great.

Speaker 1:

Oh, well, congratulations on the news. Congratulations on the launch.

Speaker 2:

Yeah, the pace is wild. Last thing, what's going on with, have you seen these just blood billboards?

Speaker 1:

Oh, yeah.

Speaker 2:

They're all over LA.

Speaker 1:

So so there's there's someone who's running a campaign right now. Justice for Elizabeth Holmes claiming that Theranos was not the scam people think it was. And there's a there's a documentary coming out, and there's billboards all over LA for just blood. Like, it's just blood. It's not that big of a deal.

Speaker 9:

And John, to be clear, there's an exclusive on technology buzz next week about from this person. Right? They they're gonna tell their story next week just to make sure you you invite them already.

Speaker 2:

Hope We're we're we're we're we're

Speaker 1:

we are toying with the idea that someone reached out to kind of connect us. We're we're thinking about doing it, but we're not we're not a % sure that it'd be appropriate for the

Speaker 2:

Based on the website, I don't know if it's appropriate.

Speaker 1:

Yeah. It doesn't look like it was designed with Figma. So I don't know. It's a little bit. A little bit.

Speaker 1:

But they they claim that, that Elizabeth Holmes has been proven innocent. And so it's a bold claim. We like we like to see people making bold claims.

Speaker 2:

By what, jury is my question.

Speaker 1:

Yeah. The jury of someone who knows HTML.

Speaker 2:

Kian, Always a great time. Energy is fantastic. Electric. Electric.

Speaker 1:

Thank you for coming on, firing us up. Congratulations on the launch. We will talk to you soon. Talk soon.

Speaker 9:

Twitter for sure. Okay?

Speaker 1:

So we'll see you

Speaker 11:

there. Bye.

Speaker 10:

Bye. Bye.

Speaker 1:

We have someone from OpenAI here. We're gonna stick to technology and business, but welcome to the show, Mark Chen. Good to see you.

Speaker 8:

Good to see you guys. Thanks for

Speaker 1:

having day, but I'm excited to talk about deep research. I am excited to talk about AI products. Would you mind introducing yourself and kind of explaining what you do? Because OpenAI is such a large company now and there's so many different organizations. I'd love to know, how you interact with the product and the research side and anything else you can give to contextualize this conversation.

Speaker 8:

Yeah, absolutely. So first off, you know, thanks for having me on. You know, I'm Mark. I am the chief research officer at OpenAI. So in practice, what that means is I work with our chief scientist, Jakob, and we set the vision for the research org.

Speaker 8:

We set the pace. We hold the research org accountable for execution. And ultimately, we really just want to deliver these capabilities to everyone.

Speaker 1:

That's amazing. In terms of research, I feel like a lot of the what happens in the research side is actually gated by compute. Is that a different team? Because what if the researchers ask for a $500,000,000,000 data center, that feels like maybe a bigger a bigger task?

Speaker 8:

So, yeah, it it is useful for us to factor the problem of, research and also kind of building up the capacity to do that research.

Speaker 11:

So we

Speaker 8:

have a different team. Greg leads that, which really thinks holistically about, you know, data center bring up and how to get the most compute for us. And of course, when it comes to allocating that compute for research, you know, Jakob and myself do that.

Speaker 1:

That's great. And so what what can you share that's top of mind right now on the research side? There's been this discussion of pre training scaling wall potentially the importance of reinforcement learning, reasoning, there's so many different areas to go into. What's actually driving the most conversations internally right now?

Speaker 8:

Yeah, absolutely. So I think really it's a really exciting time to do research. I would say versus two or three years ago, I think people were trying to build this very big scaling machine. And really, the reasoning paradigm changed a lot of that. Right?

Speaker 8:

Like, reasoning is really taking off and it really opens this new playing ground, right? It's like there are a lot of kind of known unknowns and also unknown unknowns that we're all trying to figure out. It kind of feels like GPT-two era, right? Where there's so many different hyperparameters you're trying to figure out. And then I think also, like you mentioned, pre training, that's not to be forgotten either.

Speaker 8:

Today, we're in a very different regime of pre training than we used to be, right? Today, we can't treat data as this infinite resource. And I think a lot of academic studies, you know, they've always kind of treated, you know, you have some kind of finite compute but infinite data. I don't think there's much study of, you know, like, you know, finite data and infinite compute. And I think, you know, that also leads to a very rich playground for research.

Speaker 1:

Do we need kind of a revision to the bitter lesson? Is that a a refutation of the bitter lesson? Or No. No. Or do we just kind of re re rethink what the definition of of scaling laws looks like?

Speaker 8:

No, I don't think of anything as a refutation of the bitter. Really, like our company is grounded in we want simple ideas at scale. I think RL is an embodiment of that. I think pre training is an embodiment of that. And really at every single scale, we face some kind of difficulty of this form.

Speaker 8:

It's just like, you got to find some innovation that gets you past the next bottleneck. And this doesn't feel fundamentally very different from that.

Speaker 1:

What's most important right now on actual compute side? We heard from NVIDIA earnings that that we didn't get a ton of guidance on the shift from training to inference usage of NVIDIA GPUs, but it feels like it must be coming. It feels like this inference wave is is happening. Are those even the right buckets to be thinking about tracking metrics in terms of the the the story of artificial intelligence? Because yeah, I mean, it's like if if the reasoning tokens are inference tokens and and but they're what lead to higher intelligent, more intelligent models, like it's almost back in the training bucket again.

Speaker 1:

What bucket should we be thinking about? Or are we how firmly are we in the applied AI era versus the research era?

Speaker 8:

Well, I think research is here to stay. And it's for all the reasons I mentioned above. Right. It's such a rich time to be doing research. But I do think, you know, inference is going to be increasingly important as well.

Speaker 8:

Right. It's such a core part of RL that you're doing rollouts. And I think, you know, we see 2025 as this year of agents, right? We think of it as a year where models are going to do a lot more autonomous work. You can let them kind of be unsupervised for much longer periods of time.

Speaker 8:

And that is just going to put big demands on inference, right? When you think about kind of our overall vision, right? We lay it out as a series of steps and levels on the way to AGI, right? And I think the pinnacle, really, that last level is organizational AI, right? Like you can imagine a bunch of AIs all interacting.

Speaker 8:

And yeah, I think that's just going to put huge demands on inference.

Speaker 1:

On organizational question, I remember reading AI twenty twenty seven and one of the things that they proposed was that the AIs would actually like literally be talking to each other in Slack. Does that seem like does that seem like the way you imagine agents playing out, like using the same tools as humans instead

Speaker 13:

of One

Speaker 2:

agent says, I'm gonna go talk with Teams, talk with Slack. I'm gonna do a little negotiating.

Speaker 1:

But maybe it just happens super super fast 20 fourseven or is there like a new machine language that emerges?

Speaker 8:

Yeah, think one thing that's really helped us so far in AI development is to come in with some priors for how humans do things. And that's actually if you bake those priors in, they typically are great starting points. So I could imagine, like, maybe you start with something that's slack like and give it enough flexibility that it can kind of develop beyond that and really figure out the way that's most effective for it to communicate. One important thing, though, is we want interpretability, too, right? I think it's very helpful for us today that what the agents do is easy for us to read and interpret.

Speaker 9:

And I don't

Speaker 8:

think you want that to go away as well. So I think there's a lot of benefits just even from a pure debug the whole system perspective. So just let the models speak in a way that is familiar with us. And you can also imagine, like, we might want to plug in to the system too, right? So whatever interfaces we're familiar with, we would ideally like our model to be familiar with as well.

Speaker 8:

I think it's also pretty compatible with we hit a big milestone. We got, I think, 3,000,000 paying business users for Let's go. Yeah. There we go. Let's go.

Speaker 8:

And, Three

Speaker 1:

Gong hits for 3,000,000.

Speaker 2:

The Gong will keep ringing for a while.

Speaker 1:

Had to do I was hoping you would drop a number.

Speaker 8:

Yeah. Anyway

Speaker 1:

Congratulations. That's that's actually huge. That's amazing. Yeah.

Speaker 8:

Yeah. Yeah. But I think one big part of that is, you know, we have we have connectors now. Right?

Speaker 11:

Yeah. We're

Speaker 8:

connecting into, you know, like, drives. And Mhmm. I think, yeah, you can imagine, you know, like, integrations, things like that. I think we just want the models to be familiar with the ways we communicate and and get information.

Speaker 1:

Yeah. Can you talk about benchmarking? It feels like we're potentially

Speaker 4:

Yeah.

Speaker 2:

Do you think about benchmarks at at all?

Speaker 8:

Oh, yeah, a lot. I mean, but I think it's a difficult time for benchmarks, right? I think we used to be in this world where you have these human written benchmarks for other humans, right? And I think we all have these norms for like, what are good benchmarks, right? Like, we've all taken the SAT.

Speaker 8:

We all have like a good conception of what it means to get whatever score on that. But I think the problem is the models are already at the point where for even the hardest human written benchmarks for other humans, it's really near saturated or saturated, right? I think one clear example here is the AIMEE, like probably the hardest autogradeable human math eval, at least in US. And yeah, the models are consistently getting like 90 plus percent on these. And so what that means is I think there's kind of two different things that people are doing, right?

Speaker 8:

They're developing kind of model based benchmarks, right? They're not kind of things that we would give to an ordinary human. Things like Humanities Last Exam, things like, you know, Epic AI that are really, really the the frontier of what people can do. And I think the hard thing is it's not grounded in intuition, right? Like, you don't have a lot of people who have taken these exams.

Speaker 8:

So, it makes it harder to kind of calibrate on whether this is a good exam or not. One of the exciting things that's on the flip side of that is I really do think we're at the era where models are going to start innovating, right? Because I think once you've passed the last kind of like the hardest human rating exams, that's kind of at the edge of innovation. And I think you already see that with the models, right? Like they're helping to write parts of papers.

Speaker 8:

And I think the other kind of way that people have shifted is, there's these ultra frontier evals, but there are also people kind of just indexing on real world impact, right? You look at your revenue, kind of the value you deliver to users. And I think that's ultimately what we care about.

Speaker 1:

Can you bring that back to interpretability research? Like, with these super, super hard, math evals, for example, are we doing the right research to understand if the thought process mirrors not just one shotting the answer, oh, you memorized it or you magically got it correct, but you actually took the correct path kind of like, you know, you're graded for your work, not just the answer if you're in grade school. Yeah. And and, you know, Dario said that, interpreting interpret interpretability research will actually contribute to capabilities and even give a decisive lead. Do you agree with that?

Speaker 1:

What's your reaction to that concept of interpretability research being very important?

Speaker 8:

Yeah. Mean, we care a lot about it here at OpenAI as well. So one thing that we care a lot about is interpreting how the model reasons. Right? Because I think we've had a very kind of specific and strong view on this in that we don't want to apply optimization pressure to how the model thinks so that it can be faithful in the way it thinks and to expose that to us without any kind of incentives to cater to what the user wants.

Speaker 8:

Right. I think it's actually very important to have that unfiltered view because oftentimes, like if the model isn't sure, you don't want to hide that fact, right, just for it to kind of please the user. And sometimes it really isn't sure. Right. And so we've really done a lot of work to try to promote this norm of chain of thought, faithfulness and interpretability.

Speaker 8:

And I think it gives you a lot of sense into what the model is thinking and, you know, what are the pitfalls that it can go off into if it's not reasoning correctly.

Speaker 2:

That's such an important point, because if you have somebody on your team and they come to you and they say, Hey, you know, I think this is the right answer, but we should probably verify it. It's like, it's still valuable.

Speaker 1:

Totally.

Speaker 2:

Puts you on the right path. If somebody comes to you a hundred percent confidence, this is the truth. Well, trust is just destroyed.

Speaker 1:

Yeah, totally.

Speaker 8:

Do you guys feel like safety felt a lot more theoretical a couple of years back? Right? But today, the things that people are talking about a couple of years, like scalable oversight, really having the model be able to tell you and convince you that the work it did was right. It feels so much more relevant right now.

Speaker 1:

100%.

Speaker 8:

The capabilities are so strong.

Speaker 1:

Yeah. I mean, just personally, I've I've completely flipped from being like, the safety research is not that valuable because I'm not that worried about getting paperclip. It just seems like a very low likelihood that that's kind of like the bad ending like immediately in this fume and all this crazy the Grey Goose scenarios were just so abstract and sci fi. I just felt like economics will will will fall into place and there will be, like a like a cold like a nuclear ending, which is like we didn't build nuclear plants and we just stopped everything because we seem humans seem to be good at that. But now that we're actually seeing

Speaker 9:

things go. Yeah.

Speaker 8:

Yeah, it's crazy how fast it's been, right? Like, I'm

Speaker 1:

Oh, yeah.

Speaker 8:

I think my personal story is, it's like, you know, what got me into AI was AlphaGo, right? Like, just watching it get to that level of capability at Go. And you were kind of like, it was such an optimistic and also kind

Speaker 3:

of a little bit of

Speaker 8:

a sobering message, right, when you saw Lisadol get beat. And I just remember, you know, like we saw the coding models, you know, when we first launched, like, I think very OG Codecs, you know, with GitHub Copilot, it was maybe like under 1,000 Elo on Codeforces. And I still remember the meeting where I walked into where the team showed my score and they're like, hey, with models better than you? And it's like, come full circle and it's like, wow, I put decades of my life into this and, you know, the capabilities are there. So like, if, you know, I'm kind of at the top of my field in this thing and it's better than me, like, what can it do?

Speaker 1:

Yeah. Yeah. That's amazing. I have so many more questions. On AlphaGo, are there lessons from scaling, how scaling played out there that you can that we can abstract into the rest of AI research?

Speaker 1:

What I mean is, as I remember it, the AlphaGo training run was not a hundred k h two hundreds. Mhmm. But what would happen if we actually did an AlphaGo style training run? I mean, would be an economic money pit. Right?

Speaker 1:

Like they've had no economic value to do. But let's just say some benevolent trillionaire decides I'm gonna spend a billion dollars on a training run to beat AlphaGo and go even bigger. Is is Go at at some point solved? Would we see kind of diminishing scaling curves? Could we throw extra RL?

Speaker 1:

Could we port back everything that we're doing in just general AGI research and just continue fighting it out in the world of Go? Or does that end and does that teach us anything?

Speaker 8:

Yeah. Yeah. Honestly, I feel like if you really are curious about these mysteries, join our team. I want to say. So, yeah, I mean,

Speaker 1:

really, like, kind

Speaker 8:

of the central problem of today is RL scaling.

Speaker 1:

Right?

Speaker 8:

When you look at AlphaGo, it's a narrow domain. Right? And I think in some sense that limits the amount of compute you can pump into it. But even kind of small toy domains, they can teach you a lot about how you scale RL. Like, what are the axes where it's most productive to pump scale in?

Speaker 8:

I think a lot of scaling research just looks like that, whether it's on RL or pre training. So you identify a lot of different variables under which you can scale and where you get the best marginal impact for pumping scale there. I think that's a very open question for our all right now. And I think what you mentioned as well is just like, you know, going from narrow to broad, right? Does that give you a lever to pump a lot more scale in as well?

Speaker 8:

I think when you look at our reasoning models today, they're a lot more broad based than, you know, just being able to kind of an expert system on Go. So, yeah, I I really do think that, there are so many levers to scale.

Speaker 1:

What about Move 37? That was such an iconic moment in that AlphaGo, Lisa Doll match. They placed Move 37. It's very unconventional. Everyone thinks it's a blunder.

Speaker 1:

It turns out not to be. It turns out to be critical. It turns to be it turns out to be innovation. Do you think we are we're certainly post Turing test in language models. We're probably post Turing test in image generation, but it feels like we're pre Move 37 in text generation in the sense that there hasn't been, like a fully AI generated book that everyone is just, oh, it's the new Harry Potter.

Speaker 1:

Everyone has to read it. It's amazing and it's fully aged and it's fully generated. Or, or this image, the images, they do go viral, but they go viral because they're AI. Move 37 in the context of Go did not go viral because it was AI. It felt like it was actual innovation.

Speaker 1:

So, is that the right frame? Does that make any sense?

Speaker 8:

Yeah. I think it's not the wrong frame. So I think some quick thoughts on

Speaker 10:

I

Speaker 8:

think kind of when you have something that's very measurable, like win or lose, right? Something like Go. It's like very easy for us to kind of judge, right? Like, did the model do something right here? And I think the more fuzzy you get, it is just harder, right?

Speaker 8:

Like, when it comes to, is this the next Harry Potter? Like, it's not a universally loved book. Think

Speaker 1:

fairly

Speaker 8:

universal, but there's some haters. And yeah, I think it is just kind of hard when it comes to these human subjective things where it's really hard to put down in words, like what makes you like Harry Potter, right? And so I think those are always going to lag a little bit. But, you know, I think, you know, we're developing more and more techniques to attack kind of these more open ended domains. And I don't know, I wouldn't say that we're not at an innovative stage today.

Speaker 8:

So I think my biggest touch with this was when we had the models compete on the IY last year. So IY, it's like international, basically Olympics for computer science. Basically, the top four kids from each country go and compete. And these are really, really tough problems. Basically selected so that they require some innovative insight to solve.

Speaker 8:

Right. And we did see the model come up with solutions even to some very ad hoc problems. And so I think there was a lot of surprise for me there. I was completely off base about which problems the model would be able to solve the most. I think like I kind of categorized there's six problems.

Speaker 8:

Some of them as more kind of like, oh, this is standard, a little bit more standard. This is a little bit more out of the box. I was like, it's not gonna be able to solve this, more out of the box one, but it it did. And I think, I think it really does speak to kind of, these models have the capacity to do so, especially trained with RL.

Speaker 1:

Now now now put that in context of what's going on with Arc AGI. Obviously, OpenAI has made incredible progress there, but it just when I do the problems, it seems easy. And when I look at the IOI sample problems, I think this would be a twenty year process for me to figure out how to achieve that. And I can do the

Speaker 6:

Yeah.

Speaker 1:

Arc AGI on my phone. Is this the spiky intelligence concept? Is this something that a small tweak in in algorithmic design just one shots AGI or Arc AGI, or or is there something else going on there that we should be aware of? Yeah.

Speaker 8:

I mean, I think, part of this is the beauty of Arc AGI as well. Right? Like Yeah. I think I'm not sure if there's another kind of like human intuitive, simpler benchmark,

Speaker 1:

which

Speaker 8:

is for the models. I think really that's one of the things they really optimize for on that benchmark. I do think when it comes to models, though, like there's just a little bit of a perception gap as well. Like, you know, models aren't used to this kind of native, you know, like just screen type input. I think there's a lot we can bridge there.

Speaker 8:

Actually, even O4 Mini, it's a state of the art multimodal model in many ways, including visual reasoning. And I think you're starting to kind of build up the capacity for the models to take images, manipulate and reason about them, generate new images, write code on images. And I think it's just been kind of under focused. But I think when I talk to researchers in the field, they all see this as a part of intelligence, too. And we're going to continue to focus there.

Speaker 1:

Yeah. Is is is RKGI kind of in the if we're dropping a buzzword on it, is, like, program synthesis? Is there a world where, I I know that I I know the tokens, like, the the images, we see them as as renderings of squares and different colors, But the when they're fed into the LLM, they're typically, just a stream of of numbers effectively. Is there a world where actually adding a screenshot is what's important? Like visual reasoning.

Speaker 8:

Yeah. So I think that could be important. It's just like kind of whenever it comes to textual representation of grids, models today just don't really do that well. And I think it's just kind of because humans don't really ever write down textual representations of grids. Have a chessboard, like no one really kind of just like types it out in a grid.

Speaker 8:

And so the models are kind of like undertrained a little bit on what that looks like and what that So I think with more reasoning, we'll just bridge the gap. Think with better visual perception, we'll just bridge that

Speaker 1:

Yeah.

Speaker 3:

How are

Speaker 2:

you thinking about the role of non lab researchers in the ecosystem today? I'm sure you try to recruit some of the best ones, but the ones that don't join your team.

Speaker 1:

Tell us about the one that got away.

Speaker 2:

Yeah, one that got away.

Speaker 8:

Yeah, I I think it's still actually a fairly good time for specific domains to be doing research. And I think the style is just very different. And you do feel the pull of non lab researchers into labs because I think they feel like a lot of the burning problems in the field are at scale. Right. And that's kind of one of the unfortunate things to write.

Speaker 8:

Like when you look at reasoning, you just don't see that happen at small scale, right? There's like a certain scale at which it starts becoming signal bearing and that requires you to have resources. Right. But I do think a lot of the really good work that I've seen, there's experimental architectures. I think a lot of good work is happening in the academic world there.

Speaker 8:

Like a lot of study in optimization, a lot of study in kind of like GANs. There's certain fields where you see a lot of fruitful research that that happens in academia.

Speaker 1:

Yeah, that makes a lot of sense.

Speaker 2:

How about consumer agents? How are you thinking about them? You talked earlier about sort of b to b adoption, and that's all very exciting. But how much do you and the research org think about breakout consumer agent products?

Speaker 8:

Yeah, that's a fantastic question. I think we think about it a lot. I think that's the short answer. We really do think like this year, we're trying to focus on how we can move to the agentic world. Right.

Speaker 8:

And when I think about consumer agents, I think like ChatGPT proved that people got it, right? It's like people get conversational agents, conversational kind of models. But when it comes to consumer agents, we have a couple of theses that we've tried out in the world. I think one is deep research, right? I think this is something that can do five to thirty minutes of work autonomously, come back to you and really like kind of synthesizes information, right?

Speaker 8:

It goes out there, gathers, collects and kind of, you know, compresses the information in a form that's useful.

Speaker 2:

A little bit of pushback there. Like I can see that as a consumer product when someone like Aiden is like, I want new towels. And he uses deep research to like figure out like what is the best towel across every dimension. But when I think of deep research, yes, it has applications with students, but it's often, it might just be very, consumers being like, we keep using flight report on this country and where to travel and

Speaker 1:

things like We keep using this flight example, but I don't, I haven't actually tried to book a flight with Deep Research. It's totally possible that it could go and pull all the different flight routes and calculate all the different delays and all the different parameters of if I fly to this airport and park or I can use valet here or something like that. Yeah.

Speaker 2:

Yeah. I guess like when I think of agents, it's deep research is like curating information in which you can take action on. But it's like at what point is action a part of that sort of loop, right? Where you can not only curate a list of flights that you want, but then, you know, actually go out and have agency.

Speaker 8:

Yeah. I think one of our explorations in that space is operator, right? It's where you kind of just feed in raw pixels from your laptop into or from some virtual machine into the model and it produces either a click or some keyboard actions. Right. And so there it's taking action.

Speaker 8:

And I think the trouble is, you don't ever want to mess up when you're taking action. I think the cost of that is super high. Only have to get it wrong once to lose trust in a user. And so we want to make sure that that feels super robust, before we get to the point where we're like, hey, look, here's a tool. I

Speaker 1:

That's so different than deep research because, like, you can wind up on some news article and read one sentence that gets a fact wrong or the commas in the wrong place and the numbers off. But that's just the expectation for just text and analysis. And if you delegated that, yeah, you're gonna expect a few errors here and there. Oh, that's actually a different company name or that's the that's an old data point. There's new data.

Speaker 1:

But very different if I book a flight and you book the wrong flight and I can wind up in Chicago instead of New York.

Speaker 8:

Exactly. And I think the reason why we care so much about reasoning is because I think that's the path that we get reliable agents through. Sure. We've talked about, like, reasoning helping safety, but reasoning is also helping reliability. It's like you imagine, like, makes a model so good at a math problem?

Speaker 8:

It's like it's banging its head against it. It's trying a different approach, and then it's like adapting based on what it failed at last time. And I think that's the same kind of behavior you want your agents to have. It's like tries things like adapts and keeps going until it's successful.

Speaker 2:

And that's the humans do this every day. You're booking a flight, you keep hitting an error. It's not which or which form you miss, right? And you're just sort of banging your head against the computer and eventually it says, okay, you're booked, right? I think that's a great call out.

Speaker 1:

Yeah. I mean, there's so many more questions we can go into, but I'm interested in the scaling of RL and kind of the balancing act between pre training RL and inference, just the amount of energy that goes into getting a result when you distribute it over the entire user base. How is that changing? And I guess is is are we post like really big, really big runs? Is this going to be something that's like continually happening online?

Speaker 1:

Or it feels like we're moving away from the era of like, oh, some big development, some big run happened and now we're grouping the fruits of it versus a more iterative process?

Speaker 8:

I don't see why it has to be so, right? I think, like, if you find the right levers, you can really pump a lot of compute into RL as well as pretraining. I think it is a delicate balance, though, between all of these different parts of the machine. And when I look at my role with Jakob, it's just kind of like figure out where how this balance should be allocated, where the promising kind of nuggets are arising from and resourcing those. Yeah, it's kind of a in some sense, feel like part of my job as a portfolio manager.

Speaker 1:

Yeah. That's a lot of fun. Well, thank you so much for joining. This is a fantastic We'd love to have you back in that deeper.

Speaker 2:

Great hanging, Mark.

Speaker 1:

We'll talk to

Speaker 2:

you soon. Peace. Have a good one.

Speaker 1:

Next up, have Shalto Douglas from Anthropic coming on

Speaker 2:

this show.

Speaker 1:

I'm just

Speaker 2:

getting a lot of messages saying why no one cares about AI. Talk about the drama on the timeline. But we do care about AI. We care a lot about AI. But it is a mess out there.

Speaker 1:

Wow. Yeah, the end of the Trump Elon era. I don't know what maybe we have to get some people on to talk it tomorrow or something. Anyway, we have Charlto from Anthropic in the studio. How are you doing?

Speaker 2:

What's going on?

Speaker 11:

Good to see you guys.

Speaker 1:

Hopefully, you're staying out of the chaos on the open house. Don't don't open your time now. Don't open. Sweet child. Everything.

Speaker 1:

Stay focused on the application.

Speaker 2:

Stay focused on the mission.

Speaker 1:

Stay focused on the next training run.

Speaker 2:

We really humanity really cannot afford for any researchers to open x.

Speaker 1:

What a hilarious day. Anyway, I mean,

Speaker 11:

I'm back by twenty four hours,

Speaker 1:

guys. Yeah. How are you doing? What what is new in your world? What what are you focused on mostly day to day and maybe maybe it's just a way of an intro?

Speaker 11:

Yeah. So at the moment focused really hard on scaling RL. And that is the theme of what's happening this year. And we're still seeing these huge gains where you go, you know, 10x compute increase in RL. We're still getting like very distinct linear gains basis for that.

Speaker 11:

And because RL wasn't really scaled anywhere close to how much pre training was scaled at end of last year. Yeah, we have like a basically a gamut of like riches over the course of this year.

Speaker 1:

Yeah. So where are we in that in that RL scaling story? Because I remember the the some of the rough numbers around like GPT two, GPT three, we were getting up into like, it cost a hundred million dollars. It's gonna cost a billion dollars. Like, it just rough order of magnitude, not even from anthropic, just generally, like, what is a big RL run cost or how many are we talking 10 ks H two hundreds or a hundred K?

Speaker 1:

Like, we going to throw the same resources at it? And if so, how soon?

Speaker 11:

Yeah. So I think in Diario's essay at the beginning of the year, he said that a lot of runs were only like a million dollars back in like December. I think you have like DeepSeek v3 and this kind of stuff like R1, which means that with us like at least two ooms just to get to the scale of GPT-four and GPT-four was two years ago. RL is also perhaps a bit more naively paralyzable and scalable than pre training. Pre training, you need everything in one big data center ideally, or you need like some clever tricks.

Speaker 11:

RL, could like in theory, like what the prime interlag folks are doing, scale it all over the world out of And so you're held back like maybe like you're held back far less than

Speaker 1:

you are. Sure. So everyone and their mother has a billion dollars now. There are there are, you know, hundreds of thousands of GPUs getting pumped all over the place. I I I feel like we're not GPU poor as a as a as a society.

Speaker 1:

Maybe some companies need to justify it in different ways, but it sounds like there's some sort of, like reward hacking problem that we're working through in terms of scaling our own. What are all of the problems that we're working through to actually go deploy the capital cannon at this problem?

Speaker 11:

Yes. So I mean, think about what you're asking the model to do in RL is you're asking it to achieve some goal at at any cost, basically. Yeah. And this comes with a whole host of like behaviors, which you may not intend. In software engineering, this is really easy.

Speaker 11:

I'd like to, it might try and hack unit tests or whatever. In much more longer horizon, real world tasks, you might ask it to say, go make money on the internet. And it might come up with all kinds of fun and interesting ways to do that unless you find ways to guide it into following the principles that you want it to obey basically, or to align it with your idea of what sort of best for humanity. And so it's actually it's a pretty intensive process.

Speaker 1:

Yeah. It's a lot

Speaker 11:

of work to find down and hunt down all the ways these models are hacking through the rewards and and and and patch all of that.

Speaker 1:

Yeah. How how are we going to see scaling in the number of rewards that we're RL ing against, if that makes sense? Would imagine that a certain point we unless we come up with like kind of like the Genesis prompt of go forth and be fruitful or something and multiply. You could imagine training runs on just knocking down one problem after another. And is that kind of the path that we're going down?

Speaker 11:

I very much think so. There's this idea in which like, you know, the sort of world becomes an RL environment machine in some because there's just so much leverage to making these models better and better at all the things we care about. And so I think we're going to be training on just everything in the world.

Speaker 1:

Got it. And then does that lead to more model fragmentation models that are good at programming versus writing versus poetry versus image generation or does this all feed back into one model? Does the idea of the consumer needing to pick a model disappear? Are we in a temporary period for that paradigm?

Speaker 11:

I think the main reason that we've seen that so far is because people are trying to make the best of the capital, like we are all still GPU poor in many ways. Okay. And people are focusing those GPUs on the sort of like spectrum of rewards that I think is most important. And I'm a bit of a big model guy. I really do think that similar to how we saw with large pre trained models before with small fine tuned models made it made like had gains over the sort of GP to GPT two era, but then we're obsoleted by GPT-four being generally good at everything.

Speaker 11:

I think to be honest, you're going to see this generalization and learning across all kinds of things. That means you benefit from having large single models rather than specialization or area tuned models.

Speaker 1:

Can you talk a little bit about the transition from or many any differences between RLHF and just other RL paradigms?

Speaker 11:

Yes. So RLHF, you're trying to maximize a pretty deep like lossy signal, Things like air wise, like, what do humans prefer? And I don't know if you've ever tried to do this, judge two language model responses

Speaker 1:

I get prompted for that all the time. Right. And I'm always like, I don't wanna read both of those. I'll just click the one on the left.

Speaker 11:

Exactly. Exactly. And, you click one of the random sometimes.

Speaker 1:

Yeah. Or or I click like the one that just looks bigger or I'll read the first two sentences. But, yeah, I'm not giving straight I'm not I'm not being I'm not doing my job as a as a human reinforcer.

Speaker 11:

Exactly. Human preferences are easy to hack.

Speaker 1:

Yeah. Totally.

Speaker 11:

Environments in the world are much truer if you can find them. So it's something like, did you get your math question right? Is a very real and a true reward.

Speaker 1:

Does the code compile?

Speaker 2:

Right?

Speaker 11:

Does the code compile. Exactly. Did you make a scientific discovery? We got very little rewards right now, but pretty quickly over the next year or two, you're to start to see much more meaningful and long horizon rewards.

Speaker 1:

You're going to see models bribing the Nobel committee to win the Nobel prize. A relevant reward the It's a real nightmare scenario. What about like, there's so many different problems that we run into that feel like the it's it's just really, really hard to divi design any type of eval. The the my kind of benchmark that I use whenever a new model drops is just tell me a joke. Yeah.

Speaker 1:

They're always bad. Or or or even even the latest v o three video that went viral was somebody said, like, stand up comedy joke. And it was kind of a funny joke, but it was literally the top result for joke Reddit on Google, and then it clearly just took that joke and then instantiated in a video that looked amazing. But it wasn't original in any way. And so, we were joking about, like, the RLHF loop for that is like you have an endless cycle of comedians running AI generated materials and then and then, you know, speak, microphones in all the comedy clubs to feedback what's getting laughs.

Speaker 1:

But I mean, that would work pretty well, actually. Yeah. If any comedians wanna hook us up with

Speaker 11:

an RL loop, I mean

Speaker 1:

Yeah. Yeah. But but, I mean, for for some of those less, like, as you go down the curve, it feels like each one gets harder and harder, to actually tighten the loop. We see this with, like, longevity research where it's like, okay. It takes a hundred years to know if you extended a human life.

Speaker 1:

Like, the yes. You could create a feedback loop around that, but every change is gonna be hundreds of years. And so even if you're on the cycle, it's irrelevant for us in the context that we talk about AI. So talk to me about like, are you running into those problems or or or will there be like another approach that kind of works around those?

Speaker 11:

So there are a lot of situations where you can get around this by just running much faster than real time. Like, let's say the process of building like a giant like building Twitter, right? It's something that would take human months, but if you got fast enough and good enough AIs, you could do that in several hours. Like realize heaps of AI agents, they're all building things to spec. And so you can get a faster reward signal in that way.

Speaker 11:

In domains that are less well specified like humor. I agree. It's really, really hard. And this is like why I think in some respects, like creativity is like at the top end of the spectrum, like true creativity is much, much harder to replicate than the sort of like analytical scientific style reasoning. Yeah.

Speaker 11:

And that will just take more time. Know what, the models actually are pretty good at making jokes about being an AI. This feels weirdly fresh. Like everything else is kind of a weird copy of something like it's like, it just it feels like it's derivative. Basically, it's trying to infer what humor is, and it doesn't really understand it.

Speaker 11:

But jokes about being an AI are quite funny.

Speaker 1:

Yeah, I think this also might be, I don't know if it was directly reward hacking, but I noticed that one of the new models dropped and a bunch of people were posting these like 4chan like be me memes and and and they were it seemed like they were kind of hacking the humor by being hyper specific about an individual that they could find information on online. And so you're laughing at the fact that it's like, oh, wow. That is like something that I posted about it. It's making a reference, but it's not really that funny to me. It's other than it's just like, wow, they really did its research.

Speaker 1:

Like, it really knows Tyler Cowen intimately, which is cool. But I didn't find it hilarious. Yeah. Yeah. Yeah.

Speaker 1:

Very interesting. Let let's talk about some sort of deep research product prod projects and products. We were talking to Will Brown and he was saying like AGI is here with some of the bigger models, but the but the time that AGI can feel consistent, it diverges. And so you could be working with someone who's, you know, hundred IQ, but they will consistent for years as an employee or they'll they'll keep living their life. Whereas a lot of these super smart models are working really well and then after a few a few minutes of work, the agents kind of diverge and kind of go into odd paradigms.

Speaker 1:

It feels very not human. It feels like a like just a they're hyper intelligent one way and then extremely stupid in others. What's going on there? What is the what is the path to extending that? Is that more like having more better planning and better better like dividing up the task or or will this just kind of naturally happen through the RL and scale?

Speaker 11:

Yeah, so there's that jaggedness, right? Which is what you're seeing is how we call it. And I think that is largely a consequence of the fact that maybe something like deep research is probably being our role to be really good at producing a report. Yeah, but it's never been our held on the like, act of producing valuable information for a company over a week or a month or like making sure the stock price goes up in like a quarter or something like this, right? Like it, it doesn't have any conception of how that feeds into the broader story at play, it can kind of infer because it's got a bit of world knowledge from the, you know, the base model and this kind of stuff.

Speaker 11:

There's never actually been trained to do that in the same way humans have. So to extend that, you need to put them in much longer running, much like long horizon things. And so deep research needs to become, know, like deep operator company for a week kind of thing.

Speaker 1:

Sure. Is that the right path? Like, it feels like the road might be there's a like the longest running LLM query used to be just like a few seconds, maybe a few And I remember when when some of the reasoning models came out, people were almost trying to like stunt on it by saying like, asked it a hard question and thought for five minutes. Now deep research is doing twenty minutes pretty much every time. Is the path two hours, two days?

Speaker 1:

Or are we going to see more efficiency gains such that we just get the 20 model, the twenty minute results in two minutes and then two seconds?

Speaker 11:

Yes, this is somewhere where like inference in many respects and prioritization becomes really important. So both like how fast is your inference is that literally affects the speed at which you can think and the speed at which you can do these experiments. Also how easily you can paralyze becomes really important. Like can you dispatch a team of sub agents to go and do deep research and like compile like sub reports for you so that you can do everything in parallel? These kinds of like it's, it's both like there's an infrastructure question here that feeds up from the hardware and the chips and this kind of stuff to like designing better chips for better inference in this and all this.

Speaker 11:

And, and an RL question of like, you know, how well can you paralyze and and and all this. So I think we just need to compress the timelines, compress the time compress the time frames, basically.

Speaker 1:

Yeah. So if I'm if I'm like an extremely big model and I'm running an agentic process, like, how how much my hankering for, like, a middle sized model on a chip or, like, baked down into silicon that just runs super fast? Because it feels like that's probably coming. We saw that with the Bitcoin project progression from CPU to GPU to FPGA to ASIC. Do you do you think we're we're at a good enough point where we can even be discussing that?

Speaker 1:

Because I every time I see like the latest mid journey, I'm like, this is good enough. I just want it in two seconds instead of twenty. But then a new model comes out. I'm like, I'm glad I didn't get stuck on that. Right?

Speaker 1:

But but yeah. Like, like, how far away from far away are we from, okay, it's actually good enough to bake down into silicon?

Speaker 11:

Well, there's a question here of baking it down to silicon versus designing a chip, which is like very suit of the architecture that you care Right? And baking on the silicon, unsure. Like, I think that's a bet you could take, but it's a risky one because the pace of progress is just so fast nowadays. And I really only expect it to accelerate. But designing things that make a lot of sense for the sort of transformers or architectures of the future should should make a lot

Speaker 6:

of sense.

Speaker 1:

That's a big gap though. Transformers or architectures of the future. We diverge, there's lot of companies that are banking on the transformer sticking around. What is your view on transformer architecture sticking around for the next couple of years? I mean, look, they stuck around for

Speaker 11:

five years, so they might stick around for a little while. But there's there's different you think about architectures in terms of this balance of memory bandwidth and flops. Right? Mhmm. One of the big differences we've seen here is Gemini recently had actually a diffusion model that they released

Speaker 1:

That was fantastic.

Speaker 11:

The other day. Right? So diffusion is inherently extremely FLOPS intensive process, whereas normal language model decoding is extremely memory bandwidth intensive. You're designing two very different chips depending on which bet you think makes sense. And if you think you can make something that does flops like four times faster than diffusion and like four times cheaper than you always could, diffusion makes more sense.

Speaker 11:

So there's like, there's this dance basically between the chip providers and the architecture, both trying to build for each other, but also like build for the next paradigm. Yeah. It's risky.

Speaker 1:

Do you I I I don't know how much you've how much you've played with image generation, but do you have any idea of what's going on with images in ChatGPT? It feels like there's some diffusion in there. There's some tokenization, maybe some transformer stuff in there. It It almost feels like the text is so good that there's, like, an extra layer on top almost and that it's it's almost like reinventing Photoshop. And and I guess the the the broader question is, like, it feels like an ensemble of models.

Speaker 1:

Maybe the discussion around around just agents and text based LLM interactions shouldn't necessarily be, transformer versus diffusion, but maybe how will these play together? Is that a reasonable path to go down?

Speaker 11:

Well, I think pretty clearly, there's some kind of rich information channel. Even if there are multiple models there, it's conditioning somehow on the other model. Because we've seen before, let's say when know, models use mid journey to produce images, it's never quite perfect. It can't perfectly replicate what went in as an input. It can't perfectly, like, adjust things.

Speaker 11:

So there's a link somehow, whether that's the same model producing tokens plus diffusion. I don't know. Like, yeah, can't comment on what OpenAI is doing there.

Speaker 1:

Yeah. Yeah. Yeah. Are are there any other kind of, like, wild card long shot research efforts that are maybe happening even out even in academia where I mean, this was the big thing with what was his name? Gary?

Speaker 1:

He was talking about I forget what it's called. Symbol manipulation was a big one. Yeah. And and I feel like, you know, you can never count anyone out because it might come from behind, it'd be relevant in some ways. But but are there any other research areas that you think are, like, purely in the theory domain right now that are worth looking into or tracking that, you know, low op low low probability, but high upside if they work?

Speaker 11:

It's a

Speaker 1:

tough one.

Speaker 11:

This is a tough one. But will say it's not a symbolic thing.

Speaker 1:

Please.

Speaker 11:

It's crazy how similar transformers are to systems that manipulate symbols.

Speaker 1:

Sure.

Speaker 11:

What they're doing is they're taking a symbol and they're like converting it into a vector and then they're manipulating and moving stuff like information around across them.

Speaker 1:

Sure.

Speaker 11:

Like, this this whole, like, debate that all transformers gonna represent symbols, and they cannot do this. I think it's it's yeah. It's not

Speaker 1:

So Gary Marcus underrated or overrated? I guess.

Speaker 11:

Overrated.

Speaker 1:

Yeah. Yeah. But but I mean, if if you if you twist the, if if you twist it so much, you wind up with saying like, well, really, the the transformer fits within that paradigm. And so maybe it's, you know, it's, you know, it like the rhetoric around it being a different path was maybe false the whole time. Yeah.

Speaker 1:

Something like that. But but I but as as I remember that debate, it was really the the idea of compute scaling versus almost like feature engineering scaling. And and will the progress scale with human hours or GPUs essentially? And that has a very different economic equation. And it and it feels like there's been there's been some rumblings about maybe with a data wall, we'll shift back to being human labor bound.

Speaker 1:

But do you think that there's any chance that that's relevant in the future, or is it just algorithmic progress married with bigger and bigger data centers in the future?

Speaker 11:

So I'm pretty bitter lesson built the sense that I do think removing as many of our biases and our clever ideas from the models is really important, just freeing them up to learn. Now, obviously there's clever structure that we put into these models such that they're able to learn in this extremely general way. But I am more convinced that we will be compute bound than we will be like human researcher out, human researcher bound on this kind of thing. Like we're not gonna be feature engineering and this

Speaker 1:

kind of stuff.

Speaker 11:

We're gonna be trying to devise incredibly flexible learning systems.

Speaker 1:

Yeah, that makes sense. On the scaling topic, part of I I I I haven't part of my, like, worry is that the the oomphs get so big that they turn into these mega projects that are that at a certain point, you're bound by the laws of physics because you have to move the sand into silicon chips and you have to dig up the silicon and it's There's certain sand. Yeah. There's only so much sand and like the the math gets really really crazy just for the amount of energy required to to move everything around to make the the big thing. Where are you on how much scale we need to reach AGI?

Speaker 1:

How whether or not we will see, like the laws of physics start acting as a drag on progress because it certainly feels exponential. We're feeling the exponentials, but a lot of these turn into sigmoids. Right?

Speaker 11:

So I think we've got what, like two or three more oooms before before

Speaker 1:

it

Speaker 11:

gets really hard. Leopold has this nice table at the end of his situational awareness where I think like 2028 or something is when under really aggressive timelines that you get to 20% of US energy production. Yeah, it's pretty hard to go exponentially beyond 20% of US energy production. Now, I think that's enough. Every indication I'm seeing says that's enough.

Speaker 11:

Now, there might be some complex, you know, data engineering, we're one engineering, this kind of stuff that goes into lots of, there's still a lot of algorithmic progress left to go. But I think that with those extra rooms, we get to basically a model that is capable of assisting us in doing research and software engineering.

Speaker 1:

Yeah, which is the beginning of the self reinforcement backward. Yeah. Interesting. Is that just a coincidence? Like, this feels like one of those things this feels like one of those things where, like, the moon is the exact same size as the sun in the sky.

Speaker 1:

It's like, oh, it just happens that AGI happens within this time. Like, hey. Woah. Did you have you unpacked that anymore? Because it feels convenient.

Speaker 1:

Not not to, you know, I I know less that. There's a there's a lot of

Speaker 11:

weird conveniences or, like, weird it's a good sci fi story,

Speaker 1:

let's say. Totally.

Speaker 11:

You know, we've got a, you know, Taiwan in between China and The US, and it produces the most valuable material in the world that's locked between the two

Speaker 2:

Incredible plot. Yeah.

Speaker 1:

Yeah. Bad for the people that don't think of that don't believe in simulation theory. It really feels like it's a scripted program. Fascinating. Talk to me more about getting to an ML engineer in AI and and kind of that reinforcement.

Speaker 1:

I imagine that you're using AI codegen tools today and and and Thropic is broadly and everyone is. But but what are you looking for and what are the what's the shape of the the spiky intelligence? Where do they fall flat? And what are you looking to kind of knock down in the interim before you get something that's just like go?

Speaker 11:

Yeah. So I mean, we definitely use them. The other night I like I was a bit tired. I asked to do something just sat watching it in front of me working for half an hour.

Speaker 1:

It was great.

Speaker 11:

Nice. It was truly weird experience, particularly when you look back a year ago and we're still copy pasting stuff between a chat window and and, you know, a code file.

Speaker 1:

Yeah.

Speaker 11:

What I like meters evals for this kind of stuff. So they have a bunch of evals where they measure like the ability to write a kernel, the ability to run a small experiment and improve a loss. And they have these nice progress curves versus humans. And I think this is maybe the most accurate reflection of like what will take for it to really help us at doing progress. And there's a mix here, like where they're not so great at the moment is like large scale distributed systems engineering, right?

Speaker 11:

Like debugging stuff across heaps and heaps of accelerators and like the way the feedback loops are slow. And you actually like if your feedback loop is like an hour, then it's you spending the time on doing something. Yep. Feedback is fifteen minutes. And

Speaker 1:

for context there, the hour long feedback loop is just because you have to actually compile and run the code across everything that takes that long. Exactly like

Speaker 11:

spin up all your machines or you need to like you need to like run it for a while to

Speaker 1:

see if something's

Speaker 11:

going to happen. Like at that point in time, you're still cheaper than the chips. Sure. So you're you're you're sort of it's better that you do it. But for things like your kernel engineering or for like, actually even just understanding these systems, incredibly helpful.

Speaker 11:

Like one thing I regularly do at the moment is in parts of the code base in like languages that I'm unfamiliar with or some stuff like this, I'll just ask it to rewrite the entire file with comments on every line. Game changing. It's like

Speaker 1:

Comments on every line.

Speaker 11:

Yeah. Or just come through like thousands of files and explain how everything interacts to me, draw diagrams, this kind of stuff. It's really, yeah.

Speaker 1:

Yeah. How important is a bigger context window in that example you gave that feels like something that's important and yet it I I I just naively like Google's the one that has the million token context window. I imagine that all the other Frontier Labs could catch up, but it seems like it hasn't been as much of a priority as maybe like, the PR around it sounds like. Is that important? Should we be go should we be driving that up to, a trillion token window?

Speaker 1:

Yeah. Is that is that just gonna happen naturally?

Speaker 11:

There's a nice plot in the Gemini 1.5 paper Mhmm.

Speaker 1:

Where they show the, like,

Speaker 11:

loss over tokens as a function of context length. And they show that the loss goes down quite steeply, actually, as you put more and more and more like a code base in the context, you get better and better and better predicting the rest.

Speaker 1:

Yeah, that makes sense.

Speaker 11:

On the context length, it's cost. You know, the way transformers work is that there's, you have like this memory that is proportional, the KBCache is proportional to how much context you've got. And so you can only fit so many of those into like your various chips and this kind

Speaker 2:

of stuff.

Speaker 11:

And so longer context actually just costs more because you're taking up more of the chip and you're sort of like, you could have otherwise been doing other requests basically.

Speaker 1:

So bringing it back to the custom silicon, is that a unique advantage of the TPU? Is is that something that Google has has thought about and then wound up to put themselves in this advantage position? Or is it a durable advantage even?

Speaker 11:

Yeah. So TPUs are good in many respects, partially because you can connect hundreds or thousands of them really easily across really great networking. Whereas only recently has that been true for GPUs.

Speaker 1:

Like NVLink?

Speaker 11:

Yeah, with NVLink and like the NVL 72 stuff. So it used to be like eight GPUs in a pod and then like you connect them over worse interconnect. And now you can do 72. And then it breaks down. With Google TipViews, you can do like 12,000 over really high bandwidth interconnect in one pod.

Speaker 11:

And so that is helpful for things like just general scaling in many respects. I think it's doable across any chip platform, but it is an example of like somewhere that being fully vertically integrated a in a benefit.

Speaker 1:

Yeah, that makes sense. Talk to me about Arc AGI. Why is it so hard? It seems so easy.

Speaker 11:

It does seem easy, doesn't it?

Speaker 1:

Well, it certainly seems like more more evaluatable than tell me a funny joke. Right?

Speaker 11:

Yeah. And I mean, I think if you are old on arc AGI, then it would you'd probably get superhuman at it pretty fast. But I think we're all trying not to RL on it so that it functions as like an interesting held out.

Speaker 1:

Sure. Okay. Wait, is that just an informal agreement between all the labs basically?

Speaker 11:

We're trying to have a sense of honor between us.

Speaker 1:

That's good.

Speaker 2:

Sense of honor.

Speaker 1:

That's amazing.

Speaker 2:

How many people on earth do you think are getting the full potential out of the publicly available bottles? Because we're now at a point where we have, you know, billion plus people are using AI almost daily. And yet I have to admit, my sense would be it's maybe like 10,000, 20 thousand people on the entire planet are getting that sort of full potential. But I'm curious what your assessment would be.

Speaker 11:

Yeah, I completely agree. I mean, I think that even I don't get the full potential out of these models often. And I think as we shift from you're asking questions and it's giving you sensible answers to you're asking it to go do things for you that might take hours at a time, and you can really like paralyze and spin that we're going to hit like yet another inflection point where even less people are like really effectively using these things because this basically doesn't require you to like, it's like a like StarCraft or DOTA. Like it's going be like your APM of like managing all these agents and that's totally process.

Speaker 2:

Yeah. I think Starcraft is such a good example. You think you're just absolutely crushing it and then you realize like there's an entire area of the map you're just getting destroyed. It's such a good it's such a good comp.

Speaker 1:

That's great. Anything else Jordy?

Speaker 2:

I think that's it on my side. I mean, I would like this to be an evolving conversation.

Speaker 1:

Yeah, this is fantastic. We'd love have Absolutely.

Speaker 11:

It's really fun.

Speaker 1:

Yeah. We'll talk to you soon.

Speaker 2:

Cheers, Shelton. Have a good one.