Wil and Matt discuss tech, startups, and building really cool things with AI. Sometimes joined by (actual expert) friends.
Wilhelm (00:00.13)
me your startup. This is like a
Matt And Sunil (00:02.973)
We are seeking 1.5 million over 10 for AI chat with dogs. That would be dope by the way. There are many dogs. It's all the dogs. Yeah. It's done. Okay. Do you want to switch it with the light right behind us?
Wilhelm (00:11.588)
But how big is the market?
Wilhelm (00:16.77)
All right, granted.
I'm committed. All right, let's roll the intro.
Matt And Sunil (00:32.204)
and Sunil. I just, I keep forgetting that your soundtracks are a little bit out there. Hey, hi.
Wilhelm (00:34.134)
Answer now, what up?
Wilhelm (00:41.37)
Yeah, you were cringing hard in that intro, but we love it.
Matt And Sunil (00:46.222)
I mean, is also like there are days when I listen to music like that. what's the one I was listening to lately? I don't know. Like I've been on Instagram lately, which means I've been listening to a lot of TikTok music because all of TikTok just ends up on Instagram. And it's okay. Yeah.
Wilhelm (01:00.506)
Mmm.
Wilhelm (01:07.066)
What's going hard on TikTok music right now?
Matt And Sunil (01:13.166)
Dude, don't know. BB no. Who the hell is BB no? Is that BB no dollar something? Left, right, left, right, left, right, go. No, no, no, no. Dude, it's all too old for me, Like I spent half an hour on Instagram and then I have to listen to a half hour of Neil Young just to like go all the way back to my actual age group.
Wilhelm (01:28.154)
That was very good.
Wilhelm (01:36.537)
Yes.
Just to calm down, where are you both at the moment? It looks like not like London somehow.
Matt And Sunil (01:44.462)
It's London. It is London. on the bank side. So we're on the South Bank.
Wilhelm (01:49.848)
that where the executive, wait, just the normal Cloudflare office.
Matt And Sunil (01:54.644)
This isn't the office, this is his hotel room.
Wilhelm (01:57.914)
okay, that makes sense. That makes sense. Yeah.
Wilhelm (02:04.022)
I was wondering if there was a separate executive Cloudflare office that was in a different undisclosed location.
Matt And Sunil (02:11.81)
Where the nobility doesn't have to mess with the unwashed masses and we would be considered executive. see. Okay. All right fairness I see I see
Wilhelm (02:16.471)
Exactly.
That's right. Yeah, yeah, yeah. It's based on how many Twitter followers you have. That's who decides whether you're an executive or not.
Matt And Sunil (02:27.656)
We'd be doing pretty well then. We'd be doing pretty decent. Dude, you're back on Twitter!
Wilhelm (02:34.24)
Not really, no. Well, I was going to... Well, was Easter Sunday yesterday, so happy Easter. And Lent is over, which means my Twitter fast in theory is over. But I don't know, it's been nice without Twitter, you know?
Matt And Sunil (02:40.93)
Peace.
Matt And Sunil (02:49.56)
You're not missing anything. Someone tried to cancel my ass and that went, that was interesting.
Wilhelm (02:49.806)
might just continue it.
Wilhelm (02:56.292)
we did talk about that. That's been actually great. Matt's just been catching me up on the highlights of any Twitter drama whenever we chat. And honestly, that's perfect for me.
Matt And Sunil (03:07.136)
It was unexpected. Also, the thing I really dislike about the whole thing is that everyone started talking about what a nice guy I am and I shouldn't be bullied. I was like, I'm not nice motherfuckers. Who came up with that shit? No, no, no, no, But it was...
Wilhelm (03:12.538)
Mm.
Wilhelm (03:29.198)
What was the timeline like from your, did you just wake up and suddenly you were like, what is happening? Because that's kind of what it seemed like.
Matt And Sunil (03:33.902)
So I'd so I'd flown back from Spain and I was super tired and I was napping and I was napping in my little study room, my little office study thing. And I woke up to go to bed at something like 11. I was like, let me look at Twitter. And the first, actually I got a couple of messages from actual CloudFlare executives. were like check Twitter. I checked Twitter and this had gone down and I was like, shit.
That started a crazy 48 hours. Like I immediately responded to that and I was up until 3 that night I think.
Wilhelm (04:12.045)
Lovely. Damn. Is it all settled now?
Matt And Sunil (04:16.3)
I mean, yeah, I guess. We get for beers with them this week. We are going for beers for them. Yeah, that'll be fun. Actually, not just that. Cloudflare is sponsoring the speakers dinner for AI engineer, and speakers are going to be there. That'll be fun. Have you got mega FOMO
Wilhelm (04:33.921)
For AI engineer London, kind of to be honest. Yeah, that'd be nice to be around for that.
Matt And Sunil (04:39.692)
Everyone's showing up for it. And that's the best part of it. So I don't know if you notice, it's like an eight track conference. Yeah, so which means like everyone is showing up in London, the way. Name the company, name the dev rel, name the engineers. They're all here. Like, Gurgly is doing an open-claw Q &A with Steinberger. Steve Ruiz, of course, is doing a steel-draw thing, cursor.
Wilhelm (04:47.357)
wow.
Wilhelm (04:57.081)
The FOMO is increasing.
Matt And Sunil (05:08.094)
OpenAI, like everyone is there and in the middle of that like me and Matt are going and talking about like isolates and stuff. Yeah
Wilhelm (05:17.145)
Nice, nice. Is it on the start on Wednesday or something? yeah, when does it start?
Matt And Sunil (05:22.744)
think there's some workshops on Wednesday and then talks the Thursday and Friday. That's right.
I don't know what the workshops. I keep getting emails, the emails, do you find them, get ready emails? Oh yeah, yeah. They're quite like. So they're getting progressively aggressive. I was gonna say that, it's not just me, they're quite aggressive. They're like, it's three days left. We are 100 % booked out. Like basically like being like, don't try bring a friend. I guess, but also how can they stop you? It's a conference. Yeah, I don't know. I mean, I'm just saying you can show up and we can get you in if you want.
Wilhelm (05:55.267)
Ha ha ha!
Matt And Sunil (05:59.98)
You do know this is from my... Is this live? No, no, no, it's not live.
Wilhelm (05:59.982)
It's funny because when AI engineer was on NSF, when I was on an NSF, I had a ticket and I tried to go in and I think the guy at the entrance wasn't briefed properly because he looked at my badge and said, no, you can't go here. But turns out he was at the entrance for the whole conference. he was, he wasn't like guarding some special area. I just couldn't get in. So I just left. And then I didn't come back until the last day out of resentment.
Matt And Sunil (06:30.657)
no.
Wilhelm (06:30.777)
So think even if you have a ticket, sometimes you can't get into AI engineer. Now I'm sure it was a one-off. I'm sure someone told him quickly.
Matt And Sunil (06:35.768)
Well, it's nice being a... It's nice being a speaker then, I guess. Yeah. It's gonna be fun. It's gonna be really
Wilhelm (06:40.857)
No, I hope you have a great time.
Matt And Sunil (06:44.504)
We're gonna have some beers with interesting people. Like we've been, you especially have been, I guess like teasing beers with Mario. I think that's gonna be really fun. So I'm so excited about that. Mario, I, Mario of course makes a coding agent called Pi. So you can imagine why I'm so.
attracted to the idea of it but I love the philosophy. had a couple of calls where I had one call with him. You've spoken to him a couple of times. Wonderful dude. Him and Armin are basically like a... Will's a big Armin fan. Yeah, yeah. I love them both. They bring a Urium...
Wilhelm (07:24.417)
I feel like, yeah, Peter was part of that trio as well, feel like. I feel like was like the three of them hacking on stuff like end of last year and now they're all doing incredible things.
Matt And Sunil (07:30.018)
Yeah.
It's nice to have a weird European sensibility in an otherwise American dominated like ecosystem. So I love it. So I think we're grabbing beers with them and that'd be fun. Yeah. So why do you have me on your stupid podcast? What do you want to talk about?
Wilhelm (07:40.227)
Totally, yeah.
Wilhelm (07:49.411)
That's awesome.
Wilhelm (07:52.983)
Yeah, Matt, I feel like you had tons of stuff you said you wanted to get through. do you want to, what's the agenda?
Matt And Sunil (08:00.366)
Dude, have you caught up on what's happened on Twitter the past like, couple of weeks?
Wilhelm (08:05.994)
I mean, there's always something stupid going on. What's, what's, what do mean?
Matt And Sunil (08:09.996)
Now, Kwon, tell me what you've seen.
Wilhelm (08:13.132)
It's funny because my feet, my algo has reset and it's like only 10 % tech Twitter now and the rest is just like silly videos that are not tech related at all. And I mean, yeah, there was some beef, there was some latest cloud code harness stuff. I don't know, man. It's not that interesting. Okay, okay. That's the...
Matt And Sunil (08:29.228)
That's big one right now, by the way. Over the last couple of days, they're just locking out anything that's not cloud code running in the way they expect.
Wilhelm (08:39.522)
So the thing that I'm confused about this still is whether it includes the agent SDK or not, because my agent is just fully on top of cloud agent SDK and it's unclear to me whether that's okay or not okay.
Matt And Sunil (08:48.44)
Well.
Matt And Sunil (08:53.09)
There's a tweet from Matt Pocock where he's just like, it's obvious. It makes so much sense about what they allow. yeah, so you can't use OAuth tokens with other harnesses. You can use it with Claude code. You can use it personally with Claude-P in your own software. It's not clear whether you can distribute it. Companies can't use it. And if you change the system prompt to have...
Like I think now if you try to say either open claw or open code, you're blocked like immediately. They also, I think do traffic or prompt analysis. So if they detect you're doing the wrong thing, they'll block you. Like earlier today, it locked itself out when it realized it was analyzing cloud code itself.
Wilhelm (09:43.677)
seriously?
Matt And Sunil (09:44.558)
Yeah, yeah, yeah. So I saw, I think I saw a couple like that. I Reese has been playing with it a bunch, like trying to find the limits. Yeah.
Wilhelm (09:52.697)
So there were like two emails that went out, right? Like one was the one that went to the cohort of everyone they were confident was using one of the band harnesses. And then there was a second one to everyone, I think, that was like, hey, here is a change that's happening. FYI, if you happen to be using the harnesses, that now counts as extra usage. And I didn't get the first one. I feel like, so my agent, it all just uses the Claude agent SDK, which I think under the hood just kind of shells out to Claude dash P. So I think it's fine, but who knows?
Matt And Sunil (10:21.87)
I didn't get the first one either and I'm definitely on extra usage now so I'm not convinced.
Wilhelm (10:22.21)
We'll check back in next week, guess.
Wilhelm (10:27.914)
Okay, think. Should I tell Chad to kick off a heavy job and then we'll see if I'm an extra usage too?
Matt And Sunil (10:33.902)
Wait, so Will's got an open call called Chad. It's so good. Can you show us your app again? I'm actually like more excited about the app you built for it.
Wilhelm (10:39.894)
It's...
Wilhelm (10:44.402)
Okay, yeah, let me see if the app has anything new worth talking about. interesting. It's getting a 400.
Matt And Sunil (10:53.966)
That's right. Right now they're having an incident with logins. yeah. At this moment. You have to re-auth. Yeah.
Wilhelm (11:03.448)
I need to ask Chad how I can do that in the app. No, but I think this is, don't, yeah. I mean, it's just.
Matt And Sunil (11:08.782)
because I was getting 400 as well. Maybe not 400, 4XX, something, some four style error.
Wilhelm (11:17.784)
Yeah, I think everyone should have a mobile app for their agent. I think actually one thing that's also really under explored is like everyone gets to this point, right, where their agent does all this stuff and it's just kind of hard to review. So I think like one of the cool next frontiers will be like, how can you put all the work that your agent does in whatever shape it might take, right? Whether it's like buying a new, you a new supplement based on your health data or actually some work code or whatever. You want like a really good review workflow that is very mobile native.
like a kind of like Tinder style approved decline, like something that you can do from your phone. So when you go to the toilet with your phone, you're not looking on Twitter, you're just like going through the review queue and being like, okay, cool, no, yeah, a little bit of feedback. I think that'll be like the next thing that needs to be built into this app. I don't have that yet to be clear.
Matt And Sunil (12:04.622)
Can you make a to-do list that does? Isn't that basically that? This is the... The to-do list that does. This is the agent that makes the to-do list. Dude, do you realise how dystopian this sounds? Like, is my entire career, like, eight hours a day going to be reviewing clanker outfits? yeah, On the toilet. We also get to have a little bit of inspiration of what we would like to build, I think. I think that's not going to get taken away from us.
Wilhelm (12:22.07)
On the toilet. Yes.
Matt And Sunil (12:34.616)
for the foreseeable future. And when that does, then that becomes horrific. Like that is dystopian as. If you have no autonomy on what you get to review, that's tough.
No? You're not sold? I feel like that's when people start falling into the psychosis thing, right? Because they're like, they're like, Claude told me to do this. Okay, so Claude is now doing this and now I'm watching Claude do what it told me to do. Well, I'm watching Claude do what it told me to tell it to do. Right. Now it gets to the end of it. so I'm reviewing it. Claude, how do I review this? okay. It tells me to do it to make these changes. Right. Claude made these changes.
And then you're like, you've completely shelled out any type of neuron activity. So this is my problem. Like I almost feel like my brain is starting to like calcify. And the reason is, okay, so I'm working like really hard right now. First of all, that itself makes me unhappy. I was hoping I'd be working less at this moment, but no, I'm like going full on every day. And unfortunately what's happening then is I'm setting a new expectation with my team, with my managers, all of that. So come.
like in three months, if the model's not smarter, it'll look like I've like gone static. I, so what is my job, bro? Like what am I doing? And the, yeah, which takes me to the second thing. Right now, any success I get with this code is because I'm still suspicious. Okay? Like I watch reasoning traces all the time because every three minutes, four minutes, I'm like, whoa, whoa, whoa, whoa, on, hold on. Like you need to do it this way.
but I can feel the laziness kicking in, bro, where I'll be like, yeah, fucking do it. Someone will file an issue and I'll say bonk. We have a thing in GitHub and our internal thing called bonk, where we say bonk review this bonk. Bonk success. Yeah, so my workflow is going to be like, and thankfully Cloudflare doesn't like judge your output based on tokens, but if they do, they'll be like, wow, Sunil's killing it, because every day I'm like, bonk, review this.
Wilhelm (14:32.961)
Hahaha
Matt And Sunil (14:45.048)
Think hard about edge cases. So I've won like another $20 in tokens by saying that. I was just telling you, one of the things this has done is I have no time for side projects now. At a time when I would be killing it, man. I have a side project I'm building, which is my dream party game. Have I told you about this? It's basically Clue, but with AI. It's Murder Mystery.
So it generates a full scenario. One person becomes the detective, one person becomes the murderer. And then you take turns figuring out who the murderer, et cetera, as you can ask questions. the thing I'm trying to build is so that it's not all done by AI. How do you make it like a party game? And I've done this thing where it generates the images of the characters, and it actually generates realistic scenarios where it doesn't hallucinate the thing.
And I want to build this so badly because I love who done it and I want to build software that brings people together. For people who are listening to this in audio outside the camera view, I'm doing a slow jerk off motion. I just, and I, what am I doing reviewing Clanker code all day at the.
Wilhelm (15:51.954)
Hahaha
Wilhelm (15:58.2)
Thank you. Yeah, yeah, I was laughing because of that.
Matt And Sunil (16:09.378)
You are having a bunch of fun with Gladstone, right? Like you built yourself a little fun thing. That sounds nice. Yeah, I've got my little clanker. It's doing well. Although you get a bit cocky sometimes. You're like, I can build, I can build this. I can do this. I can do this. I can do this. And then just like go off and do it and do it and do it. And there is still like, I hate to say that the guardrails, the guardrails are only what we put there for ourselves. But for instance, like Thomas, I'm pretty sure he killed his SD card.
Wilhelm (16:09.771)
I think if you're
Matt And Sunil (16:39.211)
on his Raspberry Pi. He was trying to re-implement DD and Zig.
Wilhelm (16:45.879)
Is it because he plugged the kettle in again?
Matt And Sunil (16:49.582)
plug the kettle in. What? Yeah, his entire thing is running on a raspberry pi right now, right? Yeah.
Wilhelm (16:55.019)
You're telling me that when he plugs the kettle in, he has to unplug his Raspberry Pi or something like that because the... Or his website goes... His blog goes down.
Matt And Sunil (17:01.582)
Oh, no, no, that wasn't that. was his, his partner was unplugging his Raspberry Pi when she wanted to plug the kettle in. Yeah, yeah, yeah. It was, it was using the kettle slot. No, that was his mismanagement. That wasn't like, there was no like a. He doesn't have redundant power backup. Yeah. Yeah. He needs to get that. Oh, he needs to put it in like an empty socket. It's not that complicated. Like hanging off the wall near the washing machine or something. Yeah.
Wilhelm (17:10.559)
Alright, okay.
Wilhelm (17:20.651)
Hahaha
Wilhelm (17:26.866)
Hahaha
Matt And Sunil (17:31.278)
Was he really keeping it right in the kitchen? Yeah, yeah, yeah. Like, I'm pretty sure on like the side. Yeah. He has some, he has a good set up actually. I'll talk about his sound because mine is like a full agent loop, right? And it has like all of the PI tools. Um, and it has a few extra tools for like acting on the external stuff. We talked about this. It's got like code mode for acting on external stuff. So like GitHub or discord, like the way it interacts with me is through code mode. And I think that works pretty well because I can do permissioning quite well.
But Thomas, he's just been like, the tools are all I need. I'm just going to use Claude and just MCP the tools into normal Claude desktop or Claude mobile. So he's got his persistent sandbox running in the background of every single Claude thread because his sandbox is his pie. So he's he's like offshooting the tool, the actual agent loop to Claude. And then he's just doing the...
Wilhelm (18:16.759)
I see.
Wilhelm (18:24.372)
no way, interesting.
Matt And Sunil (18:30.947)
the
Wilhelm (18:32.023)
So he puts the tools into the cloud.ai chat. Yeah, interesting.
Matt And Sunil (18:35.788)
Yeah. Yeah. And he's like, I love the app. The app's great. Let's just use the app. Let's just give it persistent access. And so like Claude can do all of the bash, can do like basically everything on his Raspberry Pi. It did mean that he made like a very scary MCP server, which is give bash access to this computer. So he had to make that properly. And yeah. actually, actually uncovered a bunch of stuff. was, he wanted me to do some like.
Wilhelm (18:40.171)
Yeah, yeah,
Wilhelm (18:54.059)
Yeah.
Matt And Sunil (19:03.662)
vulnerability testing on the MCP server because he's using my auth libraries and Well, all flibers we both work on actually. Yep, and he We basically realized well, knew this but Thomas didn't really realize I'm not sure a lot of people realize when when DCR is enabled like dynamic client registration Which basically all MCP servers enable this thing called DCR, which is part of the old spec that means that anybody Anywhere in the whole world can register a client
on your MCP server, even if they don't have like authority then to like get a token and get a code and sort of code for a token and like actually get access. So, so I registered a client on his MCP server and then I sent him a phishing link that Rick rolled in and the phishing link went through an auth flow that he was already signed into. no. And so I got auth codes from a phishing link.
Wilhelm (19:52.661)
Nice.
Wilhelm (19:58.359)
That's very good. You got off credits through your own client through his like token. Yeah. Through his session. That's pretty good, Matt.
Matt And Sunil (20:04.812)
Yeah, because I knew he was already signed in. So he was telling me about this thing and then he me the link and said, try and break into this. so I made an auth client for his server and sent him a link to authorize my auth client. And like, if you send it to someone who's already pre-authed, like, you know?
Wilhelm (20:24.054)
That's very good. I think in this day and age, if you're not prompt injecting or pwning your friends' agents, it's really discourteous. Like, we all need to be doing this.
Matt And Sunil (20:32.45)
Well, you need to try and do a bit more to mine. All I get from yours is setting up a...
Wilhelm (20:40.266)
Have I committed a faux pas by not talking to your agent enough?
Matt And Sunil (20:42.702)
No, you set up like a podcast cron job thing so we can keep track of our ratings and that's it. Where's the rest?
Wilhelm (20:50.422)
Hey, don't tell people this. We don't care about our ratings. Okay, alright.
Matt And Sunil (20:58.232)
You still have no, every week it's like, you still have no reviews on Apple, on Apple podcasts. That's the thing that's letting you down.
Wilhelm (21:04.32)
Actually, yeah, we have so many reviews on Spotify, but we need some Apple Podcast reviews. Although, I actually, feel like Spotify must have more people using it now than like Apple Podcasts, right? It's like, it's just much nicer experience.
Matt And Sunil (21:13.966)
Yeah, I need to check out analytics. Yeah, if you are listening to this podcast, give us a review, preferably a good one.
Wilhelm (21:18.304)
Okay.
Wilhelm (21:21.62)
I'll have Chad try and inject more stuff into Gladstone.
Matt And Sunil (21:25.996)
Wait, are you gonna give Chad a Discord MCP to interact with Gladstone?
Wilhelm (21:33.27)
Sure. Sure.
Matt And Sunil (21:37.646)
Sure, no you're not. How would you get it to do it? How would you get it to do it?
Wilhelm (21:39.382)
No, maybe, I don't know. We'll see. Maybe, I mean, I control both sides of the stack, right? Because I can talk to Gladstone myself. So I'll just set up some kind of webhook sync and then pwn directly. No discord needed.
Matt And Sunil (21:56.91)
Crazy, crazy.
Wilhelm (21:58.187)
Wait, Matt, you wanted to talk about like this TBPM thing as well,
Matt And Sunil (22:01.626)
yeah, TBPN got acquired. Yeah, what is that about? I don't understand it. So I saw some comparisons of TBPN being acquired by OpenAI as similar to Facebook acquiring Instagram before they IPO'd or Google acquiring YouTube before they IPO'd. Those are very different. They're not real platforms. Those are the Bezos acquiring like...
Wilhelm (22:27.019)
Ha!
Matt And Sunil (22:29.494)
Washington Post but maybe but it's like it's a particular move you do before you're acquired before your IPO as like a way of potentially a way of like Creating sent like it's intimate. Yeah creating narrative for an IPO potentially I have no idea that we know do we know that some has been trying? Been pushing for an IPO And when I say we know we speculate because who the
Yeah, I think that's a wild one.
Wilhelm (23:01.204)
It is very well run. Yeah. think I feel like there's, it, what I find crazy is just a story, right? Like imagine you and your friend start this like show, you know, and the main thing you do differently is that you actually are interested in what the people coming on have to say. And you're not like asking questions that are at least to a tech crowd. Right. I feel like you see like some executive, even like your CEO go on like Bloomberg TV and the questions that the Bloomberg people ask are just.
Matt And Sunil (23:24.332)
Yeah.
Wilhelm (23:30.986)
They're in like a completely different world. It's like the completely different way of thinking about everything. And it's just not very interesting to at least a tech audience. And these guys, they seem to like their guests, you know? And then 12 months later, you get acquired for in the low hundreds of millions. Like, are you kidding me? That's just wild.
Matt And Sunil (23:41.036)
Yeah, yeah, yeah.
Matt And Sunil (23:48.654)
don't know if was what they did differently. Do you think it was? I think what they did differently was they recorded every day for three and a half hours a day, every day, and they were so on the money. Like, if you think of the people they had on, they've had everyone on now. had so much money gone 24 hours after it blew over. They killed.
Wilhelm (24:17.181)
Hahaha
Matt And Sunil (24:18.634)
It's insane. think that's like my, my god. Yeah, they, yeah. But they also talk quite a lot about not wanting to get the scoop. They want the thing 24 hours after it happens because they want the discourse. They don't want the initial scoop. They want to be able to like dissect a little bit at least. Okay, but I still don't understand why OpenAI acquires them. Well, OpenAI are famously not good at narrative. That's true.
Like famously not good at narrative. Like Anthropic have, is it, what's his name, Sam McAllister? Like the guy, he's just so good, like beautiful pictures, paints a lovely image of the company. And they have loads of other people who do branding and stuff. But openly, famously not good at this. Like now there is some... But they claim they're going to keep it independent. Like that's the whole point that...
They are like, are not going to become shills. There's no way they can't. But that's also impossible. Like, Bun is independent, and yet they're building stuff for code. I guess, yeah. I guess like a huge amount of their life. OK. Independent is a, I guess, independent.
Wilhelm (25:25.449)
Yeah, all the takes seem to be that it's for influence. Like, that's the reason to do it. And especially it's, yeah.
Matt And Sunil (25:28.396)
Yeah. Independent to me means that they can't pre-record it and get open-eyed lawyers to cut bits out of it. Like that's like the lowest bar, right? Yeah. I think that's the bar they're going to meet. Everything else above that is like... Okay. Yeah. I wonder.
Wilhelm (25:47.318)
But I feel like Matt, you were inspired by this, that we should improve our quality. Right? Was there something like that? Do you feel like we should prepare more, like we clearly have not done for this episode at all, or do it more like regular? How did it affect you personally, rather than what it might mean for OpenAI?
Matt And Sunil (25:52.399)
yeah, you guys want to be acquired for a hundred mil by... We're gonna get...
Matt And Sunil (26:12.672)
no, I think like there is a distribution. I guess it's easier. The more I... Okay, so like a month ago or something, whenever it was, when we released the code mode blog, like my Twitter absolutely blew up. It was kind of insane. And I guess I never really...
thought what I would do if my Twitter blew up like that? Has it changed how my Twitter's posted? No, it just means that more people get to see the random-ass stuff that I put out there. It's kind interesting to see who likes my Twitter stuff now, because it's super varied. Really, really random. they're sometimes people that I know, which is kind cool. I feel like more reach is better in that way.
Wilhelm (26:49.343)
Mm.
Matt And Sunil (27:00.878)
It's it's okay. It's fine for us to like blather on and chat shit But it's also kind of nice for other people to hear that which I had recently Yeah, it was a family friends dinner Mate of her parents works at Google and had listened to our podcast. shit. Yeah Okay, getting reach pretty high up at Google as well. I like it
Wilhelm (27:23.059)
It turns out that out of the 100 people who listen, it's all the most important people in the world. Aren't we lucky? No, that's sick. It's the decision makers. Yeah, yeah.
Matt And Sunil (27:28.994)
Hahaha
Matt And Sunil (27:38.526)
Yeah. Yeah, I don't know what's going on with Google at the moment. You know, they pitched me their A2 UI. It's kind of like their version of JSON rendered generative UI. I have thought on generative UI. Do you want to? Let's hear it. Do want to go? Yeah, sure. You have thoughts on generative UI. I don't have thoughts, but I can feel it's coming. It feels weird for me to say my thoughts on generative UI when you're sat next to me. Why? I haven't done shit in generative UI.
Wilhelm (27:40.457)
Nice.
Wilhelm (27:53.747)
Let's hear it.
Matt And Sunil (28:06.6)
But like I'm getting ready to take credit for whatever you say right now. I'd be like, yeah, that sounds about right I've been thinking about it. Okay. Well, my thoughts are we did so JSON render we talked about JSON render before well, I can't Okay, well JSON renders this like declarative generative UI framework where you generate JSON then you map your JSON to some like Component or some HTML or something like that? And it's like a way that LLMs can build applications like basically
Wilhelm (28:18.389)
I don't know anything about JSON rendered, no.
Matt And Sunil (28:35.822)
declaratively, like you map your button in JSON to your button component. And I mean, that's a very simplistic way of looking at it. And there's a bunch of other ones, like A2UI is like classic Google. It's exactly the same thing that they've called something slightly different and has a slightly different view on things. my feeling is this is all quite backwardly leaning. when
When we were working with Langchain in 2022, I think, when I was in consulting and we were doing some stuff with them and I was working on Quiver, which was very much powered by Langchain. I remember them looking super closely into more generative UI stuff and literally coming up with the same idea. We'll generate JSON, function calling as a thing, we'll return JSON, and then we'll map the JSON to UI components.
Matt And Sunil (29:35.394)
I guess I've spent the last year or so running away from DSLs and declarative stuff because when the models get better, they inherently become limited. So guess my vision for generative UI is that the model just writes code, maybe it's React, and then it just has access to your component library. It can import from your component library. Then you build the React and then you render the React properly.
have, I don't know if that these are, those are all the right words. I feel like anything, when, as soon as you go to do something declarative, someone else who does something in code with more degrees of freedom will end up with like a better, more like vibrant, more rich response. Yeah. So as part of the dynamic isolates launch, right? We shipped a project called worker bundler and worker bundler was very simply, Oh,
If you have a script that has dependencies or JSX or TypeScript, it'll compile all of that for you. It'll even fetch the dependencies and bundle it all for you. But there's a feature that no one's noticed in the readme, which is A, you can give it multiple entry points. You can give it a client entry and a server entry. And B, it has like serving infrastructure built into the thing. Like you can serve an app and
I've been waiting like for like, so we have agents week coming up in, well, in a week actually. Once that calms down, I want to, yeah, I want to, oh, this is, so it used to be called developer week. This time it's called agents week. It's cloud floods big. Let's bombard the timeline with a whole bunch of announcements thing. Yeah.
Wilhelm (31:08.756)
Hell yeah.
Wilhelm (31:21.209)
sick. OK, got it, got it. Yeah, I was familiar with developer week or like launch week or something like that, where it's like every day is a new announcement. Yeah, yeah.
Matt And Sunil (31:25.982)
Exactly. yeah, it's not one announcement. There'll be like a bunch of announcements every day. Matthew, not you, Matthew Prince, like saying the only, you should do launch weeks only when you have stuff worth filling up a week for and Cloudflare broad. Like it's exhausting to me even though I know like what's coming. Anyway, like once that.
Wilhelm (31:31.964)
Nice, nice. Okay, that's exciting.
Matt And Sunil (31:54.286)
It gets to be too much by like Wednesday, Thursday. You're like sure dude. Some magic. I can't say any of the things that announcing. I'm just gonna like by the time it's Thursday, you're like, okay, this is a lot. There's quite a lot of magic coming. Yeah, and it's very different from like other company launch weeks where they do like one product a day. We are like, yeah, just do five in the same day. Big deal. But like when Cloudflare does it, it's like, okay, here are like 30 announcements. It's a lot.
Anyway, point being, once that's done, I'm going to settle, think, purely into... Like, in my head, I don't even call it generative UI. I just think of it like, let your agent build apps for you that you can interact with that has access to the rest of your state. Did you ever hear about Kenton's weird tic-tac-toe experience with code mode?
Wilhelm (32:42.28)
I did not.
Matt And Sunil (32:42.606)
So Kenton, the guy who made workers, he built a a little white coding environment. Okay. So the first thing he did is, Hey, can you make a canvas app with pens, color selection, all of that. So it did that. And he drew a Tic Tac Toe grid on it and put an X in the top left corner. And then he told the agent play Tic Tac Toe with me. The agent started generating code for a Tic Tac Toe.
And Kenton stopped it. He's like, no, you're not doing that. You have access to the state of the canvas. Look at it. And it is kind of crazy. Yeah, it looked at it. It's like, I recognize what these strokes are. It's a tic-tac-toe grid. And you put an x here. So I'm going to update by writing all of this is writing code. I'm going to write an O into the middle of the grid, into the state. There's no tic-tac-toe specific code anywhere.
And played Tic Tac Toe like with Kenton and Kenton won. And then later we are like, the model probably let Kenton win, by the way. I think I don't know if that's an alignment thing. It's like, human, win your game of Tic Toe or whatever it is. Anyway, so I want to lean into this model of apps and experiences that you don't build the app and put it out there and say, OK, fine, you have an app. No, it's still collaborative with.
everything else, like the state, internal state, what have you, finding out what the boundaries and guardrails for this are. That's my take on generative UI punk. Is it more sophisticated than yours? Yeah. No, it's not. You know the Tic Tac Toe game was mine? Did you make was me. Wait, So what happened exactly? Okay, please correct the story. What's the story? So Kenton shared me this thing with the canvas.
And he was like, I forgot what he'd done, but he'd written some words on it and the model was just basically drawing backwards and forwards on the canvas. was like, oh my God, the model can draw on the canvas. So I drew a Tic Tac Toe board because I'd seen, it was based on a Teal Draw demo I saw, ages ago, like a year ago maybe, where Lou drew Tic Toe, like Norton Crosses on the board. And then I asked it to complete the board, but I didn't have the correct...
Matt And Sunil (35:08.942)
I would say like what words? Incantation. Incantation. Yeah, that's the right. I didn't have the correct magic in order to let the model actually do it. So the model then started to regenerate the thing and generated me a tit-tat game. And then I tried to reverse it a couple of times and I didn't really manage it. Kenton turned up, looked at my logs and went, ha, that's a great idea. And then just did it right. I think this is the first time I've seen Kenton getting credit for like your work.
Wilhelm (35:34.43)
Nice.
Matt And Sunil (35:37.678)
because otherwise, it's always the other way around. Me and you are, we talk about how our career is just taking credit for Kenton's work all the time. This is the first recorded instance of it being the other way. It's amazing. I'm going to frame that. I have a screenshot of his message on time. Yeah, but it's kind of crazy because like, I think the thing I like about code and this is a $12 phrase that I love using is emergent behavior. Like the whole point is that it comes up with things in the moment that you hadn't
Wilhelm (35:51.092)
That's awesome.
Matt And Sunil (36:06.808)
planned before as a product manager or a designer, like who the hell knows? Like let's figure it out in the moment. And I'm, so every time everyone talks about chat is the new interface, generative UI, like, okay, fine, that's low level, but like the vision here is how are you playing the game with your LLM? And where does that lead instead of like a predetermined product experience?
Plastic or like malleable. So yeah, it's like the idea of that. Everything can change I think um, who is it? Of course, it was Torsten. He had a really good was a notice of blog post or a tweet where he like defined the software becomes this like blunt force instrument and then people who use software are able to start with a blank sheet of paper and end up with something that they would like to use
For me, the original essay about malleable software, which happened even before LLMs, is Jeffrey lit. I think he wrote it during his ink and switch days before he joined Notion. That's damn good. That's such a good read. I want to meet Jeffrey. Isn't he here? he might be. Everyone's here this week. might be. I'm pretty sure the amp people are here. See, it's Austin. Yeah.
Wilhelm (37:14.611)
So now that we're talking about Torsten.
Wilhelm (37:21.203)
It might be. sick. That's great. Yeah. So speaking of Torsten, it's interesting because I feel like what you were just talking about is essentially like opening up the degrees of freedom, right? And Matt, you mentioned this as well. And the most interesting thing I feel like I've read maybe in like the last few months is something that was in Torsten's last newsletter, which was this blog post. It's like one of those blog posts that's like not formatted nicely. It's like old school. So, you know, it's going to be like very good.
and it's, it talks about executable oracles. Did you guys see this? I'm going to do an awful job summarizing it again to Matt's point. I should have prepared this probably better. but it, kind of actually talks about getting the most out of LLMs. You want to like restrict the degrees of freedom as much as you can. Obviously this is not as like an end user experience, which I think is more what you're talking about. This is more talking about like.
Matt And Sunil (37:54.862)
Go on. No.
Wilhelm (38:18.055)
how do you get it to make really, really good software? So it's kind of more the like software factory line of work and inquiry and exploration. And yeah, I'll do a bad job somewhere. I think everyone should just read this post, but I think he's doing some very like low level stuff and in some very like niche field of computer science. And it's like, yeah, if you tell the agent to just go do the thing, it'll like do something, but it's not going to be great. But if you tell it,
you do the thing and then you also have this profiling tool available and it can't be slower, it has to be faster. Use the profiling tool constantly. And then you get better results, but only in that one direction. But if you give it this other constraint of this niche problem or whatever, then you have these two constraints and then within those two constraints actually it can do very well. So it's making a case about, I guess, shaping the LLM output or restricting it in such a way. Letting it be creative, I guess, in only
one dimension or as few dimensions as possible. And then it can kind of go rip and go really hard on that one thing, but it's constrained very intentionally and with very good purpose-made tooling. And I think this is really fascinating.
Matt And Sunil (39:31.574)
I love it. Yeah, I it. So Kapathi made a thing about this that he, I think it was in Python, but then Toby, the Shopify CEO made, I don't know if it was just Toby, but Are you talking about auto research? Yeah, made an auto research in Pi. And auto research is insane. The new redacted, I'm saying redacted, but there is a name for it, feature product that's coming out next week is...
was based, the initial POC of it was built with auto research, like just in a loop, being like, make your outputs the same as this thing, basically, slot fork this thing, make the performance better, make the memory usage better, with a bunch of other characteristics. And that's really cool about auto research, because it starts this research loop of being like, how do we profile all of these things?
Okay, we can profile all of them now. Now what's our aim? What's our goal for stopping? Okay, we have no goal. Let's just go and do as best as we can. And then it just runs it in a loop and it can either, it can decide which experiments it wants to keep or not. So every time it runs an experiment, it's like, is this better or worse? Keeps it or doesn't. And it's such a good plugin for Pi if you haven't played with it. I really would recommend. Sounds like what we were talking about.
Wilhelm (40:53.138)
Yeah, no, auto research looks sick. talking about Koparthi, feel like actually, so again, I haven't read this. I feel like I saw it when I woke up this morning. But I think, did you see this LLM Wiki thing from Koparthi?
Matt And Sunil (41:08.962)
Yeah, could you explain that?
Wilhelm (41:10.994)
Okay, yeah, well, don't know, I haven't read it. But it looks like I think if I had to put money on what's the most interesting thing on Twitter right now, I would put my money on that thing as opposed to the latest harness drama. Because I don't know, think Koparthi is always a really interesting person to look to because he has like great overview, great like everything. And I was listening to a podcast with him which predates this latest work where he said something that feels true right now, which is that the powers are here, the agents are here.
If they ever fail you, it's a skill issue. Everything is a skill It's just all skill issue right now. And we're all just having to wade through all of this, improving our own skills. But this latest thing, this like LLM Wiki thing, I have no idea what it is. I just glanced at it and it looks kind of interesting.
Matt And Sunil (41:57.816)
So basically you dump in all data about yourself and then you start interrogating it? Is that the thing? Yeah, and you constantly update it over time, right? Like you just keep some big piece of knowledge up to date. I saw the guy who made... Obsidian? The guy who made nights and weekends ages ago. not hustle. Fias? Fias? Firas? Fias, I think.
But I know, know, I The Buildspace guy. Yeah, the Buildspace guy. He published his version of it, like a fire's wiki. And he just has years of information ready. That's just it. All these tools for thought junkies, their time might probably come finally. The productivity tool. The loves the productivity tool.
Wilhelm (42:33.052)
Okay, yeah.
Wilhelm (42:40.082)
Yeah, exactly. Exactly. Being an obsessive nate taker. Well, that's the thing. The biggest unlock, I think, for Chad at the early days, as in like a month ago, was just dumping like my 10 years of writing notes into Google Keep. Like just random thoughts like, oh, like this would be a fun thing to build. All into this thing and then having it like deal. I just look at it. So yeah, I think totally by this and putting more order into it. And then I think I saw something in this LLMWiki thing that was like, you
these different research directions or interests that you have or ways you think about a problem space, things that are definitely not fully baked. I think this is another great thing that Kaparthi is good at. He's very good at exploring the half-baked state of things or just here is a direction. And I guess this is like, again, I have read 5 % of this gist where he puts this idea together. But you have just inklings of research directions or like...
ideas or like potential things you want to explore and you put them all in and then you develop them with the agent or you develop each thing in parallel and I think there was even a thing in there that like if you're thinking about an idea right with you and your agent You could like send a ping to your friend's agent, which obviously great interest to me this whole Agents talking to each other thing and be like, okay Will's thinking about this new idea do any of like Matt's research directions in his
LLM Wiki agent thing line up with this? Like how would Matt and his prior art consider, like think about this direction? So I think it's really cool.
Matt And Sunil (44:10.574)
This is so funny. What? This is so funny because I was just at MCP Dev Summit and there was like a whole discussion that everyone was having about like prompt exfiltration. And you're like talking like how, what happens if you like only allow an MCP server for one thing or you only allow it in one area of your application, but then the prompt like exfiltrates like the data from that server. And you're just like, yeah.
What about if we exfiltrate it from the agent guys? What about if we just give it to everyone else? What happens if it's just like fully open? Like my agent's just telling your agent all my secrets and vice versa. It's like one agent net.
Wilhelm (44:50.054)
Hahaha
mean, ideally you don't exfil all the bad stuff, right? But it's like me meeting you for coffee and being like, hey, I'm thinking about building this. How would you think about it? And you're like, actually, I've thought about an adjacent problem for like 10 years. And here are some interesting concepts in that domain. And then I've learned something. And now the agents can do it without us, assuming we have not too many skill issues.
Matt And Sunil (44:57.006)
You
Matt And Sunil (45:11.79)
My agent can talk to world's agent. Should we make vector is really good is what's on my mind right now. Yeah QMD. The whole idea of like QMD. When I saw QMD by Toby, I didn't really understand but he was like six months in front of me. Yeah. If not more. He like fully got that there will be people with data locally that agents will need to search over and like it slaps. I get it. I get it. Yeah.
No, so we need to get back to it again. So Will has this thing called friends, friends.FYI. And it's how like your agents can like... Really? Yeah. So I can like... Your agent can make a post request to Will's agent. Oh, I fucking want to reuse this. That sounds fun. It's so cool.
Wilhelm (45:40.764)
Hahaha!
Wilhelm (45:46.418)
That's right.
Wilhelm (45:55.068)
And even, you can even send messages without the other people having signed up first. It needs to be a bit, yeah.
Matt And Sunil (46:01.186)
Yes, my agent's never authenticated. So I need to work out how to authenticate my agent. But can you work out how to authenticate my agent with a GitHub token?
Wilhelm (46:10.083)
Yeah, yeah, this has been long coming. I've been trying to focus on other things, has been... Focus is a new superpower? Well, in this age, but maybe I'll just sit down and spend half an hour on making it so that agents can sign up to Friends.FYI without needing a human or a DCR.
Matt And Sunil (46:22.968)
Definitely.
Matt And Sunil (46:31.384)
No, no, so I want it to be connected. I want it to be connected to me because I have an account on friends. So it knows.
Wilhelm (46:35.737)
Yeah, it will be. It'll work how we discussed. It'll make a gist to auth you. But it can do that without having to... Yeah, it'll be a bit like the multbook signup was, but just GitHub-based.
Matt And Sunil (46:41.004)
Yes.
Matt And Sunil (46:49.166)
Yeah, I think that's gonna be so good. You're basically making Maltebook for GitHub.
Wilhelm (46:56.047)
Yeah, and agent and agent. I mean, did we talk about actors.dev, which this guy Ben Orenstein, who I think is really interesting and always at the forefront of everything is building?
Matt And Sunil (47:08.952)
No? What is?
Wilhelm (47:10.481)
Yeah, I think that's pretty cool as well. Which is actually very much in line with the t-shirt you're wearing, Matt. Can you just tell people what it says?
Matt And Sunil (47:18.71)
Well, my t-shirt is from the guys at MCP use and it says, make something that people crossed out agents want. Yeah, make something that agents want.
Wilhelm (47:29.081)
Yeah. Which is cool because in SF you see like YC guys wearing like a make something people want t-shirt, which obviously like the main YC slogan, but make something agents want. I think it's the future we're going towards. And then this actors.dev thing is literally that. It's like a service for agents. Like it's like has a big banner that's like, this is not for humans. And then there's instructions for how an agent can sign up to get things that are useful for agents. For example, I think it gets like an email address and like a web hook sync.
Matt And Sunil (47:57.826)
okay, no, I saw the actors thing about emails. Okay, no, that was really cool. This just all reminds me of Rent A Human. You know the guy from Rent A Human got into YC recently? Okay, more YC drama. There's so much to fill you in on. Did you see the Delve stuff, Will?
Wilhelm (48:07.352)
nice.
Wilhelm (48:13.785)
I did see the Delft stuff. Yeah, yeah, yeah,
Matt And Sunil (48:17.41)
Yeah, wild. Did see Elizabeth Holmes commenting on the Delve stuff?
Wilhelm (48:23.151)
Wait, Elizabeth Holmes?
Matt And Sunil (48:23.31)
How is she even allowed up? She's not. She has another person who posts for her. I need people to understand this, okay? Nine out of 10 posts from the Elizabeth Holmes account are not from Elizabeth Holmes. But it's weird that even 1 % or even 10 % so apparently, yeah. So like, no, she doesn't have access to it, but that's just it. They've clearly paid someone to like rebuild her image. Like, why does no one see this? I feel like I'm losing my goddamn mind.
Wilhelm (48:43.611)
Super.
Matt And Sunil (48:49.326)
Like what is she's not tweeting from a fucking jail cell. I just imagine she has a phone in prison. No, it's someone else posting on her behalf. That's a guy who works at Terso who is legit in prison. Yeah, that's no, no, that's a completely different person. Yeah, obviously it's a different person. But how is that allowed? He's not allowed to go on Twitter. No, that person gets an hour to hours every this thing day, week order and works on
Only this on Torso. Very good. I love that. I've spoken to Globber about it. Amazing. Okay. This 600 tweets a day from the Elizabeth Holmes account is not Elizabeth Holmes. Why is it? I feel like I'm losing my mind. Everyone replies so earnestly and she's clearly like, I don't even know if it's a woman. It's probably a dude. Like what? We should do some language analysis on the. Maybe. Sure. But that's it.
She's clearly doing this image rehabilitation. She's probably out quite soon, right? Probably. Sure, I don't care, but that's not her, dude. just... Okay, anyway. And that's just it. What are you talking about? Elizabeth Holmes quote tweeting the dev apology post and saying, we should talk. I know something about this. I'm like, shut the fuck up. It's not you.
Wilhelm (49:58.535)
I hear you Sunil, I hear you.
Matt And Sunil (50:13.646)
I knew something was going to trigger you in this episode. didn't know what it was. I'm so happy it Holmes Twitter account pisses me off. Everyone's just enjoying the fact that they're talking to the Elizabeth Holmes and it's some comic book nerd that they pay, I don't know, like $50 a day to sit in his underwear and post on her behalf. Like, I don't know.
Sorry. Okay. This one like really gets me. So good. Fuck me. Yeah. Occasionally there'll be a post of her like with her child before she went to jail. Okay. It should be like I spent every moment with this baby because I knew I'm like that's not you posting. Like you have a ghostwriter so obvious. Never mind. Anyway, so you were talking about other drama. There's the Dell drama.
Tell me about the rent a human thing. saw one founder like said, I've parted ways. I don't know if there's drama about rent a human. there isn't. I thought he just said. no, no, no.
Wilhelm (51:14.193)
She's out in 2031, or 2032 by the way, that sounds like.
Matt And Sunil (51:17.828)
that's ages. Okay, she's not already seen. I'm glad she went to prison for a while. What she did was messed up. like she was basically fraud, like medical fraud. Not basically, definitely. Yeah, yeah, yeah. Like people who had like, potentially had like proper problems. Yeah, horrific. No, so the Delve drama, well, I mean that was... That's really bad. That is so bad and kind of predictable.
Did you ever go through that SOC 2 stuff with PartyKit? I didn't do SOC 2, but I did some compliance stuff and I was like, why am I paying for all this shit, We did it with StackOne and we went around all of the compliance providers and we tried the one that was least dodgy because they all seem... It's weird. You pay the money and you get a stamp immediately. That can't be legal. The whole point of compliance is to do compliance.
Yeah, so that one that one that one's not ideal. I don't know. What was the other drama you had some more drama on Twitter? Me no, you. I don't know about you
Wilhelm (52:22.266)
thing is I feel like there's so much exciting shit happening. Let's just get off Twitter and build random shit and not do the drama.
Matt And Sunil (52:29.538)
Go on then, Will, what have you been building? Go on, what have you been building? Then I'll tell you what my open call can do.
Wilhelm (52:36.218)
Why don't you go start and I'll have a think about what I should say.
Matt And Sunil (52:39.458)
Okay, so since my move to Portugal, have been looking up many, basically spending way too long on surf webcams and all surf webcams take about a minute to load the page and then at least a minute of ads afterwards. And this was taking up too much of my life. So I asked my open claw. That's not open. I keep calling it my claw. I asked my pie. I'm going to call it pie because I don't like the open claw connotation. Yeah, pie legend. I asked my pie to like,
Wilhelm (52:50.02)
and then
Matt And Sunil (53:09.996)
spin up a new container that has a browser in it, connect via Chrome debugger port, and can you just make a script that does this for me that goes to all of these places, screenshots it, and then tells me which one's good and then sends me the screenshots? And can you run it five times a day? And so that's actually a substantial amount of time. It looks at three places five times a day. It's minimum three minutes to look at each one.
So I got a fair bit of time back. I a fair bit of time back. I mean, I wasn't looking at it five times a day, but I was probably looking at it once or twice a day. And now times when, and it also predicts like forecast in the future. So it gets my forecast every day, works out like tomorrow's gonna be an amazing kite session here. So basically just all of the admin around organizing my fun is just removed and I can work on the admin to organize my work and like I can.
Yeah, and it just fixes admin around fun. know, organized fun? I hate the organized bit. I'm really bad at the organized bit because like, so I just get my open core to do it now. Get my pie to do it. Yeah, a bunch of other stuff as well. I've got it down. No, sorry, go for it.
Wilhelm (54:09.242)
Yeah.
Wilhelm (54:17.902)
It's very good. I yeah, I feel like it's hard to talk about little use cases because they're often so personal and they're meaningful to yourself, but it's very hard to generalize and sell someone who's not personal agent-pilled being like, hey, it does these five sick things for me and it can do so much more. They're like, okay, well, I don't surf.
Matt And Sunil (54:25.577)
way too personally.
Matt And Sunil (54:38.766)
I guess some of the reasoning is like anything that an app that you would just download for to do one random odd thing, you'd like literally download an app just to do one random odd thing. I had a whoop, not because like I cared about like improving my sleep, because I wanted to see what my mates sleep scores were and just laugh at them when there's worst of mine. That's like the only reason I had a whoop was the socializing thing about it. And so I've been working out how I can make my pie do the same thing because everyone on
Wilhelm (55:01.402)
Nice.
Matt And Sunil (55:08.68)
who had a whoop then still has a Garmin and gets a sleep score. reckon you can make, you can be in my sleep score graph on Strava. I need to work out how to do OAuth to your Garmin watch and steal all your data. But once I work that out, once Gladstone works that out, we can set this up. So things like that where I would literally had an app and I paid money for an app to do one thing, gone. I paid a bunch of money to do
Wilhelm (55:13.429)
man, that's a sick idea. I love that. I want to be in your graph, your sleep score graph.
Matt And Sunil (55:37.37)
to watch webcams, to skip the ads and stuff. I think like those things should still be supported because like those webcams are being maintained by people and I actually might put in my own webcam. So I think there are some things where you have to like go back and pay the cash in the end as well. things where I was downloading one app, what's another good example? I have loads of random ones. The macros app, sorry. Yeah, so I've got it doing my macros.
Wilhelm (55:41.38)
Yeah, yeah.
Wilhelm (55:57.402)
mean, my, my chat. yeah, go ahead.
Matt And Sunil (56:03.17)
So like food macros, like way better than my fitness pal ever was like years ago. Basically like Cal AI, you know how anything that was like a chat GPT rapper that people were like, like getting you to pay money for that was like a bit too much money, like more than the total cost. Now you can just do with your call. Like I have a channel on discord that I just post pictures of all the food that I want estimating and it keeps running totals. Like just organizes all of that stuff that I.
kind of want to do to be a more healthy human but have been struggling to in the past few years.
Wilhelm (56:39.695)
My chat already has access to my garment data, including my sleep data. Can I just ping it to you? Yeah, exactly. I can friends it to you.
Matt And Sunil (56:46.19)
Friends you could friends it to me Can I I can make a cron I can make a cron to? request it via friends from you and you can send it back to me and That you know the best thing about this is we don't even need a book We don't even need a protocol whatever way you send it to me it will work Like like what? Whatever way whatever whatever way you pass that note to me
Wilhelm (56:55.843)
Yeah.
Wilhelm (56:59.309)
We can do push, pull, live, batch.
Wilhelm (57:08.815)
just like humans have done for millennia. Yeah, amazing.
Matt And Sunil (57:15.244)
It will get put in the bundle of other notes that my agent collects and it will work. It's amazing.
Wilhelm (57:20.739)
I'll tell Chad to send it in a slightly different format each day to keep it interesting.
Matt And Sunil (57:23.79)
Every day. Just a different anecdote. can't wait to disappear into the woods and then go fuck all of you all. I'm going to make him a claw for his birthday. He needs one. I told you I've got that Mac money sitting on my desk for the last like four or five weeks now. I'm going to bundle this version of my PiClaw up because I actually think it's really, good. it's almost got all the features I think you'd need to have like a very generalizable.
Wilhelm (57:31.503)
Nah, you'll be back within weeks.
Matt And Sunil (57:54.018)
Discord powered, purely Discord powered agent.
Wilhelm (57:58.164)
Question for you actually I've been using your chrome CDP skill because it's just way faster than like the cloud Controlled Chrome, but it I need to keep reoffing it like does it work reliably for you because for me like it just stops working and then I have to go on to the like screen share onto the Mac mini and then like allow it like I have to keep reallowing Chrome to be controlled by this protocol
Matt And Sunil (58:03.019)
yeah.
Matt And Sunil (58:22.286)
Okay, this is weird. This might be because you've got a Mac mini, or you might have to install like a particular version of Chrome. So on my Linux container, I just have the Linux server Chromium installed, and I have never even like connected it to a screen. Like I've never done it. just, when I start Chrome,
Wilhelm (58:45.327)
And just works.
Matt And Sunil (58:49.382)
When I start the chromium thing, I have like allow debugging port, the flag, and I set the port on the flag and then I've never had to auth it. And I don't know if it's because I'm allowing it via the flag. So maybe you need to work out how you're the Chrome, how you're opening Chrome.
Wilhelm (58:53.508)
Yeah, I am.
Wilhelm (59:05.315)
Yeah, I mean, it passes the flag for sure, but it still puts up like a UI thing that's like, do you want to allow remote debug? OK, maybe I just need to use Chromium and Chrome has more protections.
Matt And Sunil (59:11.874)
Yeah, I've never had the UI thing. Well, yeah, maybe it's Chrome. Maybe Chrome has like protections and stuff. I was surprised. I was surprised when it worked because obviously I've got it on my Mac as well and I always have to press the button. And then when I tried it on, when I got a Gladstone to implement it on with inside the Pi.
Wilhelm (59:21.645)
Yeah, I'll try Chrome. Let's scratch out.
Matt And Sunil (59:37.39)
I was super surprised that it just worked and it was way better than any other option I had. started using, there was a few actually, and I feel like this is an area that's gonna get better. So Chromium running takes, it's like a few gigs of RAM. My Pi has only got eight gigs of RAM in total. And so it actually like can kind of struggle sometimes just with a couple of tabs open. I wanted to use...
like Light Panda or what? Yeah, use Light Panda. but Light Panda doesn't work for my use case because Light Panda doesn't. Light Panda is a new browser that's specifically designed for headless applications, but it doesn't have a graphical representation. What about, why don't you ping Andreas Kling and see what's the browser he's building?
Wilhelm (01:00:07.833)
What's Light Panda?
Wilhelm (01:00:12.655)
And wow.
Wilhelm (01:00:22.593)
Ladybird.
Matt And Sunil (01:00:23.734)
Ladybird. I bet he'd love to because it's in active development and it works for a bunch of complicated websites right now and I bet he's... Okay, maybe I'll try Ladybird then. Because my problem with this is it's very hard for me to debug when it breaks because I've never connected to a screen. So I kind of need it to just work and also I need it to be able to take screenshots and so LightPanner doesn't support graphical representation yet. So it doesn't support outputting screenshots. You should think on...
Wilhelm (01:00:40.921)
Yeah, yeah, yeah.
Wilhelm (01:00:49.463)
Yeah. I've had some luck with browser use as well. I tried a bunch of the different browser companies, like a couple of other ones, but browser use was the best one, but I'm not sure how much I trust them. It's just like a YC company.
Matt And Sunil (01:01:00.994)
What is browser use? It's basically browser as a service. Is it hosted browser, like externally hosted? Yeah, externally. Okay, nah, I want it all on my PI, that's the whole point. I want to see how this stuff works.
Wilhelm (01:01:12.877)
Yeah, fair, Do you do, so with your Linux Chromium CDP thing, do you have it do authed stuff as well? Or is it all unauthed?
Matt And Sunil (01:01:24.614)
so it has my, it has some passwords in variables and every now and again manages to auth with it. And then it stays auth because I have like Chrome profiles in there. the answer, the answer is yeah. So it's just Chromium, right? So the answer is it can do it, but very often I get bot detected because the web, because obviously it looks like a bot.
Wilhelm (01:01:38.617)
Got it. interesting. You have Chrome profiles. That's cool.
Wilhelm (01:01:53.007)
Oh, what? Really? I thought that was kind of the point of running Chrome locally is that it's like the furthest you can be from a bot.
Matt And Sunil (01:02:00.962)
Like it's not headless, it is obviously Chrome, but it's Chrome running on a tiny version of Linux. It's like, it looks like a bot.
Wilhelm (01:02:07.19)
Hmm. Interesting. Maybe, Hmm.
Matt And Sunil (01:02:10.424)
So I do get bot detected a little bit. But for instance, it can sign into Twitter. Anything without two factor it can sign into. And anything with two factor, it just has to ask me for a code. So I can do it.
Wilhelm (01:02:17.965)
Yeah, yeah,
Wilhelm (01:02:24.322)
Yeah, that's fascinating. Okay, I've got to bounce. Any closing words? What are you talking about at AI Engineer?
Matt And Sunil (01:02:32.824)
Code mode something. Code mode. MCP Dev Summit, there were five talks before me talking about code mode. Really? Everyone had done my talk while I stood up on stage. Maybe I should do a different talk. Yeah. I don't know.
Wilhelm (01:02:39.586)
Damn.
Wilhelm (01:02:44.314)
yeah, Matt, wait, can you give a brief recap of MCP Dev Summit? You were in New York for like three days, right? Like what was it like? Was it good?
Matt And Sunil (01:02:50.114)
Yeah, I for the week, it was great. It was a really well run event. Great speakers. A lot of the maintainers were there. Like we had a few days of maintainer conference beforehand. Yeah, the future of MCP looks bright. I think it's basically just been adopted by everything. It's kind of wild. I like MCP, bro. I don't even understand the hit. Like once you like dial it in and the new things are kicking in.
Like it gets really good. There's going to be, okay, so we're to get way more hate very soon because we're going to break, potentially going to break a bunch of stuff or make it, there's a whole thing about sessions where there's like the transport level session that you have to have to maintain a persistent client server connection or to keep them in sync. And people then started relying on it for application state, which in the server, which
as a way of the client sending the server a version of application state. Except some clients did it, but then some clients, chatgbt, didn't do it ever and will never do it and publicly say that they will never do it. So then there's this difficulty where some servers work way better on some clients than others. And so we're actually just gonna get rid of sessions, I think is the big thing.
And so MCP is going to move completely stateless. And that means that some stuff that you were doing before, like elicitations and sampling where you had to keep a state, the way that that's working is changing. Do you have two seconds for me to talk about that or are we done? Okay.
Wilhelm (01:04:27.212)
Yeah, yeah, yeah. No, and Stateless MCP sounds exciting. That's awesome.
Matt And Sunil (01:04:30.67)
It's going to be great. So how elicitations will work is imagine you're inside a tool call. You have this tool call. have your, you have, this is on the MCP server. You have your arguments to your tool call. Let's just call it like first name, your argument. And then you, get inside your tool call and the server developer is like, shit, I need to know the clients or the user's last name as well. Well, what do they do? Previously, they'd have done like a wait, an elicitation back to the client.
and then the client would answer and then the message would be sent back on the open stream and you'd continue that tool call from that line of code, right? You'd await the last name from the client and then you'd carry on. And if it timed out or the connection closed or something, your tool call was dead. Yeah, do remember? That's how it was before. What's gonna happen now in the future without those open persistent streams?
Wilhelm (01:05:21.112)
Yep, yep, yep, yep, yep, yep, yep,
Matt And Sunil (01:05:28.344)
there's open streams, is you'll get the first name in the tool call, the same as before. And then if in halfway through the tool call, the server's like, my god, I need the last name, you would do something like, if not last name in the arguments, return some type of special elicitation error. So you would return a special error, which goes back to the client, and the client would
Wilhelm (01:05:51.628)
Nice. Yep.
Matt And Sunil (01:05:56.556)
that elicitation error would contain the elicitation schema and then the client would render the schema and then there's no persistent connection, it's all closed. The client would render a schema, the user would fill in the last name and then it would make a new tool call with the last name and then the first few lines of code that was like, if not last name, would now pass and you'd get past that session. So it's called MRTR or something.
multi-turn, empty, multi-turn, something or other. And it makes for very strange looking code because you're basically validating that some of your arguments exist or not, knowing that they won't exist. And it means that if you do mutations or something before the arguments exist, you're potentially, that line of code might be executed twice, if that makes sense, per tool call, essentially. Like twice per tool execution, two tool calls per execution.
Wilhelm (01:06:26.158)
Interesting.
Yeah, yeah,
Wilhelm (01:06:47.662)
interesting.
Matt And Sunil (01:06:53.934)
So there is some like weird things there, but a genuine in general I think it's gonna be a massive improvement and like since Alem's are writing all the code anyway I think they'll work out how to do mutations properly But will you stay at the center of you for the win?
Wilhelm (01:07:06.284)
Yeah, nice. just stateless, simple MCP. Sounds great. Comeback is real.
Matt And Sunil (01:07:11.084)
Yeah, it becomes less simple and you have to think about in a different way because like for that one tool execution, that code, the top bit of it was called twice. So there are going to be some gotchas and we are going to break everyone's elicitations. Yeah, I think so. And it is going to be going to be a rewrite.
Wilhelm (01:07:23.022)
Sure, yeah.
Wilhelm (01:07:28.142)
But it's like if you submit a form, right, and there's an error, then you submit it again and the validation runs again.
Matt And Sunil (01:07:34.25)
Exactly. So to me, it makes a lot of sense. For a client, it's going to make sense. For the server developer, there's just the idea that that bit of code might be executed twice. Like you can imagine having an elicitation at the bottom of a massive tool call. And now you're like, everything that happened before, make sure that that's stateless or have some sort of like transaction lock, which gets complicated. Item potency. Yes, exactly. Yeah. So.
Wilhelm (01:08:02.968)
I think Vercell owns that word.
Matt And Sunil (01:08:04.492)
Yeah, gone.
At impotency. Why?
Wilhelm (01:08:11.352)
Well, because if you're not... no, I'm gonna cut this out.
Matt And Sunil (01:08:13.89)
Well, I'm slop forking idempotency this week. It's now mine. You've got to call it something else that's pretty similar. that's just it. Add Cloudflare slash idempotency.
Wilhelm (01:08:24.424)
Yeah, yeah, there we go.
Matt And Sunil (01:08:26.7)
Maybe ask CloudFlash slash potency. Potency. Potency, yeah. We don't deal with the impo... not impotent, the item potent person. Our version is very potent. Potency, yes. We are all about potent things. Anyway, well, it's been a pleasure.
Wilhelm (01:08:36.91)
There you go.
Wilhelm (01:08:41.582)
Alright, peace. Been a pleasure. Thanks for coming on Sunil. Have a nice week. Bye.
Matt And Sunil (01:08:46.248)
Bye! I got the train book. Bye!