You've Been a Bad Agent

"Bullish on claude code"
"I found Soham in our ATS"
"these things have been like RLHF to fuck"

Vibe Tunnel - https://vibetunnel.sh/
Armin Ronacher on Simon Willison’s blog - https://simonwillison.net/tags/armin-ronacher/
Amp by Sourcegraph - https://ampcode.com/

  • Matt is finalizing his event for AI Demo Days.
  • Juliette completed a challenging marathon with significant elevation.
  • The tech news cycle is currently nuanced and interesting.
  • Soham's job application saga has sparked widespread discussion.
  • Roenacher's insights on Claude Code are valuable for developers.
  • Git work trees can be beneficial but have their challenges.
  • Environment variables can complicate development processes.
  • Context engineering is crucial for effective AI interactions.
  • Talent poaching is rampant in the tech industry right now.
  • The competitive landscape allows companies to attract top talent easily. Cursor is refining its features to improve user experience.
  • The competition between AWS and Cloudflare shapes developer tools.
  • Claude Code is designed to work closely with AI models.
  • AI can enhance productivity but requires careful oversight.
  • Vibe Tunnel offers a novel way to manage terminal sessions.
  • MCP servers need to return appropriate data for voice agents.
  • Quality checks are essential for AI-generated code.
  • Developers should prioritize and start small when building tools.
  • The integration of tools should be user-driven, not server-driven.
  • The evolution of MCP servers reflects the changing landscape of AI development.




Creators and Guests

Host
Matt Carey
ai engineer @StackOneHQ
Host
Wilhelm Klopp
building @kolo_ai

What is You've Been a Bad Agent?

Wil and Matt discuss tech, startups, and building really cool things with AI. Sometimes joined by (actual expert) friends.

Matt Carey (00:00)
you get me.

Wilhelm Klopp (00:01)
Are we on?

We're on.

Matt Carey (00:02)
Are we like, are we live?

Wilhelm Klopp (00:04)
We're live already, that's crazy. It's a real studio. How's it going, man?

Matt Carey (00:09)
It's good, dude. I'm just finalizing my event next week. AI Demo Days nine.

Wilhelm Klopp (00:14)
nice.

That's wild. Number nine. That includes the one in New York and SF. That counter, yep.

Matt Carey (00:18)
Yeah.

Yeah,

both of those.

Wilhelm Klopp (00:23)
And you're coming back to AI demo days in San Francisco, right? Or is that a secret?

Matt Carey (00:27)
Hopefully,

yeah, hopefully we're not entirely sure when it's going to be yet Like it's October or September or October or November, but we'll try we're trying to time it with something else that's going on there And so yeah, we just try work it out. Yeah, hopefully we'll back it should be good fun

Wilhelm Klopp (00:40)
Nice.

That's awesome. feels like it's been ages. I think it's only been a week, but it feels like it's been ages since I last saw you. Yeah, man, you throw me off my schedule like that. I'm gonna get lonely.

Matt Carey (00:48)
Dude, I think it's been a week and one day. ⁓

I've

missed you too. I've missed you too.

Wilhelm Klopp (00:57)
Wait, so, I feel like the big news of the week is that Juliette did like a crazy marathon and you were out there to support? What was your travel schedule been like?

Matt Carey (01:04)
There's a massive

news of the week. Yeah, she did 42kms with like an insane amount of elevation. think almost 3000 meters of elevation on Sunday. Yeah, and my other friend Nico, who is actually my colleague, as well, he did a 90k on the Friday with 6 and a run. Yeah.

Wilhelm Klopp (01:13)
Mm-hmm.

That is wild. That is wild.

a 90k run. What?

Matt Carey (01:30)
with 6,500 meters of elevation. So I got to see him on the Friday and then I got to run around with Juliette on the Sunday. And in between there was like a half marathon as well that someone and my other friends did as well.

Wilhelm Klopp (01:32)
What?

Wait, where was this big event? And you were doing it as well?

Matt Carey (01:44)
No, no, I didn't actually get a spot, so I was just there to hang out. It was really good fun. And my brother turned up, he came paragliding, and it was all cool. It was in Chamonix, in the Alps. Yeah, it's the Chamonix Marathon. It's like the locals marathon. So yeah, I think it's... Yeah, of course it does, because Every hill you go over is 1,000 meters, pretty much.

Wilhelm Klopp (01:47)
Mmm.

It was in Chamonix, right? I was gonna ask.

That's awesome. And of course it has 3000 meters elevation.

Right, yeah, I can see that.

Matt Carey (02:09)
Like every,

every ascent is somewhere between 900 to like 1200

But dude, there's been so much news we have we had loads. Yeah.

Wilhelm Klopp (02:15)
That's wild.

There's been a lot of news, yeah.

I think actually also quite interesting news. Not just new model drop that's gonna change your life. Drop everything, you're falling behind. But some actual, I feel like the news cycle at the moment is a bit more slow and a bit more nuanced, which is great because that's what the pod is all about.

Matt Carey (02:35)
Nice, yeah. Since, I don't know, Sonnet 4 dropped like three weeks ago. But that's all news.

Wilhelm Klopp (02:40)
That's

old news, that's ancient history. Before we move on though, can I ask how is Juliette recovering? Is it really? No way. ⁓

Matt Carey (02:48)
did she find the next day? Yeah, I'd

like slightly sore legs the next day, but I'm pretty sure I had more sore legs from running around after her and like the little runs I did around and about than she did. Yeah, it was kind of wild. I mean, it's a long way, but it's not like, you're not going really fast and because of the variation, it's actually just a big day out. So I think it's more brutal on like the lungs and all of that sort of stuff than it is on.

Wilhelm Klopp (02:57)
Ha

Matt Carey (03:15)
your legs much less than like a road marathon. Like a road marathon is brutal. Like she'll be the first to tell you like for days afterwards she couldn't really walk the last road marathon she did. But this is way more, it's way more of a big exercise.

Wilhelm Klopp (03:20)
on the legs. ⁓

Mm-hmm.

That's awesome. Well, pass on

my congrats to her. And are you still in France?

Matt Carey (03:32)
No, no, no, I got back on Tuesday. Tuesday night, Tuesday night. No, but was really good fun. It was really good fun. Yeah, no, her legs are fine. She's absolute. She's an absolute champion. Absolute beast. Yeah, man, I called her a beast in front of her grandparents and her grandparents are French. I think it translated very well.

Wilhelm Klopp (03:33)
Okay, nice.

Did you... that's

amazing. you call her a beast in French? What's the French word for it?

Matt Carey (03:53)
Oh no, I think I just said beast. But I've done this wrong before. We might need to get rid of this, but I've this wrong before. I was like, of taking the mick and I called her a tank in front of her mom, but you can call someone a tank in French and it's just not a very polite thing to say.

Wilhelm Klopp (03:59)
Thanks.

Matt Carey (04:10)
So yeah, they're good fun. No, I'm definitely gonna have get rid of this bit. ⁓

Wilhelm Klopp (04:11)
Sure sure sure, yeah yeah yeah. That's... ⁓

No, I don't think you should. That's just... What is the French word for tank?

Matt Carey (04:19)
it.

Wilhelm Klopp (04:21)
Tonk. It sounds more impolite actually in French. ⁓

Matt Carey (04:28)
This is

their glass.

Wilhelm Klopp (04:29)
Anyway, okay, so there's been a lot of stuff happening. What has got you most excited from the stuff? Like, or what have you, I don't know, played around with already?

Matt Carey (04:37)
no dude, sorry, we have to talk about Soham. Right, how do you say his last name? Pa-Pare-Kee?

Wilhelm Klopp (04:43)
That's so unprerogative.

Matt Carey (04:45)
We have to talk about him.

this story is wild, okay? I'm frame the scene. There is a man.

He may live sometimes in India, sometimes in the United States, He floats, he drifts, he's an insane engineer, an incredible interviewer, interviewer even. He gets all the jobs and he works at like minimum five companies at once. Yeah, it's actually nuts. There was like a whistle, I guess like a whistle blow tweet from quite a big personality saying that he'd hired this guy.

and fired him and given him a moral lecture about working at multiple companies or what's the moral lecture thing seemed a bit pointless but like I just think it's amazing it's so funny and then the memes that have come off this have been so good like he is like the most famous dude on the internet or on tech to her I guess

Wilhelm Klopp (05:32)
Right now, yeah. What shocked me

is, so I find about this in very interesting way because I was just not on Twitter yesterday, be honest. I was having a great day. I was just writing some code, being productive, not getting distracted. I feel like focus has been a big problem for me actually this week. And it's just so easy to open Twitter and there's so much good stuff on Twitter, right? Anyway, so was like, all right, no Twitter today, had a great day. But then a bunch of various group chats just started popping off about like, oh.

the sky, we fire it immediately or like, yeah, he applied to us as well, but he didn't pass the reference check. And then I was like, hmm, there must be something going on right now. And then I opened Twitter and literally my whole feed is just full of content about this guy. ⁓

Matt Carey (06:12)
Yeah, mine as well. The whole thing, the whole thing.

We live in Sutton, bubble dude. And then, dude, I tell you, this morning I went into RATS and he'd applied to us as well. Which was amazing.

Wilhelm Klopp (06:24)
It seems like

everyone has a story or knows someone who has a story about this guy like applying to them and clearly it seems to be working for him like he so far

Matt Carey (06:29)
But like the digger. Yeah. Okay.

There's two things I don't understand. The first is, well, he applied to us as well. Like he applied in May when we were like, like early May. weren't, it wasn't that long ago, but we weren't, we're not necessarily a very well known company. So my first question is like, how does he have the time to make all the applications? Who's doing it? And my second question is,

Wilhelm Klopp (06:47)
Mmm.

Mm.

Matt Carey (06:53)
The application, I don't know who is being fooled by that application. It looks so dodgy. Like, it doesn't look legit.

Wilhelm Klopp (06:58)
really?

Matt Carey (07:00)
Yeah, I mean, we'd never have hired him because he didn't have a right to work in the UK and we were only hiring in the UK at the moment. So he was automatically rejected. But just in general, it looks super dodgy. Like the CV has him working at three different places at once. Like it's, yeah, it's like dodgy. And like none of the links work on any of the CVs. I mean, now his LinkedIn is deleted, but previously his LinkedIn said that he was like doing three different jobs and was like a lecturer or like an intern at university or something.

Wilhelm Klopp (07:05)
I see.

Does it actually? Interesting.

Yeah.

Matt Carey (07:27)
But I just think it looked really dodgy. And also he said he was in New York, but then all of his jobs were like in SF or wherever. But then he clearly wasn't because it was an Indian phone number. So I don't know. It just looked kind of weird. And you can see where people registered from as well.

Wilhelm Klopp (07:31)
That is a bit sus, yeah.

I see.

But

it though like we know a bunch of people, separately even, who interviewed him and it seems like he's a really good…like he passes all the interviews.

Matt Carey (07:55)
Yeah, dude. He must

be a sick engineer, like genuinely sick engineer because he ⁓ the digger team almost hired him right? Like he was he they had that he was there on the first day and then they did a background check on the first day and they had to let him go because he didn't pass a reference check.

Wilhelm Klopp (08:02)
you

Right.

And there's like lots of people on Twitter who are saying, yeah, we actually did hire him. He worked for us for a bit. We realized pretty quickly he was working multiple jobs. Then we fired him.

Matt Carey (08:23)
Yeah,

and anti-metal, you know, Matthew Pankhurst, anti-metal, he was their first engineering hire. Yeah, dude, insane. He must be really good. actually stalked He's gone. I stalked his, I stalked his.

Wilhelm Klopp (08:29)
No way.

Yeah, this is... What I'm most

amazed by with this is that it's taken this long for everyone to talk about it and figure it out. It seems like he's been doing this for years. And now is the only time where everyone's compared notes and...

Matt Carey (08:50)
thousands of companies like say on Twitter right not that many like compared to the whole world not that many people are on that section of tech Twitter and there must have been I don't know a thousand comments below the guy's post being like yeah we we we hired him or we got a thing from him so like dude how does he have time to apply to thousands of companies get in to a lot of them and like

Wilhelm Klopp (09:08)
Yep.

Matt Carey (09:14)
do the work. I just have this thing that he's like must have some sort of outsourced dev shop. I don't know. But anyway, Since this all came out, the instructor guy is like just posting ways that this guy could make absolute bank. Like he should write a book about like passing exams. He should make a course like all this stuff. I'm just like Jason, you are a legend.

Wilhelm Klopp (09:19)
Yeah

Hahaha

Ha

Yeah, one of the funniest memes I saw, because you're right, the memes have been so strong, was a meme that was like, when the Cluely guys say, I'm gonna, Cluely is this very controversial company, right, with the, slogan is like, cheat on everything. Like, I think they made this interview cheating software and then he got kicked out of Columbia University and now he's building a big company that's all about cheating on everything.

They raised a bunch of money from A66. I think for his intern program this summer, he just hired 50 people who all had massive social media followings. And their goal was to get a billion impressions on Cluely or something like that. That's how they roll. the meme, was the classic Office meme, which is the attractive guy says, you look nice today to the lady.

And the lady's like, oh, you're so sweet. And then like the less attractive, like big guy in the office says like, you look so nice. Like, oh, human resources. And someone made a meme with this thing, like, okay, if the clueless guy says I'm going to cheat on everything, that's the like, oh, you're so sweet. But Soham says I'm going to cheat on everything. And it's like, hello, human resources.

Matt Carey (10:40)
Dude, I saw the funniest thing about Sunil, which was like, could not be like, you could not be pro clearly and anti. So I'm like, it just doesn't. And then there was a repost of that from the, from all the guys that does zero mail, zero emails, zero mail. don't know. he was like, dude, imagine explaining this to anyone who's not permanently online. I guess is the point of this podcast.

Wilhelm Klopp (10:47)
Riot.

Yeah,

yeah, yeah, that's true. is true. Yeah, we can get very wrapped up in our own, get high on our own supply, I guess.

Matt Carey (11:14)
Yeah, well you had thoughts about what you wanted to chat about. We can put a pin in this.

Wilhelm Klopp (11:18)
Man, dude, there's, I feel like I was, yeah, I thought there was a lot of really, really interesting stuff over the past week. Where should we start? Okay, so actually, let's start with this. So, Armin Roenacher, the Austrian guy, creator of Flask, worked at Sentry for like a decade, did a really, really, really good talk about how he uses Claude code and how he gets like the most out of it. And I think it's very, very, very high quality talk and very...

like highly recommend everyone who like plays around with the stuff like we both do to like watch the whole talk. It had a bunch of attention already like I think Simon Willison covered it. I think it was on Hacker News as well. But I think there's just some like really really good specific stuff in there in the way that you rarely see like the specific stuff covered.

I think everyone's still trying to figure out how to get the most out of tools like Claude Code. And he just put a bunch of really, really interesting good stuff in there. you seen the talk?

Matt Carey (12:09)
I haven't seen it yet, but I've seen some of his tweets. So I'm relatively up to date with how he's thinking.

Wilhelm Klopp (12:12)
Okay, nice, nice, nice.

That's cool. So I have a bunch of notes we can chat through the macures to get your take on a bunch of this stuff. of the things we talked about a few weeks ago is how in your rules file, you should describe the patterns that exist in your code base.

He talks about this as well. He says, yeah, in your Claude.md file, put like, and one of the specific patterns he mentions is you need to give the agent a way to write one-off scripts.

Because especially Claude really enjoys writing one-off scripts to do, think, which makes a lot of sense, But in some programming languages, I think maybe Python especially, that can be a little bit more tricky. Or, or, or, or, or, or, or, or

Matt Carey (12:53)
and take them off.

Can we go?

Wilhelm Klopp (13:07)
command

and you can add custom commands but like if you want to like access the models and like have your all of your Django settings loaded then they need to live in like specific like a specific folder that's like under a specific app that's called like management commands or management or something like that and

He was like, yeah, if you just tell the agent, put my one-off scripts in this folder, then obviously it can perform a lot better and it's not going to get stuck in some doom loop trying to write a root level script or trying to write and then run a root level script. So I think that's a really, really good example of something we talked about a few weeks ago of having specific patterns defined in the.

Matt Carey (13:38)
That's right.

Yeah, definitely.

Definitely. Not one I really considered actually because I use like TypeScript and Bun for most stuff. So can just like TSX or Bun run anything really.

Wilhelm Klopp (13:51)
Yeah.

and then they just live at the root level and it's chill. Yeah, that's interesting. Yeah, it does feel like Node and PM have a bunch of legs up on this stuff. I was doing a thing yesterday.

Matt Carey (13:57)
It just works.

Sorry to interrupt by that.

guess even with like a compiled language, it's absolutely pain in the ass. Like if you're writing Zigg or Rust or Go, you have to, like writing a script is actually a pain. I didn't really consider that interesting because you would normally compile your app into some sort of binary or some library or something like that.

Wilhelm Klopp (14:07)
Mm-hmm. Right.

Mm-hmm.

Right, right, right. Yes. And then like you probably want the script to access some of the code that's like in your code base and all of this stuff. then it becomes, yeah, it's like, dude, I just think there's like a real product to be built around helping people write great rules files or automatically writing them for them and keeping them updated.

Matt Carey (14:37)
Yeah dude, dude, that's what shippy was gonna be, or still should

be I think. Really should be like that. And if someone beats me to it, like fucking go for it please. Because it needs to exist.

Wilhelm Klopp (14:47)
Yeah,

I agree. And also, I just think you and I, we spend a lot of time thinking about the stuff and playing around with it and whatever. And even we're learning new stuff every day about how to actually get the most out of Claude or whatever. I think we're quite far away from just anyone being able to…

Like if you want to get a lot out of Claude Code or any of these tools, I think you need to put effort into writing your prompts. You need to like develop the skill. All those things. We're quite far from being just like, just go and like, in a code base that has no rules files, like I think you can easily like kind of shoot yourself in the foot. And it would be nice to make that easier for everyone.

Matt Carey (15:15)
Yeah.

Yeah,

yeah, 100%. 100%.

Wilhelm Klopp (15:23)
and then also related to this, I tried out Git work trees for the first time actually yesterday because Claude Code actually recommends using them now. And there was a whole article now on the Claude Code and Claude Code docs about Git work trees. And I realized even that is a little bit painful in Python actually. So when you make a new Git work tree, it just tracks everything that's in Git, right? So for example, all your Git-ignored files don't get...

Matt Carey (15:28)
Yeah.

Wilhelm Klopp (15:47)
copied into this new folder that has your Git work tree. So there's no .env, there's no virtualenv. what's the... I was like, this is a bit more friction. But I guess it would actually be similar in Node because your Node modules wouldn't be copied over either, right? You'd have to like npm install and then run them again.

Matt Carey (16:02)
Yeah, you have to run like the EMI or something.

Yeah, that makes sense.

Wilhelm Klopp (16:08)
But then do you, you

have a .env file presumably. Yeah.

Matt Carey (16:11)
Sometimes, like,

there's a bunch of products that have been trying to get rid of .N files for ages. And I think they should do a better job because .N files are such a pain in the ass. Like, maybe I just haven't worked at companies where it's been big enough to need them, but we've always just like pasted and sent around .N files, except when I was working a lot with AWS, like SST.

they store your much, much better. It's like stored in Secrets Manager. And then they have like this like SSTSecrets abstraction, which populates your files. So means that every developer gets the right secrets when they need it. And that's really cool. I just don't think there's, there's probably no money to be made

Wilhelm Klopp (16:41)
Mm.

Matt Carey (16:53)
There was one step where we got rid of the ⁓ .env package. You no longer need that if you're running stuff with Bun. It just loads them automatically, which is great. That should exist. But then the next step is actually, can we load them from a centralized door, please? why did GitHub not give us a first-class way of getting our secrets? why.

Wilhelm Klopp (16:56)
Mm-hmm.

I say, yeah, yeah,

That's a great point.

Especially when the secrets aren't actually secrets, but it's like your local Redis URL. Your like local.

Matt Carey (17:17)
Bye-bye.

Dude, that's

just a fucking environment variable. That should be, like, sometimes that should just be in code, I think.

Wilhelm Klopp (17:32)
Yeah, yeah, I agree. I think like the .env makes it like quite simple for that not to be in the code, but then you're just sending around files.

Matt Carey (17:38)
Yeah, people just check it and they're like, dude,

I have stuff where it's like, env equals development. Like, why should that be in an ember file? Like, surely it obviously equals development because I'm running it on my local machine. And like, if it didn't equal development, then there would be something that tells it it's not development. Something that tells it's not development. Making all that as similar as possible. I feel like GitHub has really missed a trick there, not allowing.

Wilhelm Klopp (17:43)
Right, right, right.

Yeah.

Yep, yep, Yeah.

Matt Carey (18:05)
or not having some sort of like git hook that pulls the end files up and like populates them or yeah, I don't know.

Wilhelm Klopp (18:09)
Mmm.

there's any GitHubbers

listening, like Jason, please, you know what to do. ⁓

Matt Carey (18:17)
just do that.

Also get rid of a whole class of SaaS products that shouldn't exist.

Wilhelm Klopp (18:21)
Okay, I want to keep

going with the stuff from this talk because it's very interesting. There is a whole, yeah, another thing in there was.

Yeah, you should teach the agent how to run a test in your, just put that in the Claude code file. And actually this brought me down an interesting idea or path of ideas, which I'm still kind of experimenting with, but I'm really curious to get your take. I think of this as kind of like agent health checks. So the idea is just like that you like.

Matt Carey (18:34)
Yeah.

Wilhelm Klopp (18:48)
In the conversation, you have a custom Claude code slash command, and it's just like health check or something. And then it just tries to do all the stuff that the agent at some point presumably will need to do, like fire off a request to the running server.

Matt Carey (19:01)
Yay.

I'm going to raise you one. Okay. So open code, ⁓ released.

Wilhelm Klopp (19:06)
Ha ha ha.

Matt Carey (19:08)
OpenCode released hooks, like post step hooks, right? And the code now has them as well. But that's literally what these are. They're like, every time we do something major, run this stuff. And this stuff could be like, linting, could be formatting, it could be running tests, it should be all of these things. But if you're working with Cloudflare stuff, should probably be running Wrangler types to like get to do type gen. Also, that if you're working with TRPC or something like that, or you're trying to, yeah.

if you have anything generated, anything built, you should be running those all the time and that should be deterministic.

Wilhelm Klopp (19:41)
That's, so the hook stuff, I'm also really excited about. I put this on the agenda for us as well, and I didn't realize OpenCode actually beat Claude code to it, but it seems like, yeah, it makes sense. So to be honest, this is something, this like deterministic command execution or whatever around the agent loop is something I've been like.

Matt Carey (19:48)
I'm pretty sure they were the first to have hooks. Yeah.

Wilhelm Klopp (20:01)
trying to, or like talking about for like months, or it was like a thing I was really keen for about at Zed and like, so I'm really glad it's here and I haven't played around with the Claude code thing yet, but I think it's going to enable like a whole, a whole new thing. But also the thing that I'm thinking about with these health checks, feels slightly different because what I'm basically saying is like at the start of the conversation with the agent, like actually have a, like make a request, observe the 200, have it like run a test or observe that the test passed.

Matt Carey (20:05)
Yeah,

Wilhelm Klopp (20:28)
have it install a package and uninstall the package again, just to teach it as part of the conversation that it can do those things and how it can do those things successfully. The thinking being that then later on, it'll try to do it through the correct approach. And if something isn't working, it knows that it was possible before in this way. So it gives it just some additional context within the conversation.

Matt Carey (20:53)
So you're like,

you're like actually preloading, you're like trying to preload context almost. I've seen this new context engineering, like I don't know who context engineering, but this is genius. Like prompt engineering, I hated as a term. it always, it always felt very, clunky and childlike.

Wilhelm Klopp (20:59)
Yeah, exactly. within the conversa- Yeah. Yeah. Right.

Mm.

Matt Carey (21:15)
just really really awful was that context engineering dude i can get behind that like everyone's heard that like expression from that slightly annoying person at work like dude i need more context like

I need more context. What is it what you actually on about? Yeah, now you have to go into basics. Everyone knows what that means. Like context. I need to know what we're doing, why we're doing it, where this lies in my frame of reference of the world. And that's the best way to teach prompting. And if that's called context, I love it.

Wilhelm Klopp (21:26)
⁓ That's funny.

Mm-hmm.

Yeah.

Yeah, it seems like there was like one big blog post on me that sparked this context engineering stuff. I haven't read it yet.

Matt Carey (21:52)
There was

a Boris, Boris wrote like a pretty good, it was very short blog post, but it's been shared so much. Like I'm pretty sure like one of the heads at IBM shared it on LinkedIn. If you want an audience, that is an audience. So I think his has gone quite far, but I don't know who actually coined the term.

Wilhelm Klopp (22:10)
Yeah, it seems like it's really popped off in the past week or something. Okay, so what do you think of this idea of having the agent actually do these specific things at the beginning of the conversation? Do you think that, does that fit your mental model of getting the agent to do the right stuff?

Matt Carey (22:14)
Really, really.

Nah, it feels weird for latency-wise. Maybe for async tasks, it would be cool to get the state of the world before it starts to try and do stuff. But I am not sure about asynchronous tasks because...

Wilhelm Klopp (22:30)
Mmm, yeah.

Hehe.

Yeah, it takes

like 30 seconds a minute depending on how much you have to do.

Matt Carey (22:48)
Yeah,

and then you're also...

Like you're also massively filling a context window. Say it's like test, test running. Like you could fill a context window there with just each individual test passing when actually what would be nicer is if you had something like some pre-reported maybe, no, okay, right. Instead of the agent running it, if you had like a pre-reported health check that when you opened Claude or like some other agent, it automatically ran these things. And then it just said it was included in the

Wilhelm Klopp (23:15)
Mmm.

Matt Carey (23:18)
in the code.md file that like these are all passed. This came back with this result, but like it wasn't the whole context. It was just like a minified version that I could get behind.

Wilhelm Klopp (23:29)
OK, interesting. So in your something happen within the agent conversation itself, like within the user messages, is not more instructive to the agent than something that lives in the Cloud MD file.

Matt Carey (23:41)
Probably not. Like these things have been like RLHF to fuck to follow what's in the Claude.md file. So I just stick it in the Claude.md file.

Wilhelm Klopp (23:50)
Okay,

cool. That's a really useful take to hear. I found that sometimes it's just not sticking to what's in the Claude MD file, which has kind of surprised me. this is... Yeah, it's not very long. It's pretty short. It's like... Yeah, and this is even with Opus. Actually, do you use Opus or Sonnet mostly for...

Matt Carey (23:59)
How long is it for the MV file?

I'm pretty much using Sonnet for coding and then Opus for a little bit of thinking, but I'm using Opus in Core Desktop and I just get re-limited all the time. It's like really annoying. So then I ended up using it in Cursor, but basically just starting with you are not writing any code. This is a thinking session and then I can actually chat with it because it definitely has a bias in the system prompt to like want to write code and that's just annoying.

Wilhelm Klopp (24:09)
Mm-hmm.

I say I see you.

Mmm.

Interesting,

yeah. I upgraded to like the 10x max, like $200 a month. Exactly. And so far not hitting the right limits on that, but I should give Sonnet a try. ⁓

Matt Carey (24:33)
The Max Plan. The Ultimate Plan!

Oh, have you

tried CC Usage?

Wilhelm Klopp (24:45)
wait, is that the tracing thing?

Matt Carey (24:47)
Yeah, it's the cost observability thing.

Wilhelm Klopp (24:50)
Oh, it's cost of, okay, I thinking, wait, maybe this is something else.

Matt Carey (24:52)
Yeah, it's a cost thing for Claude Code and it's built by one of my colleagues. And Claude Code are like recommending it, from their official Twitter to people like, this is what you should use to track your usage.

Wilhelm Klopp (24:56)
No way.

That is super interesting. There is... ⁓

Matt Carey (25:06)
It's really cool. We

made it last week and it's already on 3,500 GitHub stars.

Wilhelm Klopp (25:10)
That is wild. ⁓ The thing I was thinking about using is something also I learned about from Simon Willison called Claude Trace, which is an HTTP proxy against Claude code. So you can see all of the requests it's making, like all of the prompts, all of the, yeah.

Matt Carey (25:12)
That's good, isn't it?

Woo!

That's cool. That's cool. Okay, Simon Willis, he

is so on the money. Like, I don't know how you can... I think I'm relatively up to date and yet most of the time I'm seeing his stuff to feel that up to date. how is he... How is he just like a couple... Maybe he's no more than a couple of days in advance, but he is a couple of days in advance consistently. It's very impressive.

Wilhelm Klopp (25:30)
Yeah, it's wild.

Yeah, yeah.

He, yeah,

the way I keep up with his stuff now is actually via RSS. Like I have a Slack channel where just like every time you post something, it gets dropped in the Slack channel, which is currently the best way I have of.

Matt Carey (25:58)
Doesn't it take every person every 10 minutes?

Wilhelm Klopp (26:01)
Yeah,

so this is actually really funny because he has recently started doing like a program around like github sponsors where if you pay him $10 a month, he will send you a monthly like roundup of all the most important stuff. So it's like the most interesting model because there's nothing in the roundup that you couldn't have read otherwise, but he just produces so much content that like he's figured out a way to monetize sending you less.

Matt Carey (26:19)
Yeah. ⁓

He's like created a That's amazing.

Wilhelm Klopp (26:27)
It's very impressive. Yeah, I think it's very, cool. So I've subscribed now and

I think he's done two monthly issues so far. And yeah, it's actually quite nice to read and you feel like you get the most important stuff.

Matt Carey (26:36)
He's British, right? Does he live in San Francisco or does he live in UK?

Wilhelm Klopp (26:40)
He lives a bit south,

yeah, but he lives in the Bay Area.

Yeah, it's actually a really interesting story. think like he was working at a company. He's also one of the co-creators of Django by the way, like really wild. And I think they're working at a company called Lanyard, like Lanyard, but the yard without the vowel. So like Lanyard. That was like an event tracking company. Like you could like put what conferences you're attending. I never used it, I think. And then they were acquired by Eventbrite, I think. And then a bunch of them moved to the US.

Matt Carey (26:49)
Yeah, yeah, I didn't know that. didn't know that. Yeah, wild.

So yeah, we've been very Python centric. We've had the creator of Flask, the creator of Django, love and life.

Wilhelm Klopp (27:15)
Do you want to hear some more of the stuff from Armin? Because I think there's a bunch of other interesting stuff in there.

Matt Carey (27:18)
Yeah, I actually do. He's super

smart. I love his Twitter. It's probably one of my favorite ones.

Wilhelm Klopp (27:23)
Let me see if I can... Okay, so one of the most interesting things he talked about is this idea of having a unified log output from all the different parts of your application. All put it into one big log file and then give the agent a tool to read the last 10 or 20 lines of the log file. Which I think is also super smart. That's a great way for the agent to say, oh, what went wrong? Let me see what the errors were.

Matt Carey (27:50)
Yeah, that's good. good. most of the, but that feels like a hack to me because most of the time, like the reason you need to do that is because cursor, for instance, in your ID, which should have knowledge of your terminal, sucks at getting knowledge of your terminal. Like, automatic.

Wilhelm Klopp (28:06)
Right. This is actually

my top feature request for Claude code right now would be to either manage terminal processes, like, or give me, me, maybe it a bit nicer. Like just have a way to like intercept running terminal process. Maybe I would do like Claude, I don't know bash and then my command. And then it just like monitors what's in there and still gives me my own like terminal window, which has the output like wraps it. Yeah. And then

Matt Carey (28:14)
Yeah, then it's...

I think

Claude should just have the ability to run Terminal Process, but I feel like that's actually really hard to render. Like, it's quite hard to do.

Wilhelm Klopp (28:40)
Yeah, I think like also

Windsurf had this at some point or something. I remember seeing this but it was just it just felt a bit janky and like what if you want to like kill the thing right now you're like looking at open ports and like like yeah Yeah, yeah

Matt Carey (28:51)
Now you kill Claude, right? You can't just kill

all of it. Yeah, no, I don't know. Yeah, that's weird. And that is where the IDEs have massive advantages, right? From a user experience point of view, because they can run a terminal process and they can run it in any of the windows of the IDE and they can still intercept it. That's fine. They can read the output.

Wilhelm Klopp (29:04)
Yep.

Yeah, yeah, that's a good point. But that felt like a really smart idea. He even, I think, has a way of piping the browser logs back into that log file. So now you even have browser logs.

Matt Carey (29:22)
I saw

that as well, I saw that as well. That was really cool. That was really cool. Someone made a little package for that. He did like a request for a package. I love it when you've created something so influential. You can just like, now, if when you ever want anything else, can just be like, I request a package people. Like an hour later, someone had this like working package that intercepted.

Wilhelm Klopp (29:29)
Mmm.

Hahaha

Yeah, that's amazing.

Matt Carey (29:43)
which was insane. speaking of Claude Code, I really want to cover this because I think this has been very poignant in my brain in the last couple of weeks. What is going on with the talent poaching?

Wilhelm Klopp (29:54)
my god. Yeah, so that's the one that shocked me the most, think. Boris, different Boris to Boris, we mentioned earlier, and Kat moving to cursor, which is it confirmed, confirmed okay.

Matt Carey (30:05)
they

posted about it. So Boris was one of the lead engineers on Claude Code from Anthropic and Kat was the product manager from Claude Code from Anthropic. And this team was tiny, right? I don't know how many people it was, but it wasn't huge. And they've both gone to cursor. They haven't been at Anthropic very long. The talent mood is wild. I feel like the US allows this sort of stuff.

Wilhelm Klopp (30:12)
Mm-hmm.

Mmm.

Matt Carey (30:29)
incredibly

well because very rarely will you have like, like very long non competes and all this stuff at some point. Yeah, yeah, yeah. Which that is nuts to me like.

Wilhelm Klopp (30:35)
Yeah, or they're not enforceable in California. think none competes at all.

Matt Carey (30:40)
mean, yeah, it's very cool for stuff like this because there is a full like buyer's market. Like if you want to post someone, you just, how big's your wallet? Let's go. And then there was the other one as well. The Notion Mail, wait, the Notion Mail guys went to, I forgot what they went.

Wilhelm Klopp (30:55)
Mmm. Curse

of? I think as well, curse of, right? Yeah.

Matt Carey (30:58)
It goes to as well. Yeah, okay.

Well, like that's also crazy because that's like a productivity move there from cursor. I guess when you just raise a shit ton of money, you can do it if you want at that point. You can get the best people, you know.

Wilhelm Klopp (31:08)
Yep.

It is interesting, yeah, like I, I don't know, I really like Lord Code. And I don't know, it feels like, like the stuff that Cursor is doing always feels like a little bit more distant to my day to day. But I don't know. Or to the, yeah, yeah, yeah, so I was a bit sad about this move.

Matt Carey (31:23)
like the development. No, no, I get that.

I

think cursor have stopped doing the shipping janky features as well. They're taking a little bit more time to refine them because I do think there was a period around January time where every two days my cursor just didn't work.

Wilhelm Klopp (31:34)
Mmm. Right.

Mm-hmm.

Matt Carey (31:43)
The next day was amazing, like best it's ever been.

Wilhelm Klopp (31:45)
I remember this. Yeah, yeah.

Matt Carey (31:46)
The day after it was just like completely broken. And they've kind of stopped doing that. And that's coincided with them hiring a bunch of design engineers and people, ⁓ like community focused people who are posting a lot about new updates on Twitter that haven't made it to Cursor yet. Like they'll hack something out, post a tweet about it. And then it doesn't make, it hasn't made it to, well, some of them do make it, but like I've seen some really cool stuff with like,

Wilhelm Klopp (31:49)
Hahaha.

Mmm.

Right.

Matt Carey (32:11)
queuing cursor agents that hasn't made it.

Wilhelm Klopp (32:13)
Mmm.

Matt Carey (32:15)
I think that's really interesting and maybe it's just like show that becoming a bigger company. But you're talking about code. I really want to like jump something very quickly. I saw a tweet from Dax, the OpenCode SST fame. he was saying, and I think this was him like projecting something internally that he was worried about. He was talking about how AWS treated SST compared to how Cloudflare treated SST.

Wilhelm Klopp (32:25)
Mm-hmm.

Mmm.

Matt Carey (32:41)
trying to compare that to how Anthropic might treat open code. the story is the SST is an ⁓ infrastructure as code and it directly competes with AWS's CDK and cloud formation, except it doesn't, right? Because you're still deploying apps to AWS. So no matter what, but...

Wilhelm Klopp (32:47)
I heard this, ⁓

Yeah, yeah, yeah, yeah.

Matt Carey (33:03)
Like SST's life got very difficult when the CDK and the CloudFormation teams got angry at SST for sort of like taking their users. But they weren't because those are free users. Those users, all they're there to do is to enable them to use AWS, which makes the money. So you have this organization where these offshoots, and Claude Code is something similar, like from the main topic, these offshoots, their only purpose is to drive people to the main moneymaker. In this case,

Wilhelm Klopp (33:19)
Right. Yep.

Mmm.

Matt Carey (33:31)
the Claude API. And I'm finding I am interested about this, how this will go. So the other the other side of the coin was CloudFlare, who even though they have Wrangler, their infrastructure's code, they were super happy apparently to deal with and help like SST make a very good CloudFlare experience, because they realized that getting more people to deploy on workers is the way forward. And I guess

Wilhelm Klopp (33:54)
Yup.

Matt Carey (33:55)
Yeah, it's what they're trying to do and what they're trying to optimize for not getting more people to use Wrangler. Like it's a different, it doesn't actually matter. Let's be honest. And I'm sure some people in the Wrangler team really want people to use Wrangler, but like if they use workers, they use workers. And so the interesting thing is like, will Anthropic look at open code and think this is a competitor to Claude code, we must kill it as hard as we can. Or will they look at open code and be like,

Wilhelm Klopp (34:06)
Yeah.

Matt Carey (34:20)
everyone on OpenCode is using Anthropic, we should help them make the best experience for Anthropic and drive them to use Anthropic.

Wilhelm Klopp (34:24)
Red.

Mm-hmm. Yeah,

it's fascinating, isn't it? It feels so hard to make a prediction with this, even though it feels like there are a bunch of data points that could help us. it's so hard to figure out what's going on inside other companies, right?

Matt Carey (34:36)
the edit-less way.

It's impossible. The Claude Code team like two of their main people just left, but they're still shooting, right? So they must have diverted people onto it. I don't think they've hired specifically for that team. I haven't seen it.

Wilhelm Klopp (34:45)
Yeah.

Although I feel

like the team must have been at least like 10 people or something, right? I don't know.

Matt Carey (34:56)
Yeah, yeah. But like, you

lose your main PM. Like, that's quite a big shakeup to the team.

Wilhelm Klopp (35:02)
Yeah, I agree. No, 100%.

Matt Carey (35:04)
But we also

don't know when they were poached, right? They might have been poached a couple of months ago and they would just been working out their notice period. Although in the US, like notice periods aren't really a thing, are they?

Wilhelm Klopp (35:14)
True, yep. That's interesting.

Matt Carey (35:16)
this sort of happens

like turnover just happens immediately. and ⁓

Wilhelm Klopp (35:20)
What's your take

on how Anthropic will view open code?

Matt Carey (35:25)
I don't know. I do think like DAX was trying to compare that situation because obviously Cloudflare come off as better player. And I think most people would want that Cloudflare comparison.

Wilhelm Klopp (35:28)

Matt Carey (35:35)
⁓ Maybe. But I don't think it's entirely legit. Like when SST first came out, it was a rapper on top of CDK. And so its only purpose in life was to deploy to AWS.

Wilhelm Klopp (35:36)
Mm-hmm. Mm-hmm. Mm-hmm. Mm-hmm.

Yeah, yeah, yeah, yeah, yeah.

Matt Carey (35:49)
like its only

purpose was when they went to v3, because they were on top of Pulumi, now they could deploy to multiple different cloud providers. So they added support for Cloudflare, they added a bit of support for GCP, and they added support for random stuff like Heroku and things. And so, yeah.

Wilhelm Klopp (35:57)
got it.

I still love Haroku. I still use it.

Matt Carey (36:08)
This

is a different situation, right? Because open code, like you can use any model in open code. Like it runs, I'm pretty sure it runs on the AI SDK by Vercel You can literally use any model. And so at the moment, Claude is the best model, but...

Wilhelm Klopp (36:12)
Mm-hmm.

Matt Carey (36:23)
that application layer is probably gonna be the sticky layer. And if Claude no longer becomes the best model, then it's a very easy switch. And so I can see them going directly after OpenCode. And also they've done some sneaky stuff where Claude code is usable for people on subscription models, right? ⁓

Wilhelm Klopp (36:40)
Yeah, my god, I saw this sneaky stuff.

Matt Carey (36:42)
they managed to replicate that. so open code is now used

Wilhelm Klopp (36:42)
was... Yeah.

Matt Carey (36:45)
by people in the Max plan, which is cool, right? Like it should exist, I don't like, Anthropic, they're just losing money by doing that, right? And they're just giving away market share to a competitor without the like stickiness that this competitor is only going to direct people to us. Like it's, they're not a partner, they are a competitor.

Wilhelm Klopp (36:47)
Yeah.

Yeah.

So OK, this is really interesting. feel like I'm quite bullish on Claude code, even with the existence of open code. And I think Torsten and the AMP people have written a lot about this, how you end up building a tool or an agent that is really tuned to the model. And it feels like Anthropic had a long time ago gotten this idea that it's all about tool calling and tuning.

Matt Carey (37:22)
Yeah.

Wilhelm Klopp (37:29)
the tools to fit well with the model and vice versa ends up making a really good agent. Whereas Gemini has not really got there yet, to be honest, even though the raw strength of the model is like really good. And you know, the open code guys, it still is really bad. Yeah. And the open code guys were like, oh, you know, you know, one day like soon Gemini will just add great tool calling support. And then like, that'll be a solved problem, which maybe could happen, but I feel like, um, there is something like philosophically different about like the approaches and

Matt Carey (37:38)
There's people that still suck ass. They're still old.

yeah, definitely.

Wilhelm Klopp (37:57)
What I like about

Claude code is that it's tuned to the models and it feels like they're really investing in a very comprehensive approach for making a great coding agent from both the model and from Claude code.

Matt Carey (38:09)
I've been really loving some of the research projects coming out of Anthropic recently. I think it speaks to how they do consider themselves Like I don't know how to explain it. the example I'm thinking of is where they got Claude to operate a vending machine. Did you see that?

Wilhelm Klopp (38:22)
yeah, that was wild, yeah.

Matt Carey (38:24)
which

is just wild. Like Claude lost loads of money, but it just operated a vending machine and worked out which things to buy. And then people prompt injected it and were like, buy me tungsten cubes or something. And so it's going be for zero pounds because I've got no money. And then Claude was like, yeah, of course I'll do that. And so it lost loads of money, but it's an amazing experiment and they're playing with it. And like, I feel like they must have a lot of harnesses internally to be able to like play with the model in that way.

Wilhelm Klopp (38:34)
Yeah, I saw that. Yep.

Mm-hmm.

Mm-hmm.

Matt Carey (38:51)
But they're very thin harnesses, right? Because they're testing the model. Whereas what do you have with Gemini, right? You have an integration into Google Docs.

Wilhelm Klopp (38:56)
Right.

Matt Carey (38:59)
like a direct integration to Google Docs, but where's the integration for me to run a vending machine on it? Where is that? And internally, some teams will have something like that, the ones that work on Gemini, but I feel like, I just feel like lot of the teams in Anthropic are probably much closer to the model, much like the Claude Code team is going to be closer to the model than the equivalent team in Google. And that just, the organization is so much smaller.

Wilhelm Klopp (39:05)
Yeah, right, right, right.

Yep.

Yeah, right, right, right. Yeah, exactly. It feels like something just right about cloud code, Sorry. ⁓

Matt Carey (39:31)
And they were

saying that with AMP, I don't know if they explicitly said this,

Wilhelm Klopp (39:38)
Hmm.

Matt Carey (39:40)
like Anthropic was the company that was much more willing to let them have private beta's, let them like get much closer to researchers to chat with people and do stuff like that. They tried with all of the other major providers and they were much more closed. You get a meeting with a salesperson. And I think some of the source graph is like written publicly about that because I have seen that recently. ⁓

Wilhelm Klopp (39:57)
Right.

and just like the

difference in approach of like...

Is your model being perfected to like one shot a browser game or is it made to use tools really well and navigate around a big code base? It's like very different approaches and it's great. mean, the other hilarious thing that's I feel like people have really started analyzing over the past couple of weeks is that like you have these companies like Lovable, Bolt, Replet getting from like zero to like a hundred million ARR in like a super quick timeframe and all the big plays.

is deciding they want a piece of the action and making their own instant deploy front-end web app kind of thing. Yeah, there's like 20 of these now, Which is like, okay, interesting. That clearly is something people are willing to pay for right now, but also it feels like very limited, right? You create this artifact and it's like a bit harder to achieve. Anyway, I don't know, I'm saying anything new here.

Matt Carey (40:38)
Like the cool artifacts, like the artifacts.

I like the artifacts though. When they came out, I was making games that were just one file of HTML with CSS and JavaScript. I was making Kinect 4 and chess and things, and then just downloading the artifact and sending it to my girlfriend, to her work computer, and she'd just open it and be like, here's a website that my boyfriend created for me. It like, well, I didn't really, it just sent me a load of code. And then she was playing Kinect 4 with her friends. I was like, this is sick. ⁓

Wilhelm Klopp (40:58)
Mmm.

That's cool.

That is great. I need to

try this. We should do a game jam, an AI game jam or something. That would be really fun.

Matt Carey (41:26)
We actually

should do an event like artifacts. I think that I do think that's very, very cool. And yeah, I just, I, I've seen so many like marketing pieces now, but here is this company that took went to like 50 K ⁓ R R by building an app on top of lovable. And this is the problem they solve. it's like, man, I just don't care that much. Is that was it?

Wilhelm Klopp (41:43)
Mmm.

I thought it was actually 50k ARR. it was, I think because that was a big

criticism of lovable and stuff so far, right? Show me one company that actually has made a product on top of this.

Matt Carey (41:56)
Well, maybe they're doing it because

it's a criticism. I just don't care, unfortunately. ⁓

Wilhelm Klopp (41:59)
Yeah, yeah, yeah. Right. Wait, what don't you care about?

Matt Carey (42:05)
What do I care about?

Wilhelm Klopp (42:06)
No, no, what's

the thing that you don't care about?

Matt Carey (42:09)
I just don't care about like the social proof that people can make applications on an app builder. Like, yeah, sure. You can make something that someone needs to pay for. Like, like.

Wilhelm Klopp (42:14)
I see.

Hahaha

Matt Carey (42:20)
Yeah, I'm sure there are loads of things that people are to pay for. Like dude, someone would have paid for my Kinect 4 game if I'd been in the right place at the right time. Like if I'd have been sat outside a really fancy restaurant and there were like loads of screaming kids and I had a way to shut them up and it was a clawed artifact, dude, people would pay for that. Like it doesn't matter, like it's just right, like you solve a problem for someone, someone's gonna pay you for it if you provide value. And I don't.

Wilhelm Klopp (42:29)
Mmm.

Ha ha.

Mmm.

Matt Carey (42:44)
So that just annoys me a little bit. But it's a very, very unique annoyance, I think.

Wilhelm Klopp (42:49)
You know one word that we haven't mentioned so far in this podcast yet, which is very unusual for us.

Matt Carey (42:53)
Don't do it, don't do it, don't do it, don't do it, don't do it, don't

do it! We were so close man, we were so close!

Wilhelm Klopp (42:58)
We're so...

Well, I wanted to bring it up because I'm actually feeling really like more excited than I have again about model context protocol, about FCP. Because...

Basically, one of the decisions I was trying to make with what I'm trying to build with Kolo like the idea is to make like a really great like LLM first debugger for Python basically. Like that's the point of Kolo, give like deep, like really detailed debug output to LLMs and Kolo can like provide that really detailed output. But one of the decisions I was, I've been trying to make is like, should I like build some kind of cool web or terminal experience that like is like that.

kind of wraps Claude code. So Claude code is like a thing within my thing, or should I do it the other way around and really lean into being a thing that Claude code invokes, like a CLI tool or an MCP server. And it's interesting because you do definitely get more control when you wrap Claude code, I think. But also, it's just obviously way more work to build it that way.

So think that's like, yeah, there's pros and cons for both. I think the hooks thing that came out this week is actually a huge help in being able to build it into Claude Code, because now you can just deterministically run commands. I could tell my users, hey, just put this stuff in your hooks file, and then everything will work smoothly together. The other thing is, yeah, it's way less work.

to do it this way. I am just like a solopreneur like it's very useful for me to like have constraints like not having to build everything and building something into a platform like building something into a platform is what I did with SimplePole and that worked really really well having it be just in Slack and even though the constraints like sucked sometimes it was really really helpful to make like an actual like product.

And then also the maybe the biggest benefit that I think is definitely underappreciated now because it's not real yet and I'm a little bit hoping that it'll become real is like usually when you build into a platform you get like significant kind of distribution advantages as well, right? So like that was actually one of the biggest things for SimplePool success that just likes the Slack marketplace was sending so much traffic to SimplePool as soon as it was listed for like a thing that people wanted. I am like hoping and imagining maybe really to speak to someone from

and topic to confirm that like they do want to cloud code into like a platform where there are like cloud code plugins of some sort maybe they're just wrappers around MCP stuff is right but like ⁓

Matt Carey (45:19)
Yeah. Not even wrappers. So

it's Anthrobic's making a registry API. They've talked quite publicly about this.

Wilhelm Klopp (45:28)
For

sure, they're making the MCP registry, but MCP and Claude Code are different teams, right? The MCP registry does not necessarily mean that Claude Code will have great support for the registry.

Matt Carey (45:38)
Yeah, dude, it will. Like it will. Like it's, it just will. Yeah. Like, like how hard is that going to be to do Claude add MCP from registry and then just give a registry URL? Like it's not going to be hard. Like that will happen.

Wilhelm Klopp (45:41)
Nice.

Totally,

yeah. But I think there'll be, like, if they really want to lean into this, I think there's a lot more that you can do, right? Like, you could have, you know, MCP servers that are made for Claude Code, that work, like, really great for Claude Code. You could have, like, I don't know, like, you could have, like, a proper marketplace that's, a Claude Code marketplace that's just powered by the MCP registry.

Matt Carey (46:11)
That interesting

what you just said there about MCP servers that are made for Claude code. was, hearing from, Angelo from 11 Labs that they've had some problems about MCP servers that are made for voice ⁓ and like they had to rewrite a bunch of stuff when they were building the 11 AI thing because they were that conversational AI agent.

Wilhelm Klopp (46:33)
Mm.

Matt Carey (46:34)
And the main reason is like most MCP servers will return

the agent will return text, right? And that text doesn't often sound very good if you speak it out. So like, it might have URLs in, it might have like a ULID in, like a really long string of alphanumeric IDs, like we might have a table in it. Like that's not going to sound very good if like, if the agent is like reading out a table in Markdown letter by letter.

Wilhelm Klopp (46:40)
Yeah.

I see.

Yep.

Yep.

Matt Carey (47:03)
And so they had to really think about that. And I don't think anyone apart from them is thinking about MCP servers returning the right data for voice agents. And he was saying there should be something in the protocol which determines how the MCP server is going to be used. so the server would know like,

what to return, like whether it should return all the data or whether it should just return like a short text snippet I'm going to go the other way and say that I don't think that's the protocol's prerogative.

Wilhelm Klopp (47:22)
Mm-hmm.

Matt Carey (47:30)
The protocol shouldn't have flags left, right and center determine what the agent is doing. It should be like the MCP server. If I release a StackOne MCP server, I should also release a StackOne voice MCP server, or a StackOne slim MCP server. Imagine Linux distributions for Docker. I release a slim version which just has what you need. I release a full version which has

Wilhelm Klopp (47:35)
Interesting.

Matt Carey (47:55)
everything and you now have to pick your tools before you load them in. I think people are going to do more like that. And so I am really interested. Want to see that develop so the server developers stop just wrapping APIs and actually do that properly, which is mostly my fault as well. And then

Wilhelm Klopp (48:02)
Hmm. Interesting.

Yeah.

No, that's interesting though, because

I feel like on that... So this is not entirely true. It's like kind of just for the sake of argument, but like you could make the case and it kind of is kind of true for Kolo that like the Kolo MCP server is essentially all that it is is like a CLI command that it calls.

like a really nice, like long description that is for the LLM, right? So like all that the, the Kolo MCP server tool calls are like they're literally just CLI commands. That's like the main thing. But then the other thing that kind of where MCP shines is that you can give a really like good description that obviously is tuned for an LLM, right? And I think it makes sense that like for different clients using different LLMs, you would want different descriptions, right?

Like that's why I like the whole going all in on Claude because, and Claude code, because I can tune my descriptions to work really well for what Claude will pick up.

Matt Carey (49:05)
Here we go. Well,

I think if I had a feature request or maybe not even a feature request, I think this will belong to Anthropic eventually, but maybe like an external company that has some free time could do this. You we were talking about free ideas. Maybe, I don't know if it was the last episode of the WordPress, you always have free ideas. I have a free idea. This is a request. I'm doing an Armand style request, which is, can someone make a Docker file for MCP servers?

Wilhelm Klopp (49:19)
god, yeah, we should talk about... Mmm.

You

Matt Carey (49:31)
because I hate the current JSON thing, it's awful. I want to do like load this server from this URL, include these tools, done. Load this server from this URL, include all tools. Load this server from this URL and this server from this URL, connect them together, here are the environment variables.

Wilhelm Klopp (49:41)
I see. Yep.

Matt Carey (49:51)
Like I want that to be part of the MCP protocol, part of the, it's not even part of the protocol. It's like a client situation. It's like a client thing. But if someone makes a standard for that, then the clients should follow that standard.

Wilhelm Klopp (49:51)
Mmm.

Yeah, it's like middleware.

Wait, isn't this what you were doing with integrations.cool?

Matt Carey (50:07)
Yeah, no, kind of, I just feel like I've been making this, but I'm never gonna release this because I'm not a client. So I'm making this from like the server side and then creating like a GUI to allow people to basically generate this. But all I'm doing is generating this spec, but this should just be a spec in and of itself. So maybe it's my job also to make like an RFC and make one. But I think this exists, like an MCP file should exist.

Wilhelm Klopp (50:16)
Mm-hmm.

Yeah, interesting, interesting.

That's not that weird JSON format that you hit. They're all working on a format like this, I think, though. There is, I saw some. Yeah.

Matt Carey (50:35)
Yeah, the format is disgusting. the JSON format is hard to use. I saw that there's this DXT format.

which is like the Anthropic Desktop Extensions format, but it's like another classic Anthropic thing where it makes sense for that particular use case and now people are going to try and use it for this. And this is a different use case actually because this is like, the DXT is a singular click, MCB server developers can release a DXT and then someone down, like an end user downloads it, double clicks on it and it adds the config to a core desktop. That's all it does.

Wilhelm Klopp (51:01)
Mm-hmm.

Matt Carey (51:11)
But it's not for people, but end users or even like client developers to modify, like to launch their MCP servers. It's a situation.

Wilhelm Klopp (51:20)
Yeah, I get you. I get you. Yeah.

Is this something you talked about at this MCP roundtable event that you attended?

Matt Carey (51:27)
Nah, just,

at the round table, I basically just got angry at anyone who suggested that they could wrap APIs and call them MCP servers. And I was like, dude, I've got 28 of them. You can go and look at integrations.cool. Just tell me which ones don't work and they'll be the ones that I've wrapped in API. The ones that do work are the ones that like you've played with and made them very use case specific.

Wilhelm Klopp (51:41)
I see. Interesting.

Yeah, yeah,

yeah, yeah, yeah, yeah,

Matt Carey (51:48)
We talked a lot about

listing. You know, like in a restful a lot of times people will be like, here is the list of the resource and here is the get of an individual resource. Well, I think list is gonna just be entirely replaced by search.

Wilhelm Klopp (52:00)
Yeah, yeah, yeah, yeah, yeah,

Matt Carey (52:02)
will

never have a list because the list will always overflow a context window. Like, pointlessly. So will always just be like a netminer search. Dude, sorry, I've been trying for ages. We're going to call it, we?

Wilhelm Klopp (52:11)
That makes sense.

I'm good. I could keep chatting for you for another three hours. Yeah, I feel like you should write a blog post about this, like how to build, like yeah, like don't just rap.

Matt Carey (52:19)
Yeah, we could keep going. Any last bits you want to chat about? What is vibe?

Wilhelm Klopp (52:32)
API, like I'd be interested if you like for you to explore this in like a few different angles. Like what is a good MCP server look like? What is a like, what is a use case specific tool? The search versus listing. I think there's a lot of interesting points in there.

Matt Carey (52:43)
And then I kind want to do another blog post about the MCP file idea.

Wilhelm Klopp (52:47)
Yes. So the thing I was thinking about, think the people working on the Cloud Registry, think the file format that they're working on is for a MCP server to specify what the MCP server is for the purposes of the registry.

Matt Carey (53:02)
So the Smithree have

something similar, right? They have a YAML file where you can specify how to start the server. Because Smithree, it clones the repo and then starts a Docker container. And so they convert the Smithery.yaml file into something that they can run with Docker.

Wilhelm Klopp (53:05)
Mmm.

I see.

Matt Carey (53:17)
And so I'm assuming they're gonna all collate on something similar. But I don't want to look at it from that side. I want to look at it from the client side. Like the client shouldn't be installing random mcp.json. They should be actually working on like a more structured format. And I'm sure Cursor has something like this on, no Claw definitely has something like this under the hood because you can turn off and on individual tools.

Wilhelm Klopp (53:23)
Yeah, from the client's side, yeah.

Right.

Yeah, more tool format.

can you in code code? ⁓

Matt Carey (53:46)
In code, not in code

code, but maybe in code code, but in code, that's how you can.

Wilhelm Klopp (53:50)
that's cool. Yeah, interesting. It kind of messes up the server if there's some composability stuff, right? When it's like, call this tool and then call this tool. If there's some instruction, like that and that, if you've turned off the tool, then it's a bit messy.

Matt Carey (54:01)
Yeah.

I don't

think you should ever have a call this tool and call this tool. That's actually another one of my hot takes. It should be use case specific. if it's a call this tool and call this tool situation, it means that should be one tool.

Wilhelm Klopp (54:10)
⁓ Okay, yeah.

But I guess the nice thing about, I mean, in theory, Like the idea of MCP at like a super philosophical level is like, you just like provide capabilities and then like whatever use case you have, the capabilities can be used to achieve the goal, right? like in theory that like makes some sense to me, right?

Matt Carey (54:35)
in which case it should be the user specifying call this tool, then call this tool. It shouldn't be the server developer.

Wilhelm Klopp (54:41)
No, no, sure, sure, sure. think I don't have any specific thing in mind with that. It's just I think I've seen some people discuss like tool composability.

Matt Carey (54:46)
But it's definitely a thing. People have made it

so that server developers have to call one tool and call another tool. And I think that is fundamentally flawed. It's wrong. If a client decides that they just want to call one tool, they should just be able to call that one tool. They should have no instructions that they need to call another tool. And if it's necessary that they always should have to call one tool before calling the rest, then that should be integrated into the protocol in more of a fundamental level.

Wilhelm Klopp (54:53)
Right, right, right. Yeah.

Mm-hmm.

Yep.

Matt Carey (55:14)
That's my feeling.

Wilhelm Klopp (55:15)
Yeah,

no, I think that's cool. And yeah, I think you have some blog posts to write. ⁓

Matt Carey (55:19)
Dude, so many blog posts, so

many blog posts. you never know, I might take some time and just write them out. I don't know. Did you have any last bits that you want to talk about? What is Vibetunnel?

Wilhelm Klopp (55:28)
Go for it.

Ooh, yeah, Vibe Tunnel. So Vibe Tunnel is actually the product slash company that Armin, the Flask creator, is working on now. So it's Vibe Tunnel.sh. And it's actually very interesting because you know how we were talking about like proxying terminal commands?

Matt Carey (55:47)
Yeah.

Wilhelm Klopp (55:47)
it plays some music when you open the website. I think there's some really interesting ideas in this So the top line seems to be that it uses tail scale and vibetone locally and then you can... Yeah, tail scale's incredible.

Matt Carey (56:00)
Tailsdale is a sick company by the way. didn't know. I know them

recently. was like, holy shit, this is amazing.

Wilhelm Klopp (56:07)
It's so genius and they've grown so, yeah, it's very, it's funny. I think even the like main moderator of Hacker News said that like most Hacker News commenters like dislike most companies, but there is two community darlings in the Hacker News community. Can you guess what they are?

Matt Carey (56:22)
No, I have actually no idea apart from one of them is tail scale.

Wilhelm Klopp (56:25)
Yeah, so the tailscale is like the more nascent one where he was like, yeah, I can see them. But can you guess what the other one is?

Matt Carey (56:30)
Things that Hacker News people really like.

Wilhelm Klopp (56:32)
that is kind of like universally loved.

Matt Carey (56:34)
I know, like Linux or Docker or something like that. Git.

Wilhelm Klopp (56:38)
No, it's CloudFlare. Yeah, it's actually CloudFlare. But yeah, the basic idea of this VibeTunnel thing is like, yeah, so like, so okay, there are sites that VibeTunnel proxies your terminals right into the browser so you can Vibe code anywhere. Watch output scroll in real time, type new commands, and spawn fresh sessions on the fly. So like, you make your terminal, I guess, available through TailsKill and VibeTunnel, and then you can like use it on your phone.

Matt Carey (56:40)
That's it.

Wilhelm Klopp (57:00)
or anywhere else, which think is a really cool starting point for an idea. I hope it also means this thing we were talking about earlier where it's like, it's kind of like a process manager for different things on your machine and like, it can control like, and read the output.

Matt Carey (57:12)
That's cool, and it's open source as well. Now that's really cool.

I'm going to sound like a hacker news person, but this feels like something that someone could just build if they wanted it. But I guess not really. It's actually not tough.

Wilhelm Klopp (57:25)
You mean because

AI just makes it easier to build things.

Matt Carey (57:28)
Yeah, you know like that

Dropbox, the Dropbox comment, which is like, why would I do this when I can just carry around a pen drive? Why would I even, like, if I don't want to carry around a pen drive, I have a NAS at home where I can just like make a remote storage solution. Like, why would I need storage?

Wilhelm Klopp (57:32)
⁓ yeah, yeah, Yeah, yeah, yeah, yeah.

Yeah.

And you can

just Telnet and like, SCP and yeah.

Matt Carey (57:51)
Exactly, exactly, exactly.

And I can do port forwarding on my router. Like, why would I need any of this stuff? So I hate to feel like that, but this is really cool.

Wilhelm Klopp (57:58)
I don't know, at the

current state of things where I'm feeling, I'm actually really coming back around to the perspective that great products, you can't one-shot them with AI. I think AI can help you scale, writing certain amounts of code that you otherwise would have needed humans for, but you still...

Although like a lot of the old rules of building products like still really apply like you need to prioritize like you need to like Probably start small and like find a good like niche or whatever you still need to apply like a lot of taste and craft and whatever

Matt Carey (58:30)
I

think of Modal a lot when I think of this. I need to go by the way in like one second, They basically started and I think for like two or so years, maybe even three years, all they did was make a better version of Docker where

Wilhelm Klopp (58:32)
Mm.

Matt Carey (58:41)
they can start up containers faster. And then from that, they built like an incredible serverless function as a service platform. And then the serverless container as a service platform for data scientists. It's like that shit wasn't possible before. And they spent a long time being like, we're going to make something fundamentally better. And now they have quite a good moat when it comes to Python.

Wilhelm Klopp (58:57)
Mm-hmm.

Yeah, yeah. That's awesome. Can I ask you one last thing? I was having a chat yesterday. Do you, when you have AI-generate code, do you still read the code and like before you, like, you, and it still needs us to pass your own quality check, like as if you had written it yourself?

Matt Carey (59:19)
Yeah,

Yeah, I read a lot. There are some things where I don't care as much. And so I maybe like won't read a little bit. But if it's work, if it's work, work, and read everything. Yeah.

Wilhelm Klopp (59:31)
Yeah. And

if the agent does some weird shit, like, and adds like a ton of comments or like it does some really like wild like try catch, you're like.

Matt Carey (59:40)
Yeah, yeah,

I know that, I always get rid of that Like, always. And then I- then like maybe 50 % of the time I'll try to work out a better way so it didn't- so it doesn't generate that again, but a lot of the time it generates pointless shit and I always get rid of it. Like, always.

Wilhelm Klopp (59:43)
Okay.

Got it, got it, okay, cool. Because

I was having a chat with someone yesterday who was like, yeah, I don't look at code anymore. And I'm like, yeah, I don't know, seems, like I just have the agents.

Matt Carey (1:00:01)
I went through a phase, I did

definitely go through a phase of being like, yeah, AI could definitely solve all of it. And now I'm like, no, like I definitely should read this. AI is cool, but.

You still got to make something that works. I've seen so much stuff recently where I friends where they're like CEOs decided because of AI, he's going to start like pushing like 5,000 line PRs. They're just like spaghetti code that's like half broken.

Wilhelm Klopp (1:00:23)
Mmm.

Yep.

Matt Carey (1:00:28)
then some actual salaried engineer spends the next week trying to like make it into something that's visible. And it's like that is probably an anti-pattern. Probably at the moment. It's too hard to like jump to conclusions there because maybe it's not in future.

Wilhelm Klopp (1:00:33)
Yeah

We probably, probably maybe. Yeah.

I feel like we're really still figuring a lot of the stuff out, I guess is the point.

Matt Carey (1:00:50)
Anyway, I'm going to the House of Commons now.

Wilhelm Klopp (1:00:52)
no way! What?

Matt Carey (1:00:54)
Now I'm going to like a drinks reception. So I do it.

Wilhelm Klopp (1:00:56)

fun! Enjoy! Yeah have some have some drinks for me. Spread the good word of the pod.

Matt Carey (1:01:02)
Yeah, I my best. I'm actually going to tell every MP that they're missing out on the best podcast they could ever listen to. Guys, bad agent podcast. We still need to get some of your GitHub mates on here.

Wilhelm Klopp (1:01:07)
Yeah, get real takes on the ground, Parliament. You heard it here first.

There's a lot of... Why don't we do another guest pod in this month or something?

Matt Carey (1:01:17)
really want to do one.

Yeah, I really want to do one. Let's do it. It was fun last time. Lou was awesome.

Wilhelm Klopp (1:01:22)
Yeah, it was great. And I think that's still one of our most downloaded episodes, which says something. Yep.

Matt Carey (1:01:26)
Makes sense, makes sense.

No, we are going up though, up and to the right. Did see my little picture? didn't, didn't caption it.

Wilhelm Klopp (1:01:33)
Oh,

I was trying to figure out if it was about the pod or something else. Because I checked the stats as well after I was like, oh, I thought it might have been about the pod. That's great. Nice. Look at us go.

Matt Carey (1:01:36)
I didn't caption it. No, it was the plot, was the plot. didn't caption it.

Look at this guy, Right, I've got to go, dude. It's been a pleasure. Big love. Bye!

Wilhelm Klopp (1:01:53)
Sweet. Likewise, bye. Big love.