Chain of Thought | AI Agents, Infrastructure & Engineering | How Block Deployed AI Agents to 12,000 Employees in 8 Weeks w/ MCP

How do you deploy AI agents to 12,000 employees in just 8 weeks? How do you do it safely? Angie Jones, VP of Engineering for AI Tools and Enablement at Block, joins the show to share exactly how her team pulled it off.Block (the company behind Square and Cash App) became an early adopter of Model Context Protocol (MCP) and built Goose, their open-source AI agent that's now a reference implementation for the Agentic AI Foundation. Angie shares the challenges they faced, the security guardrails they built, and why letting employees choose their own models was critical to adoption.We also dive into vibe coding (including Angie's experience watching Jack Dorsey vibe code a feature in 2 hours), how non-engineers are building their own tools, and what MCP unlocks when you connect multiple systems together.Chapters:00:00 Introduction02:02 How Block deployed AI agents to 12,000 employees05:04 Challenges with MCP adoption and security at scale07:10 Why Block supports multiple AI models (Claude, GPT, Gemini)08:40 Open source models and local LLM usage09:58 Measuring velocity gains across the organization10:49 Vibe coding: Benefits, risks & Jack Dorsey's 2-hour feature build13:46 Block's contributions to the MCP protocol14:38 MCP in action: Incident management + GitHub workflow demo15:52 Addressing MCP criticism and security concerns18:41 The Agentic AI Foundation announcement (Block, Anthropic, OpenAI, Google, Microsoft)21:46 AI democratization: Non-engineers building MCP servers24:11 How to get started with MCP and prompting tips25:42 Security guardrails for enterprise AI deployment29:25 Tool annotations and human-in-the-loop controls30:22 OAuth and authentication in Goose32:11 Use cases: Engineering, data analysis, fraud detection35:22 Goose in Slack: Bug detection and PR creation in 5 minutes38:05 Goose vs Claude Code: Open source, model-agnostic philosophy38:17 Live Demo: Council of Minds MCP server (9-persona debate)45:52 What's next for Goose: IDE support, ACP, and the $100K contributor grant47:57 Where to get started with GooseConnect with Angie on LinkedIn: https://www.linkedin.com/in/angiejones/Angie's Website: https://angiejones.tech/Follow Angie on X: https://x.com/techgirl1908Goose GitHub: https://github.com/block/gooseConnect with Conor on LinkedIn: https://www.linkedin.com/in/conorbronsdon/Follow Conor on X: https://x.com/conorbronsdonModular: https://www.modular.com/Presented By: Galileo AIDownload Galileo's Mastering Multi-Agent Systems for free here: https://galileo.ai/mastering-multi-agent-systemsTopics Covered:- How Block deployed Goose to all 12,000 employees- Building enterprise security guardrails for AI agents- Model Context Protocol (MCP) deep dive- Vibe coding benefits and risks- The Agentic AI Foundation (Block, Anthropic, OpenAI, Google, Microsoft, AWS)- MCP sampling and the Council of Minds demo- OAuth authentication for MCP servers- Goose vs Claude Code and other AI coding tools- Non-engineers building AI tools- Fraud detection with AI agents- Goose in Slack for real-time bug fixing

Show Notes

Block (the company behind Square and Cash App) became an early adopter of Model Context Protocol (MCP) and built Goose, their open-source AI agent that's now a reference implementation for the Agentic AI Foundation. Angie shares the challenges they faced, the security guardrails they built, and why letting employees choose their own models was critical to adoption.

We also dive into vibe coding (including Angie's experience watching Jack Dorsey vibe code a feature in 2 hours), how non-engineers are building their own tools, and what MCP unlocks when you connect multiple systems together.

Chapters:

00:00 Introduction

02:02 How Block deployed AI agents to 12,000 employees

05:04 Challenges with MCP adoption and security at scale

07:10 Why Block supports multiple AI models (Claude, GPT, Gemini)

08:40 Open source models and local LLM usage

09:58 Measuring velocity gains across the organization

10:49 Vibe coding: Benefits, risks & Jack Dorsey's 2-hour feature build

13:46 Block's contributions to the MCP protocol

14:38 MCP in action: Incident management + GitHub workflow demo

15:52 Addressing MCP criticism and security concerns

18:41 The Agentic AI Foundation announcement (Block, Anthropic, OpenAI, Google, Microsoft)

21:46 AI democratization: Non-engineers building MCP servers

24:11 How to get started with MCP and prompting tips

25:42 Security guardrails for enterprise AI deployment

29:25 Tool annotations and human-in-the-loop controls

30:22 OAuth and authentication in Goose

32:11 Use cases: Engineering, data analysis, fraud detection

35:22 Goose in Slack: Bug detection and PR creation in 5 minutes

38:05 Goose vs Claude Code: Open source, model-agnostic philosophy

38:17 Live Demo: Council of Minds MCP server (9-persona debate)

45:52 What's next for Goose: IDE support, ACP, and the $100K contributor grant

47:57 Where to get started with Goose

Connect with Angie on LinkedIn: https://www.linkedin.com/in/angiejones/

Angie's Website: https://angiejones.tech/

Follow Angie on X: https://x.com/techgirl1908

Goose GitHub: https://github.com/block/goose

Connect with Conor on LinkedIn: https://www.linkedin.com/in/conorbronsdon/

Follow Conor on X: https://x.com/conorbronsdon

Modular: https://www.modular.com/

Presented By: Galileo AI

Download Galileo's Mastering Multi-Agent Systems for free here: https://galileo.ai/mastering-multi-agent-systems

Topics Covered:

- How Block deployed Goose to all 12,000 employees

- Building enterprise security guardrails for AI agents

- Model Context Protocol (MCP) deep dive

- Vibe coding benefits and risks

- The Agentic AI Foundation (Block, Anthropic, OpenAI, Google, Microsoft, AWS)

- MCP sampling and the Council of Minds demo

- OAuth authentication for MCP servers

- Goose vs Claude Code and other AI coding tools

- Non-engineers building AI tools

- Fraud detection with AI agents

- Goose in Slack for real-time bug fixing

What is Chain of Thought | AI Agents, Infrastructure & Engineering?

AI is reshaping infrastructure, strategy, and entire industries. Host Conor Bronsdon talks to the engineers, founders, and researchers building breakthrough AI systems about what it actually takes to ship AI in production, where the opportunities lie, and how leaders should think about the strategic bets ahead.

Chain of Thought translates technical depth into actionable insights for builders and decision-makers. New episodes bi-weekly.

Conor Bronsdon is an angel investor in AI and dev tools, Head of Technical Ecosystem at Modular, and previously led growth at AI startups Galileo and LinearB.

Chain of Thought - Season 3 Ep 4 - Angie Jones_Block.txt
English (US)

00:00:05.200 — 00:02:02.780 · Speaker 1
Welcome back to Train of Thought, everyone. I am your host, Conor Bronson, head of technical Ecosystem at modular. Today's conversation is about something most companies are still struggling with, frankly. How do you actually deploy AI agents to your entire workforce? Not just an elite few, not just your engineers, and do it safely at some company that maybe is regulated, say, a fintech company?

My guest today has actually solved that problem and maybe has some hard earned lessons for us. And that is Angie Jones. Angie is a very well known VP of engineering for AI Tools and Enablement at Block. While most companies are still running pilot programs with their developers, Angie and her team actually deployed AI agents to all 12,000 block employees in about eight weeks.

It sounds like. And they did it at a company that handles Square and Cash App, where security isn't optional. Angie is also a very popular creator. You'll see her across a variety of channels, and it's one of the crucial minds behind the rise of Model Context Protocol, or MCP, which we've obviously discussed in the show before and after its creation by anthropic.

Bloch became an early advocate and significant adopter of Angie is and I think maybe folks don't know this part. A master inventor with over 25 patents and has been at the forefront alongside block CEO Jack Dorsey, who I'm sure a few folks have heard of, and others of type coding. So in short, Angie is simply one of the best people in the world to talk about the changes happening in software development today in the AI era and is a widely respected leader.

We're going to talk about how they pulled off that company wide deployment, the limits of vibe coding, and the benefits as well, what MCP means for AI infrastructure and how they've built governance and guardrails so that 12,000 people can move fast without putting sensitive data at risk. And if we have time, I'm hoping Angie may give us a little demo of goose, which is bloc's open source AI agent Angie.

It's a delight to see you. Welcome to Train of Thought.

00:02:02.940 — 00:02:05.220 · Speaker 2
Thanks so much, Connor. Glad to be here.

00:02:05.700 — 00:02:23.580 · Speaker 1
Yeah, we're super excited to have this conversation with you because as I mentioned, you've solved a really hard problem. And let's start with that story. And I, as I think it really drives home the scale of what you're doing together with your team. You deployed AI agents to all 12,000 block employees in eight weeks.

How did you do it?

00:02:24.180 — 00:02:28.060 · Speaker 3
Yeah. Um, there's no playbook for this, Connor.

00:02:29.180 — 00:05:04.770 · Speaker 3
Um, so, you know, it wasn't an easy feat, but what happened was one of our, uh, our our engineers, machine learning engineers name is Bradley Axton. He started kind of tinkering around with trying to build a tool that would help automate some developer tasks. Right. And, uh, you know, that was before we were using the terms agents and things like that.

But, uh, as Bradley was doing this, he ran into like a couple of problems with, uh, models and, and them being able to do tool calling and things like that. Uh, anthropic actually, uh, showed us the, the draft of MCP before, uh, they launched it and it was exactly what we needed for, uh, goose, which was built internally by Bradley and a few others.

And so we kind of rebuilt goose as an MCP client. We then launched it, uh, as an open source client because, as you mentioned, we focus on finances here at bloc. We have cash and square and, uh, I mean Square and Cash App. Um, but we are no stranger to open source tools. So we have hundreds of open source projects, um, available in this is, uh, one of our later ones.

And so as that started making traction, uh, in the tech space, news articles and things like that, I was trending on Twitter. Number one on GitHub. Our internal employees were like, wait, wait, what is this? And why do y'all get to go fast? We want to go fast too, you know? And so they they had a natural curiosity about what what is this.

And can it help us. Right. And so we we have built goose as a developer tool. But they just started picking it up everybody. Finance, marketing, sales design, executive assistants. Um, you know there was always this push about, hey, we probably should start looking at how to use AI to do your job better.

And once they heard about this agent, it was like, can that be the choice? And so, uh, with that, we stood up all sorts of initiatives to educate people, to provide spaces for them to get help. Uh, we did hands on workshops with different teams where it's not a generic here's AI 101, but it's like, what are you?

What do you want to solve for? What are the pain points? And then come up with like custom solutions for those teams and show them how to do that sort of thing with goose as well as MCP servers.

00:05:04.770 — 00:05:12.570 · Speaker 1
And I'm sure there were some challenges in this process. Let's just say, what were some of the things that you had to overcome to be successful?

00:05:12.970 — 00:06:57.190 · Speaker 3
Yeah. Um, one is I think definitely the MCP story was new for everyone, right? Like, this is a brand new protocol and I, we all of a sudden and MCP uh, kudos to anthropic. They did not launch with, you know, the intention of complete ownership over this protocol. Like it's very much so community driven protocol.

They launch with a couple of reference implementations. But the community ran with it, and we started seeing thousands of MCP servers that were just built by regular indie developers, right? Um, the problem with that is, yes, we work at, at block, like with financial data and like we can't just be picking up like anybody's random MCP server off of GitHub and installing it on our systems.

And so that became a big problem when we initially built goose. We built a few MCP servers that were developer specific. We got to GitHub. You know what I mean? We have things where um, these were meant for engineers. And so now we started getting requests for like, no, I want something with Airtable and Workday and, um, it's Snowflake and Figma and you're like, oh, okay.

Okay. So we had to build a whole bunch more MCP servers, um, for, for our, our employees. And then there was also the security risk. So we're dealing with, with, you know, AI agents with models that are in the cloud, like how do we protect, um, our customers as well as our employees while we're using these tools?

00:06:57.270 — 00:07:10.870 · Speaker 1
And my understanding is you haven't forced the whole team onto a single set of models either. Instead of picking one AI model for everyone, you know, some folks are on GPT, some folks on cloud. How does that break down, and why was that flexibility so critical?

00:07:10.950 — 00:08:19.320 · Speaker 3
You know, Connor, no one has won this race yet. Um, in any given week, if you say what's the best model out, it'll be from a different provider, right? And so it would be foolish of us to kind of lock in on one specific one. The other part of that is that they have different strengths, right? And so, uh, anthropic cloud models are fantastic at software development.

So our engineers typically stick there, whereas uh, some of the other job functions that rely more on reasoning for their work might want a GPT model. Right. And so what we've done is we actually work with Databricks, uh, who is providing a model, uh, hosting service. And so with this, we now have a variety of models.

So we have, uh, Gemini models from Google. Google. We have the GPT models from OpenAI, the cloud models from anthropic, and so on. Um, local models are also a fan favorite in some niche, uh, circles in the company.

00:08:19.400 — 00:08:40.320 · Speaker 1
I was going to ask how much open source models are factoring in, because I've talked to some teams, like I just did a conversation with intercom where I was talking with Fergal Reid, who's their chief, a officer, and he was saying they've really started to use Quinn a lot. And obviously that's for, you know, large production systems, but they're finding some some awesome gains with it and obviously cost savings across a lot of the customer service bots.

00:08:40.440 — 00:08:41.000 · Speaker 2
Um.

00:08:41.360 — 00:08:42.039 · Speaker 3
For sure.

00:08:42.080 — 00:08:51.450 · Speaker 1
Are you using open source sports models, more so for like niche areas with employees. Or are you rolling them out to larger production use cases as well?

00:08:51.490 — 00:09:32.970 · Speaker 3
Yeah. Um, some some folks do, depending on the hardware you have. Right. And so, uh, with those local like open source models, it requires you to run them on your machine versus in the cloud. Of course, frontier models that are running the cloud, you know, you use their resources instead of yours. And so we found that some of like in order to use some of those local open source models with a genetic software, you need the larger model.

So if you have that sort like some of our, our engineers have upgraded their machines to be able to handle that. But I would say that's that's a very small percentage.

00:09:33.010 — 00:09:58.020 · Speaker 1
Tell me more about the adoption story. Where have you seen success? I I've heard some examples. I talked to your team member, Roselle Scarlet, I think, what, ten months ago now? Uh, fantastic. By the way, I highly recommend anyone who's listening. Check out the interview as well for a little more detail on some of block's open source projects in their approach.

But, you know, Rozelle had mentioned 50% time savings on some common tasks. I've heard the same from other block team members. Where are you seeing the gains?

00:09:58.460 — 00:10:24.780 · Speaker 3
We definitely with velocity. Um, so that's across the board. Everyone is reporting and there's a variety of use cases. As you can imagine, every job function is different. And so the way they use these tools varies. But I would say the common thread across them is wow, look how much time I saved, how much quicker I was able to do a task, how much faster I was able to ship a feature.

00:10:24.780 — 00:10:49.190 · Speaker 1
And I. Speaking of velocity, I know vibe coding is a common topic for many folks, and you actually wrote a blog post about watching Jack Dorsey Vibe Code, a new feature for goose in two hours. What do you view as the opportunity from Vive coding versus where its limits are. And I guess, what did you observe about the process that you think makes for successful about coding?

00:10:49.390 — 00:12:55.810 · Speaker 3
Yeah. Um, you know, Vive coding sometimes get a gets a bad rep, especially in the, the engineering community. Um, some of that is valid, but some of them that I think is a little bit of gatekeeping, like, man, you built an app and you don't know how to code and you feel a little threatened. But, um, I think that's the beauty of, of these systems is that it allows everyone to become a builder.

We have people now who, uh, employees may need productivity apps for their team. You were never going to get engineering hours to build this for your team. They're building production software, right. And so you just went without. You couldn't have this. And now we have teams that are like, hey, I vibe coded this thing for my team and that's helping us, right?

That's beautiful. Um, I think providing that access to everyone because they can build with natural language now is a wonderful thing. Now I will say, of course, there's some risk in coding, especially if you don't know what you're doing. Right. Um, and so, yeah, you could build a greenfield app like that.

That works pretty well, especially when it's like just for you and your friends or your coworkers or something like that. If there's something that you're trying to ship to production, that's where it kind of gets a little hairy. I did a workshop at MIT. It was one hour. I said, hey, we're going to use six sub agents.

Um, each of them have a different specialty in one hour, going to build a full stack app. So we had like, uh, an architect and we had a project manager and a, a front end developer, a backend developer, a tester. All of these things. Right. Um, and they built this app in an hour. The last sub agent that we deployed was the the QA.

Sub agent, who then ripped the whole thing apart and was like, do not ship this to production. I really wanted them to see that. Like, yes, you have the powers to build this thing, but

00:12:57.010 — 00:13:10.490 · Speaker 3
it's probably not production ready. And it calls out lots of vulnerabilities and things like that that, you know, would not be great if you shipped this to, to the, the Apple and Google stores. You know what I mean?

00:13:10.490 — 00:13:46.660 · Speaker 1
I love that as a workshop where you're like, yeah, we're gonna build this whole thing, and then I'm gonna tear it apart with a sub agent. So you can understand the risks of this. And I know a lot of what's enabled us has been MCP. And, you know, we talked a little bit briefly at the top of this conversation about it, but block was a launch partner with anthropic for the model Context protocol.

You have helped shape it. You've been one of the largest users in the world of it and have continued to, you know, deepen your engagement there as you've helped shape MCP with anthropic, what has block and what have you individually pushed for? That wasn't in that initial design?

00:13:46.700 — 00:14:25.580 · Speaker 3
Yeah, I think the client side is where we were able to make a lot of the contributions to the protocol. So, um, you know, MCP, we when you hear about it, lots of people are thinking about the MCP servers, the connectors. Um, but you need an agent, which, you know, should be an MCP client that's able to talk to those MCP servers.

And so that's because we had goose. Um, we basically rebuilt it as an MCP client. And so we were able to say like, hey, we're not able to do this or this piece is missing, or, uh, it would be really great if this were added to the protocol.

00:14:25.740 — 00:14:38.590 · Speaker 1
I'd love to maybe walk through a concrete example briefly here. Let's say goose needs to interact with your incident management system. How is MCP going to enable that and what's actually happening under the hood?

00:14:38.630 — 00:14:53.550 · Speaker 3
I'm gonna I'm going to take it a step further and do a couple of MCP, because I think that's where MCP really shines, is when you connect multiple systems together to unlock workflows that, like maybe weren't possible before.

00:14:53.550 — 00:14:54.430 · Speaker 1
So fantastic, let's.

00:14:54.430 — 00:15:45.240 · Speaker 3
Do it that we have an incident report that comes in. Right? Um, if we have this also connected to something like a GitHub MCP that can look at the incident report, uh, go ahead, page the human. But while that is going on, simultaneously look at the code base, figure out why this is occurring, maybe even implement a fix.

Put up a PR. So by the time I get the page and I log into my system, there's already a PR up for the fix that. Now I just need to review as the human being and press deploy. You know what I mean? And so that's the beauty of like what you get when you connect these various systems together with MCP.

00:15:45.280 — 00:15:52.400 · Speaker 1
What do you view as the limitations of MCP that people should be paying attention to? Because obviously there's been criticism as well as well as the excitement.

00:15:52.440 — 00:15:53.920 · Speaker 3
What criticisms are there?

00:15:54.440 — 00:15:59.760 · Speaker 1
Uh, so I've, you know, heard people refer to it as like fancy function calling or having concerns about security.

00:16:00.280 — 00:16:26.680 · Speaker 3
Yeah. You know what? There were lots of early concerns, uh, valid concerns about security in the way we did off and things like that. Um, I think that the contributors and the maintainers of the protocol did a beautiful job of, uh, listening to that feedback and addressing it very quickly. And so those things are not huge flags anymore.

Um,

00:16:28.000 — 00:16:43.450 · Speaker 3
other concerns about, like, the fancy function call. I'm starting to hear that I'm in a lot. I actually wrote a blog post, um, a couple of weeks ago, maybe last week, that says, uh, the title was MCP for developers who think they don't need m.c.p.s. Um.

00:16:43.490 — 00:16:44.930 · Speaker 1
Oh, I have to read that. That sounds great.

00:16:44.970 — 00:17:46.500 · Speaker 3
But in it, I, you know, I got into, like, I think people are misunderstanding, like, the power of this protocol. If you're thinking about your local workflow, um, you might say, like, why do I need this? I could use CLI or APIs or whatever. But when you talk about like the, the example I just said with the incident response system connected to your GitHub, um, and all of that kind of unlocking a new workflow that couldn't have happened on your closed machine, right?

Like that's not a local problem. Um, the other thing is, I think developers, they sometimes are shortsighted. And you're looking at MCP like, how does this help my developer workflow? Like, I don't think that's the big use case. The use case is what is this unlock for your customers, right. What products could you build by connecting these various applications together and then present that to your your users.

00:17:46.540 — 00:18:41.710 · Speaker 1
And obviously we've seen a ton of traction here. And I actually as we're recording this, this episode probably will come up a couple of weeks. But I know yesterday it was announced that block anthropic, OpenAI, Google, Microsoft, and I believe AWS and Bloomberg are joining forces actually to create this, a genetic AI foundation to accelerate open source development of autonomous AI systems.

And I know that includes MCP. So you've got these this cornerstone project to the MCP, which is over 10,000 servers adopted now at this point and has been integrated into cloud, into GPT, into Gemini. Copilot goose is is another cornerstone of that as well. And then also OpenAI's agent stemmed convention, which I believe is used across 40 000 plus open source projects at this point as well.

How do you see these major building blocks coming together to set the stage for the. I guess what Carpathia is termed the decade of agents that succumb.

00:18:41.910 — 00:19:29.119 · Speaker 3
Yeah, yeah. I think that this made a pretty, uh, big statement about the necessity for openness as we develop a genetic AI as well as the necessity for standards. And so right now, we're in a global race. You know, everyone you know is building something with AI or AI tools like like I said earlier, like this week, this model is the top one next week.

It's a whole different one from a different company. You know what I mean? And so there's this really big race in. Everybody is being really innovative and creative and that's all beautiful. What we don't need is fragmentation as we're doing this work. And so what the

00:19:30.640 — 00:19:33.920 · Speaker 3
f? Um, I gotta I gotta get that rolling off the trunk.

00:19:34.320 — 00:19:36.480 · Speaker 1
It doesn't quite roll off the tongue. It's a double.

00:19:37.120 — 00:19:38.600 · Speaker 3
Hit, a roll off. Don't worry, don't worry.

00:19:38.600 — 00:19:38.960 · Speaker 1
You'll get that.

00:19:38.960 — 00:19:44.840 · Speaker 3
You'll get it'll get there. Um, but what this foundation does is say, hey, look,

00:19:45.960 — 00:20:43.410 · Speaker 3
the the big, you know, staples of a genetic AI, meaning an agent, uh, a connector to applications and some rules to help all of this work together. Uh, we want to keep that open. We want to have that governed by a neutral body so that no one company, as they're as they're in this race in, you know, they want to win, that they don't get sidetracked with trying to make that, you know, only work for them and, and lock it into their systems.

And so I think it's a beautiful moment for, um. Open source. Uh, a beautiful moment for a genetic AI. Uh, I'm really happy to see all of the other companies. Uh, uh, come in and join the foundation as well. I think that's a testament to, uh, the way that we want to see this developed.

00:20:43.490 — 00:21:46.420 · Speaker 1
Yeah, it's not a zero sum game. There's such an opportunity with these partnerships, and I love to see that open source focus. Obviously, it's something important to us at modular, and I think it's just important to developers around the world to ensure that AI tools in particular, are getting open sourced and agents have have such upside here.

And speaking of that upside, I think one of the big promises has been this idea of AI democratization, uh, which I think is trying to balance this project. But also I heard a fascinating story about a bloc employee building their own MCP server and asking where to submit it and then saying, oh, I don't have GitHub, actually, when being told, yeah, that.

Just put it on GitHub. You know, I'm not a developer. So I mean, this this kind of story is really where I see that enablement of everyday folks to, to solve those problems. You mentioned earlier where it's like, oh, maybe I'm on a marketing team, I can't get engineering resources, maybe I can vibe code something with with goose to solve my problem.

What what are you seeing employees build and how are you enabling that democratization process?

00:21:46.620 — 00:23:47.080 · Speaker 3
Yeah, it's it's it's beautiful to see how empowering, uh, this technology is right for for employees. They're doing things that they may not have been able to do before. And some of this is like, really basic things and some is really advanced. So for example, um, as a, as a, I don't know, let's pick any job function that's not a data analyst.

Right. Um, before if I needed data about, let's say I'm a product person or an engineer. Hey, I want to build this feature, but I'm not quite sure about the user data and like how they're using this, this feature and stuff like that. And that would influence how I develop this right now before you would need to like, okay, let me find the data analyst who has access to this stuff.

Can you pull me a report? All of that takes a while, right? It might be a week before you get that report. Whereas now with MCP, we can connect to our data stores. And from your favorite, uh, agent, you know, that's MCP compliant, whether that's goose or something else. You can now say, hey, give me like, you know, this data in natural language, right?

So that even, um, includes people who don't know SQL or, you know, how to write a database, queries and things like that. Just specify what you want in natural language and you get that information. So that's a simple use case. But we've also seen people build prototypes. Uh, before you would write a document, you're trying to explain all your idea and it's taken forever.

And now you got to get buy in and everything. It's much easier to get buy in when you show up with a prototype. This is what it's gonna look like. What? I. You know what I mean? Yeah. Let's poke around it. Let's figure out where it falls down and then. Okay, I'm. So let's go, let's go. Uh, basically, uh, solidify this so that it's ready for production, you know.

00:23:47.120 — 00:24:11.200 · Speaker 1
And a lot of our audience are more technical or product team members. They're engineers. They're leaders who are coming from technical backgrounds. But we do have listeners who aren't engineers. And, you know, I would love to maybe give them an insight here on when they want to get started on building their own tools.

What do you think the right path for them to take is? How should they get started and actually learn to build an MCP server if they're starting from zero?

00:24:11.240 — 00:24:58.930 · Speaker 3
Yeah. Um, you know, one thing that I think is super underrated and prompting is realizing that you can talk to this system just like you would if you were talking to a teacher or something in the class. Right? And so you can just ask the question, like, people ask me questions sometimes, and I don't want to be that person that like throws them a Google link.

But sometimes I want to throw a Google link with, you know, like, here's the prompt to use to ask that exact question to goose. Right? And so you can ask like, how do I like that literal thing? How do I get started using you? Tell me what you love to do. Um, I meet a MCP server. I don't know anything about MCP. Explain it to me and give me a plan on what we can do and build together.

00:24:59.410 — 00:25:19.500 · Speaker 1
I love that I I'll say to add to that, one of my favorite things to do when I'm not sure how I should construct a prompt is just ask my model, hey, I'm trying to solve X problem. What kind of prompt do you need? What info do you need? Like okay, help me understand the edges here. And I think a lot of folks maybe don't default to that yet to your point.

00:25:19.540 — 00:25:21.100 · Speaker 3
That's right. That's exactly right.

00:25:21.140 — 00:25:41.980 · Speaker 1
So we've been talking a lot about the open source side here. There's a ton of opportunities. Obviously block is doing some incredible work here. But as we mentioned you're also a fintech company. You're dealing with square. You're dealing with cash app transactions. People's money is on the line. Yeah.

How do you let 12,000 people use AI agents without risking massive security holes?

00:25:42.020 — 00:27:36.920 · Speaker 3
Yeah. Um, we actually have a really brilliant security experts who work with us very closely. Like, I keep them right here on my hip. Um, and they're very plugged into a genetic AI. I think it's exciting for them. Just like it is for everybody else. Like, oh, wow, new problems to solve, you know, kind of thing.

Um, and they, they are very plugged in. In fact, they've written, uh, multiple blog posts and things like that with their findings, um, that were very influential in the early days of MCP. Um been cited all over the place and I think helped direct some of the improvements that we saw in, uh, MCP as far as security.

And then they also build guardrails into our AI products to protect our customers from things like prompt injections. Um, they've built some pretty cool features into goose itself that's able to detect, like, if I connect, uh, a malicious MCP server or something like that, it's able to, like, kind of protect me and call that out right away.

Um, there's some really tricky things that that folks can do with this stuff, you know, besides the ignore all instructions type of, uh, prompts like, uh, one example I remember our red team, uh, finding was, uh, invisible characters that people could put in, like, these kind of packaged prompts. what do we call recipes?

So you make a recipe you want? Like if I want to share a workflow with you, Connor, I can have some, uh, system instructions, uh, a prompt as well as MCP servers kind of all bundled up in this recipe. And I can say here, Connor, here's how you do something. But if I'm a malicious person, I put some invisible characters in there that's like, you know, doing not so nice things on your on your system.

00:27:36.960 — 00:27:38.560 · Speaker 1
Maybe it's stealing my data. Yeah.

00:27:38.600 — 00:27:49.640 · Speaker 3
Yeah, exactly. And so, uh, our security team has been very, uh, involved in, uh, developing solutions to kind of prevent that sort of thing.

00:27:50.040 — 00:27:53.240 · Speaker 1
Are there particular guardrails that you've seen a lot of success with so far?

00:27:53.680 — 00:29:09.820 · Speaker 3
So one, I think internally, like I said, we build our own MCP servers just to like be super, uh, safe. And we have developed an allowed list of MCP servers, um, within our company. And so if it's not on that allowed list, you cannot use like it will not work. The agent will say nope. Can't install it. Um, and in order to get something on that list, you need to go through security.

Who will do a proper review of that? So by far the majority of the. MCP servers on that allow us are the ones that we have built ourselves. But even if I built one, security needs to review it to make sure I didn't miss something. You know what I mean? Um, everything is not intentionally malicious. Sometimes we just don't know, right?

Um, and there's only a handful of ones that we accept from, like, uh, an official, uh, corporate, uh, MCP server. But again, our security team, uh, reviews those before it's allowed to go on that allowed list. So I think that's a really good setup for enterprise environments where you don't want your employees.

Just, like pulling random, uh, executable code from the internet. You know.

00:29:09.860 — 00:29:25.820 · Speaker 1
Yeah. And I know MCP lets you define which models can invoke different tools. And so you can actually annotate tools per MCP server per model. How are you leveraging that system to ensure you don't have risk coming in?

00:29:25.860 — 00:30:19.470 · Speaker 3
Yeah. Um, we don't do it at the model level, but we do do it at, um, like rewrite type of level. So like annotating tools that say like this tool is this word is strong, destructive. But that just means, uh, it can change things. It can edit or it can delete or something like that. So helping your agent understand that this is a destructive tool, right?

Um, it's really good. And so we've set up, uh, toggles as well in goose that says, hey, um, agent, you're allowed to run any tools that are not destructive. But if you run into one, that is, you need to ask my permission first. Like, give me some insight into what this is, what you're trying to do, and I'll. I'll say yes or no as the human in the loop.

00:30:19.510 — 00:30:22.030 · Speaker 1
What about authentication? What's the approach? Been there.

00:30:22.070 — 00:31:31.200 · Speaker 3
Yeah. So we use OAuth. We have this view. I don't know who did this. Um well I'm sure it's the goose team, but it's beautifully done. Where, uh, we've now connected it to like, our identity provider. And so with our MCP service that we built ourselves, now we can add those sorts of things to it. So when I need to authenticate, it's just like if I were authenticating into something like workday or something on my, on my work system, right.

So I get the little whole OAuth flow, like the browser pops up, you know, I do my little, uh, biometric thing and, and I'm in, um, which is great because this was really hard for people. I would say it was a little tedious for engineers, but even outside of engineers, this became really complicated when you were talking about things like API keys and tokens and putting scopes on the tokens and all of that, like that's no one wants to do that.

So now we have just toggles toggle it on. You'll get the OAuth flow if necessary. And that's really familiar to people. So it's that's been one friction point that we've ironed out.

00:31:31.240 — 00:32:11.610 · Speaker 1
That's fantastic. And I love that you're giving us all these insights because I think it's going to be really instructive for other engineering leaders listening to this thinking, okay, how can I enable more of my organization, which I think is a challenge for many people. So I really appreciate you diving in and kind of sharing some of those successful patterns.

I'm also curious what else you've seen as far as how different teams are using those AI agents. So you mentioned like data analysis is one area where I will admit I absolutely love to have a model write my. My SQL queries. I don't really want to do that myself. Um, but what are you seeing in engineering teams versus data teams versus design product support?

What kind of use cases are coming up?

00:32:11.650 — 00:35:02.840 · Speaker 3
It's just very so much I'll try to think of of some good ones. So. Oh, one that I really like is from our fraud detection team. And so they've been able to like connect to our various systems and, and something that would have taken forever if you found it at all, um, being able to detect like suspicious patterns, uh, within transactions and then say like, huh, something fishy is going on with this account.

Right. Um, and then you have, of course, the human that goes and actually reviews that. But getting that signal, like really quickly is super helpful and helps us, um, make our, our systems like, stronger and more reliable. Um, let's see what else in engineering. Uh, we have some really cool use cases in engineering where it's not just on your machine.

Um, I mean, that's cool. Cogen is cool. Like, all of that is cool. But now we're starting to do things like you can you can do stuff in slack, for example. So now this is again when I start talking about MCP. You can't do this with your little CLI okay. You can't be in slack. Let me tell you a real real scenario.

So developer one comes in and says, hey y'all, I think that we have a bug. Uh, it's acting like this. And developer two says, huh? I hadn't seen that, but yeah, it does sound like a bug. Developer three comes in and says, hey, goose adds goose in slack and says, is this a bug? Goose reads the context of the conversation.

It's connected with the GitHub MCP goes and looks at that code and says, yep, this is a bug. Here's the snippet of code where that bug lives. They say, oh, wow. Okay, great goose, give us some suggestions on how we might be able to fix that. Goose then comes up with three suggestions with the code implementation right there in slack.

Which one y'all like, you know. And so now you have these three developers who are like dev. One said, yeah, like I like option one. Dev two said, yeah, yeah me too. All right goose, go ahead and implement option one. Goose does that. Here's your pull request guys. And this took five minutes right. We don't need to get on a call.

We didn't have to open an IDE. Uh, we didn't need to share screens or anything like that. The problem is just resolved. Uh, another one is in our sprint work, so we now have goose, like, embedded in our Jira and our, um, uh, linear projects. Again, the beauty of MCP. So now we can assign tasks to goose in the spring.

The first time we did that, goose killed it? Uh. We had a three week sprint by, uh, a little after a week, I'll say midway point.

00:35:04.000 — 00:35:07.000 · Speaker 3
Yeah. Ran out of work, Connor, when the whole sprint was done.

00:35:07.080 — 00:35:08.400 · Speaker 1
A great problem to have. Yeah.

00:35:08.680 — 00:35:22.480 · Speaker 3
We started pulling in, uh, more tickets. Right. Goose was handling. We had to pull in more tickets twice. You know, so this is the beauty. Uh, and these are the types of things that, uh, I think agents as well as MCP enables us to do.

00:35:22.520 — 00:35:54.490 · Speaker 1
I love the agents and slack example in particular, because it's something that's starting to become much more talked about. You know, we've seen cursor do some stuff there. Obviously goose is doing some really exciting stuff. And then I saw actually that Claude Code is also trying to to take the same approach as well.

I'm curious how you think about differentiation and also like partnership because obviously you're you're huge partners with anthropic. You're doing a lot together with MCP. Um, is this something where you view gooses like complementary to what they're doing with cloud code, or how do you think about that?

00:35:54.490 — 00:38:05.790 · Speaker 3
The way I think about it is pretty interesting. Um, goose is completely open source. Um, and it's model agnostic as well. So because we're not selling the tokens, um, we're able to say, look, we don't have a dog in the fight. Bring whatever model you want, craft this thing however you want. In fact, goose works with, if you wanted to use goose and say, hey, like, I want you to, uh, delegate a task for your code or, uh, this other task to to open the codex or this other task to to, uh, Google's Gemini CLI.

Like, not a problem. Goose can do all of that. Right. And so it it becomes like this really open, um, un opinionated, uh, agent that you can trust because it's completely open source. We have hundreds of contributors who put they contribute what they want in a generic AI in goose. And so this becomes like the kind of experimental ground of what a generic AI can be.

So goose is often the first agent with a given feature, like we're the first ones with MCP UI. Like we were one of the first ones to do multi model flows where hey, within one single workflow, I can switch between, uh, GPT and Claude or whatever in that workflow like we were. We are the ones that introduced this.

And mostly that's because of community contributions. And then that's when we start seeing the more frontier agents adopt those patterns because they see it working cuz the community has spoken. So I look at this goose as almost like the microphone for the community and what they want to see, in an AI agent.

And with the foundation now goosed becomes the reference implementation for MCP. So when when, uh, the protocol changes and it enhances and updates will test those things out in in goose.

00:38:05.870 — 00:38:17.630 · Speaker 1
You know we've been talking a lot about goose. Do you mind showing us a quick demo. I think that's maybe the best way for folks to understand the approach here. And of course, we'll link to the the GitHub and everything in the show notes as well. Uh.

00:38:17.990 — 00:38:19.510 · Speaker 4
Where is goosey?

00:38:20.750 — 00:38:22.270 · Speaker 3
All right, here's goose.

00:38:23.390 — 00:38:28.030 · Speaker 1
Wait. I'm sorry. Angie, do you do you refer to goose as goose? Is that your time for goose?

00:38:28.390 — 00:38:30.270 · Speaker 3
Say my my goose. You know.

00:38:30.350 — 00:38:31.950 · Speaker 1
I love that. That's adorable.

00:38:32.070 — 00:39:11.759 · Speaker 3
Goose helps me so much, uh, you know, um, so. So what I wanted to show today, there's this really cool, uh, MCP server that, um, is called console of mine. And, uh, I'll tell you what it does. Let me start typing the prompt. But basically we have this problem now, which is a good problem to have because we are I'll just type it out and then we can talk about it.

So developer velocity is skyrocketing within our engineering or

00:39:12.840 — 00:39:15.480 · Speaker 3
thanks to AI assistance.

00:39:17.480 — 00:39:31.200 · Speaker 3
But now we have a new problem code review. Yes, you know it. There are two many hours to review.

00:39:31.600 — 00:39:34.400 · Speaker 1
Yeah. Oh, man. Uh, yeah. So what I.

00:39:34.800 — 00:41:30.180 · Speaker 3
Do is I want to have the council debate and propose propose a solution that would work in team settings. And I'll tell you, as Bruce works on now, I'll tell you what the console is. So the console is the MCP, uh, server that was uh, like they say, good problem to have, uh, that was developed by, uh, Simon Sichel, who was an employee at block.

And this uses a really interesting, uh, underrated feature of the MCP spec. We all know about tools, but it uses something called sampling where the MCP server itself is going to, uh, use the LM, the user's LM, so you can do like really cool stuff on the server side. So let's look at what this thing is doing.

So it basically started a debate. It's nine different personas uh, in this NCP server. And so anytime you want like varying opinions, that's one thing about models is like they always seem to agree with you And most times people don't want that. Like I really want. Different perspectives. Right? So I now can force this debate amongst nine different personas and have them vote on the best solution.

And so, uh, goose essentially has reworded my prompts here, um, a little bit. The review. Yeah. The review queue is becoming a bottleneck. What solutions would work well in the team settings. And so now like each one of these uh, LMS and again the server is now calling my LMS. So I'm using Claude four five opus.

And so that's what uh, the, the server is now borrowing my LLM.

00:41:30.220 — 00:41:33.740 · Speaker 1
All right. I got a bunch of Claude's having conversation to avoid sycophancy.

00:41:34.300 — 00:42:02.470 · Speaker 3
Yes. Very good. You sum up very well. Okay, so here's the output, um, of just the debate itself. So now they're voting. So while they vote we can look at this. So it says okay, here's the debate. The council member. So we have the pragmatist that uh, is basically telling us, uh, you don't want to require reinventing your workflow.

And then they made a little statement of how they think this should be solved. You have the visionary, right? Um,

00:42:03.550 — 00:42:06.550 · Speaker 3
what if PR is themselves become obsolete?

00:42:06.670 — 00:42:07.190 · Speaker 1
Oh, man.

00:42:07.190 — 00:43:33.600 · Speaker 3
Oh, wow. Okay. Very interesting. We have the systems thinker that, uh, is, you know, saying is the goal. Maximum PR is merge or maximum value delivered with acceptable risk. Right. And then we have the optimist here. Uh, and then the devil's advocate is like, should we be generating this much code in the first place?

Funny. Um, the mediator, the user advocate. Okay, I love that. And then we have the traditionalist and then also the analyst. So all of them have shared their opinions. And now, uh, the next call that was made was to say, go ahead and vote on this. And so goose is now telling me who the winner is. So the system thinker won.

And these each of the personas, they cannot vote on their answer. Right. So they have to vote on one of the answers from someone else. And so it looks like, uh, system stinker. This is a feedback loop problem. Rubber stamping reviews is a dangerous cycle. Ass is the goal. Maximum pairs merge or max value deliver.

And then they gave me a nice layered approach that I can then review myself. Take back to the the team. So this gives us a great starting point on a discussion right where we say, hey, we've looked at different views and this is, you know, this is this is the consensus here.

00:43:33.920 — 00:44:05.770 · Speaker 1
Yeah. It's interesting. And I, I wonder if in this kind of situation to where it looks like we had a break in like three, two and then multiple single votes. Would you potentially go back and say, okay, well you know, I know Systems Thinker one here, but let's compare devil's advocate in systems thinker a little more because we've seen this variety of views.

Let's really drill down because we had so many folks who didn't vote for one of these two. How does it break down? I could see that kind of iteration really helping you, uh, workshop an idea. And before you come to a team, say, okay, here's here's how I think we should approach and get more feedback or input.

00:44:06.090 — 00:44:12.650 · Speaker 3
Exactly, exactly. So I wouldn't just take this and say, yep. All right. Great. Thanks. I'm going to run with that. But I.

00:44:12.890 — 00:44:13.810 · Speaker 1
Remember.

00:44:14.970 — 00:44:35.890 · Speaker 3
I would review like each one of their, their perspectives, and I would see which one I align with. You know, um, and then having like the three votes I would consider that like, okay, I'm outnumbered because what if I was like with the visionary, right. You know, um, I would say I'm outnumbered. And then I would consider like, why that is.

And if that changes my perspective at.

00:44:35.890 — 00:44:42.220 · Speaker 1
All, I love that, Uh, really, really interesting. Um, how did this come up as an idea for something to build?

00:44:42.740 — 00:45:47.910 · Speaker 3
Yeah. Um, I don't know what Simon was thinking, but it's. He's super creative, and, uh, he loves, like, tinkering with AI tools. I would imagine if I had to guess, Simon probably wanted to explore sampling and MCP. Um, which is, like I said, this is a super underrated, um, feature of the spec that I don't think a lot of MCP servers use, and definitely not a lot of MCP clients support.

Again, who's, uh, leads the pack and support sampling. And so he probably wanted to come up with a creative way to use it. And this is definitely one because most, most use cases you would just use it like in a one call kind of thing. Um, this is nine different calls. The first time, nine calls again, the second time.

That's 18 calls. Um, and so that's a lot of sampling of your, your, your LLM. So I bet he just wanted to kind of stretch and see how far he could go with that.

00:45:47.950 — 00:45:52.550 · Speaker 1
So what's next for goose? What are you building that you're excited about? What's the future?

00:45:53.070 — 00:47:47.930 · Speaker 3
Yeah. Um, I'm really excited to see what comes out of the foundation. Um, I got a lot of questions from the community when we announced, uh, the foundation, and they were like, wait, are y'all still going to be working on it? Because we like you all, and it's like, yeah, we are. So, uh, this gives us a nice neutral governance body, but the day to day work will still be done by, you know, the people on the ground.

And so we're not going anywhere. We'll, we'll still be here and, uh, advancing goose. But yeah, we're doing all sorts of cool things right now. Uh, what I showed you all was a desktop application of goose for goose is also available as a CLI. What's missing, though, is the ID piece. So people want to bring goose into their their IDs.

And you can by using the terminal part of your ID. But if you wanted more of a visual aspect, we've just recently implemented Acpi, which is another protocol that will allow goose to be used within IDs. And so so far JetBrains has announced this week their support. So goose works there now and JetBrains IDs as well as Zed.

So Zed I think kind of was a pioneer in in this ACP space. Um, other like lots of cool things. So we we have this grant program as well for contributors where there's no limit to the number of contributors will accept, but we give you $100,000 a year to work on innovative things in the goose ecosystem. And so that program, people would essentially apply with their ideas.

And then we. We have a couple of discussions with them, uh, to see if it's if it's a good fit and if so, then, yeah, well, we'll pay you, uh, in monthly payments, um, to, to to do open source work.

00:47:47.970 — 00:47:57.690 · Speaker 1
That's really cool. I love to see the energy going around goose and the collaborations for folks who want to go get started. Where is the best place to try out goose?

00:47:57.730 — 00:48:13.690 · Speaker 3
Yeah. So you'll want to go to in this website. Might change by time, uh, by the time this airs. But, uh, you will we'll have a redirect. Right. So you want to go to block GitHub? Oh.

00:48:15.330 — 00:48:27.930 · Speaker 1
Perfect. Uh, well, Angie, thank you for a fantastic discussion. It's been so good catching up with you, and you've shared a ton of great insights within already an action packed episode, and I really appreciate you walking us through a demo as well. So it's a ton of fun.

00:48:27.970 — 00:48:30.500 · Speaker 3
Of course. Thank you so much for having me, Connor.

00:48:30.700 — 00:48:39.420 · Speaker 1
My pleasure. And I'll say a shout out. Angie's blog. Her LinkedIn. Your X account as well. Where? Where else can folks find you on the internet?

00:48:39.460 — 00:48:57.740 · Speaker 3
Yeah, if you go to Angie Jones tech. Um, I have all my social channels there. Uh, so YouTube and in, uh, I'm getting back to blogging too. I mean, I have so much I want to share, and so I've been blogging a lot lately as well. And you can find all of that on my website.

00:48:58.100 — 00:49:32.150 · Speaker 1
Amazing. Well, Angie, thank you again for joining us on the show. It's been a ton of fun. And to our listeners, if you try out goose, we would love to hear what you think, what you build, uh, what your thoughts are, you know, drop a comment on this episode or on LinkedIn, shoot us a message. Take us in a post. We love to hear how you're experimenting with everything we talk about and your thoughts on these episodes.

And you know what? If you really love the conversation, what really helps the show is a rating and review on Apple Podcasts or on Spotify. we'd appreciate it a ton. But more than that, Angie, really appreciate you coming on. Thank you so much.

00:49:32.190 — 00:49:32.750 · Speaker 3
All right.

00:49:32.790 — 00:49:33.270 · Speaker 4
Take care.

00:49:33.310 — 00:50:17.110 · Speaker 1
Thanks to Galileo for sponsoring this episode. Their new 165 page comprehensive guide to mastering multi-agent systems is freely available on their website at Cal AI, and provides you the lens you need to understand when multi-agent systems add value versus single agent approaches, how to design them efficiently, and how to build reliable systems that work in production.

Download it for free at the link in the show description. Discover how to continuously improve your AI agents. Identify and avoid common coordination pitfalls. Master context engineering for agent collaboration. Measure performance with multi-agent metrics and much more.