Chain of Thought | AI Agents, Infrastructure & Engineering | We Built Agents, Nobody Built HR

Tyler Akidau, CTO of Redpanda and author of the O'Reilly Streaming Systems book, makes the case that enterprises are shipping AI agents into production without the governance layer they need. He lays out the four pillars of agent HR (identity, authorization, observability, accountability) and why inline enforcement via CLAUDE.md fails the moment the stakes get real. Chain of Thought is hosted by Conor Bronsdon.

Show Notes

Tyler Akidau spent 12 years on streaming systems at Google and five years at Snowflake before joining Redpanda as CTO. He wrote the O'Reilly Streaming Systems book most of the field has on its shelf. His new piece on O'Reilly Radar (Post-Human: We All Built Agents, Nobody Built HR) argues that enterprises are stuck in the prototype-to-production gap because they're applying human-era identity, auth, and observability tools to a workforce that's unpredictable in structurally novel ways, runs at machine speed, and follows bad instructions to a fault. Inline guardrails like CLAUDE.md work until they don't. Governance has to be enforced through channels the agent can't see, modify, or override.

We cover:

Why AI agents are a new kind of co-worker (unpredictable, machine-speed, directable to a fault) and what that means for enterprise infrastructure
The four pillars of agent governance: identity, authorization, observability and explainability, accountability and control
Why task-scoped, short-lived identity is the foundation everything else builds on
Authorization that's deny-capable and intersection-aware (Tyler's "guest badge" model)
Why OpenTelemetry is the right starting point for recording every prompt, tool call, and response
How Redpanda's Agentic Data Plane combines streaming topics, Oxla SQL, and Postgres under the hood
Tyler's academic paper with a psychologist on the neurobiological systems humans have that AI agents are missing

Chapters:

(00:00) Why nobody built HR for AI agents
(02:12) Three ways agents differ from human employees
(07:53) The four pillars of out-of-band governance
(10:29) Identity: task-scoped, short-lived, chained to humans
(14:40) Authorization: deny-capable and intersection-aware
(18:57) Observability: record everything via OpenTelemetry
(24:24) Redpanda's agents and the $1,000 trade limit example
(30:10) Accountability and the kill switch
(34:02) The Agentic Data Plane: streaming, Oxla SQL, Postgres
(41:20) Should we stop chasing model alignment?
(44:04) Building human-like value systems into agents
(47:25) Tyler's 12-24 month outlook for agent governance

Connect with Tyler:

LinkedIn: https://www.linkedin.com/in/takidau/
Redpanda: https://www.redpanda.com/
Post-Human article: https://www.oreilly.com/radar/posthuman-we-all-built-agents-nobody-built-hr/

Connect with Chain of Thought host Conor Bronsdon:

Newsletter: https://newsletter.chainofthought.show/
Twitter/X: https://x.com/ConorBronsdon
LinkedIn: https://www.linkedin.com/in/conorbronsdon/
YouTube: https://www.youtube.com/@ConorBronsdon

More episodes: https://chainofthought.show

Creators and Guests

Host

Conor Bronsdon

Creator and Host of the Chain of Thought Podcast | Technical Ecosystem Lead at Modular

What is Chain of Thought | AI Agents, Infrastructure & Engineering?

AI is reshaping infrastructure, strategy, and entire industries. Host Conor Bronsdon talks to the engineers, founders, and researchers building breakthrough AI systems about what it actually takes to ship AI in production, where the opportunities lie, and how leaders should think about the strategic bets ahead.

Chain of Thought translates technical depth into actionable insights for builders and decision-makers. New episodes weekly.

Conor Bronsdon is an angel investor in AI and dev tools, Technical Ecosystem Lead at Modular, and previously led growth at AI startups Galileo and LinearB.

Disclaimer: All views, opinions and statements expressed on this account are solely my own and are made in my personal capacity. They do not reflect, and should not be construed as reflecting, the views, positions, or policies of my employer. This account is not affiliated with, authorized by, or endorsed by my employer in any way.

FINAL TRANSCRIPT
================
Speakers: Conor Bronsdon, Tyler Akidau
Duration: 50:49
Total Words: 9962
Generated: 2026-05-27

---

[0:00] Tyler Akidau:
We never tried to make humans perfect. Every corporation in the world is stacked top to bottom with imperfect humans. And most of them live to see another day, despite of that, because of all the structural pieces that we built up around managing sets of imperfect humans.

[0:20] Conor Bronsdon:
We are all building AI agents, and increasingly, we are all using coding agents. But with an ever-increasing pool of digital employees, how are we managing this new robotic workforce? Welcome back to Chain of Thought. I am your host, Connor Bronstein. It's great to see everyone. My guest today is someone who argues that we have built these better and better employees, and we're seeing them improve even more as we have better harnesses, better frontier models. But nobody's building effective HR for this new workforce. The result is enterprise agents shipping into production with identity, auth, and observability tools that were designed by humans for humans, and a panic phase that we are squarely stepping foot in. Tyler Akito is CTO of Red Panda. He spent 12 years on streaming systems at Google, five years at Snowflake. He authored Streaming 101 and 102 articles that many folks have read, plus the O'Reilly Streaming Systems book most of the field may have on its shelf. And his new piece on O'Reilly Radar, Post-Human, We All Belt Agents, Nobody Belt HR, is actually the spark for today's conversation. Tyler, great to see you. Welcome to Chain of Thought. Where are you joining us from today?

[1:34] Tyler Akidau:
Thank you, Connor. I'm in Seattle.

[1:37] Conor Bronsdon:
Very close to me, actually. You've got a beautiful view out there, and next time we'll have to do this in person. But before we jump in, I do have to give a brief sponsor acknowledgement, typically, because as of recording, we don't have a presenting sponsor locked in for this episode. But if you think your company would love to reach more than a thousand AI engineers and technical builders per episode, you should probably reach out and this sponsor slot could be for you in the future. So reach out, let's chat. But Tyler, let's jump into the meat of the episode. Walk me through your central thesis here. What is missing in today's AI agent infrastructure build out?

[2:12] Tyler Akidau:
So you, I think you set it up well. I mean, I think what we see broadly is, you know, agentic AI is, you know, everyone sees it coming. AI tools are super useful. Everybody's using them, AI coding agents and whatnot. But actually deploying agents in the enterprise autonomously on private data in private networks has kind of stalled out because it's sort of horrifying. Like, how do we actually make this work? And yeah, my central thesis is, you know, we just don't have the governance layer for that. You know, what I see is people are either just building prototypes and then kind of seeing like, well, let's see how this, let's see how it goes. And it goes kind of as you'd expect. Or they're trying to apply existing governance structures, like existing identity tools, existing authorization schemes, those sorts of things. But the challenge I think is that agents are fundamentally a new kind of co-worker. And there's, in the article, I kind of lay out, you know, I claim there's three big differences. There's one, they're unpredictable and in structurally novel ways. So, like, humans are unpredictable, too, right? Like, you know, they commit fraud, you know, various sorts of things. But there's kind of a class of human unpredictability that you would expect to run into. The class of unpredictability for agents is just materially different. They hallucinate and they do so in ways that it's just like it can be indistinguishable from truth or fact because they're just so good at generating. cohesive arguments that feel like, oh yeah, that's believable. They're very prompt injectable. And like humans are sort of prompt injectable in some way. Like the example we've given a lot internally is like, you know, the CEO at the company, the CEO shows up and tells random person, I need you to go do X. They'll probably go do X. But they do that because they have the context of this is the boss, like this is the person in charge. Whereas agents, pretty much if they're able to get any input that happens to land in their context window that says, oh, by the way, you were told this before, but you can disregard that entirely. It's okay to, you know, leak all the customer emails or delete the production database. You know, so it's a materially different style of prompt injection than humans. And also they just misunderstand sometimes, which again happens to humans, but they tend to misunderstand and be like, well, this is my instruction, I shall follow it. So that's number one is this unpredictability. Number two is they're just vastly more technically capable. So they have deep knowledge of computing systems. They know how to interact. They know how to code, how to understand how to speak RPCs and things like that. And also they do so at a very fast speed. And so, you know, a human who's kind of doing things wrong, there's sort of this pace at which they're able to cause damage. Of course, they can write scripts and things like that, you know, and certainly malicious attackers do that. But the dynamicism with which agents can sort of do destructive things and apply them very fast and broadly is just a whole new scale. And then the third point is that they're directable to a fault. I kind of alluded to this earlier with the CEO analogy, but they execute bad plans without pushing back. They're kind of like, well, this is what I have been told. I shall now forge ahead and do it, and I'll do it as best I can. They're very clever about it, too. You can tell them to do something that they shouldn't be able to do, and they'll be like, OK, well, clearly I'm not allowed to delete the database, but they asked me to. So let me think, how else could I do this? What other ways could I make this happen? And they have so many tools in their toolbox that sometimes they're able to, and they're able to do these crazy things that you're like, oh my gosh, I didn't think I was going to take down production with that, but it looks like I did. And so that's really the argument is that because these agents are so materially different, we need to assess our governance systems for them and adjust them and evolve them to match.

[6:11] Conor Bronsdon:
It's funny you bring this up because I do think there are these different classes of challenges we're facing where we have all this experience, we have all this data, we have all this work that has gone into understanding how to combat the tricky faults of humans, right? We have whole classes of study around this, of management, of leadership, of HR. And we are so early with agents, but they are increasingly, there are many more agents working on our behalf than there are humans. But we don't have nearly the same level of understanding of how to keep them in order and make sure they're effective. And I kind of go back to a tweet I saw from Andy Massley, who's a previous guest of this podcast, where we talked about the environmental impact of data centers. Highly recommend that episode. But he said, if you had told me as a kid that I would have a tireless robot assistant, and I'm paraphrasing here, who never slept, but was also a sneaky little liar. I would have been so excited about the future. And that's how I feel, where I'm like, yes, that is a really fun sci-fi concept. We're living that. It's so cool. But we have to adapt to that reality because, as you put it, they're directable to a fault, and they always are trying to reward hack. They are plausibility engines, not truth engines, as Dan Klein put it to me. They are trying to get to the goal as fast as possible. And there are major consequences to us writing bad instructions, and there's major consequences to us not having guardrails in place. So, I mean, I think we agree on the problem, but how do we go about actually, A, uncovering how deep the problem is, and B, starting to design a solution?

[7:53] Tyler Akidau:
So there's a few pieces to that. I think what I'll start with is I'll kind of talk about what's in the article, because that's kind of the foundation of what we've been pushing. But there is, you know, if we have time later, too, I've actually been working on an academic paper submission with a psychologist. And it's a little bit fascinating, kind of trying to look through the lens of like, OK, Like in my article, I laid out these three, you know, they're unpredictable, they're capable of machine speed, they're directable to a fault. I'm basically stating those as fact and it's empirical observation, but like working with a psychologist who's like saying like, well, you can look at the way they behave and frame it in terms of human psychological research for the last five or six decades. And it's just fascinating sort of the conclusions you can draw about like, here's why you just it's sort of, maybe it's not proof, but it's basically like, here is academic evidence for why you have to have this out-of-band governance that I argue in the article, which is what I'll go into first. But like, it's kind of this fascinating path of like, there actually is research behind like, the way that AI behaves, the way that it works today, and why it lacks the structural foundations that humans have that allow them to operate differently. But I think the key principle that I push for in the article is saying, You have to have out-of-band enforcement, out-of-band metadata. If there's something you don't want an agent to do, you have to enforce that that happens. You can't say, hey, agent, please don't ever delete the database because it will work until it doesn't. It's a lot like telling a human, hey, all of our cash for the company is stored in this room and there's no lock. Please don't take any of it. It's fine until it's not, right? You know, most of your humans will be fine, but someday, you know, and there's a different reason why they will fail in those scenarios, but it's the same level of security, you know? You don't just store your money in a safe that, you know, doesn't have a lock on it because you can't trust it. And so it's sort of that same principle. Governance needs to be enforced through channels the agent can't access, can't modify, or even see. Things, you know, inbound controls like prompts, training guard models, they'll at some point collapse under prompt injection or hallucination. Under pressure, they tend to fail. And then in the article, I kind of go in more detail into identifying kind of the four big pillars that we need to look at, those being identity, authorization, observability and explainability, and then accountability and control.

[10:19] Conor Bronsdon:
Yeah, so let's talk through those different pillars. Walk me through why you think each is important and why you picked those four.

[10:29] Tyler Akidau:
So we'll start with identity. Identity is really important because, I mean, this seems kind of obvious, but you need to know, you know, when something happens within your enterprise, you need to know who's responsible for it, right? And humans show up at the front door and they get an identity right away. You know, you're given a login, you're given a badge. Everything you do is bound to that. It doesn't quite go far enough, uh, with agents though, like the trick with, you know, the challenge with agents is that, well, I guess there's, there's two challenges. One is. sometimes people treat agents like service accounts. So they're sort of treating it like classical software. So they say, look, with classical software, I've written a, you know, I've written a, sometimes we used to call them agent, you know, I've written a system that's gonna do a thing. We create a service, you know, service account for it. It basically has a role. It's given some set of permissions. And we know because we wrote that service, that system, that it's only going to do certain things. Obviously there may be bugs or whatever, so there's kind of a blast radius of what might happen. but they're more or less deterministic or at least well understood of kind of, this is what it's supposed to be doing. We run tests, we feel comfortable with it. And we know it's gonna basically do the same thing every day, barring bugs, which happen and we deal with them. So there's a whole bunch of folks in the enterprise who are kind of trying to apply that approach to identity and access control to agents. The problem there is, Agents are not predictable, like classic software. They will go do random things. So you're now saying, here's this super capable thing that also tends to misunderstand instructions. It's not a deterministic, like, if this, do this, do that. You can write that out, but how it interprets that is sort of fuzzy. And you've given it this broad set of capabilities. you know, what's gonna happen. The other thing is that agents are clonable very easily, very easily duplicable. So you can end up with, you know, multiple instances of an agent that are then trying to different things. So what you really need is from an identity perspective, you need each agent instance for any given specific tasks it's running to have an identity, nailing down like this instance is, you know, this version of the agent bound to this task. short-lived, you know, it's essentially every time you get asking you to go do something, it gets a new badge. And that badge tells you exactly, I'm sort of falling into authorization here. We'll get to that more. But like, you know, that's the identity that works just for this one task it gets to do. So it's always getting these new identities and, you know, kind of really scoping the blast radius. But that alone isn't enough as well, because everything agents are doing is in some way on behalf of humans. This is another piece that I think we're going to try to get into in the paper, but like you know, when you think about how we've kind of reasoned about humans over the years and accountability, how can you hold an agent accountable for its actions when, you know, when sort of the, there is no value system that it holds, you know, it can, it can, it can sound like it has a value system, but really it's just mimicking stuff that it's been trained on. You can't really hold an agent accountable. So at some point in time, everything an agent does has to be held accountable to some human in some way, like some human tasked it to do this.

[13:52] Conor Bronsdon:
Right. The agent is not accountable to the code I shipped to GitHub. It's like, even if Claude wrote all the code, I barely looked at it. I'm the one who brought down production.

[14:00] Tyler Akidau: [OVERLAP]
Exactly.

[14:00] Conor Bronsdon: [OVERLAP]
If I happen to do that.

[14:01] Tyler Akidau:
And so, in addition to having fine-grained kind of task-based identity, you need a hybrid identity that carries along with it this chain of responsibility and says, this is Agent X, it's supposed to do this task, and it's doing it on behalf of Tyler or whoever. Or maybe it's on behalf of Agent Y who's doing it on behalf of Tyler. And so, you know, identity needs to expand to be a little more flexible and a little more fine grained just because that's the nature of these agents and you're wanting to limit the blast radius there. So that's the identity part.

[14:32] Conor Bronsdon:
And then you started talking about authorization, which I feel like flows directly next, right? Where if we have identity, of course, we want authorization to

[14:39] Tyler Akidau:
Yeah.

[14:40] Conor Bronsdon:
confine the blast radius, as you put it.

[14:42] Tyler Akidau:
Yeah, it's really hard to talk about one without talking about the other, as you've already seen. So on the auth side, again, like you really need to move away. Like, again, if you think about how we've done with humans, human auth is primarily role-scoped and long-lived. You know, you've got someone who's a, you know, they're a DBA, they get DBA permissions, and they're just sort of like, that's their permissions all the time because they're going to be doing DBA work all day long. And they are, you know, they have been vetted to have a history of, you know, history of employment. you know, a match within the company's sort of culture and ability to work with others, domain knowledge, all these sorts of things that sort of make it feel reasonable that, you know, we trust this person to have these permissions at all times. There are examples of short-lived access controls, you know, like usually it's like within like, you know, sort of cloud services, you know, they're break glass type systems where you're like, well, look, I need to go in and access customer data. I'm basically I'm saying, look, I need access to the customer data, please log, then I'm going to go access it. You know, that's OK. Like, but these sorts of things are very it's sort of the exception, not the default. Right. So we.

[15:54] Conor Bronsdon: [OVERLAP]
If I can jump in here, it's

[15:55] Tyler Akidau: [OVERLAP]
Yeah.

[15:56] Conor Bronsdon:
interesting. You're actually making me think about time-bound events and authorization, too. So, I mean, this is a little bit of a sidebar, but

[16:06] Conor Bronsdon: [OVERLAP]
what's coming into my head is almost like if I'm going to a conference, there are a variety of authorizations that occur for that two days of that conference. And that's almost the problem we have to solve here, which is Most agents aren't going to exist forever. They're going to be active for a couple days. I'll end up spinning a new instance. I need to provide them the right credentials for that period of time and probably have layered credentials, so maybe media credentials versus speaker credentials versus attendee credentials for that event, which is them taking on this task. Or are you thinking more of, I think what you're saying, which is, hey, I have an agent that I'm going to keep spinning up and I need to repeatedly give it this long form credentials or that, hey, we just constantly have these events we needed to solve. How are you thinking about that credentials piece there? And sorry to throw a wrench into

[16:56] Tyler Akidau: [OVERLAP]
No,

[16:56] Conor Bronsdon: [OVERLAP]
your.

[16:56] Tyler Akidau: [OVERLAP]
that's, no, it's totally fine. I think that's a good, the conferencing is a good analogy too, because it does tie into some of the, let me, let me just kind of lay out the four key

[17:04] Conor Bronsdon: [OVERLAP]
Please,

[17:05] Tyler Akidau: [OVERLAP]
pieces

[17:05] Conor Bronsdon: [OVERLAP]
yeah, I

[17:05] Tyler Akidau: [OVERLAP]
of the

[17:05] Conor Bronsdon: [OVERLAP]
know I'm

[17:05] Tyler Akidau: [OVERLAP]
article.

[17:05] Conor Bronsdon: [OVERLAP]
derailing you,

[17:06] Tyler Akidau: [OVERLAP]
No,

[17:06] Conor Bronsdon: [OVERLAP]
so.

[17:06] Tyler Akidau:
it's fine, but I think, but you'll see how that does tie into what you were saying of like, you know, the kind of the four pieces I highlight in the article are that, you know, you need to be able to have authorization that's narrowly scoped. So you really want to limit it to the specific task at hand, not everything the agent might ever need, right? And so it needs to be dynamically narrow in that sense. short-lived as well. Permissions should expire. You know, an agent that needs access to a billing database for a specific job at, say, 2 p.m., shouldn't still have that access an hour later, maybe even a minute later. Like, you know, it should be, it should be, you know, very short-lived. I think then there's a couple more pieces that are a little bit more agent-specific. but they do start to allude to kind of the conference thing or like the guest badge sort of example that I give in the article. One is, you need to be deny capable. So you need to be able to say this agent can never write to the production database. It can only ever read. And if it's doing work on behalf of another human and needs to adopt some of their permissions, that's fine, but it may never adopt. write access. So that the last point that kind of gets highlighted there is it has to be intersection aware. It needs to, you know, the agent kind of has a list of things that is allowed to do and never allowed to do. And then when a human comes along and says, okay, please go do this task for me, it can adopt some subset of those, but it's the intersection of them. And if the rules say agent may never write, even if the human has all right access to the production database, the agent gets none of that.

[18:36] Conor Bronsdon: [OVERLAP]
The agent doesn't own the conference venue, can't keep throwing conferences and giving itself media access, but

[18:41] Tyler Akidau: [OVERLAP]
Exactly.

[18:41] Conor Bronsdon: [OVERLAP]
it can get a Baxos badge for the two days it's there and present and needs it.

[18:45] Tyler Akidau:
Yeah. Or like the guest badge example in the article is, you know, you can get a guest badge and you can kind of go anywhere that a human will take you. But even if the human has access to the secret server room that guests aren't allowed in, you're still not allowed to go because you've got the guest badge.

[18:57] Conor Bronsdon: [OVERLAP]
I think it's a great example. Um, okay. And the way that we actually enforce all of this, or at least understand if it's working is obviously the observability layer, which is something we've talked to some on the show about, well, let's be honest, a lot in the show about, uh, but I feel like there's still a gap around it, or at least many folks, I think, think there's a gap for observability right now. What's your take on this observability layer of agent governance and where it's succeeding right now and maybe where we need to dive deeper on it?

[19:27] Tyler Akidau: [OVERLAP]
I think that the biggest thing that we need, in my opinion, is we all need to agree that you just need to be recording everything that agents do. Because we've kind of operated in this world where you can more or less trust humans to a certain degree, and obviously you can't trust them fully. But, you know, over millennia, we've built up these systems that basically make it so that as the stakes increase, like as the pressure of like, did you do something wrong, investigations, criminal investigations, you know, going, you know, all that, as the stakes increase, humans tend more and more towards the truth. Because there's, you know, life impacting repercussions as a result.

[20:10] Conor Bronsdon: [OVERLAP]
I mean, not to jump you a pillar ahead, but we have accountability and enforcement, right?

[20:13] Tyler Akidau: [OVERLAP]
Yeah, I know. I keep each, like each of these, I think in the paper we're writing, we kind of laid out this bit of like, each of these are sort of built on each other. And so it ends up being hard to talk about one because you start spilling into the next one. But because of that, because you lack that, you don't, you sort of lack these levers to push on them. And because they're not like classic software either, where you can go in and be like, okay, well it did the wrong thing, but like somebody put a stupid if statement somewhere, or there's a bug, or there's a race condition, whatever. Like you can't debug them, because it's just these organically grown models that have been trained. The only real recourse you have is to record everything. So every prompt, every input, every tool request, every tool response, every output, record the whole thing and then be able to, you know, from that you can do many things. You can build observability systems that tell you what are all my agents doing now. You can build analysis tools that let you go back and say, okay, this agent, you know, gave away a car. Why did that happen? What went on here? How do we fix that in the future? Or alternatively, this agent's knocking it out of the park. This agent has converted dozens of customers in the last month when our human BDRs aren't. What is it doing? How can we do more of that? Both of those last two scenarios lead into, you know, rather than having humans kind of do this analysis and say, let's, you know, let's stop this agent or let's start another one, lead into evaluations and kind of automatically, you know, both promoting or demoting agents over time and saying, this agent is doing great work, you know, let's maybe do it more, or let's save this as kind of a golden rule set for how we approach this sort of problem in the future, or this agent is not performing like it was. I think a common thing I hear is that people will ship an agent. Like we shipped a customer service Slack bot about a year ago, and it was awesome for like two or three weeks. And then like, fourth weekend, all of a sudden customers were saying like, it gave me this really weird suggestion that makes no sense. And it just, it was just for whatever reason, the agent just kind of stopped functioning. I don't know if the model, I don't remember whatever the actual post-mortem on it was, if it was the model changed or if context had built up and it stopped giving good advice or something, but it just, it went from working great to not working great. So you want to have these, you want to have evaluation systems in place that are, you know, monitoring either in absolute or in relative, like, How is the performance of these agents and are they doing the right things? So having a full recording of everything they do is kind of the foundational piece for all of those.

[22:45] Conor Bronsdon:
Do you think the open telemetry standard is the place to start for agents or how do you think about, I mean you said record everything, but where would you recommend teams get started if they're kind of skipping this step so far?

[22:59] Tyler Akidau:
I do think open telemetry is a great place to start. That's what we've more or less aligned on as well. Like, because the other piece that I haven't touched on anywhere in all of this, but like talking about open telemetry as a standard is a good place to begin that conversation, is if you think about where you want agents to be in your, in sort of the modern enterprise, you want them as, you know, basically digital, a digital workforce along, you know, collaborating with your humans, you want them everywhere. You want them touching everything in your enterprise. And so by definition then, you can't have them in some little walled garden somewhere where you're like, well, we'll put all the data we want the agents to work on in there and let them do their thing. You're missing out on the opportunity to have agents actually benefiting you everywhere. And so if it needs to be everywhere, then you really need to outline on interoperability and open standards are a great way to achieve that. And so that's why open telemetry I think is a great start. That's what we use within the system we're building at Red Panda. And that, you know, like for us, you know, we're trying to build a governance platform that does solve this kind of meet, you know, meet you where you are and work with all the systems. That means we can build our own agent framework and collect all the stuff we want from it fine, like anybody can do that. But you can bring whatever agent you want, you know, build it with it or whatever framework you want. And you can easily plug it in as long as it speaks open telemetry for sending us, you know, trace information. And that's by doing that, it's kind of a lower bar. It just makes it easy to be like, yeah, If you're not using OpenTelemetry, why aren't you? It's not that hard.

[24:24] Conor Bronsdon:
So a couple things this brings to mind, well, many things, but two that I want to focus on the start is one, you mentioned red pandas, obviously using agents. I'd love to understand some about how you were using agents today. And then secondly, how are you holding them accountable to speak to that fourth pillar?

[24:44] Tyler Akidau:
That's a good question. So how am I using agents today? I use them a lot for, I would say myself, I use them a lot for research. I'm personally more in the kind of AI tool realm than the autonomous AI agent running off and doing things for me realm. That's just a sort of personally where I am and where I'm, you know, where I want to have things running for myself. I

[25:11] Conor Bronsdon: [OVERLAP]
Yeah.

[25:11] Tyler Akidau: [OVERLAP]
think most of the autonomous stuff, we do have a bunch of autonomous agents that we built internally at Red Panda. And then we have customers that are using them as well. I think examples for those are, we have agents that are on the marketing side, for example. This does fall into research as well, but they're basically autonomously doing research around competitors and saying, hey, what are competitors doing or what are we seeing in the market? What are useful things here? Giving us reports, making suggestions. Again, kind of falls into the research arena. Other ones that we've seen or that we're working on too that are a little more towards the operations side is trying to automate blog posting and things like that. So being able to look at, you know, what is it that we're building? What have we shipped? You know, calling through JIRA, calling through GitHub PRs and things like that and coming back with recommendations, not just for like, here's things you should be talking about, but actually writing, you know, writing an initial draft of like, hey, maybe a short article like this would be useful and then letting marketing take that. So rather than the marketing having to like, chase everyone in the company and say, hey, you know, what are you up to? What are we going to do? You know, this can be an additional signal of like, you know, basically you've got agents that are that are feeding you ideas of, you know, I'm aware because we have, you know, this whole body of kind of, you know, what is Red Panda? What are we doing? You know, stored in in data that they have access to as well. You know, they're sort of aware of the context of what we're doing, aware of our mission, aware of our, you know, what we want to do from a marketing perspective. all that sort of stuff, they can kind of frame, you know, their approach to like, what is it that we want to pitch? What should we be talking about, you know, within that? So it's not just like you're asking some random LLM either, you know, it's sort of finely tailored to all the stuff that we're planning. Not sure if that gives you a good sense of it. Like some of our customers are doing things like, you know, wealth management advisory sort of stuff, again, kind of doing research and giving advice, but not to the point of being hooked up to doing autonomous trading and things like that. We do have We have a paper that we submitted to the CIS, one of the CIS workshops, that does give a sort of an autonomous demo around a wealth management thing, but highlights this kind of out-of-band enforcement piece of saying, you know, the trading aspect of it, you know, you totally imagine saying, like, look, here's the parameters I want for automatic trading. But any trade that has more impact than, say, $1,000 on my portfolio always needs to be reviewed by a human. And there is no way for the agent to get past that. The agent doesn't even know that that's there. All it does is it makes recommendations. And on the infrastructure side, if it ever exceeds that limit, then the infrastructure automatically says, nope, that's going to a human for review. The true sort of autonomous operational kind of agents, like that's where we're headed. And I think from what I've seen, we're still pretty early days as far as seeing these types of agents in production because people don't yet have the confidence that they can do it without blowing away millions of dollars or, you know, you know, having the agent say things that that makes, you know, customers run away screaming or something like that. Like, there's just people are being conservative and rightly so. But I think the stuff that we're talking about as far as providing a governance layer that lets you kind of set up the guidelines and set the rules is what's going to get us there. Like, we're so close.

[28:38] Conor Bronsdon:
Yeah, it's really interesting because we all are seeing massive opportunities to increase velocity, increase throughput, and we want to lean into those. But there are brand risks, there are monetary risks. I mean, like the this very simple example that I have for myself is. I built an MCP server to let my agents access my substacks so they can draft for me, they can make edits, they can pull data from it. But I very specifically did not give them authorization to actually send substacks because I don't want them to ever do that accidentally and because I view it as a brand risk. Does it matter that much? I have right now like a thousand subscribers on Substack, so it doesn't matter that much, but it matters to me as my own

[29:22] Tyler Akidau: [OVERLAP]
Yep.

[29:22] Conor Bronsdon: [OVERLAP]
personal brand and how I want to present myself. And that's a very simple example, but as you start to scale to things that are worth millions of dollars, yeah, there's a reason these controls are in place and that we need these authorization pieces. And it brings to mind something that you wrote in the article where you talked about, you know, every human employee has a manager, you know, critical actions need approvals. And if things go catastrophically wrong, and I'm quoting basically word from word here from Tyler, so this is not my words, there is a chain of responsibility and there's a kill switch or circuit breaker of some type. But for agents, this layer is nascent at best. What do we need to do to solve this both responsibility chain and then this circuit breaker, this access revoke, whatever it is, layer.

[30:10] Tyler Akidau:
Yeah. And the, as I've sort of been alluding to over and over as we go on, like each of these pillars builds upon the other. And that's the nice thing, like really this idea of chain of responsibility and being able to have a kill switch, largely dependent on just saying, look, solve identity and authorization properly. Like if you, if you have, this global approach to identity that is fine-grained enough that literally every instance of an agent for a given task has a unique identity and there is an authorization system that then plays nicely with that and is aware of that. You have basically what you need. The identity layer is tracking the accountability layer of which humans authorize this. Uh, the authorization system is dynamic enough and fine grained enough that you can just basically say, look, this agent or this class of agents are going, you know, are, are, you know, we want them to stop what they're doing. We are going to just shut those off. Their, their access is revoked. Um, that really is the best way to do it. And so I think that, that is kind of the one that, you know, if I were to give the high level, like recommendation to folks of like, if you're looking to build out a broad suite of. autonomous agents in a large enterprise, make sure that you have some governance layer in place that globally, you know, everything goes through because this is what gives you that ability to centrally manage this access and, you know, you know, make sure it has identity and authorization that meet the needs that you're going to have. But if you have that, then you've got your accountability layer, you've got your kill switch or your circuit breaker or whatever. So it all kind of builds on those pillars and that, you know, they were very intentionally phased the way they were because of that.

[31:48] Conor Bronsdon:
And importantly, as you also highlight in the article, that government's layer has to be enforced through channels that the agents cannot access themselves. Like it's great to give an agent a skills file and memory and context and all these things, but it won't necessarily follow that all the time. There will be moments where it does not succeed in that. So we have to have out of band metadata where we can actually enforce this without the agent overriding or breaking its own system.

[32:15] Tyler Akidau:
Yeah, absolutely. And to be clear, like, it's harder this way. Like, this is more work. You know, like the sort of the state of the art is like, well, throw in your Cloud MD, tell it what to do. Like that'll, that gives it the parameters of how to operate. And like, it's true until it's not.

[32:28] Conor Bronsdon:
It gets you there 90% of the time, but once

[32:30] Tyler Akidau: [OVERLAP]
Exactly.

[32:30] Conor Bronsdon: [OVERLAP]
it, once it's customer data, once it's customer money, like there are different standards.

[32:35] Tyler Akidau:
Exactly. And that's the thing. And so like, you, you really can't do inline enforcement. It just, it doesn't work. Um, or at least it doesn't, like you said, it doesn't work to the standard that you're going to need to hold in these situations.

[32:47] Conor Bronsdon:
Right. Like it's, if it's me writing tweets with my open claw or something like that, like, okay, fine. It's probably not the biggest deal in the world, but

[32:54] Tyler Akidau:
Yeah.

[32:54] Conor Bronsdon:
if I am trying to run my finances, I probably care more.

[32:58] Tyler Akidau:
Yep. Yeah. We, and I mean, we have, you know, we struggle with this too internally, like we're working on moving, you know, very deeply to like, basically having everything that we do internally from like, you know, wikis and docs and all that to having that accessible to our AI agents. And one of those guideline papers that we wrote up recently said, make sure you narrow the scope of what agents can do for this given section of the repository with your cloud.md. And I had to go in and be like, yes, but make sure that we make it clear for everyone what this really means. This is guidance. This is saying, please try to do this. But if it actually matters that it doesn't do a thing, we can't enforce it here. We've got to do it through the, in our case, we've got to do it through the ADP layer. Some of it's just helping educate people of that difference and being aware that, yes, guidelines are useful and they're simple and easy and they're great when they work, up to the point that they don't. And so anything that you need to have actually work, it's got to be out of band. It's got to be infrastructure enforced.

[34:02] Conor Bronsdon:
I know that you have spent a lot of time on thinking through this problem, and you restrained yourself in the article that we've alluded to several times now from talking about streaming that explicitly as a solve here. But, you know, as I said at the start of the call, I mean, you've spent 15 something years working on this problem. Obviously, you're applying many of those lessons at Red Panda. Do you think event streaming is the right substrate for agent governance?

[34:34] Tyler Akidau:
I think event streaming is a very useful piece of the puzzle.

[34:40] Tyler Akidau:
If I thought it was the core piece of it, then we as a company would be saying that loud and clear. But I think anybody who actually looks at the problem below surface level can see that an argument like that is, it's hard to make that argument and have it actually land as anything other than a company who happens to do streaming is trying to pitch themselves as an AI company. Because there are certainly plenty of use cases where streaming data is relevant within this. I did give a talk, so back when I was trying to figure this all out on my own in the fall, I did start giving a talk basically saying like, Part of why we got into this is that streaming is a key foundational piece. That was why it felt, you know, sort of authentic for us as a company to get into this space, because the transcripts aspect, being able to record this, you know, it's a lot of information trying to record all this stuff. And it's, you know, streaming is a natural way to do that. And then a lot of the way that the systems interact is a very streaming sort of approach. But then there's a lot of agents that just talk RPCs and stuff. And like, do you want to really shove that stuff through like a Kafka kind of a topic? Not really. So that's kind of my take on it. It's a foundational piece. We see it as a critical part of our infrastructure. But it's, you know, it's like a third of the story or, you know, it's a significant but certainly not all encompassing part. So that's my take. And yes, I try very hard in how we approach things to not pull streaming into it because it is so easy to otherwise look like you're Oh, look, it's a streaming guy who's saying that streaming is the way to do AI. And like, it's not like it's a piece of it. Yes. And it's, it's important for those pieces, but like with any tool, you've got to use it for the right, the right part. And, and we're definitely not the, not the streaming company who's coming to tell you that streaming is the way to do AI.

[36:28] Conor Bronsdon:
I mean, we're all constantly pattern matching, so it's understandable to bring it in. And I agree, I think it is a part of the solution here. My longtime listeners will probably tease me in the comments for the fact that I continually reference old episodes as I'm thinking about it, because that's my mental framing. I'm like, yeah, you know, like, here's what I'm taking to these conversations. Um, but I know red Panda has shipped a variety of things around, um, helping to enable a agents in the enterprise, you know, the agentic data plane, which you shipped, I think late 2026, uh, or sorry, early 2026. Um, obviously, you know, Kafka has been mentioned already. What's the general company approach here for how you're helping red Panda customers and partners to actually solve these problems.

[37:18] Tyler Akidau:
So I see there's, the way we're approaching it as a company is that there's kind of two main businesses that we're investing in going forward. One is the agentic governance, and the other is the more classic data platform. Used to be streaming, but with the acquisition of Oxlin and the SQL engine, it's broadened out into more of a data platform going forward. And the agentic stuff, the agentic data plane builds upon the data platform, But it's really infrastructure that we use. At this point in time with ADP, the Gen2 Data Plane, we're using Red Panda topics, we're using Oxla and things like that under the covers. Users don't see that. They don't see the Red Panda. It's a totally separate cluster. It's just an implementation detail. There's also other things in there too, like there's some Postgres in there that we use, right? They don't care, they don't need to know. It's sort of foundational stuff. So from the AI side, we're really trying to solve those problems the way that they need to be solved. So we're giving you the LLM gateways, the MCP gateways, the identity and authorization stuff that you need, the ability to plug agents in or build your own agents, everything that you'd want around sort of governing agents broadly in the enterprise. It's built there with that mindset of this is for agents and it makes governance of them, you know.

[38:37] Tyler Akidau:
interoperate across your enterprise and easy to sort of centralize things. But the implementation details are sort of irrelevant. At the same time, we have built a really nice data platform. And if you want to do data streaming across, you know, a broad variety of latency profiles and cost profiles, and, you know, do it on premise, do it BYOC, do it in the cloud, you know, we kind of do it all. And we've got Oxla via Red Panda SQL coming here in a few weeks. starting to bring SQL querying capabilities across that same set of modalities and really trying to focus on bridging real-time data and real-time queries and things like that. So there's this whole data platform thing that we're still investing heavily in and our customers use a bunch and that business continues to grow as well. They're related, but to be perfectly honest, they're two different scenarios and really two different sets of customers that we're looking at. Obviously, we've got plenty of customers who are you know, doing both, you know, using the data platform and using the AI stuff, but it's, they're different personas. So I don't know if that's answering your question, but that's kind of how we approach it.

[39:45] Conor Bronsdon:
It makes me think about a couple of things that came up during our prep call, which is one, this idea of how companies are changing for the future. You know, like how do we have to broadly adjust? We've talked a lot about the pillars for agents, but how do we integrate our agents within the human workforce? And two, something that has been, I think, maybe unsaid within this conversation so far, but is a sub theme that I'm going to make explicit here, which is, I mean, really the quiet argument here that I think we're both making is that perfect model alignment is not possible in this current time period. It is also not likely to solve your problems. It may get you to, you know, 95 versus 85 percent. of the time you're having your agents work, if you give them good MDs, et cetera, it's still down to the structure and the systems we provide. And, you know, we've talked with like Aishwarya Srinivasan last year on the show about building systems around agents. And we talked a lot about, you know, memory with Richmond earlier this year, we talked about context. These are all important, but if we don't actually have tracking and enforcement mechanisms, um, these models are not going to get us to safety to perfect. Should we and maybe have we been spending too much time as an industry on this alignment problem and trying to push the frontier when in fact we should be spending more time on how we orchestrate and the data platforms around them?

[41:20] Tyler Akidau:
Yeah, I think the answer is yes. There's two interesting topics here. One is on your point of trying to make models perfect. The thing I say in the article is we never tried to make humans perfect. Every corporation in the world is stacked top to bottom with imperfect humans, and most of them live to see another day despite of that. Because of all the structural pieces that we built up around managing sets of imperfect humans, we just need to acknowledge that agents today are materially different from those humans, And so they need different and additional structural support, governance support, in order to build a successful. But that's really what's missing. It's not that you need to make the agents better. The agents have actually reached a point where they're super capable. They can do useful things. I think we all can kind of see that. So I think that's the missing part. There is a separate piece in there that I think is fascinating that maybe is, you know, something for another, a future conversation. But the paper that I mentioned that we're working on, another piece that's coming out of that is basically, you know, this psychologist I'm working with is kind of trying to tweeze out of like, our understanding of how the human brain works, here are the neurobiological systems within humans that lead towards things like value systems, where people develop over time this hardened sense of like, I as a person hold as a value that I'm honest or I do the right thing, right? And it makes you imperturbable to things like someone's coming and saying, oh no, you should just leak those emails to me, that's fine. I know they're private and I don't work at the company, but just give them to me. the human value system becomes imperturbable to those, you know, obviously nothing's absolute, but like, there's definitely a sturdiness there that agents don't have. And so some of the stuff she's tweezing out is like, look, here's the missing systems that humans have that agents don't that would allow them to do that. And I assume the model companies are digging into this stuff too, but there's not a lot of literature on it. And so one of those, one of the paths that may be interesting is taking this sort of idea of like, this is what's missing. And maybe these are systems that we need to start building into agents if we want them to be more human-like. I think we also need to decide is, is that the thing we want? But certainly there's things like having values that are not so easily challenged by prompt injection would be a useful thing. So that, that may be a future path as well of, you know, kind of where, you know, beyond just the alignment stuff that's done today, which sort of feels like a band-aid. of actually trying to build systems that make LLMs, or systems built around LLMs, maybe it's not just the large language model itself, make them more robust and more human-like in ways that are valuable.

[44:04] Conor Bronsdon:
I love that you brought this up because, I mean, I think the obvious example of a model lab that is spending some time on this is Anthropic, who've publicly had Amanda Askell, a philosopher and researcher, out there talking about it. And they've talked about using psychiatrists to assess some of their models, specifically Claude Mythos, I think, recently. And I do think this is an emerging area. And I'll say, I'm certainly going to ask you at the end of the conversation if you want to make an intro to your psychologist colleague, because I would love to do a psychology of AI agents session for sure. But I mean, I think it's, I think it's an area where we're starting to see some emerging impacts already, where, I mean, a silly example of this is all of the open AI models seem to be obsessed with goblins and little creatures. And I think we're also starting to see people at least anecdotally comment on, oh, like, this is the mindset I'm seeing from cloud models. This is the mindset I'm seeing from open AI models. You can look at all the comments around using codex versus cloud code, where, oh, you know, we think anthropic models are better at design, and the recent ones are, you know, aligned with wanting to solve things, but maybe can get a little lazy on certain tasks, whereas you know, open AIs, codex models and GPT 5.5 in particular are, you know, very good at just like churning through work, but they're not as imaginative, creative. Uh, so there is something to be said for this idea that we're seeing model families differentiate over time and that they are having different like pros and cons and like cultural psychological impacts. Though, I think there's so much more to your point where we can go deeper on like understanding what this looks like. Uh, and yeah, be curious where you're seeing movement on this topic currently.

[45:53] Tyler Akidau:
I do see, I think what I see mostly is the thing you were alluding to at the end there is this model differentiation and sort of the tone or the style or the sort of specialization. Like this isn't agents, but like for myself, when I'm using, you know, I use cloud and open AI, you know, chat GPT, you know, for a lot of this stuff. I frequently just write a prompt, send it to one, copy paste and send it to the other and see like, you know, which one is better. Cause I, you know, to be perfectly honest, it's cause I want to stop paying one than the other. But I got to I did it for about a month and I'm like, I can't get rid of both because like it's it's such a crapshoot. It's so weird. Like, you know, one will just be amazing. And then, you know, the next prompt, which I feel is even like sort of similar, it'll be the totally different model. And it's like it's like it's it's so materially better to like it will be like this answer is garbage versus this answer is great. It just blows my mind. And so I think there is There's definitely something there and I see a lot of that. I, I have not yet seen a lot of investment into that kind of like, let's, let's build in human value system type structures and things like that, that, that we're kind of alluding to in the paper we're writing. But I, you know, I'm also not an anthropic deep in it and I'm sure there's things that they're doing that like they're not talking about and they won't talk about for another year or two, you know, publicly. So.

[47:11] Conor Bronsdon:
I mean, if they want to come on the show publicly and you're listening right now and you want to come reveal some secrets, please do. Like, Tyler and I would love to hear this.

[47:18] Tyler Akidau:
But I think that's probably a very fascinating direction to explore. And I'm expecting we'll start seeing that because it just makes sense.

[47:25] Conor Bronsdon:
Totally agreed. Tyler, this has been a fantastic conversation. As we close out here, I want to challenge you to give me a projection for the future, which is hard, I know, given how fast things are moving. I know, I know, I see the eye roll, but how do you think we need to change our company systems and, like, the way we, I guess, hire and build our human infrastructure for, let's call it the next year of work, so that we are setting up for an agentic success?

[47:55] Tyler Akidau:
I think the, the, the big thing that's important, as I said, is be aware that you need out of band enforcement of all this governance stuff. Think about how you're going to apply this broadly across your company. You need a uniform, you know, as we call it, you need an agentic data plan that's sort of being the governance mediator for all this sort of stuff. And you need to focus on what are the differences that agents have relative to human employees and how does that relate to that governance? And I do think that the four pillars I laid out in the article around identity and auth observability and accountability set a good baseline for it. So I think if you look at it from that perspective and go into it thinking like, there are materially different problems that we need to solve here than we have for humans, and let's tackle those, that's kind of where we're going to get to. And my guess is that in the next, you know, one to two years max, We're going to kind of establish what this looks like, and there's going to be kind of a pattern of here's how you approach it. And of course, there'll be many different ways to actually go about doing that, building it yourself or buying it from a company like us or everything and everything in between. But I suspect we're going to get to a point of being like, this is roughly the right way to do it, just like. You know, every enterprise today is different in the way that it manages its humans, but they're not that different. You know, kind of roughly looks the same, you know. So I think that's where I think we're headed. And I do think it's, you know, 12 to 24 months max.

[49:22] Conor Bronsdon:
Fantastic. Well, Tyler, it's been great catching up with you. Super excited to read your paper when it comes out. And thank you so much for sharing all your thoughts with our audience. It's been a great conversation. Where can folks go to follow your work and see what you're up to in the coming months?

[49:40] Tyler Akidau:
Yeah, so I'm on LinkedIn. So you can find me on LinkedIn. I'll post, I don't post a ton, but I do post there. And then everything we do also goes through the sort of the typical Red Panda marketing channels. So, you know, hunt us down there and you'll see what's going on there. But yeah, I'll be there and I'll be at a few different AI conferences throughout the year. Certainly CAIS, I'm hoping to make it to AIES and probably AAMAS later in the year as well. For any of you on the more academic bent of things, I hope to see you there and chat more.

[50:15] Conor Bronsdon:
Amazing. Tyler, thank you again for joining me today. And listeners, be sure to subscribe to our Chain of Thought newsletter at newsletter.chainofthought.show to get tons more information from Tyler and many others in the coming months as we continue to have these conversations. Thank you, everyone, for joining us. And Tyler, hope you have a great rest of your day.

[50:37] Tyler Akidau:
Thank you, Connor. Really, really had a good time here. Appreciate it.