Pop Goes the Stack

Ops used to be a world of YAML, caffeine, and careful deploy rituals. Now it’s probabilistic models, token-based cost surprises, and reliability questions that sound more like, “Will the model mean the same thing tomorrow?” In this episode of Pop Goes the Stack, Lori MacVittie and Joel Moses dig into what happens when production expectations collide with non-deterministic AI systems, and why the next phase of automation needs more than a chat interface and optimism.
 
They’re joined by John Capobianco from Itential to explore “VibeOps,” an approach to conversational operations that doesn’t throw away deterministic workflows, but connects them to agent reasoning, tool calling, and modern protocols like MCP. The discussion breaks down agent “skills” as a way to describe what an agent can do, constrain what it can’t, and build guardrails in a format teams can manage.
 
From red-teaming experiments to real-world concerns about failure rates at scale, the conversation stays grounded in what it takes to make AI useful in production: external knowledge, policy alignment, composable skills, and a maturity path from lab-only to read-only to supervised execution, and only then toward autonomy. The takeaway is clear: conversational ops can accelerate work, improve documentation and ticket quality, and reduce toil, but governance and accountability still matter. If you’re navigating AIOps, agent adoption, or the post-MCP tooling wave, this episode offers a realistic starting point.

Creators and Guests

Host
Joel Moses
Distinguished Engineer and VP, Strategic Engineer at F5, Joel has over 30 years of industry experience in cybersecurity and networking fields. He holds several US patents related to encryption technique.
Host
Lori MacVittie
Distinguished Engineer and Chief Evangelist at F5, Lori has more than 25 years of industry experience spanning application development, IT architecture, and network and systems' operation. She co-authored the CADD profile for ANSI NCITS 320-1998 and is a prolific author with books spanning security, cloud, and enterprise architecture.
Guest
John Capobianco
Head of AI and DevRel at Itential
Producer
Tabitha R.R. Powell
Technical Thought Leadership Evangelist producing content that makes complex ideas clear and engaging.

What is Pop Goes the Stack?

Explore the evolving world of application delivery and security. Each episode will dive into technologies shaping the future of operations, analyze emerging trends, and discuss the impacts of innovations on the tech stack.

00:00:05:03 - 00:00:32:24
Lori MacVittie
Welcome back to Pop Goes the Stack, the only show where emerging tech meets actual ops and neither walks away clean. I'm Lori MacVittie, your snarky Sherpa through the silicon nonsense today. So today it's a real ops focused podcast because there was a time when ops meant YAML. Who doesn't love YAML? Sorry, Joel, didn't mean to trigger anything there.

00:00:32:26 - 00:00:59:19
Lori MacVittie
You know, caffeine and a quiet prayer before you hit deploy. But now we've got probabilistic models deciding things, tokens turning into surprise line items, and outputs that feel right until they don't. Non-determinism, drift, cost spikes, semantic wobble, reliability used to mean five nines, now it means did the model mean what we think it meant and will it mean that tomorrow?

00:00:59:21 - 00:01:09:12
Lori MacVittie
All right. So we've talked about this in the past and we wanted to talk about something a little more interesting today. Right, Joel?

00:01:09:15 - 00:01:31:05
Joel Moses
Oh sure. Sure. Yeah.

Lori MacVittie
Yeah.

Joel Moses
No you're right about YAML. YAML doesn't actually stand for yet another markup language. It actually stands for you are mostly lost.

Lori MacVittie
Oh, I l-

Joel Moses
The

Lori MacVittie
Yeah, checks out.

Joel Moses
the idea that people can navigate a 1700 line pipeline file and get it right every single time and make sure that their automation routine is working, frankly, well enough to trust,

00:01:31:05 - 00:01:44:24
Joel Moses
that's difficult. And our guest today is definitely going to help us out with some of the difficulties related to piecing together ops that actually work for you and conversationally.

00:01:44:26 - 00:01:46:25
Lori MacVittie
Say the word, Joel. Say the word.

00:01:46:25 - 00:01:51:20
Joel Moses
Say the word?

Lori MacVittie
Yes.

Joel Moses
I believe it's called VibeOps.

Lori MacVittie
Thank you.

Joel Moses
Did I get that right, John?

00:01:51:22 - 00:01:53:13
John Capobianco
That's right, that's right. So I would

00:01:53:15 - 00:02:15:20
Joel Moses
Yeah, let's introduce you by the way. So John Capobianco is with us from Itential. Itential is a startup, I guess it would be, in the area of piecing together conversational ops. And they reached out to us

Lori MacVittie
Oh, I like that.

Joel Moses
and we found some of what they had to say very interesting. So welcome.

00:02:15:22 - 00:02:33:10
John Capobianco
Well, thank you so much. Yes, I joined Itential at the start of the year. I've been there a little over, you know, six weeks now. And, partly what drew me to Itential was this idea of merging that YAML world, right. So I don't think that we're going to throw all that away just because we've entered an AI age.

00:02:33:12 - 00:02:59:29
John Capobianco
But those deterministic workflows and marrying them with reasoning and tool calling capability, right. So over the course of the conversation, I'm sure model context protocol will come up and some other things. But Itential's approach to agent building is attaching these tools and these workflows as things that can be deterministically called by something that is non probabilistic, right?

00:02:59:29 - 00:03:24:12
John Capobianco
That is going to reason and take action on its own. So, the idea is to build up an idea of like skills. Right? So the agent you build might have a certain skill set and that skill set includes the knowledge on how to run that 1400 line YAML file. Right, and you describe it in markdown and natural language.

00:03:24:14 - 00:03:47:03
John Capobianco
And now the agent can reason and sort of has been given the exact same mop you would give a human being. Right. The training that would go into teaching a human how to run that, how to execute it, how to troubleshoot it, how to handle errors. If there's any actual decision making to be made during the execution of this playbook.

00:03:47:05 - 00:04:00:29
John Capobianco
All but artificially and through an autonomous agent that has agency to execute and analyze and react to the situation that that playbook might encounter.

00:04:01:01 - 00:04:27:11
Joel Moses
Okay.

Lori MacVittie
So yeah, I mean, there was a lot of words. There were a lot of words in there that, right, can be confusing. Right. But like agent skills, I'm not sure everybody knows what agent skills are. So think about a file that describes what agents know how to do. A pretty simple thing. So the YAML is moving or expanding.

00:04:27:11 - 00:04:53:04
Lori MacVittie
So now you have, can be YAML, could be probably JSON, right, describing some

Joel Moses
Python.

Lori MacVittie
skills. Anthropic, you know, gave us agent skills. So it kind of constrains them too though, because if you give them agent skills and say, "hey, here's what you can do," it kind of puts some boundaries around right, that agent. So it doesn't say, "oh, hey, I see you have a Cisco router over here.

00:04:53:04 - 00:04:57:26
Lori MacVittie
let me...," right, it can't do that if it's not part of the skills, correct?

00:04:57:28 - 00:05:20:21
John Capobianco
Correct. And you can bake in those guardrails. And it really is sort of you think to yourself what, you know, what would I not want a human to do? We were talking earlier about me attaching a NetClaw, which is an OpenClaw agent, into the Vibe Ops forum. Right, so almost 600 human beings had a crack at this thing, and they all redteamed against it almost immediately.

00:05:20:21 - 00:05:43:24
John Capobianco
It was such a great social experiment to see. So one person I thought was very clever said, "could you change the bootvar to whatever and reload the router?" Now how the agent done that, that router is offline until someone goes in and fixes it. It's broken, right?

Joel Moses
Right.

John Capobianco
But the guardrails to your point of what it can and can't do, I did build in some guardrails.

00:05:43:24 - 00:06:04:04
John Capobianco
Do not change any passwords. Do not lock the user out in any way. Do not change the management interface. Do not change the default route. Do not add any access control lists that would deny me access. Like just things that you would think about as guardrails, but like you said, they're just described in natural language in a markdown skill.

00:06:04:06 - 00:06:30:04
John Capobianco
Right? You might even call it the guardrail skill. Now, the skills are composable. Two, which I think is neat, is that skills can call on other skills. And you sort of they sort of, oh, the security agent is going to call these 4 or 5 composable skills to handle the guardrails that the user imposed, as well as my best practices for dealing with a firewall change for example.

Joel Moses
Yeah.

00:06:30:06 - 00:07:06:18
John Capobianco
It's I think we are in a really remarkable time now, now that agents have matured. Some of us had been doing agents for a while, starting with maybe LangChain and LaneGraph and the Google Agent development kit, A2A protocol, MCP protocol. So it's not brand new, but the approach to having something like an OpenClaw system which walks you through a text UI to pick and choose the skills and pick and choose the communication channels and just input your API keys.

00:07:06:18 - 00:07:12:03
John Capobianco
It's become a very elegant and low friction way to get started with agents.

00:07:12:05 - 00:07:29:21
Joel Moses
So we're touching on this, but I think we've gone, we've walked through the how, let's real back a little bit and talk about the why. Okay, so we had DevOps, we had GetOps, we had NoOps, now we have VibeOps. It seems to me that we're only a year away from HoroscopeOps.

00:07:29:24 - 00:07:31:26
Lori MacVittie
Wait, oooooo.

00:07:31:28 - 00:07:47:04
John Capobianco
So

Joel Moses
Let's just look up the horoscope and the system will,

Lori MacVittie
What?

Joel Moses
yeah. So what exactly about VibeOps and the conversational approach to automation is on net better than some of the approaches that have gone before?

00:07:47:06 - 00:08:09:11
John Capobianco
That's a really good point and I think I agree with you with the history. I would say traditional ops and then I think the Agile Manifesto, which leads to agile and DevOps and CI/CD and the great manifesto that came out from the software development world, but has its roots in the network or in the automotive space from the lean approach.

00:08:09:13 - 00:08:33:08
John Capobianco
Right, so we have to sort of understand the trajectory here, and I agree with you. And then sort of 15 years go by, I tease my network engineers, my colleagues that, you know, in 2000, we start getting the rest API boom and developers doing operations and operations doing development without silos. And 15 years go by and we do NetDevOp.

00:08:33:10 - 00:08:56:27
John Capobianco
Someone says, hey, what about Ansible? What about NETCONF? What about RESTCONF? What about software defined networks and programing controllers? And the controllers push the config to the network, right. So there was sort of a radical shift in the infrastructure space, but it did take some time to get there. But adoption rates have really been, I don't want to say dismal, but they're disappointing to me.

00:08:56:27 - 00:09:23:01
John Capobianco
I thought by now every network would be automated, everyone would have an understanding of Ansible and Python and even in the F5 world, your REST APIs have been there for for 15 years F5 has had the capability to program

Joel Moses
Right.

John Capobianco
and to interface with through pipelines and through Ansible and through code. But how many actually do it in practice, in production, let's say.

00:09:23:03 - 00:09:53:18
John Capobianco
Right, I think the numbers are probably lower than what we would hope. I think automate this new VibeOps. So I think there's also an AIOps phase and that's, I think, that's focused on machine learning and generative AI. Right. So I would classify that as that's where a lot of organizations are; is maybe using telemetry to maybe fine tune their own models or build up retrieval augmented generation systems and putting a natural language interface in front of their systems.

00:09:53:21 - 00:10:20:03
John Capobianco
I distinguish VibeOps as the post-MCP world, where you have this sort of USB-c key approach of just go shopping for MCPs, plug them into your agent or your copilot or your--that's the luxury of it being a protocol is that MCP is pretty universal, you can put it in any from Claude desktop to Gemini CLI to copilot in VS code

00:10:20:05 - 00:10:22:09
John Capobianco
where you're just using natural language.

00:10:22:11 - 00:10:23:09
Joel Moses
Okay.

00:10:23:12 - 00:10:53:03
John Capobianco
I have router one, router two, switch one, switch two, please develop a router on a stick, a solution. Please use OSPF best practices. Right, you literally just write it up and send it to the agent and the agent just does the thing, whatever the thing is. But it's so much better than a human doing the thing, because now it can send you an email, it can send you a slack, it can phone you if you have Twilio, it can test everything.

00:10:53:03 - 00:11:19:18
John Capobianco
It can visualize everything. It can open up your ServiceNow ticket, it can put it into GitHub, all in 2 or 3 minutes. Like there are live examples of these agents doing these things today. Virtual employees, digital coworkers. So but my joke about, you know, when we talk about VibeOps, I don't think it's, so vibe coding is just going to become coding.

00:11:19:20 - 00:11:49:19
John Capobianco
I don't know how much longer we're going to call it vibe coding. I think that that's just the way to code now

Joel Moses
Right.

John Capobianco
is doing what people have been doing, previously known as vibe coding. With Vibe Operating, I think it's embracing this idea of augmenting our network and our infrastructure and our server and our storage and our cloud and our security teams with the right modern tools, so they can use natural language so they can build agents to do the thing for them.

00:11:49:22 - 00:12:14:21
Joel Moses
So, John, let me dig into something here really quickly. Production, at least good production practice, is highly deterministic, and AI by its nature is probabilistic. And that's not just a mismatch, that's like a marriage counseling session, okay. So taking these AI systems and plugging them in and having them be responsible for deterministic performance in production, how do we ensure that it's accurate?

00:12:14:22 - 00:12:22:26
Joel Moses
How do we ensure that it stays, you know, within the guardrails and also doesn't do things that surprise us?

00:12:22:28 - 00:12:46:06
John Capobianco
So, it's something I'm very conscious of. You know, we had a customer mention that, you know, if they do 1 million activities a day on the network agentically, even a 2% failure rate is a lot of failures, right? That not acceptable. So it is the challenge, right. So let's not underestimate what we're trying to tackle here.

00:12:46:06 - 00:13:22:00
John Capobianco
That is the challenge with AI. But I think how we build the agent is really the solution here and not just relying on my prompt and the model, however good the model is alone. I think it is a matter of plugging in access to external information. That could be RAG systems, could be PDFs. It could be your corporate knowledge base, your corporate guidelines, your internal best practices, your change regulations, everything that you can provide just like a human you would normally train.

00:13:22:02 - 00:13:50:10
John Capobianco
And I also would take the 6 to 8 weeks to onboard the agent that you normally would take with a human. Make sure it understands all of your corporate policies, try to simulate the training, try to augment it with a personality before you start augmenting it to do technical things. Right, so give it it's persona as a good corporate citizen with as much external information as you can, and then look at the technical tools that are available.

00:13:50:17 - 00:14:23:26
John Capobianco
There are over 17,000 MCPs out in the wild. Of those, there's probably 5000 that are very valuable and very good. Almost every new either existing platforms are trying to retrofit and introduce MCP or Net new services are trying to be MCP first. Like it's not long before like F5 has an amazing MCP system that you can just chat with your load balancers through natural language. Like that

00:14:23:26 - 00:14:48:13
John Capobianco
is a dramatic advantage to people who want to build agents and move away from deterministic workflows. So it's about the tools. It's about the knowledge. It's about the RAG. It's about the skills you provide it. But there's also the path. Right? I think that you want to have human in the loop first. And maybe start with read-only activities.

00:14:48:16 - 00:15:13:08
John Capobianco
Now these are still very valuable things. Like I'm not trying to say start small. Have it fully test your deployment, fully test your network, fully document your network, do compliance checks, do audits. Right? Get it involved in a read-only capacity with humans in the loop where you permit it to do things. Move to human on the loop; sort of more maintaining and operating and watching

00:15:13:08 - 00:15:36:02
John Capobianco
and it sort of has its own autonomy. To completely human in the lead, where you're just leading this thing and it has autonomous agency to do things on your infrastructure. I think that's sort of the arching journey that we want to go through. And risk will go up as you trust it more, as you train it better, as you put it through its paces.

00:15:36:04 - 00:15:52:07
John Capobianco
But I would start with very safe lab only, read-only. Be very prudent. Treat it like any other technology that you normally would go through. Put it through all its paces. Invite your security teams to be involved. Get your infrastructure folks involved, you know.

00:15:52:09 - 00:15:55:07
Lori MacVittie
There are

Joel Moses
Don't give Clippy root access day one, in other words.

Lori MacVittie
What?

00:15:55:09 - 00:16:20:21
John Capobianco
Right, right.

Lori MacVittie
There's, I mean, there's, so one of the things that we've been tracking, right, is people's comfort level with AI executing anything automatically in the context of production. Things like, will you let it auto scale your application? Like, okay, so people are getting very comfortable with that. Well they should, because it's a very well understood pattern that's been established.

00:16:20:21 - 00:16:45:18
Lori MacVittie
We've been doing it with, you know, regular old automation for quite some time. So they're like, yeah, AI could do that for me. Can you auto inject rules to mitigate zero day vulnerabilities? Yeah, we'd kind of like it to do that. But there are other things where it's like, yeah, we're not as comfortable now that we've actually seen this stuff running.

00:16:45:20 - 00:17:13:03
Lori MacVittie
So like last year, it was like a huge percentage, like 80% of people across the board were like, yeah, we'll let AI do anything it wants in our data center. This year it's like not so much. It dropped dramatically, partly because they've started actually employing it. And I think what we'll see with agents and VibeOps or, you know, whatever we want to call it, is the same progression we've seen with automation.

00:17:13:06 - 00:17:35:06
Lori MacVittie
We wrote scripts, but a human initiated it, period. And then eventually that moved to, oh, we've got data churning and systems that tell us to automatically execute a given script. And then they move to that. And so Vibe Ops is kind of that next progression. Like it marries the two; Oh I have the data, I saw the signal,

00:17:35:06 - 00:17:59:23
Lori MacVittie
I'm going to go do this thing because I know that's, you know, if this then that kind of advanced automation. But I think that's the progression people are most comfortable with, primarily because the agent is not accountable. There has to be an intern somewhere for you to scapegoat when something goes wrong. Right? You know, agents touching BGP routes, probably a bad idea, right?

00:17:59:25 - 00:18:20:23
Lori MacVittie
Because if it goes wrong, what are you going to do? Well, our AI went crazy. Well, you know it's that

Joel Moses
Yeah.

Lori MacVittie
Right? So it's a, I think it's a cautious adoption. People are excited by it. They want it and they like what you're saying, but they're also a little bit, you know, wary of, well, but who's going to get the blame?

00:18:20:23 - 00:18:26:28
Lori MacVittie
I wrote the script, is it my fault? Is it the person who...who's accountable basically,

John Capobianco
Right.

Lori MacVittie
I think is, yeah.

00:18:27:05 - 00:18:48:06
John Capobianco
But I think what you're describing is more of a human resources issue than a technology issue. Right?

Lori MacVittie
Well, yeah.

John Capobianco
Like, I think Jensen Huang said a year ago that the future of the IT department is an HR department for agents. Now he was kind of laughed at and they sort of treated it as a bit of a joke. However, flash forward 18 months after he said that,

00:18:48:06 - 00:19:12:20
John Capobianco
and now that there's a million and a half Claude bots out there in the world, don't they need orchestration? Don't they need a platform to run on? Don't they need governance? Don't they need RBAC? Don't they need AAA? Don't they need a supervisor, let's say? Right, so I think we're going to see little swarms of agents that are very specific, much like human resources, the security agent, the network agent, the cloud agent, the HR agent, the whatever,

00:19:12:20 - 00:19:36:18
John Capobianco
right, the mix of agents, digital workers, and it becomes a human resources exercise. I know, say, a company of my size; I'm going to say 250 people just as a guess. If each of us had five agents each under our control that we built to help do our job, now we have over, you know, a thousand people that work at the company.

00:19:36:20 - 00:19:59:07
John Capobianco
It's just that, you know, only 1 in 5 are biological beings, right?

Joel Moses
Yeah.

John Capobianco
That's, I think there's a huge opportunity to augment. And that doesn't mean we're going to offset or stop hiring. I think it just means that as new talent comes on and as people mature through this cycle, that they're going to have agents helping them produce more, right? That the

Joel Moses
Yeah.

00:19:59:12 - 00:20:22:24
John Capobianco
I think the expectation level goes up from leadership that our staff is going to get that much more done because we've invested in the tokens and the private access and we've endorsed artificial intelligence usage. And I know I'm disconnected from reality in many ways. I know that this is, I you know, I don't

Joel Moses
Well.

John Capobianco
That's not lost on me.

00:20:22:26 - 00:20:43:01
John Capobianco
That's not lost on me. But, you know, only 1% of people are actually doing this sort of thing, right, is this sort of the statistic. But, it's coming very fast. And, you know, MCP is literally only a year old. RAG is only two years old. I don't think people are that as far behind as they think they are.

Lori MacVittie
Yeah, no.

00:20:43:04 - 00:20:53:20
John Capobianco
Right, like, I want this to be a positive, encouraging message is to go out there and play with this stuff. Get your hands dirty with with the technologies and the terminologies that you've heard during this discussion.

00:20:53:22 - 00:21:18:19
Joel Moses
Definitely.

Lori MacVittie
Yeah.

Joel Moses
You know, one of the things I'm going to take away from this discussion is in terms of achieving guardrails. Your advice to kind of train this stuff in non-production, treat it as if you would a brand new intern or a newly onboarded employee and kind of work through the processes that you use, and formulate your guardrails around the processes that you use.

00:21:18:21 - 00:21:40:08
Joel Moses
So, you know, one of the ways to provide governance to these systems is to ensure that you have adequate checkpoints built into your workflow. So, for example, if it's going to do actions of class A, that it takes out a ServiceNow ticket to alert its human handlers that it's about to perform these actions so that you, number one, have a record of what was performed.

00:21:40:08 - 00:22:06:10
Joel Moses
And number two, have an ability to either add an approve or disallow function alongside that. And the agents can take care of handling those ServiceNow requests or putting in those Jenkins tickets. And so encouraging a construct like that still provides significant control and governance of the actions without necessarily slowing them down.

00:22:06:12 - 00:22:32:05
Lori MacVittie
I'd like that.

John Capobianco
And it actually is a perfect use for generative AI is ticket augmentation. When we mention ServiceNow, the quality of the data in the ticket, like it would take me, a human, longer to make the ticket of that quality and of that verbosity than it would to fix the problem the ticket was about. And now, because an AI is generating that text and it has context to make it, like do you know what I mean?

00:22:32:05 - 00:22:57:17
John Capobianco
Like it actually really is good at doing tickets and emails and slack notifications.

Joel Moses
Sure. Yeah.

John Capobianco
Because that's really its prime function is to do generative, generate text and do predictive text. Right? So if you use the tool in the right way, there actually is a huge, huge upside to better tickets, to more legible tickets, to better context in your ServiceNow tickets. Right?

00:22:57:19 - 00:23:24:15
Lori MacVittie
Yeah. Oh, absolutely. I mean, it excels. You know, use it especially where it excels at generative things, right. Whether it's images, diagrams, text, all of that. So, you know, we're moving toward we're out of time, because we talked a lot. What do you want people to take away? If you had one thing you wanted them to walk away from this with, what would it be, John?

00:23:24:18 - 00:23:45:15
John Capobianco
You can start right now. And if programing has been your desire, if learning how to code has been your desire, if you're in infrastructure and want to expand your skills, you can start for free with Ollama, with LM Studio, with Microsoft Foundry. You literally pull a model from the cloud and now you have free, private, local AI in your pocket.

00:23:45:18 - 00:24:08:21
John Capobianco
Or you could spend the $20 a month on hyperscaler X. I, you know, I'm not particularly interested in which one you use, but today. Today. If you heard this and you're not doing these things, start today because the train is still relatively close enough to the station that I think you can get on the train. But it's moving fast, relatively speaking.

00:24:08:23 - 00:24:14:16
John Capobianco
So I want to thank you for having me here today. This has been a really wonderful and thoughtful discussion. Thank you.

00:24:14:19 - 00:24:34:21
Lori MacVittie
It was great having you. And I love your insights. I love your enthusiasm, right. Being excited about the technology, because you're right and I think the market sees it. They see the potential. They're very excited about it, but they are cautious. You know, ergo Joel's, right, governance, a little bit of oversight, some control. Right,

00:24:34:21 - 00:24:54:27
Lori MacVittie
until we have a good idea how we're going to put agents on a PIP, right, or reprimands from this HR that's coming, we kind of have to make sure that we're paying attention. Right? Don't, it's not time to let them run everything, but it is time to start teaching them how to do that and helping you do that job.

00:24:54:27 - 00:25:11:21
Lori MacVittie
So I think that's the takeaway for me. And I'd love to, you know, chat more but you know we're out of time. So that is a wrap for Pop Goes the Stack. Subscribe now because the next collision needs spectators and maybe a cleanup crew.