AI Security Ops

In this episode of BHIS Presents: AI Security Ops, Bronwen Aker and Dr. Brian Fehrman break down some of the top AI security concerns being discussed by researchers, security firms, and government agencies this year.

As AI capabilities rapidly expand, so does the attack surface. From agentic AI systems being used by attackers, to deepfakes at industrial scale, to the persistent challenge of prompt injection, security teams are trying to understand what risks are real, what’s hype, and where defenders should focus first.

We dig into:
- Why agentic AI is emerging as a major security concern
- How attackers could weaponize autonomous agents to scale operations
- The risk of malicious agent skills and AI supply chain attacks
- Why overly broad permissions make agent-based systems dangerous
- AI-assisted phishing campaigns and social engineering at scale
- The rise of deepfakes and corporate fraud driven by generative AI
- Why humans still struggle to reliably detect deepfake media
- The economics of deepfake fraud and real-world incidents
- Prompt injection attacks and why they remain difficult to solve
- Whether future models may autonomously discover and exploit jailbreaks

This episode looks at the practical security implications of today’s AI ecosystem — where the biggest risks are coming from, how attackers may leverage AI systems, and what defenders should be thinking about as these technologies continue to evolve.

📚 Key References

Agentic AI Threats
- CrowdStrike 2026 Global Threat Report — https://www.crowdstrike.com
- IBM X-Force 2026 Threat Intelligence Index — https://www.ibm.com/security/x-force
- Cisco State of AI Security 2026 — https://www.cisco.com/site/us/en/products/security/state-of-ai-security.html#tabs-9da71fbd27-item-1288c79d71-tab

Deepfakes & AI-Driven Fraud
- WEF Global Cybersecurity Outlook 2026 — https://www.weforum.org/publications/global-cybersecurity-outlook-2026/
- International AI Safety Report 2026 — https://www.internationalaisafetyreport.org

AI Security & Infrastructure Risk
- CISA Joint Guidance on AI in OT — https://www.cisa.gov/news-events/news/new-joint-guide-advances-secure-integration-artificial-intelligence-operational-technology

Prompt Injection & LLM Exploitation
- Schneier et al., “The Promptware Kill Chain” — https://www.lawfaremedia.org/article/the-promptware-kill-chain
- Palo Alto Unit 42 — “Fooling AI Agents: Web-Based Indirect Prompt Injection Observed in the Wild”
https://unit42.paloaltonetworks.com/indirect-prompt-injection-ai-agents/

(00:00) - Intro & Episode Overview

(02:18) - Agentic AI as a Security Threat (CrowdStrike 2026 Global Threat Report, IBM X-Force Index)

(03:46) - Malicious Agent Skills & AI Supply Chain Attacks (Cisco State of AI Security)

(04:58) - How Agent Skills Actually Work

(07:47) - Permissions & Guardrails for AI Agents (CISA AI in OT Guidance)

(09:57) - AI-Generated Phishing Campaigns (CrowdStrike / IBM Threat Reports)

(13:58) - Deepfakes at Industrial Scale (WEF Global Cybersecurity Outlook)

(15:38) - Corporate Fraud & Deepfake Incidents (International AI Safety Report)

(17:21) - Why Humans Struggle to Detect Deepfakes

(21:13) - Prompt Injection Attacks Explained (Schneier – Promptware Kill Chain)

(24:35) - AI Models Jailbreaking Other Models (Palo Alto Unit 42 Research)

(28:59) - Final Thoughts & Wrap-Up

Click here to watch this episode on YouTube.

Brought to you by:

Black Hills Information Security

https://www.blackhillsinfosec.com

Antisyphon Training

https://www.antisyphontraining.com/

Active Countermeasures

https://www.activecountermeasures.com

Wild West Hackin Fest

https://wildwesthackinfest.com

🔗 Register for FREE Infosec Webcasts, Anti-casts & Summits
https://poweredbybhis.com

Creators and Guests

Host

Brian Fehrman

Brian Fehrman is a long-time BHIS Security Researcher and Consultant with extensive academic credentials and industry certifications who specializes in AI, hardware hacking, and red teaming, and outside of work is an avid Brazilian Jiu-Jitsu practitioner, big-game hunter, and home-improvement enthusiast.

Host

Bronwen Aker

Bronwen Aker is a BHIS Technical Editor who joined full-time in 2022 after years of contract work, bringing decades of web development and technical training experience to her roles in editing pentest reports, enhancing QA/QC processes, and improving public websites, and who enjoys sci-fi/fantasy, Animal Crossing, and dogs outside of work.

What is AI Security Ops?

Join in on weekly podcasts that aim to illuminate how AI transforms cybersecurity—exploring emerging threats, tools, and trends—while equipping viewers with knowledge they can use practically (e.g., for secure coding or business risk mitigation).

Bronwen Aker: 00:00

Welcome to AI Security Ops, the podcast where we cut through the hype and explore the real world intersection of artificial intelligence and cybersecurity. Each week, we examine how AI is reshaping both sides of the security landscape, not only the threats we're facing, but also the defenses that we're building. I'm Bronwen Aker, and today it's just me and Doctor. Brian Fehrman here to talk to you about what researchers, security firms, and government agents are calling the most intense security concerns of this year. And this is such a broad topic.

Bronwen Aker: 00:46

I thought I was asking a simple question. What are people worried about? And, boy, did I get a whole lot of information back that I wasn't expecting? This show is brought to you by Black Hills Information Security and Antisyphon Training. BHIS helps organizations identify and close real world security gaps through penetration testing, adversary emulation, purple team engagements, managed detection and response.

Bronwen Aker: 01:16

We really are a full service shop. Antisyphon Training also helps your people get better at the skills they need by delivering hands on practitioner led training built around real world world attacks, real world tools so that you and your team can apply the new skills that you need and roll on immediately. Learn more at blackhillsinfosec.com or antisiphontraining.com. Alright. Brian, we've got God, there's so much here.

Bronwen Aker: 01:57

I don't even know where

Brian Fehrman: 01:57

We're lose a lot. We had we had to cut up we cut a bunch out because they're just simply too much to cover in a reasonable amount of time. So that is very very broad topic. I think we've got just kind of four main subjects that we'll touch on in terms of the things that seem to be people seem to find to be a concern that might be keeping them up at night. So kicking it off with the first one is agentic AI as a threat vector.

Brian Fehrman: 02:27

So this is kind of a little bit of an interesting one because this can be taken a couple different ways and you can see that in reading through some of the data was returned here on the research, which some people view when they're talking about agentic AI as a threat vector. Some people view it through the lens of adversaries, threat actors using agentic AI to augment their activities to kind of as a task acceleration for performing attacks against companies. And certainly that that is a very real risk. Right? Because now they basically have extra hands to to help them out without necessarily needing to enlist an an army of people to go about carrying out their objectives.

Brian Fehrman: 03:14

So that's one facet. The other lens that you can look through this app, when we talk about agentic AI as a threat vector is the the implementation of the agentic AI being an additional attack surface that threat actors can potentially leverage through various means. Right? And so I wanna talk just a little bit about that one first. So, particularly, there's what's called Claw Havoc Campaign in which over, 1,100 malicious agent skills, were planted throughout an ecosystem and it's being called one of the largest AI supply chain attacks to date.

Brian Fehrman: 04:00

So this is an this is an interesting one. Bronwen, do you know what We're

Bronwen Aker: 04:04

we're talking almost 1,200 malicious agent skills. Mhmm. Now, maybe not everybody knows what is meant when somebody says an agentic skill. Can we break that down just very briefly for people?

Brian Fehrman: 04:23

Yeah. So a skill is, it's kind of a like a game plan, would say almost, like a behavioral outline that you would give to an agent.

Bronwen Aker: 04:33

A recipe.

Brian Fehrman: 04:35

What's that?

Bronwen Aker: 04:36

A recipe maybe?

Brian Fehrman: 04:38

Yeah. Yeah. I think a recipe, I think that's a great a great analogy for it. Because even when you look at it, it almost reads like kind of a recipe of like, hey, this is who you are and these are the tasks that I would like for you to carry out. Here are the parameters of that task and how it should be getting from point a to point b.

Brian Fehrman: 04:57

When I say it, I mean the agent. There are different ways that these are implemented. So with Claude skills, for instance, they are in like a TypeScript format, which is what they chose to use, which which is kind of strange. I'm sure there was a reason behind it, but it was a little bit weird having a type like TypeScript as the the the designated format, but meh.

Bronwen Aker: 05:21

So again, for people who are not familiar with how agentic AIs work, when you teach them a new skill, it can be something as simple as this is how you create a word doc, this is how you create this type of of output file or it can be a more complex progression of do this, then do this, then do this. So the idea of claw havoc having nearly 1,200 individual agentic skills, that's pretty intense. So, yeah, I'm I'm not sure where you wanted me to go with that, but I just I know that there are a lot of people who aren't quite as savvy about AI as we are in this room, so I wanted to make sure that got broken down for the full impact. Wow.

Brian Fehrman: 06:14

Yeah. Yeah. And so then, you know, the implication there is that if someone can implant a malicious skill or modify a skill that's already existing, then you're you're changing the behavior of the agent, right? You essentially have a level of control over what the agent is or isn't going to do. It's almost I mean, it's kind of similar to like an RCE vulnerability, but through a agentic coding depending on what had what it has access to.

Bronwen Aker: 06:45

And this is totally separate from model poisoning, which is a completely different issue. So so the ability to create or alter existing skills in an existing legitimate installation of an agentic AI and being able to add this just adds more things to look for for defenders, especially if you're in an enterprise or or even a small business and you're using agentic AI for something, you have to make sure that those skills remain safe and secured so that nobody who gains access into your infrastructure can alter those to do badness.

Brian Fehrman: 07:32

Yeah. Absolutely. Making sure that, if and if you're if you if you're not using internal ones only, then also vetting and verifying any external sources where you're pulling from pulling skills from. And then, you know, additionally, it also comes down to limiting what the agent itself can do. Right?

Brian Fehrman: 07:54

So principle of least privilege, just like with any user, you don't want to give the agent free unfettered access to do anything and everything in the environment. You want to really lock that down just to the specific use cases for that agent. And so that even if a skill were to become compromised or implanted, it kind of reduces the the fallout of of that attack.

Bronwen Aker: 08:18

It it still feels like we're just saying business as usual. You wanna use least least privilege. You wanna keep these things secured. You've gotta maintain that that separation. Oh, but still the idea of those malicious skills.

Bronwen Aker: 08:38

Wow. And that is that is one thing I'm seeing over and over again is that everywhere when we're talking especially about agentic AI and now you've got things like a plug in for Chrome to be able to leverage Cloud and other agentic browsers, you don't even need to know how to use MCP to leverage agentic stuff anymore. And so the bar for the amount of knowledge necessary to engage in an attack keeps getting lower and lower and lower.

Brian Fehrman: 09:15

Yep. Yeah. Agreed. And that attack surface in general keeps increasing as well too. Right?

Brian Fehrman: 09:21

With the more and more people, adopting the agents and daily use. So I think we're certainly gonna see a lot more of these issues issues to come.

Bronwen Aker: 09:30

Oh, fun. Yep.

Brian Fehrman: 09:34

Let's head on the the next one here, accelerated attack timelines. So I feel like this goes is a little bit related to the previous one, but we're not talking about agentic use specifically on this one. Like, it's more of a broader scope in which you can use AI to assist in exploit creation, proof of concepts for vulnerabilities, as well as generating phishing emails. So it's estimated that around 82% of phishing emails are now generated, getting a click through rate of 54% versus 12% for manual messages that are generated.

Bronwen Aker: 10:15

That's insane. I mean Oh. Well, phishing has been the bane of our existence for years. Even before I got cybersecurity was horrible. But now they're getting 54% click through rate.

Brian Fehrman: 10:29

That's insane.

Bronwen Aker: 10:30

Well, one of the things I think we've talked about before is that the rise of AI is an accelerator. Every all of the all of the really big AI content generators and and thought leaders always go on about how it's a force multiplier. And that's what I see when I see these numbers is that the ability when when talking about AI, it's always about quantity and speed, not quality. And with a lot of these attacks, it's about numbers. Because if I can launch a million individual tiny attacks and I get a 0.01% return, that can be enough to make that engagement profitable.

Bronwen Aker: 11:22

So now with AI being able to launch that million tiny attacks, whether it's a brute force or or something else that is totally mundane, Being able to do that using AI has now expanded the capabilities of attackers beyond anything that I think anyone could have dreamed of a few years ago.

Brian Fehrman: 11:51

What Oh, yeah. Absolutely.

Bronwen Aker: 11:52

How do we how do we protect ourselves against this? I mean Right. What do we do?

Brian Fehrman: 12:01

That's an interesting one. So if we're talking about, you know, on the so on the email side, you know, still traditional awareness training, even though it might be losing its effectiveness, I mean, it's still important to educate users that especially in, I mean, today's age with the sophisticated AI generated emails that always verify. I mean, just because it looks very legitimate, very tailored to you, it seems like the personality of someone who you're expecting, it's still, you know, pick up the phone, find another way to to maybe validate of of what's going on. We're we're talking about, like, the exploitation side. So one of the interesting points that I see here is time to exploit for vulnerabilities dropped from sixty three days in 2018 to 2019 time frame to just five days and saying that AI can generate exploits for new CDs within fifteen minutes of disclosure.

Brian Fehrman: 12:58

So it's That's insane. Yeah. So very important to be monitoring your attack surface looking for monitoring CV disclosures that are coming out and looking at how much you can try to automate this process yourself, you know. So use AI on the defensive side to help discover these emerging threats and figure out how you can quickly patch up and shore up your your surface to try to prevent any damage.

Bronwen Aker: 13:27

Yeah. If you thought it was important to watch those CVE disclosure announcements before, it's even more important now. Oh my. Yeah. Five

Brian Fehrman: 13:37

Fifteen minutes. Five

Bronwen Aker: 13:39

days, fifteen minutes.

Brian Fehrman: 13:42

Yep. Wow.

Bronwen Aker: 13:44

Yeah. Well, it's it's interesting that that you talked about some of the ways to use side channels to, verify things with email that also touches on the next topic, which is deepfakes.

Brian Fehrman: 14:01

Perfect segue. Go for it. So

Bronwen Aker: 14:05

one of the one of the big concerns that we've been hearing about for a while and that is not going away is deepfakes at industrial scale. And the human detection accuracy has apparently dropped to less than 25%. So that means that when presented with deep fake video, fewer than 25% of people can tell whether or not it's a deep fake. And we've had our own experiences. Well, you were on the the panel, weren't you?

Bronwen Aker: 14:42

Where we were interviewing someone that trying to hire them and and out of four candidates, only one was very clearly a real human being?

Brian Fehrman: 14:52

Mhmm. Yeah. It The other the other other three were questionable. One we think was fully generated. The other two were at the very least using in line translation real time as well as some other weird stuff going on.

Brian Fehrman: 15:08

Wow.

Bronwen Aker: 15:09

And in addition to the full on where you've got audio, video, and everything going on, voice cloning fraud is up six hundred and eighty percent in just one year. One in four people has apparently experienced an AI voice cloning scam. This is why I don't answer my phone when when strange numbers come in. And if you leave a voice mail swipe, it's gone. Corporate deepfake fraud is currently costing $1,100,000,000 in The US alone as of 2025, and it's projected to be $40,000,000,000 globally by next year, by 2027.

Bronwen Aker: 15:57

Grock had a deepfake scandal. 6,700 images per hour were being generated for for fraud there. And I believe that with with Grok, there was also some nonconsensual shenanigans going on. And then we also have UNICEF and Interpol reporting that at least 1,200,000 children disclosed having inter their images manipulated into explicit deepfakes in the past year. This is I don't know where to start with this.

Bronwen Aker: 16:40

I mean, this is just, yeah, truth is stranger than fiction. This is this is one of those things where it it I all I wanna crawl back under my rock. Can I crawl back under my rock? Because So How do we I only get to see you, what, a couple of times a year. How would I know that if you call me and ask me to do something that it's legitimately you?

Brian Fehrman: 17:10

Oh, man. So, you know, I think based upon voice alone, it would be difficult. As I mentioned before, Derek had created a Derek Banks, who's usually on the podcast, had created a deepfake of me, an audio, not not video, but just audio. And I played it for both my wife and my mother-in-law. And granted, this was almost like a year ago now, so the technology has gotten even better.

Brian Fehrman: 17:32

And at that time, they heard it and they couldn't believe that it was not me. Mean, sounded just like me. I mean, to me, like, I could hear like very tiny little like audio glitches, but it was it was more of like a technical like standpoint, not so much that it wasn't accurate to my voice. And so it'd be very difficult. I mean, you got like, really best you can do is, so, you know, you get a call from me, let's say, you hang up and you call back, look up my number in the employee directory and call back there and just hope that someone hasn't sim swap me or sim clone me or something.

Brian Fehrman: 18:09

But, know, it's it's becoming very difficult. Right?

Bronwen Aker: 18:13

What about other Whatever. Bands options? I mean, yeah, the the sim swapping is is pretty common these days. What else could I do?

Brian Fehrman: 18:26

I mean, if we have a video call, like video chat like this, you can try to get on say teams and do a video call. So I've actually done that. We I mean, we do that internally for certain requests. So if I like if I send an email to our systems team with a request that's kind of, we'll say high value or high risk, high impact. There we go.

Brian Fehrman: 18:49

There we go. Could be high impact. I gotta get on a video call so they can see me. I mean, sure if someone has compromised my account, well, I mean, that's gonna be difficult at that point. Right?

Brian Fehrman: 19:00

Because they could deep deep fake the video, but at least it's it's a decent a decent check. Because yeah. You know, on the account, see them on video and maybe try to look for a little artifacts and stuff or I don't know. Like, hey, snap your fingers.

Bronwen Aker: 19:19

Do do the heart thing, you know. All of that. I also understand that organizational passphrases are becoming a viable out of band reality check to verify that it really is you. How how exactly does that work?

Brian Fehrman: 19:39

So that would be that that'd be a tough one from a implementation standpoint, but I I think that you'd really probably want something that gets rotated, maybe in some kind of like a secrets manager where someone already has to do like a two factor authentication to get to the secrets manager. So they've got their password and they've got the authentication measure on their phone, they log in and they see the specific passphrase for them for the day And someone else can look up that same, you know, look up in that same registry, secrets manager, and verify that, hey. Oh, yep. You've got the correct passphrase for today. You know?

Bronwen Aker: 20:14

It's it's it's almost like we're already using one time passwords and and and whatnot for logging into or authenticating to digital devices. Now we're gonna have to do it for our human interactions too.

Brian Fehrman: 20:29

Yeah. Well, yeah. I mean, I wonder how long before we're just required to have, you know, everyone has like a thumbprint reader or some other kind of biometric device reader or you have to take like a quick little DNA swab that we put on a little thing next to our computer that validates that that's us. Great. It's probably not too far off.

Bronwen Aker: 20:51

I've I've seen in science fiction movies where they they breathe on a sensor to verify that it's really them or or a thumb swipe or or they're doing thumb swipe and retinal scan. Truth is stranger than fiction. Alright.

Brian Fehrman: 21:08

Oh, yeah.

Bronwen Aker: 21:09

Let's let's wrap this one up with a nice nice good run. I'm gonna let you take this next one.

Brian Fehrman: 21:15

Alright. So last one is prompt injection, which, you know, is broad in terms of how it can be used. But for those who don't know, prompt injection is basically trying to override instructions that were given to a model. Typically trying to override the instructions that are part of the system prompt that were given to the model which defines its behavior, what it can and can't do, what its role is, personality. But basically trying to trying to get the model to listen to what you have to say, not what the developers or the deployers would like for it to do.

Brian Fehrman: 21:55

And the thing is is that this is one that isn't really, it's not fixable completely with the architecture of LLMs today. There's no way to fully separate out trusted from untrusted input into the model and that's that's just what it is. There are different measures that people try to take to delineate that data, but there's still nothing like parameterized SQL queries for instance, where it is 100%, you know, that this data is good, this data is not to be trusted.

Bronwen Aker: 22:30

How many people do you think are are surprised or will disbelieve this notion that prompting injection is always going to be a vulnerability for large language models?

Brian Fehrman: 22:42

You know, I I imagine it it's probably hard to to, I guess, to accept that there's it's something that we just can't fix. But I mean, it's, you know, it's it's similar to social engineering aspect. Right? So we can put in all the different guardrails and protections that we want. We can put in different measures to mitigate the damage for if an attack is successful.

Brian Fehrman: 23:08

But at the end of the day, like how it's it you're just not gonna be able to you just can't fix it at least not with the way that it's architected today.

Bronwen Aker: 23:16

Yeah. Well, and I excuse me. You're more conversant in the the nitty gritty underneath processes than I am, but my understanding is that your large language models are basically prediction engines. They're they're advanced autocomplete on steroids. So I can't yeah.

Bronwen Aker: 23:44

I can't imagine how, given the fact that it's always going to be trying to anticipate what should come next, it doesn't know right from wrong. It doesn't know true from false. It doesn't know valid from invalid. It just knows that if, this was the next word three times out of out of 10 or or more than that, when these words were preceding it, it's always going to predict that next word. So it's just I'm explaining it badly, but you know what I'm trying to get to.

Bronwen Aker: 24:22

It's it just ah. Yeah.

Brian Fehrman: 24:28

Yeah. It's really well.

Bronwen Aker: 24:29

The thing that's throwing me is this nature's communication study where they're saying that large reasoning models can autonomously jailbreak other models at better than a ninety seven percent success rate.

Brian Fehrman: 24:44

Yeah. I think that's that's pretty interesting. So pitting the AI against AI, which has come up in other implementations too, such as like with generative adversarial networks, is basically how deepfakes work. But there's other things that can be generated up with a generative adversarial network or a GAN as you'll see it. But it's still kind of a similar idea is that you're you're pitting one AI against another AI and waiting until it one of them learns enough information about the other to it can come up with a a working solution essentially, is the is the gist of it.

Brian Fehrman: 25:18

And it's interesting because I would it'd be interesting to look at some of the output and how weird it gets. Because I know that there's an attack out. It's been out for a bit from or Bishop someone from Bishop Fox, Ben Lincoln, actually had created a he's got some stuff out on GitHub for it of where it's attacking open weight models and it iterates through the the prompting. And so you'll start with a prompt, it'll send it, get a response and then it'll append on what looks like just kind of garbled symbols and just keep on trying back and forth until it finds something that works. And all it's doing is, to to your point earlier, it's just taking advantage of the fact that these are just prediction engines.

Brian Fehrman: 26:03

It doesn't even have to make sense what you send to them because all they really see is symbols and well, all they really see is numbers actually and try to draw relations between numbers and it's just finding that correct combination of numbers to get the right path triggered to get the output that you're looking for.

Bronwen Aker: 26:19

So what do we do to protect ourselves? I mean

Brian Fehrman: 26:24

Yeah. This

Bronwen Aker: 26:25

is I mean, all of the all of the news, all of the research is coming back. This is not something that can be fully resolved given the nature of the technology? What do we do?

Brian Fehrman: 26:38

Yeah. So there are certain things we can do to mitigate harm, which is or try to reduce success anyways, which is doing things like putting guardrails into place. It can be basic guardrails like keyword filtering. It can be LLMs that are judging the input and the output for trying to detect if basically if it thinks prompt injection is occurring. There's also I know that there's some research that started they published those it's almost like a year ago now, think it's called task task tracking, which is basically looking at the way that the the neurons, the perceptrons in the LLM are activating.

Brian Fehrman: 27:26

And when it sees certain groups that activate at a higher level in certain situations, then it can detect, oh, hey, I think something wrong is going on here, maybe block or at least log it and respond to it. So there's some kind of some cool research into the field of how we can work around the problem at least. So like work with the problem, live with it, you know. That's in

Bronwen Aker: 27:52

addition to all of the traditional input validation and separation of of assets and and systems and servers.

Brian Fehrman: 28:04

Yep. Yep. Exactly.

Bronwen Aker: 28:06

Oh, I just there's so much stuff going on, but one of the things that that we should definitely do in one of the upcoming episodes is another deep dive into how defenders are using AI because there's some interesting stuff, exciting stuff happening in that space as well.

Brian Fehrman: 28:32

Yeah. Oh, yeah. I think that would be great. Certainly include that on a on a future episode so people can be better better equipped to defend themselves.

Bronwen Aker: 28:40

Alright. Anything else before we wrap up?

Brian Fehrman: 28:45

No. I think we think we hit some some good points, some good topics here in this episode.

Bronwen Aker: 28:49

Okay. Thank you very much everyone for attending this action packed adventure as we talk about the the latest ways that AI can be used for badness. Please come back soon and keep on prompting.

More episodes

Chapters

Creators and Guests

What is AI Security Ops?