Human-Centered Security

When we collaborate with people, we build trust over time. In many ways, this relationship building is similar to how we work with tools that leverage AI. 

As usable security and privacy researcher Neele Roch found, “on the one hand, when you ask the [security] experts directly, they are very rational and they explain that AI is a tool. AI is based on algorithms and it's mathematical. And while that is true, when you ask them about how they're building trust or how they're granting autonomy and how that changes over time, they have this really strong anthropomorphization of AI. They describe the trust building relationship as if it were, for example, a new employee.” 

Neele is a doctoral student at the Professorship for Security, Privacy and Society at ETH Zurich. Neele (and co-authors Hannah Sievers, Lorin Schöni, and Verena Zimmermann) recently published a paper, “Navigating Autonomy: Unveiling Security Experts’ Perspective on Augmented Intelligence and Cybersecurity,” presented at the 2024 Symposium on Usable Privacy and Security. 

In this episode, we talk to Neele about:
  • How security experts’ risk–benefit assessments drive the level of AI autonomy they’re comfortable with.
  • How experts initially view AI: the tension between AI-as-tool vs. AI-as-“teammate.”
  • The importance of recalibrating trust after AI errors—and how good system design can help users recover from errors without losing their trust in it.
  • Ensuring AI-driven cybersecurity tools provide just the right amount of transparency and control.
  • Why enabling security practitioners to identify, correct, and learn from AI errors is critical for sustained engagement.

Roch, Neele, Hannah Sievers, Lorin Schöni, and Verena Zimmermann. "Navigating Autonomy: Unveiling Security Experts' Perspectives on Augmented Intelligence in Cybersecurity." In Twentieth Symposium on Usable Privacy and Security (SOUPS 2024), pp. 41-60. 2024.

What is Human-Centered Security?

Cybersecurity is complex. Its user experience doesn’t have to be. Heidi Trost interviews information security experts about how we can make it easier for people—and their organizations—to stay secure.

From Tools to Teammates: (Dis)Trust in AI for Cybersecurity with Neele Roch

Heidi: Hello everyone and welcome to Human Centered Security. I am your host Heidi Trost and I'm joined by, my co host John Robertson. John, you want to say hello?
Today, we are joined by Neele Roch. And she is going to talk about a paper that she recently published. It was presented at the Symposium on Usable Privacy and Security in 2024, so just a few months ago.
And it was called, it is called, "Navigating Autonomy, Unveiling Security Expert Perspective on Augmented Intelligence and Cybersecurity." To give you just a very brief background um, on Neele, she is pursuing her PhD at the professorship for security privacy and society at ETH Zurich. And one of the reasons that we really wanted to talk to her is that [00:01:00] she has some really interesting insights around trust and around how AI fits into a trust model or a trust framework and how cyber security professionals think about AI tools and integrating that into their workflows, which is obviously something that John and I have been focusing on the past few months.
So we're really excited to talk to her. Really pleased to spread the word about her, her paper, her great research. So welcome to the show. And thank you so much for sharing your insights with us.

Neele: Yeah.

Hi, thanks for having me. That was a really nice introduction.

Heidi: So for folks who haven't read it, Just briefly describe what the research is.

Neele: So the research is very broadly concerned with, um, the collaboration of experts and AI in the domain of cybersecurity. It's mainly [00:02:00] motivated by the fact that probably a lot of people in cybersecurity have heard a lot: um, the workforce gap, we have too little professionals, too little skilled workers for the amount of work that needs to be done in cyber security and with novel technologies such as AI, we have the human expert and the AI with potentially very, um, complementary capabilities.
And the idea is to investigate and evaluate whether we can strengthen the cyber defenses by kind of augmenting the human expert intelligence with some of the capabilities that AI has. And in this initial paper, it was more of a foundational work where I looked at whether this is even possible or what the expert perspectives are.
What are the requirements? What are the needs of the experts that are potentially working with AI in this context?

Heidi: Yeah, I liked how it was very [00:03:00] foundational, very, like... What do you think about these tools? How do you see them fitting into your workflows? What are your workflows? So I thought, I thought that was great.
One of the things that you brought up in the paper, I think when you were doing your literature review was, that this kind of human and AI, um, collaboration, right, like the teaming up of human and AI has been studied in medical and in, in military context. And I was curious if you, if you noticed anything in particular that, that stood out to you or thought, that's so interesting and maybe that could apply to cyber security.

Neele: I think the similarities in the fields or especially with the military field, for example, is kind of the, um, urgency and with especially the medical field is sometimes also these [00:04:00] discretionary decisions where there's not a clear right or wrong, but it's more of a difficult decision that the expert needs to make based on what might be the best for a specific case.
And we have a lot of different applications as well as in military as well as in cyber security, different contexts, they might have different risk factors, different kinds of risk levels. And that's really interesting to look at related literature, and what they have found in those kind of contexts and see whether this holds true for the context of cyber security. Because this literature on expert or human AI collaboration, it's not even necessarily always experts, um, is very task dependent. The research is really fragmented. And so it needs to be better understood in the specific context that we're looking at.
And this is also why we're specifically looking at cyber security to see whether this is the same kind of requirements and conditions [00:05:00] as we found in other kinds of areas.
React to that.
My specific research is looking at experts. Um, so some, that is a very special target group because they are very knowledgeable in one specific domain. They have, um, high self confidence in their abilities. They're able to kind of verify and assess whether what, um, an AI, for example, gives them is what they would have expected.

Heidi: Can you break the research down for us and just give us a general sense of like, you know, what were your research questions and, you know, and how did you go about answering them?

Neele: Yeah, absolutely. So the research questions briefly is how do experts want to use AI in cybersecurity?
So what are the requirements kind of understanding, um, the kind of status quo and what we are looking at [00:06:00] now. Um, so what are the tasks? What is the status quo and how could, or is AI already introduced into that? What are the needs and requirements of the experts for these specific tasks?
And kind of also trying to understand what mode of collaboration do experts want to have in cyber security? Because, so a big part of my research is concerned with AI autonomy, so how autonomous should an AI act for specific use cases, for specific tasks? So trying to understand for every context and every task, how do experts want to collaborate with AI even in the first place? What is the ideal scenario, let's say? And for that, I've spoken to 27 cybersecurity experts, and it was a semi-structured interview, so I had an interview guideline, but it gives some, um, [00:07:00] freedom to investigate some topics further. And we've also used canvases and asked them to write down the tasks that they're doing and kind of rate them on a canvas of how, um, how much they would like to introduce AI into that and how feasible they believe it is to introduce AI into that at this point in time.

Heidi: Yeah. And, and what surprised you the most about the research results?

Neele: So what really surprised me the most is, of course, with introducing AI and making it somewhat maybe autonomous, making it act freely in some instances, there's a lot of trust required from the experts. And on the one hand, when you ask the experts directly, they are very, very rational and they explain that AI is a tool. AI is [00:08:00] based on algorithms and it's mathematical, um, optimizations. And while that is true, when you ask them about how they're building trust or how they're kind of granting autonomy and how that, um, changes over time, they have this really strong anthropomorphization of AI, where they describe the trust building relationship as if it were, for example, a subordinate new employee, and they were introducing them to new tasks, they would give them, um, further in the beginning, they would give them probably less critical tasks, observe their kind of, um, outcomes and then grant them more critical tasks.

So on the one hand, you have this really strict kind of view of this as a tool, but on the other hand, when they actually interact with it and they describe how they would interact with it, they kind of see [00:09:00] it more in a humanized form.
John: It's kind of an interesting phenomenon. Um, I wonder if it's just related to the output, um, that a lot of kind of these systems are relying on, which is like the kind of assistant design or like you're talking to something designed rather than more of this mathematical or kind of typical sort of put something in, get like something out in a very strict or small way.
It's like more of a conversational output.

Neele: Yeah, definitely. And that's also one of the findings of how do experts want to interact with AI. It's definitely through natural language. So it's not, they don't want to interact with mathematical outputs. It makes the problem more complex for them. [00:10:00] So a natural language, whether that is written language or spoken language is definitely how experts prefer to interact with such tools.

Heidi: Yeah, I thought, I thought this was one of the most interesting parts of the article and part of parts of the research is like centering around trust and the anthropomorphic, I can't even say the word either, but giving, you know, human like qualities to something that is not human. Um, and I think, I think it's really interesting for the future of this AI-human collaboration.
I think it's really funny when people say, Oh, AI is so stupid. It's like, you know, it's like a three year old, you know, interacting with you. And I'm like, yeah, maybe right now. But as, as you mentioned in your research, if you're building trust with it, if you're giving it more and more and [00:11:00] more, and it's It's aligning with your expectations and it's doing pretty well and obviously it's just going to get better.
Yeah. That trust is only going to increase and oh, all of a sudden that three year old that you were interacting with is, you know, a fully fledged adult who can, you know, do a lot of the same tasks that a human can. So I think it is really funny how there's so much of a, like, psychological and human side to it.
Based on, like, compared to, like, oh, we're so logical and rational about, like, the tools that we adopt.

Neele: Yeah, that's um, also a shift in kind of research from seeing these as tools to there's a string of research that just considers AI as a teammate and considers this collaboration as a team collaboration and not necessarily as a tool and which is also part of what I am kind of, I'm, I kind of, also use the term expert AI [00:12:00] collaboration, and you could argue that for collaboration, you need both parties to have an understanding of the goal.
And we're not really at the point where AI has a goal understanding. That's a unique quality that the human has. We don't know how it will develop over the next few years and whether it will be able to kind of give that as well as the human could.

Heidi: I had a couple of follow up questions about trust. Like, I mean, we were just talking about building trust and how it relates to a gradual increase in autonomy, autonomy, autonomy. It's just too early. Yeah. Um, and you talk about different automation levels in your research. So you have, you have five different levels.
How does. Help me form a mental model of how trust relates to those different levels that you describe in the research.

Neele: [00:13:00] So I have used in my research five kinds of, um, levels of autonomy and it starts from a decision support system. So on this level, the AI simply provides information to someone. So let's say we have the use case of, we want to find out whether an email is phishing or is not phishing --is a genuine email.
Then for this case, the AI would simply summarize things that it found to be suspicious. And this really doesn't require a lot of trust, besides trusting the information it gives you, which you could verify if you want to, but it doesn't make a final decision. The AI doesn't make any decisions in this case, and it doesn't recommend anything.
So the trust here definitely needs to be lower than when we're looking at a higher level, which is a fully autonomous system or a fully autonomous AI system, where the AI would just simply delete an [00:14:00] email, for example, which it classifies to be malicious. And we don't necessarily want to delete an email which comes from, let's say, an important person from your company, a CEO, um, human resources.
So that requires a lot more trust in, um, the, capabilities of AI. And so the trust is different for a decision support system where the human really needs to still make the decision and still needs to consider their own information, um, up to where an AI just makes a decision based off what information it has collected and found significant or not significant.

Heidi: I want to back up for just a second because I thought your, um, AI autonomy decision framework was really interesting. And I feel like It might help you have a really nice diagram in your in your paper. But it might help for you to just describe what is in that diagram and like what makes up that framework.

Neele: Yeah, so when we [00:15:00] spoke to experts we kind of wanted to also understand for what kind of tasks, what is their mental decision model of what autonomy are they willing to grant an AI for that specific context? And so naturally for cybersecurity experts, probably a lot of them are familiar with risk benefit assessments. And this is also kind of what they applied to this, um, what kind of autonomy should AI have?
So for each level of autonomy, they've kind of weigh up the risks and benefits of either using a human or using an AI. And so, um, this entails the error proneness that the human has and that the AI has, it entails the potential impact and potentially a reversibility of something that's gone wrong. And then these risk benefit analysis are influenced by on the one hand the task, [00:16:00] um, which entails things such as capability fit.
So if we need to be able to kind of assess something against a greater goal, the human is definitely better at that. The human is able to understand the context and the ultimate kind of procedure we would need to reach that specific goal. The AI is definitely not, but it's very fast at computing things.
Or, um, things such as urgency. So is a task really urgent? Do we need to make sure that this task is done 24 hours a day and it's done very fast, then an AI might be better at, um, this task specifically. But also this kind of assessment is really, um, moderated by the expert's trust in kind of himself and the AI. And this trust for AI is also based on the fact that the human can oversee the AI's work. Kind of, yeah, this classic factory [00:17:00] overseer, I can make sure that whatever the AI does or the worker does is correct. And this is also, um, kind of anchored by the expert's experience with AI in general, but also that specific AI model.
And if he doesn't have that, experts still have a pre, uh, disposition for a specific trust in a system. And this kind of experiences and making up their minds of how much they trust can really be facilitated by AI transparency. So for example, methods of explainable AI, making the decisions of AI, um, understandable for the human, making kind of the backend, how is data stored for privacy reasons, understandable for the human, and that can really influence their risk and benefit assessment.

Heidi: Yeah, I love that. And for UX people listening, like [00:18:00] these are these are UX issues, right? Being able to explain, uh, why a decision has been made, what factors were involved in that decision. Um, UX is involved in, you know, potentially reversing the decision, um, all that sort of thing. So. Yeah. Love that.

Neele: Yeah, this transparency can really kind of-- and through design, you can probably also do that-- but so the human mental model might be different than the model that the AI has. And this transparency can kind of help put these two together as well. This is, based on a paper that, um, maybe I'll mention the author is Fersing et al. It's from 2022. Um, but they have really found that the, these explainability mechanisms or this transparency helps kind of, um, bring back together the human mental model and the AI model.
And these [00:19:00] explanations can help establish trust between AI and experts. So they're really important and it's important to consider how we do this.

Heidi: Yeah, yeah, that's funny. Um, I felt like I had, I feel like I have to explain what a mental model is to folks sometimes. So I'm just gonna just throw it out there and John, if you feel like I do a bad job, let me know.
But a mental model is just how, you know, how you believe something works. It might not be right, right? Like we have wrong mental models all the time, but it's, it's your interpretation about how something works. So where you run into problems with mental models that are misaligned with reality, um, one people, you know, misunderstand how the system works.
So they might not trust it. They might use it in ways that It wasn't intended to, right? Like they might expect outputs that [00:20:00] it just can't generate. And that collision, like that misalignment, causes a lot of user experience problems, one, and also can decrease trust. So like folks who don't know a lot about technology, might view computers as something that they they don't trust like, "I don't know how this works like I don't trust it." So just as an aside of like what mental mental models are, you know are how we think things work how we think a system works and how that can sometimes be problematic for using that system.
How'd I do? Yeah I think, John. You can give me a grade if you want.

John: I think it's good.

Heidi: B minus.

Neele: I think that was really good because it kind of gives me something to kind of build on because this is something that researchers have found in, I [00:21:00] think, 2010, and it's a paper on trust in automation, and they've also found this, um, kind of the overtrust leads to misuse of a system. So when we design systems to facilitate trust or foster trust, where there really isn't any, any foundations for that, then the users will misuse that system, it will not be used in the right way. And on the other hand, if we kind of design the system to, um, have an under representation of that, we kind of foster disuse. So that's, uh, exactly what's gonna happen. And I think exactly what you described just now already.
Heidi: Yeah, you described it much more succinctly than I did. So going, going back to trust, one of the things we talked about in a previous conversation was recalibrating trust. Yeah. Um, can you describe what you [00:22:00] mean by that? And it seems like a really interesting concept.
So I would love for you to, what do you mean by recalibrate, recalibrating trust? And you know, what does it mean for the user experience?

Neele: So, I, I was definitely not the first person to use this. This is something from a paper by Lee and Zee, Lee and Zee. Um, and they have kind of formulated this idea of calibrating trust.
And the, um, this is kind of describing the appropriateness of human trust in the AI's or automation's abilities or capabilities. So what we've already talked about when there is over or under reliance, this can lead to being really unproductive, um, and therefore enabling humans to actually have this appropriate understanding of the AI's capabilities can help them form like an [00:23:00] appropriate level of trust.
So they don't go into overtrusting where they misuse the system or under trusting where they completely disuse or don't use the system at all. So there is this idea that even if we don't have a specific experience or touching upon back to a little bit also my framework where I've said that, experience plays an important role in kind of assessing how the benefits and risks are found to be... the experts come into the interaction or the kind of context of an AI system with some idea of trust. So there is initial trust before even interacting with the system for the first time. And this frames every subsequent interaction with the tool. So if there's already a kind of a low or a high expectation or trust level, [00:24:00] this small difference in disposition that users might have because they're individuals, they have different kinds of views and perspectives, may lead to some people engaging with the AI a lot. So they are more able to kind of trust the system and use it. And, um, some to falling back to manual labor or manual control. And coming back to the experts, experts are self confident and experts are especially prone to falling back to manual control because they have this higher level of domain knowledge because they have this feeling of they know it better than the AI and they're initially a little more distrustful than a lay user would be because they don't have as much expertise.

Heidi: Yeah. And one of the things that you mentioned at the beginning of your paper is AI lacks context, and [00:25:00] context is so important when it comes to cyber security. And if you want to make strategic security decisions, like, you have to have context. So that really resonated with me, because it seems like a major shortcoming of AI, and it makes a lot of sense that cyber security professionals would be skeptical of AI because it lacks that that context.

Neele: Yeah. Context is really important. Um, especially because that's one of the things that the human really needs to bring in because the AI is still like we've discussed in the beginning, yes, it's advancing and it's getting more, um, it, it can do way more than it could 10 years ago, definitely.
And it's still developing, but this kind of context, understanding, understanding what is the ultimate goal independently of what this one task that we're executing right now is, that's really what AI cannot do yet. And that's really where the human [00:26:00] needs to come in and needs to make these kind of adjustments and decisions.
And that's why it's really important to kind of have both. So these are tasks that require both. You can't just put in AI and have this perfectly automated. You can't just put the human because obviously we don't have enough expertise all over the world. That's why we're facing this workforce gap.
But maybe if we can put both together, they can both use what they're really good at and kind of make up for that. Yeah.

Heidi: So for folks listening who are building cybersecurity tools, cybersecurity software. And obviously, you know, everyone wants to integrate AI. It's a buzzword, you know, it's, it's something that folks are really excited about.
So either they have already, or they're thinking about, you know, leveraging AI in the software that they're building for security professionals. What advice do you have for them? [00:27:00]

Neele: So I would really say consider your target group, because there's a, huge differences in the needs for lay users and in the needs for experts.
Experts have, like I've iterated many times by now, they have this knowledge that lay users can't rely on. They have to rely on the AI's predictions in those cases. And experts are really able to verify and kind of, even if it's discretionary, even if it's intuition for the experts, they are able to draw from a really broad knowledge base.
They're able to draw from a very specialized knowledge base. Really consider kind of your target group. What knowledge do they have? How are they approaching AI? So are they more prone to being distrustful or are they more prone to kind of over trusting and then designing the product to kind of facilitate that. So [00:28:00] are the users able to kind of correctly calibrate their trust? Can they see how the AI works? Can they validate that?
Also, what's really important is even if the AI produces faulty results, this is not detrimental necessarily. It can be, but it's also been seen in research that if there's chronic failures, humans are kind of able to accommodate that after some time.
They're able to understand the faults and accommodate for that, but they need to be enabled to do that, and they need to understand the AI enough or be able to kind of make enough judgment. So does the AI system kind of facilitate the recovery of trust as well as just initially producing it.

Heidi: Oh, wow.
Those are really good recommendations. Okay. Let me see if I got these right. So the, the first [00:29:00] recommendation was around calibrating and recalibrating trust. So if, if your target audience are who are not cyber security experts, they may come in over trusting the technology and that can lead to problems.
So, the system designers need to account for that, the fact that folks might over trust it and the system wasn't designed to be over trusted, right, like it has limitations and they need to be aware of those limitations. On the other hand, there's, uh, cyber security experts, who are immediately going to be distrustful of the technology.
You need to account for that and help earn, you know, earn their trust through the technology. So is that your, yeah, that's, that's fantastic. I love that. And then the second piece of advice, um, now I'm forgetting because I can only focus on one thing at once. Uh, can you just, [00:30:00] yeah, definitely.

Neele: So, um, when, when faults occur in the AI system.

Heidi: Oh, recovery. That's right.

Neele: Yeah. Okay. Exactly. So the decline is not instantaneous. It's over time. Um, but humans are also kind of able to accommodate if this repeatedly happen and make up for that. So, um, kind of facilitating the progress or the process of, um, this trust recalibration, trying to kind of adjusting to the system as well.
Heidi: Yeah. Are you familiar with, um, Jacob Nielsen's the 10 usability heuristics? One is error prevention. And then, oh, help users recognize, diagnose and recover from errors.
This was developed, I don't know, 30, 40 years ago. Like it was a long time ago. Um, but it is funny how [00:31:00] one, you know, a couple of those heuristics align with what you're talking about with a, you know, a cutting edge technology. So it's, it's really, really interesting how they like kind of, you know, stand the test of time.
Um, but I think what you say is really interesting. If there are limitations. Which obviously we know there are limitations with AI. Giving users the ability to recover from some of those limitations is really critical. So I, I love that that was another one of your recommendations. That really resonates with me.

Neele: Yeah. We can't, we can't prevent it from erroring.

Heidi: Right. Yeah.

Neele: But we, we can kind of help users manage that.

John: I was gonna just jump in here and talk a little bit about like,
I don't know, talk a little bit about actually getting people to continue to use it so when you're designing [00:32:00] something, as you said, you're designing something it's not all at once, it's iterative. You want to design it in a way that encourages people to come back, even if it's not perfect, or even if there are a few errors.
Um, and I think that's an interesting piece for designers, especially, is to encourage users to keep using it as the models continue to get better. Um, I think there's a big problem with, you know, a couple of big errors, especially when you're talking about experts who have a lot of their own loopholes anyway, just not coming back.
And they're like, okay, this doesn't work. Leave. And your model isn't getting better because your expert users aren't coming back.
And of course, they're not coming back. So obviously hurts you as a designer or [00:33:00] a company building this, these kinds of solutions.

Neele: Definitely. Yeah. I've also heard that in my interviews that, um, we, we had one participant who was interacted with AI in a work setting and it acted faulty and his conclusion was to never use AI again because he could do it better.
So, um, that's really difficult to recover from. Definitely. I'm certainly no UX designer. But some faults are more, um, human understandable than others. And if I can understand as a human what's gone wrong, then it's a lot easier to kind of engage in it again.

Heidi: Yeah, being really honest and, uh, cognizant of where the errors are most likely to occur with your product. And, and by anticipating that you can, you can design for it, right? Like design for helping [00:34:00] the user recover for it. But you have to, you have to know what those, those problems are, which I think is, is just kind of underlying the need for UX people to be, to understand AI and to understand, you know, machine learning and how it integrates into their, their products.
Otherwise you can't explain it to users if you don't understand it yourself. Definitely, yes. So what are you going to be working on next? What sort of things are interesting to you?

Neele: So I really like the idea of, um, gradual autonomy and I find it really interesting, um, of facilitating this kind of trust building.
So I'm going to be looking at the varying levels of autonomy. So going from very low as a decision support system without a lot of responsibility to more high where humans are almost uninvolved. And also one finding that I found really interesting is this kind of transparency and the [00:35:00] explainability.
And so my future research will be concerned with designing a system that has gradual levels of autonomy and is able to provide transparency and then evaluating how that influences the expert's trust and their willingness to use it and their willingness to adopt such a technology.

Heidi: I'm really looking forward to watching your work progress and to reading the next research paper. Thanks. John, do you have any follow up questions?

John: No, I think that's a good, a good stopping point.

Heidi: If, if folks want to follow you or get in touch with you, what's the best way for them to do that?

Neele: I guess through my LinkedIn.
Uh, my name is very complicated, but maybe I can provide that to you to post in the episode? That's definitely, um, the [00:36:00] most easy way to kind of engage and otherwise, uh, feel free to swing by our website from the professorship. Can have an overview of everything we do. We do, um, all around human centered security. So it might be really interesting to a lot of people that are interested in that.

Heidi: Yeah, that's awesome. Well, thank you so much for joining us today. Your research paper was very, very interesting. Really, really excited about the stuff that you're working on.
So thank you again for taking the time to share your insights.

Neele: Yeah, definitely. Thank you for the invitation and for the interest.