Explore the evolving world of application delivery and security. Each episode will dive into technologies shaping the future of operations, analyze emerging trends, and discuss the impacts of innovations on the tech stack.
00:00:05:05 - 00:00:29:11
Lori MacVittie
Welcome back to Pop Goes the Stack the podcast where emerging tech is put under a microscope; sometimes a flamethrower, it depends. I'm Lori MacVittie and I have my safety goggles on today. We're going to talk about a very interesting topic today. That's why Joel is here. Well, Joel's always here, but it's going to be interesting, so Joel's glad to be here, right?
00:00:29:18 - 00:00:31:27
Joel Moses
Absolutely. Always good to be here, Lori.
00:00:32:00 - 00:01:04:18
Lori MacVittie
Alright. Awesome. Awesome. Well, that's good because today we're going to talk about apps leaking data. And that's interesting because AI touches lots of things today: search, analytics, browsers, APIs, logs, whatever duct taped workflow somebody built at 2 a.m., right. So when people say, "it's just a chat," they're missing the point. That chat might actually pass through, like, half the stack and somebody else's, you know, in a matter of moments.
00:01:04:20 - 00:01:30:26
Lori MacVittie
So recently, folks started finding pieces of AI conversations showing up where they should not be. There were prompts in search tooling, chat fragments in analytics; private suddenly became indexed, which none of us want. None of us want. So, you know, it wasn't because somebody hacked the system? It was because AI is so woven into everything.
00:01:30:28 - 00:02:00:26
Lori MacVittie
Integration. Integration is a four letter word. Anyone who's ever been tasked with doing integration of anything knows that and now it's apparently one of the ways that AI is leaking our chat history. My special recipes all over the internet. Sad. So that's the new reality. So, today we're going to talk about that; privacy, compliance, why "it's just a chat" is the most dangerous sentence in AI. To do that,
00:02:00:27 - 00:02:09:21
Lori MacVittie
we brought Scott Hendrickson on who knows something about SEO and stuff like that, right?
00:02:09:24 - 00:02:38:29
Scott Hendrickson
A little bit. I come from a background of data science, but working in marketing and SEO optimization for many years before joining F5. So I've seen a number of ways of doing these integrations and sharing data for marketing purposes come and go. And, you know, this is the latest iteration of some of the challenges we're facing with, you know, both compliance and with good intentions for what we do with, with users data.
00:02:39:01 - 00:02:56:22
Joel Moses
So let's start by exploring why SEO tools in particular are something that if you're developing something that has a chat bot function in it, you might need to take a second look at your SEO. How are SEO tools typically integrated with an application?
00:02:56:24 - 00:03:29:12
Scott Hendrickson
So SEO tools, you know, typically are integrated with the application primarily by sharing some of the information that the user inputs to make matches with ads on the internet. Right? So that happens through a number of technical mechanisms that track the user from site to site or within a site and then provide either contextual information about what the user's looking at or direct interaction with that site from the user in the form of of keyword matches
00:03:29:12 - 00:03:50:22
Scott Hendrickson
and, in the case of mistaking a chat box for not being integrated through SEO, an entire chat log might go into that context for how an ad would be matched to a user's behavior. So you know, sometimes we have a great experience with that. We're looking for a new desk and desk ads pop up when we're, you know, when we're searching the internet for a new desk.
00:03:50:22 - 00:04:02:15
Scott Hendrickson
And sometimes we find that experience overly intrusive when personal information gets shared and indexed through an SEO integration.
00:04:02:17 - 00:04:21:25
Joel Moses
Now, these SEO integrations are typically delivered by something like a tag manager. Meaning it's not really a piece of the application, it's something you link in to the application that gives it the ability to look at the transactions that are going on at the browser side. And so a lot of people forget SEO tools are there. They're not part of the application chain.
00:04:21:25 - 00:04:35:15
Joel Moses
They're not part of that path. They're usually hosted externally. And so people forget sometimes about the SEO tool. What happens if you forget about the SEO tool and you add new pieces to your application? Like a chat bot?
00:04:35:16v - 00:04:56:21
Scott Hendrickson
Right. And so that kind of raises the issue of who's responsible for what. And in the case of, you know, in the case of a user who wants to take care of their own privacy, it's not always clear from the context they're interacting with an application or a website, what's happening in terms of the sharing of data and what integrations exist there.
00:04:56:22 - 00:05:22:29
Scott Hendrickson
I know with, you know, various regulations in Europe and California, some of that disclosure has become required. We have to tell people what some of those integrations look like and who's processing the data. Nonetheless, from a user perspective, it's quite complicated to dig through those lists and understand who's doing what. So the question you're raising sort of goes a second layer
00:05:22:29 - 00:05:43:27
Scott Hendrickson
is that also the intention of the developer might not be fully represented by the next integration that comes along. So, you know, as a developer I was asked to integrate service XYZ, so I did that to the best of my ability, communicating the risks and benefits and sharing as much information as I could. And then later someone comes along and adds a chat interface;
00:05:43:27 - 00:05:56:29
Scott Hendrickson
we're getting a new kind of information from the user that wasn't contemplated in the original integration. And there's no one there to say, "Hey, wait a minute. We built this with these assumptions in mind, not the assumption of what you might type into a chat window."
00:05:57:21 - 00:06:23:24
Lori MacVittie
Well, my understanding is, let's say I've got a chat open in GPT, and I've been talking with it about a number of things, some of which might be very personal, and I don't start a new conversation. I just keep going and I'm like, "oh, hey, I want to buy a TV," and it gives me a link to, you know, TVstore.com
00:06:23:24 - 00:06:56:13
Lori MacVittie
and I click on it. The problem here is like we as users, I know, but a lot of users don't realize that when they click on that, all of that conversation before is context and it gets carried along. Because that's how chat bots work. So TVstore.com just got my query and that entire chat history, and that's really, so now it's got all this information that has maybe, you know, private things, maybe, you know, who knows. Who knows what I was talking about,
00:06:56:15 - 00:07:25:24
Lori MacVittie
none of your business; I'm not telling you. But it shouldn't be at TV store.com. And now that's in the SEO system, because it tracks "Oh, this came from ChatGPT, and here was all this information" and then it's in the logs. And suddenly this private information is being stored in places it was never meant to be. So I don't, I'm not sure we can educate users to say, "hey, never," you know, do we have to do the the phishing style campaigns,
00:07:25:24 - 00:07:30:04
Lori MacVittie
"Hey, never click on a link." I mean, that's the point is to click on a link and make it easy. What's, you know, what do we do?
00:07:30:05 - 00:07:35:29
Lori MacVittie
No answer. Look at that, we just, we're done.
Joel Moses
No, I
Lori MacVittie
We don't know.
00:07:36:00 - 00:07:55:00
Joel Moses
I, Lori you make a good point. I think that one thing that we're facing right now is there's a transition underway that most people are used to using web applications in a transactional manner. Meaning, I go to this page, I look at this product, I may add it to a cart, I may delete it from the cart,
00:07:55:00 - 00:08:17:07
Joel Moses
I may look at a couple of other products, I may abandon the cart altogether. And that's what SEO tools are interested in. They're interested in the flow of transactions from a particular user and understanding if there are any patterns that may help make the application rise higher in the search rankings or fix user acceptance problems with the application itself by looking at the transactions.
00:08:17:10 - 00:08:44:05
Joel Moses
Now the trouble is, when you inject a chat bot interface into these systems that are transactional, then you're adding a layer of conversational interplay with the user. And conversations and transactions are similar, but a lot of times, because of the patterns that we're used to as humans in conversational interplay, we overshare in conversations. We do things towards those systems that might be patently unsafe.
00:08:44:07 - 00:09:20:18
Joel Moses
And so these things, because the system hasn't been configured in the SEO integration to tell the difference between a transactional element and a conversational element, it records both of them. And that means that those conversations become accessible to the people who also have access to the SEO transactional data, which means marketing teams, advertising organizations, search engine placement people. Things that you conveyed conversationally may be accessed by a wider array of people than you thought possible.
00:09:20:21 - 00:09:31:12
Joel Moses
I think that's the real underlying problem here. People forget about an SEO integration and then when it comes to putting a new functionality in your website that is conversational, it over records.
00:09:31:15 - 00:10:00:16
Scott Hendrickson
And I think, you know, we're talking in the abstract here. And coming from the data science side, a lot of this SEO information that you're referring to, you know, came through my teams as, you know, aggregated information or black box processing of recommendation engines or, you know, a number of data science models applied broadly to the set of data.
00:10:00:16 - 00:10:28:08
Scott Hendrickson
So I wasn't literally sitting there reading through everyone's entries.
Joel Moses
That's good.
Scott Hendrickson
However, we don't assume that's not possible, right?
Joel Moses
Right.
Scott Hendrickson
So in the case, you know, in the worst case, we're talking about the the actual match of your context to the keywords in Google AdWords, for example, is explicitly shown to a human being. Like, you can see exactly what was typed into the box word for word.
00:10:28:08 - 00:10:48:29
Scott Hendrickson
It's not protected in any way. And that's broadly available to any human who sets up Google AdWords as an advertiser. So, you know, you could think, "well, you know, who cares in the abstract what I was talking about." Well, you're not wrong in terms of maybe the data science community. We don't care in the abstract. We're actually not reading all those things.
00:10:49:01 - 00:11:06:28
Scott Hendrickson
We're just trying to make the matches that are most effective. However, there is no protection against someone reading that who is in that line up of how advertising works. And so it really is a personal disclosure of that exact information in many cases. It's not so abstract in that world.
00:11:07:00 - 00:11:36:20
Joel Moses
Yeah. I mean in the end analysis, I think people need to remember that when you go online you are interacting with a website, and that's what you know. You know what you're giving to it, you know what you're receiving from it, but you're also wearing one of those clear backpacks, like, you know, you're used to going into Packers games with. And the clear backpack is accessible and it's visible to a lot of other people who you don't see that are behind your back.
00:11:36:22 - 00:11:57:02
Joel Moses
And it could include things like where you came from, the device you're using, tracking IDs, cookies that refused to die, and now with these chat bot conversations being added to SEO tools, your AI conversations may be visible there as well. And so it's incumbent on everybody to remember the cost of oversharing and the fact that you're wearing that clear backpack on your back every time you go on the internet.
00:11:57:08 - 00:12:24:15
Joel Moses
Now, what can you do about that? Well, you can read the privacy statements on all the websites and make sure that what you are interacting with on the site has clearly defined guidelines about how they protect information. That's number one. Number two is you should look at possible regulations to make sure that this information isn't shared with advertisers.
00:12:24:18 - 00:12:31:08
Joel Moses
And I know that that's true in several states. And I know that that is increasingly true overseas. But, you know, the cost of oversharing is really what we're talking about here.
00:12:31:20 - 00:12:53:05
Lori MacVittie
Right? And that's, there's a lot of user things like I try not to overshare, right? You know, I try to do these things; I do at least glance through privacy policies. You want to have a good idea of, you know, hey, we're selling all your data. Well, that's a no. But for the problem here that I see, is that right
00:12:53:05 - 00:13:15:28
Lori MacVittie
now, it was SEO that kind of made this, you know, everybody aware of the problem. But as AI, and we start seeing agents, and they start, you know, using tools and people building web apps, go, "hey, I want, you know, my product catalog and my ordering system to be tools," because distribution chains always adopt the new technology,
00:13:15:28 - 00:13:40:00
Lori MacVittie
right. So now you've got potentially all sorts of stuff in your clear backpack going through multiple chains, tools; things are touching and seeing information that, for the most part, users are not aware of. If we went out and asked generally, just asked like five people on the street, go ahead Joel, you go downstairs in the,
00:13:40:01 - 00:14:14:09
Lori MacVittie
just go ask like five people like, you know, "how much of your conversation is shared when you click on a link?" Most of them are not going to know.
Joel Moses
I agree.
Lori MacVittie
They don't understand the mechanism. So it's not just incumbent on companies providing the chat bots and building these things, and people to stop over sharing, but the people who might be receiving these kinds of links. Like if you are a target for an agent or a chat bot or a search, you need to be aware that AI may be oversharing on behalf of users who have no idea.
00:14:14:12 - 00:14:31:27
Lori MacVittie
Okay, I'm not going to explain to grandma. She's not going to get it. Okay? It doesn't matter how many times I tell
Joel Moses
Yeah.
Lori MacVittie
it's not going to happen. So we also need to be aware we might be receiving information we did not ask for, and maybe be a little bit more careful about that.
00:14:31:29 - 00:14:51:02
Scott Hendrickson
Yeah. And I think as users we have some new cues to learn from the context. So when we, you know, when we're in the office around the water cooler telling stories, we pay attention to who's there and who's going to retell which story and how that word's going to travel. Right? So we're careful about that based on social cues.
00:14:51:04 - 00:15:12:22
Scott Hendrickson
We've kind of learned how to do this with search engines and email, to some degree, but the context has changed. Now there's an opportunity to have what we assume to be an anonymous but fairly personal conversational style interaction around any kind of question we might have. It does lure all of us in to a different kind of conversation.
00:15:12:25 - 00:15:34:00
Scott Hendrickson
Currently there aren't useful social cues at all. So that information can go many places and from the context of what you're doing, you may not get any indication of who's going to see this and how they're going to see it, in what context, or in what detail. And so it is a change, you know, both from the creator of app side--which I live in every day.
00:15:34:02 - 00:15:50:06
Scott Hendrickson
How do we provide the right context to our users, you know, to be responsible for helping them make the best choices possible? And then as a user, sort of how do I get those contextual cues when they're not very obvious anymore?
00:15:50:08 - 00:16:14:17
Joel Moses
I think it's also worthwhile to look at other integrations. I mean, we talked about SEO today, but honestly, SEO is just one of a class of tools that also has access to this information. There's also advertising or advertising governance systems, tag managers for various general purposes. They're embedded security tools that may also have access to this same information, because they plug in at roughly the same side as an SEO.
00:16:14:24 - 00:16:43:21
Joel Moses
There's anti-bot tools; all sorts of things that have access to the same information. And so I think it's incumbent upon the people who do these integrations and operate those integrations to ask themselves three questions. Number one, what should be collected to perform my job? Number two, what must never be shared? And number three, and most importantly, just because we can see something, should we?
00:16:43:24 - 00:17:04:06
Joel Moses
And then design the tool to bypass the information that you should never see. There's going to, again, these SEO tools are super easy to drop in. These advertising plugins are super easy to drop in. And we think of them as just these little plugins that perform these functions for us. But what we fail to realize is these are integrations.
00:17:04:06 - 00:17:24:03
Joel Moses
These are things you integrate with an application, and every integration is a place for oversharing, security problems, potential challenges for loss of information. And so you need to talk about those as full integrations to the application and provide governance for them.
00:17:24:05 - 00:17:57:12
Lori MacVittie
Yeah, I think, and I like that you're pounding on the integrations, because at some point we have to recognize that API calls are the way that integration happens today. It's not a toolkit and a team of people who are, you know, building and tightly coupling things. Integration means I'm going to make an API call today. And I'm going to pass all sorts of information it needs to it, whether in the payload or in the headers or, you know, on the actual like, you know, URI.
00:17:57:15 - 00:18:24:28
Lori MacVittie
So we need to recognize that an integration is necessary in order to build workflows and make things seamless. So it's a good thing, but we have to be more careful about all of that integration and what we're sharing. Part of this, I lay the blame on AI. The fact that it's, you know, statefully stateless. We'll talk about that some other time.
00:18:24:28 - 00:18:48:19
Lori MacVittie
But that really is a problem. It's inherent to the system the way that it actually maintains state as context and then carries it along everywhere. So it's always going to be there and how do we deal with that during integration? So we have to pay more attention to that and understand that mechanism, because it's different than the way that we used to do things.
00:18:48:21 - 00:19:03:13
Lori MacVittie
Scott, you got any, you know, what advice would you give to both users but also to enterprises dealing with those who receive, I don't know, clicks? What should they worry about?
00:19:03:16 - 00:19:24:00
Scott Hendrickson
Yeah, I think, you know, I was thinking about the personal experience first, because I think, you know,
Joel Moses
Only natural.
Scott Hendrickson
like each of you, I'm aware enough of what's happening behind the scenes to be very wary when I'm, you know, sharing information, right. And so, you know, I'm one of those people who uses an ad blocker. I'm one of those people who does, clear the cache occasionally;
00:19:24:02 - 00:19:43:18
Scott Hendrickson
pretty often actually. And I can really see a difference in my experience in applications when I do that, so I know it has an effect on those integrations. You know, that's the part you can feel, right? But I also, probably like both of you, don't actually read the list of companies who are processing the data from site XYZ.
00:19:43:21 - 00:20:10:15
Scott Hendrickson
I have a couple of times and realized I don't have time for this. And so that raises another question of scale. And I think, you know, AI contributes to this too is right. Of all the amazing things that have happened with AI, you know, recently, I've been digging into multiple agent use and orchestration and, you know, amazed at what we can accomplish after only 2 or 3 years of having worked with these things.
00:20:10:17 - 00:20:38:20
Scott Hendrickson
It's really fun and wonderful. And at the same time, you know, the opportunity for scaling sort of charismatic leading or misleading information has never been greater. And so as a user, I'm just thinking of how do you practically, you know, read the signs and make great decisions in real time when you're trying to get a thousand other things done at the same time?
00:20:38:22 - 00:21:07:03
Scott Hendrickson
You know, you do want to buy that new set of golf clubs, but you don't necessarily want to share your whole chat with the company who's selling those golf clubs. And so how do we do that? And I think, you know, as a company, as a provider of applications, I really think it's important to provide as much efficient contextual information as possible upfront so that you can tell you're typing in the wrong box something that you don't want to share.
00:21:07:03 - 00:21:30:05
Scott Hendrickson
I think, you know, just those kinds of cues are like social cues, they're very efficient, they're very fast, and they keep people making better decisions based on the cues. That can't be the only answer, right? We also need to have the regulations keep up with the capabilities of the system. We're just in a different world in terms of how much you can scale a personalized, charismatic conversation to your own wishes.
00:21:30:08 - 00:21:50:27
Scott Hendrickson
We've never been able to do that at the scale and efficiency that we can do it today. And so that raises new challenge for how you regulate and how you do things. Right? I think, you know, it's been a wave of these things so fast with social media with, you know, mass surveillance. You know, who cares if I'm on a camera on Main Street?
00:21:50:27 - 00:22:12:28
Scott Hendrickson
Well, it's no big deal if it's one camera, but when it becomes a coherent set of cameras that can tell a story about your life, it starts to feel much more invasive. Right? So it's the coherence of that that starts to be really invasive, not necessarily the camera. So we're in a very similar situation now where we're starting to share a different kind of information about the story of our life in a very natural way.
00:22:13:00 - 00:22:17:06
Scott Hendrickson
And it's a little hard to tell where that's going to end up and who's going to see the integrated story.
Joel Moses
Yeah.
00:22:17:08 - 00:22:25:03
Joel Moses
Lori, I guess
Lori MacVittie
I'm going to live off grid, you want to come with me? That's
Joel Moses
Yeah.
Lori MacVittie
That's what I got out of that.
00:22:25:06 - 00:22:49:14
Joel Moses
Lori, I guess my takeaway is that, like you said, regulation usually is what makes things change. And, over time, SEO has been subject to lots of different pieces of regulation about privacy and I think that we're probably going to have to redo that for the era of AI. SEO tools have tried to prevent oversharing because of regulations related to privacy.
00:22:49:21 - 00:23:24:11
Joel Moses
And they've added tools like not collecting everything in every text box and masking things that it thinks might be a password and, you know, to try to prevent people who shouldn't see that information from seeing it. I think that that's probably going to have to happen for AI in earnest. But until that time happens, anybody who uses SEO tools should probably go take a peek and see if, again, if you are collecting things that you shouldn't be collecting and sharing. It only takes a quick peek into the logs of the SEO to make sure that that's not occurring.
00:23:24:13 - 00:23:45:24
Scott Hendrickson
Yeah. And I think
Lori MacVittie
Good takeaway.
Scott Hendrickson
I think that's important in the sense of like, there are real life HIPAA regulations you are subject to, for example, if someone puts health care information into your SEO stream accidentally. And so there is some liability here already, even with regulation not having kept up with AI,
Joel Moses
Yeah.
Scott Hendrickson
at least in my opinion.
00:23:45:27 - 00:24:01:02
Scott Hendrickson
And so I think being aware that there is some responsibility for the information users share with you already is a really important thing, and you should review your data with that in mind already. I mean, that should be part of the the process of running your business today.
00:24:01:05 - 00:24:15:27
Lori MacVittie
Absolutely. I'd love to continue this, but we're out of time. So that is a wrap for Pop Goes the Stack. Smash subscribe before we set something else ablaze. Safety goggles are now off. Until next time.