Jacob Haimes:
Welcome to muckrAIkers, where we dig through the latest happenings around so-called ai. In each episode, we highlight recent events, contextualize the most important ones, and try to separate muck from meaning. I'm your host, Jacob Haimes, and joining me is my co-host Igor Krawczuk.

Igor Krawczuk:
Thanks, Jacob. In this episode we'll talk about why AI safety is making you, uh, you actively less safe.
But first we have to highlight. Just how insane the money flows around AI investment have become, and for which we will point to a helpfully circular diagram that a blue sky user has made in the show notes. And we would also like to point out that open air has crossed the meme threshold and the, just a few more billion bro, is just how we're rolling now.
And they've, they're projecting that they'll be, uh, using 80 billion more than originally projected, uh, until 20 to 30 or 35. So comedy is dead, and we'll see how long AI can, uh, outlive it.

Jacob Haimes:
Yeah. So I just, I can't get over the, like, how I, I saw a meme that was not actually the one that you sent, but was a different one that was like simplified and it was just, you know.
Uh, OpenAI invests a hundred billion in Oracle, invests a hundred billion in Nvidia, invests a hundred billion in OpenAI. And like, it's not, it's not actually that inaccurate and it's just wild, wild to me. Um, but anyways,

Igor Krawczuk:
we talk about d different things, which, uh, mildly less infuriating.

Jacob Haimes:
Yes. So last episode, we, uh, tried to say in so many words, uh, like currently.
The prominent versions of AI safety are just branding and or safety washing, uh, for better AI systems and more fine-grained control by the owners of said systems. So AI safety is being used to sort of do you know, better AI r and d and have better control over the systems. Um. So that the owners can do whatever they want with them, and it's masquerading as, you know, a positive benefit for society and, and the field

Igor Krawczuk:
and tax de deductible.

Jacob Haimes:
That's true. Yes.

Igor Krawczuk:
In this episode we want to highlight further how it's actively fucking you over, basically. Mm mm-hmm. And by which, uh, for this we found like a couple of examples.
Probably the strongest one is that it, I am happy to kind of claim that without AI safety, we wouldn't have the current wave of AI sycophancy.

Jacob Haimes:
Mm-hmm.

Igor Krawczuk:
And ai, mental health tragedies that are, that are associated with it.

Jacob Haimes:
So. I do get like, sort of where you're coming from and I think we'll get into this in just a little bit 'cause they, they do tie together, but I would say it's more about the anthropomorphization of AI systems, um, and the extent to which big tech has pursued that and taken advantage of that.
Cognitive scientists have, like, it's very well established within this space that, you know, subtle cues, uh, in social terms and the way, things behave, I guess in the real world, um, causes us to empathize. With the thing. Um, and that's like an evolutionarily gained trait, and this is being taken advantage of because these systems that are being created are being pushed towards personalities, um, in, in, in the process of, of how they're being created.
And even just the interface that's being used here, um, is inherently. Anthropomorphizing the system because it mimics how we talk to other humans, right? Like it, it mimics like you are texting on a phone or sending an instant message. Uh, and that is previously only some way we talked to humans. So it's projecting personhood onto this tool.
Um, and in doing that. They, like big tech is, is gaining unearned trust and building asymmetric relationships with these tools that are controlled and developed by profit maximizing groups. These, the, these big tech companies, which like isn't a fault of the, I guess, I mean it is a fault of the company, but it's more of a fault of the system that it's in because like that's what they're supposed to do.
So. Not acknowledging that and saying, oh, wait a second. We shouldn't be allowing this to happen. We shouldn't be allowing them to take advantage of these sort of evolutionarily gained traits to make these systems ANR anthropomorphic, um, is really dangerous. And so it's, yeah, I just want to like, I guess set that out there as a, a first take.
Um, and maybe. I mean, what, what are your thoughts on that?

Igor Krawczuk:
Uh, I think it's worth rolling up. Um, for people who are not familiar where this started, like where I think like, uh, very briefly, Eliza is probably the earliest example of like how easy it is to get a human to. Like really like ascribe like a personality and a personhood and actually kind of like empathize with, with machines.
And Eliza Wa was this, uh,

Jacob Haimes:
oh,

Igor Krawczuk:
very simple.

Jacob Haimes:
I would say it's like a pet rock.

Igor Krawczuk:
Sure. But kind of like scientifically studied and, and kind of like a clear demonstration. Like I agree with you. Like Sure. Like I, I lived with people, uh, I, like we did a funeral for a car that died because, you know, like humans will en anthropomorphize everything.
The car's name was Fred. Uh, it was very sad. Um,

Jacob Haimes:
my car's name is Mystique. So

Igor Krawczuk:
is it a cha uh, shapeshifter as well? Is it like, like a No, it's a converted vocabulary.

Jacob Haimes:
It's a, it's a, it's a Nissan Rogue, so it's

Igor Krawczuk:
Oh, that's nice. Yeah. It was the

Jacob Haimes:
only name that my sister and I could agree on in high school, so

Igor Krawczuk:
I feel like Rogue would've been the more natural one, but anyway.

Jacob Haimes:
Well, yeah, but it is, but it is a rogue, so you can't do that.

Igor Krawczuk:
The Eliza, all it did, it had kind of, I don't even know if it had like, like, like a face ish thing. I think they made like a robot face for, uh, for it after the fact. But all it did was like, like a couple of like text prompts and like, it was kind of like, repeat back to you what you said with like, tell me more about this thing that you said.
That's all it took, because people love to talk about themselves and especially if they're not like, primed to be. Familiar with computers, um, and it's like a new context. They, they will just default to us as a person. Uh, and the alternative, to give you an idea how you,

Jacob Haimes:
that was in 1966, just did a quick look and at least the, the one that when I just looked up Eliza ai, uh, it shows, it's just like a text.
Field. But yeah, anyway, sorry. Please continue.

Igor Krawczuk:
But, but that convinced people, uh, that this thing was a person, it had feelings. People were like, very briefly kind of like worried that it might be suffering. 'cause humans are like very nice and very easily empathize with stuff that seems to be talking. And so you need to actively engineer against that.
If you are building a system that is like inter uh, interactive, um. Another example that people can look up is the Ky k uh, culture. It's basically people who really fall in love with like, internet, uh, interactions, parasocial relationships, uh, anime wi, and just don't leave a room anymore because they just get fully sucked into these fantasy worlds.
So,

Jacob Haimes:
and of course now we have companies that are fully embracing that. Um, and even, you know, big ones like Grok. Uh. Has like its own little chatbot kind of thing, right?

Igor Krawczuk:
Sure. But, but like, uh, is the crossover of a, like a large mega corporation chatbot and anime fus, there was also like dedicated like character that ai.
And like replica, like chat with busty, anime wife who services built on top of other lms, all of that stuff hits, hits that, um, kinda like human need for connectedness and the ease with which we will anthropomorphize, uh, things. But the only thing you would need to do is don't have a chat in the face.
Instead show like some. Roulette wheel that slots into place or just show the text generation, kind of like with alternatives, show the, the, the tree of, uh, choices that you could be taking. Make it, make it a system, but already do a lot of, uh, work.

Jacob Haimes:
That would do a lot of work.
I don't know if it would be everything.
Like I'm sure that humans would still be able to attribute like personhood to that system. Sure. Be because it says I am a person. It would take care of it a lot. Yeah.

Igor Krawczuk:
And. The claim that I'm making, that I was actually a bit shocked that, uh, Jacob agreed with me. But like, um, is that specifically AI safety, which is this like lineage of thought that spread from, uh, specifically AI safety would, uh, which we talked about where that Lynch thought emerged from a last episode or in previous episodes that in particular.
Is causing the instances of people getting really into conversations with the AI, where they sort of think they are the chosen ones or they just like trust their system a lot. So be because this line of thought leads you to make a AI system. That acts like a human. So you can compare it against how other humans behave.
That's the main way it's creating this harm.

Jacob Haimes:
So before, before we move on, I just wanna take a quick second. 'cause uh, we talked about this I think a, a while ago at this point actually. Um, but like, I don't love saying the word cause here because there is like multi causality to. Like behavioral health, uh, and mental health, uh, sort of, uh, episodes and, and, and instances.
And so we can't say like, you know, the thing is the cause, uh, because it, it's the result of, you know, a combination of genetics and upbringing and, you know, a whole bunch of stuff. Um, but, you know, maybe a better term here would be like the tipping factor or the catalyst or the, the, the triggering risk factor.
Um, because it is. It, it has been shown, um, at least in, in multiple instances, uh, and it's being actively studied like much more, uh, purposefully now, um, that the interactions with these systems sort of nudges people towards, uh, psychotic patterns in, in, in their thoughts. And so I would say it like, we can call it a primary cause, uh, or, or something like that.
But just, just wanna like, address that real quick because the, like, um, mental health provider's, uh, son in me says like, oh, well we should, you know, just clarify that real quick. But yeah,

Igor Krawczuk:
I think that's a valid point. Yeah. So like, uh, it's always multi-course because it's also a thing that, you know, these companies will use as like their deflection, right?
It's like, oh, but this is a complicated issue. Yeah, sure. And it's worth like making this distinction, but it's still the tipping point and it's like, like the claim is basically if you remove the model, then this particular human at that, at that particular time, unless something else would've taken the exact role that the AI system was, uh, was playing, which is unlikely, would've not had that episode.

Jacob Haimes:
Mm-hmm.
And, and then that has, you know, actual concrete harms. In, in some cases, people, um, physically harm themselves or, or other people. Um, and regardless they're going through, you know, emotional and, and mental difficulty because of it. Um, so, so like this isn't something that we can just say is like sort of out there, you know, it, it is a real harm that is happening.
Um,

Igor Krawczuk:
and, and. We, I would, we try to make the, the point very clear, but like, I want to emphasize this is not like something where like, AI safety can do better and you know, like, let's be bit optimistic and, and the field is learning and it's like, beginner mistake. The claim I'm making is that the angle that AI set is taking, like even with starting of, of idea of like AI as this intelligence that needs to be made safe via alignment, which is like the predominant meme, uh, in the subculture.
Mm-hmm. That in of itself is causing the harm because how do you check whether an intelligence like has your values, is the only way is you, you need to see, like does it behave kind of like you in situations that you can see yourself being placed under when you compare against it? And so that means you will be training LMS to output token sequences that seem like.
Agentic humans, but like, do something

Jacob Haimes:
well and, and not just seem like agentic humans, but also that you like, right? Like that's literally what RLHF, you know, which is one of the things that sort of kickstarted these, this chat bot boom, um, is reinforcement learning through human feedback. So you're, you're literally getting humans to look at the outputs and say, I like this one more.

Igor Krawczuk:
Yeah. And even if you do it in a constrained way, so like the, not just the little like feedback, I like this just like instruction tuning, which is also like done via either like, you know, um, s if, uh, supervised, fine tuning at this point di direct preference, optimization, whatever. But like, instruction tuning also creates this harm because if you take a base model and you just let it run, it will never, ever refer to itself as a, as a person.
Because it is just a, like a mark of engine. It's gonna like output, like a continuous stream of text that is like recognizable as a stream of text, but it doesn't really hold a conversation. It doesn't say, I think I blah, blah, blah. It might say the words, I think in as a sentence, but it will be very clear in the context that it's not like something person ish.
And you can even see that in stuff at, uh, the companies product themselves. So like OpenAI published, like in progress.open.com a while ago, and you can see that like GPT um, four and all of the, like instruc, they, they use like I I I, but the DaVinci one and the very early GT two, didn't it just mm-hmm. Uh, it completes your prefix in a way that kind of vibes with it.
Which then if you had to like engineer the prefix, that is the most likely to, to yield your answer, it nudges you into, into a different way of interacting with the system because you, you're constantly reminded, oh, this is an auto complete, this is not a intelligence. It's uh, it is just an, a very useful auto complete.

Jacob Haimes:
But yeah, but, but that's not like something that we can say is magical. And then say is hable, you know, like, and

Igor Krawczuk:
it's not gonna get you very close to all the things like, like the reason why people do instruction tuning is because it's very useful, right? Like, you have like this framework where the, the auto completion is like the helpful servant that like generates an output that looks like it's answering your question because that's like, actually like that factor.

Jacob Haimes:
Yeah. Instead of having to like think through how I would start the problem. I just have to ask for the problem to be solved, right? Like that's, that's way easier.

Igor Krawczuk:
And this then brings us to why it was adopted, which is like, it's very useful for these companies and like, uh, the thing that you start, uh, this, uh, section with.
But I want to have a home like this is intrinsically tied to the way dominant AI safety thinks about the problem of making. Language models like useful and safe, where like if you weren't stuck in, in the idea of this is intelligence that has, that has values that need to be aligned with our values, and you treat it basically as a person implicitly, then you can do other ways of constraining them, which we're gonna talk about, uh, once we get to solutions because this a constructive podcast.
Um, and you don't cause an issue. Uh, but as soon as you try to basically, uh, uh, think of this as an agent or like some other type of intelligence that needs to align with values, you force it to behave in, in a way that will be clocked by humans as, oh, this looks like a person. And then all of the issues, uh, like, uh, that we've seen in the news, we're gonna link in the show notes.
That's intrinsically tied to that. Like you need to completely undo a core approach and like assumption in the field.

Jacob Haimes:
But, but then we can't say that they're everything machines, Igor,

Igor Krawczuk:
which brings us to why this is getting worse, right? Like,

Jacob Haimes:
yeah, so like, sorry, I was, I was taking the, you know.

Igor Krawczuk:
Snark?

Jacob Haimes:
Uh, snarky.
Yeah. Opposition stance because it really, really, bus, I can't say anything because I'm, it makes me upset. It really pisses me off. Um, because these companies are pushing these systems as everything machines. Right. They're saying, oh, it can be your personal trainer despite when bringing, you know, questions like, make me a marathon plan, which was actually in a, a llama commercial like a year and a half ago.
And you know, I think I've mentioned this before possibly, but my girlfriend, who's a runner, was like, that doesn't work, right? And I was like, no, just try it. We went to the model that it was advertising and she put it in and she was like, oh no, this. One doesn't make any sense. And two, like could get you hurt if you actually follow this, and three wouldn't actually prepare you for a marathon.
Right. So, but it's, it's being actively marketed to do that and it's also being actively marketed for wellness. Um, so like discussions of stress and habits and other aspects of daily life. And it's super important that in the fine print of the use agreements. Th these companies are explicit to take care to note that using the models for healthcare is not condoned.
So September 15th, anthropic updated its use agreement to include that specifically even going as far as to mention therapy is not an acceptable use. Okay? But at the same time, in the same clause, it says, this does not apply to uses for wellness. And that's because they want to use the systems for wellness because people are using the systems for wellness, especially teens, which are sort of a vulnerable population in the first place.
Um, but the line between wellness and mental health is extremely thin. And I, I have a longer argument that I don't need to go into, uh, which is essentially that it's not fair. To be putting the burden of self-diagnosis onto users, because sometimes talking about wellness is part of mental health, in fact, frequently.
So it doesn't even make sense to do okay. But it's very problematic that they're being shown as, oh, you can do anything with these. Yeah, because people are, and, and then they're, they're being presented as, you know, anthropomorphic as well. So people are accepting it as, you know. Okay, well, you know, it's, it's accurate because it seems accurate.
And even people who think that they are sort of treating it as a tool implicitly are being impacted by this because of the evolutionarily gained trait of attributing feelings to Iraq.

Igor Krawczuk:
So rolling this up again. Uh, the reason why it's bad that this market is, is everything machines, is that it like forces it to rely on this instruction tuning and that will automatically code it as like a intelligence, a person, something to empathize with.
And if you even try to make a nice product, you will make like a very believable person that might be sehan. And will, uh, basically cause harm if it is exposed to, uh, just, it will actually, it will cause harm just by existing. Mm-hmm. Um, and then because it needs to be in everything machine, 'cause otherwise it, it can't justify the massive, massive amounts of money being invested is also being pushed to basically everything that is legally fine.
Like, if it wasn't like a crime to do healthcare without a license, they would probably tell you to use that as well.

Jacob Haimes:
Oh, absolutely.

Igor Krawczuk:
Um, but instead they push it to be close to the thing that is like illegal. Knowing that people don't, will not do a difference. You know, like if, if you push somebody to do, you know, wellness and coaching with AI system, like already human coaches are kind of problematic because they can keep you from.
Seeing a therapist and you would need to see one, or you can become dependent on them and so on and so on. Like life coach, not like, uh, gym coach of all might be an overlap. Yeah. Um, so that's why there's this chain, uh, that, that we're making. And that's like actively fucking over the end user for the benefit of the owner.
Because like the main reason why they do all of this is, is because they need to, you know. Create engagement, learn as much as possible about your, your private life. They just announced the pulse thing that's gonna be like their ad delivery channel. You know, like you get a helpful ping by the chat bot that you already talked to about everything with like, you know, the digest of the day and like stuff that's preparing for you.
And who was that? Who, who said that? Open ai. It's a new feature of announced, uh, a few days ago. You so, uh, so like. This is like the roadmap. Uh, again, it's all, it's all around creating these addictive and exploitative interfaces to prey on and giving all of this the veneer of, oh, but we need to do this because this is for science and for, uh, the benefit of humanity.
And we're learning to how to make it safe. And all of the little whoopsies that we have, you know, like suicides, we cause, uh. People going like into a, a clinical episodes. This is all just collateral damage on, uh, on the path to a, to a GI and the great, uh, human, uh, human future we we're building here. It's a bit sickening.

Jacob Haimes:
Yeah.

Igor Krawczuk:
And it's not the only signal after you.

Jacob Haimes:
Well, it's, no, I, I, I don't know if I, I have anything else productive to say at this point. It's, um.
Yeah, just, it just really bothers me. Um, and it's not, but, but all that being said, it's not the only example. Like I, I think it's probably the strongest and, um, most like temporarily relevant example, uh, right now because there's a lot of people talking about this kind of thing. Um, and. There's a, a good amount of evidence, there's a clear tie in to existing legislation with, with mental health, um, you know, software as a service kind of tools, um, but censorship, misinformation and bias and uses surveillance and, uh, weapon systems.
Um, and even just like. Self-driving cars, uh, and well, specifically Tesla being like skirting around the law are all examples of this same theme where
supposed safety, uh, is being used for the benefit of, uh, the, the people who are developing and owning the systems

Igor Krawczuk:
and against the end user where like and against end user.

Jacob Haimes:
Yeah.

Igor Krawczuk:
I, I think the super fancy and anthrop ization thing is the case where it's the cleanest to show like, hey, a safety technique, which like, it is like if you have this mental model of like AI as an intelligence that needs to be aligned.
It is a safety technique, but if it's, if you take a mental model and you apply to the systems how they actually work, it just creates a problem. That isn't there otherwise, like, like verifiably and I think this strength of a, of a smoking gun, we don't have anything else, but we can kinda show like the dual use and like how it is actually being used, right?
Like, like, it's like it's not being, um, used to protect against the sick offensive, for example. Like this was a like point. We are not the only haters. Um, Ben Rush, who is like a very well known in the, uh, community. Researcher has done many contributions, uh, uh, to machine learning and like the foundational work that made elements possible.

Jacob Haimes:
Mm-hmm.

Igor Krawczuk:
Um, he wrote a wonderful blog post called Theban Evil of AI Safety. Pointing out that, you know, if you so much as talk with Claude about proteins, you might get, uh, flagged by their, um, bio risk censorship, uh, pipeline. 'cause they think that, you know, like, uh, Claude can help terrorists make, uh, pandemic, uh, uh, viruses in the lab and it'll shut down the conversation.
However, if you talk about, you know, maybe wanting to kill yourself.

Jacob Haimes:
It. Well, actually it doesn't trigger that, and Anthropic recently added that as a feature. Um, so it can do that now.

Igor Krawczuk:
Um, so like after the backlash in the media against, uh, open AI and

Jacob Haimes:
yes,

Igor Krawczuk:
AI sycophancy

Jacob Haimes:
and it also, despite you, you know, despite being able to end the conversation, um, you know, it still encourages the wellness, uh.
Wellness stuff. They, they also recently allowed it, uh, like allowed it to end the conversation on its own volition. Uh,

Igor Krawczuk:
I, I was struck to talk about, about the really, like the censorship pipeline. Like, like, you know what, what should happen if somebody says, Hey, I'm thinking about killing myself, is just like big red window AI are not appropriate.
Uh, like, uh, people to talk to, hear some, some, some helpline. Uh, and just doesn't engage in that conversation.

Jacob Haimes:
There. There's an asterisk there, but like the asterisk being, it's actually really important how you develop that, uh, message. Uh, it should be like a canned response though. Uh, it should be providing other additional resources and potentially even.
Assisting in the facilitation of going to those resources or, or getting that additional help, um, in, in the ideal scenario. But that is just very much not happening, right? Like, I'm fairly certain that the, if the conversation just ends, you can just go make another one and try again, right? Like, it's not, there aren't really any actual, um, I guess.
Things to promote what we would want here. Um, and like what is required of an actual mental health tool, uh, with regards to like escalation of care and that sort of thing? Um,

Igor Krawczuk:
yeah, like, like the point, the point being that like the censorship, uh, is dual use. It could be used to actually like shut down harmful chats.
Yeah, it could be. And, and it is not in general. No, it is generally used to hide copyright infringement. Right. And, and then the Yeah,

Jacob Haimes:
like a, you know, egregious copyright infringement to the extent where it's like, you know, the, the system prompts are like, don't output copyright material. Right. Like, that's a, a real example, if I'm not mistaken,

Igor Krawczuk:
it, it's was in one of the leak system prompts and like, uh, Gary Marcos did like this expose last year with like a bunch of, uh, researchers about like the visual plagiarism problem where like, it would output, you know, like.
An Italian plumber with a, with a hat, and it's just Mario, but you know, like, uh, it, it slips through the hide the plagiarism pipeline. So, uh, that's why we saw it this time. Yeah. Um, but I think people can look at the links, uh, that we're gonna put, um. There is like the Tesla example, which serves as a, like a lead, uh, bridge to how do we do better?
How do we fix this?

Jacob Haimes:
Okay.

Igor Krawczuk:
And we had discourse about like whether, uh, central and cars are actually like a good example, uh, for this, in the preparation for this, because I, I argue that like self-driving cars is an example of how you can do like AI regulation in a decent way. And Jacob had like some issues with rephrasing.

Jacob Haimes:
Yeah, i, I disagree with that specific claim. Um, like it, I, I agree with the premise of what Igor is saying, um, but in my opinion, the, like, it just so happened that the auto regulations were. Written decently enough beforehand, uh, that in, in like good faith cases, um, where you assume that the developer of a self-driving car is like trying to do, do good, it works out.
Um, but if you assume that they aren't and you assume that they're trying to game the system like Tesla is, um, then. That just like isn't necessarily the case. Uh, and so while the spirit of the law may work, um, like Te Tesla is an example, the fact that Tesla cars are on the road and they're being called full self-driving is an example of it.
Not a working as written. Like we could change that, right? Like it, and, and some people are working towards that and that's great. But currently as is like. If you assume an adversarial, uh, stance, it, it hasn't worked.

Igor Krawczuk:
So to, uh, to give more context, like the, the way the relations work is basically is that, you know, you don't just deploy a self-driving car to the road and call a research project and then, you know, it just like learns while doing like what's happening with your chat bots right now.
You actually have to have to like, you know, apply for a permit to do that. And this permit is given like under conditions. And these conditions, uh, include an actual, like what is called an operational, uh, design domain safety concept, uh, that basically highlights, we will test these exact things on these conditions.
These are the, uh, the reasons why we think it's safe. This is like all of the evidence. And if this does not correspond to reality, then we are fucked. In fact. Mm. Like, uh, the, the company has a liability if they can't kinda like show in the case of like, uh, something happening that it was unexpected that this was like a reasonable thing, uh, to happen.
And it more or less works in the sense that California has lots of Teslas, but none of them are like. Self-driving testing because, uh, the laws were required to actually report all of the failures. And, and Tesla doesn't want to show how bad the cars are and way more hasn't killed anybody because they like follow the, the process.

Jacob Haimes:
I think that there may have been like one or two instances, but I don't think, I'm not possible.

Igor Krawczuk:
They, they were held like liable in the same way that Tesla is being held liable. And, uh, there's a Tesla deaths dot. tesladeaths.com. And let, just, lemme check waymodeaths.com doesn't exist.

Jacob Haimes:
Uh,

Igor Krawczuk:
so checkmate.

Jacob Haimes:
It's almost as bad as getting into a helicopter, like getting, like, Teslas are just like such shitty cars.
You know, like, but on top of the like shitty self-driving aspect, but whatever.

Igor Krawczuk:
Yeah. But like, like the, like the, she said certain is like, they're skirting the relations by avoiding actually doing self-driving car testing. Right. Uh, because

Jacob Haimes:
yeah, and they, they actually said like, we filed for a patent, you know, seven years ago, whatever.
And because, you know, government is slow and whatever, um, it didn't get around to like saying, Hey, you, you can't actually call it that. Um. Until, I think like 2022 or something. I, I feel like I've mentioned this in a podcast before. Um, and Tesla's response was like, but we already did it. Haha. Gotcha. Um, which it doesn't actually hold up, um, in, in court.
Uh, it just sort of allows them to do it for longer. But all this is to say the way to. Resolve these issues is to actually sort of enforce, um, liability in the ways that liability applies to a given domain. Um, so if a chat bott is used for mental health support, the company that made that chatbot is liable for the death of the person that used it for mental health support.

Igor Krawczuk:
Um, like it's the simplest value, like this like template of putting li clear liability and also clear purposes on, uh, on the apps that we're building, like, call it ai, whatever. But like the system we're building, they all have clear intended purposes. Mm-hmm. And then. Putting the owners of like, if you're offering that onto the provider, that's how you, you make something as experimental as this safe.
Um, I also think that even calling it AI or AI safety is harmful. But, uh,

Jacob Haimes:
yeah,

Igor Krawczuk:
there is,

Jacob Haimes:
and I mean, I agree with like calling it ai, you know, that's why I say machine learning. In, in general. I mean, I don't in the title of this because I want to reach more people, and that's what, uh, people use to search this kind of material, right?
But like, I don't like the term ai. The only value that I see in it, not the, maybe not the only value, the, the primary value that I see in it is that it'll opens you up to having conversations with more people, uh, because it's the sexy term right now. So like, and that extends to funding. Um, you know, AI safety is essentially like not funded at all except for a couple niche organizations that are all sort of aligned together.
But aI ethics is even less funded. Um, the kind of of research. Uh, like consumer protections, um, actually validating that companies aren't, you know, deploying something that is dangerous. Uh, that kind of research isn't funded at all. So being able to say AI safety does, like open up opportunities there and, and allow you to have more of a conversation, um, I think in some cases.
And, and that's, yeah. Sorry. Go ahead.

Igor Krawczuk:
And what type of research do you think is like the good AI safety? Like if you had to like point some examples or like the type of research that people who are in, uh, who are interested in doing AI safety but don't want to become co uh, corporate, uh, enablers. So like what would you pointing towards

Jacob Haimes:
My personal opinion.
Essentially going towards like the operational design domains and anything to promote that or, um, get closer to it. So.
Explicitly calling out like, look, LLMs aren't doing this thing that is done by, um, the technologies within this subspace that they're trying to take over. Um, so mental health, uh, chat bots or like, and software as a service for, um, mental health is an example. Those systems have. Hm, concrete regulation, and it's strong, relatively, um, it, it's like, you know, very good in comparison at, at the very least, um, if the chatbots are being used for the same purpose, they should be held to the same standards.
If you can make that explicit that those are not doing that, that the, the LMS are not doing that despite being used for the exact same purpose. That's like, that's good work in my opinion. Um, yeah, I guess that that is sort of like an example and hopefully that also establishes like what I mean by helping make this sort of calling out of like, oh, you are, you're not defining what your system's supposed to be being used for.
Um, you're not actually giving us limitations. That's bad. You should be. Anything that that gets us closer to that I think is, is pretty positive.

Igor Krawczuk:
I would agree with it. Like I, I, I, we talked about this a fair bit. Like I just see the kind of like safety washing and, uh, weaponization of a term a lot more than like the doors it opens.
So that's why I'm like, like a strong header. But, um, I think if people want to go. Into a field like this, because like, people also u use this as like career, uh, choice, uh, considerations, right? Like, I think like treating AI as normal technology as like some, uh, like a guy from Princeton is trying to push this meme of AI as normal, uh, uh, technology and then kind of like critically investigating the safety properties and dark patterns around this.
Yeah. Um, is basically good. And I mean,

Jacob Haimes:
it's, it's similar to like the ideas of, um, if you compare the practices that surround traditional engineering, um, mechanical engineering, aerospace, um, in general, um,
the kinds of, uh, system requirements and the importance that is put on them. Is like very, very significant. Software just doesn't have that.

Igor Krawczuk:
But even, even software has stuff like, um, shaming dark patterns and like investigations of dark patterns.

Jacob Haimes:
Now it does. Right. But that's 15 years in like the, I'm pretty sure dark patterns, like the whatever the website or, or like that was in 2010, you know, the internet was like.
Big by then already. That didn't, that didn't come up immediately.

Igor Krawczuk:
Sure. But asbestos was also used in millions of homes before we f we, we figured that that out. So kind of like new technology being done recklessly while bad, and we should not actually do this, you know, like, uh,

Jacob Haimes:
but like the pessimist view would be okay, well then that will happen for ai.
We just haven't gotten there yet.

Igor Krawczuk:
No, like, I mean like, uh, like my view is like we are experiencing it right now with a dark pattern shit. And like, uh, the call out is following kind of at the same pace. And we kind of got lucky that, uh, that uh, you know, even while the AGI bullshit was being like cooked up, uh, there was already people like Tim that, uh, calling out, uh, inhale.
How about the energy use on this thing? Is this really worth it? And like, uh, the bias, uh, people like I digging, Hey, this thing is actually really, uh, bad because, and, and codes, our implicit biases. All of this stuff happened, like in actual research, why things was developed. And then op may, I basically forced the whole thing into like hyperdrive because they wanted it to be like the first to market under the guise of safety again.
S and that's why we have this shit now. But, uh, it's not good. It's just like the, the normal way how technology diffuses through society and the work is in as soon as possible. Kind of like learning from the lessons of, uh, of past, past his history, right? Like, uh, yeah. 'cause like if we don't do that, yeah.
Like it will get even worse. Like. I fully expect, uh, ads in Che Piti very soon. And then there's an actual, like even bigger incentive to make people addicted to it and

Jacob Haimes:
which like the incentive is already very much there. Um, because that is how the, like they need you to pay for credits so you know, it is.
Directly their incentive to maximize the number of times and the number of tokens that you use. Yeah.

Igor Krawczuk:
But, uh, we can set up it. People should look into these like more creative directions. We're gonna put a bunch of links in the show notes to give more jump of points. You can get politically involved.
Call like local, uh, petitions. Call congressmen. Uh. In the US and the equivalent in Europe, there's nonprofits like I donate to a European one called Nip None of Your Business, which basically pushes for GDPI enforcement. And you know, like politics work like, like just this week, uh, I mean like in Europe it does kind of like, yeah, politics, politics works.
It's just not always the way you want it to work. Um, but if you get, if you get involved, you can like nudge it. And like Apple, Google, Microsoft are all crying about the digital market acts in the, in the Europe now. Like, oh, how will we bring the wonders of generative AI into the chats, uh, in, in Europe that we already put push into all of our products, uh, in the US if we have to abide by the digital market, uh, act.
And its horrendous constraints like giving equal access to our competitors to, uh. To add to our tools and not monopolize everything. And they, they wouldn't cry about it if it wasn't working, if it wasn't actually kind of like stopping the worst, um, tendrils they're trying to extend. So we, we can do the same thing for AI systems.
Uh, if like enough people get pissed off and, uh, start holding oak, may I and tropic and co like liable.

Jacob Haimes:
And there's also like other. It's not, I mean, it's not as direct, but like even just, um, you know, if there's a data center that's going to be, uh, or has been proposed in, in your area, like opposing those and saying, Hey, we, we actually don't want this in our space.
Um, and there have been a lot of, a lot of successes recently, um, with communities being able to push back, um, you know, big tech from building. Polluting data centers in their, in their areas. So there are some positives here.

Igor Krawczuk:
Yeah. And even if people go more towards like, fear of a GI side, like, uh, you can protest like, um, I, I personally don't think it's doing a lot, but I respect the stop AI people for literally like picketing outside of, uh, the offices.
And, uh, the, uh, like there were protests against Facebook, uh, and like backlash against Facebook, uh, at the time. And that actually did some stuff as well. Like, it, like, it's very difficult to stop like the capitalist moneymaking machine, but it is possible. Um, so people should, uh, shouldn't like, uh, but they should be aware of.
AI safety isn't there to make you safe, it's to make the owners safe on your, uh, cost.

Jacob Haimes:
Yeah. That's all of the muck that we have for today. Um. The other way to help is to share this show. And so if you thought that it was, uh, you know, valuable, it gave insights that you thought were useful, please share it and, uh, if possible, add a, add a comment, uh, or a, a rating in whatever podcast platform you listen to, because that's one of the best ways to make sure this gets to more people.

Igor Krawczuk:
And another good way is to give Jacob money. And if the patron is up already, you should, uh, look it up or otherwise find ways of, of giving him money. Um, because he puts a lot of work into this and failing that, uh, the comment and the like, so that the algorithms that we all behold to boost us is much appreciated.

Jacob Haimes:
I guess we will see you next time.