[00:02:49] Jacy Reese Anthis: Yeah! I'm Jacy Reese Anthis, I am a full-time researcher at Stanford University in computer science.

[00:02:54] Jacob Haimes: What are you trying to figure out with your research?

[00:02:57] Jacy Reese Anthis: I am aiming with all sorts of different methods to help humanity navigate the emergence of digital minds.

[00:03:04] Jacob Haimes: so I, tell me a bit about how you ended up where you are now. What were some of the more substantial things that informed your trajectory, and how have you ended up in this space and, and then what is it that you're doing now?

[00:03:17] Jacy Reese Anthis: Yeah, my undergraduate degree was in neuroscience, so I was very interested in understanding the human mind, but especially the minds of non-human entities, including animals as well as emergent artificial intelligences. I was thinking a lot about career choice at the end of my degree. Went through this effective altruism cycle of where you think about global poverty issues and animal welfare issues and artificial intelligence issues, and kind of rotate through them.

[00:03:41] Jacy Reese Anthis: But I knew I had to. Make at least a tentative decision. So I ended up focusing on animal welfare. I got a job at animal charity evaluators and then co-founded the Sentience Institute, uh, where I continue to do research today. And Sentience Institute was focused on animal advocacy. In particular, this phenomenon of moral circle expansion.

[00:03:59] Jacy Reese Anthis: How do we come to understand and get to know and care for new non-human entities? In this case it could be other human groups that our circles are expanding towards. And then for artificial intelligence, I was very interested in that throughout that period. But once my book had come out, the end of animal farming, I felt like that was a good culmination of my work on farmed animal advocacy, which is a giant group of non-human entities.

[00:04:21] Jacy Reese Anthis: And then we are seeing this deep learning revolution in this emergence of a new group of non-human entities in society. So that's what I've been focused on since then.

[00:04:29] Jacob Haimes: And so that, the book came out in 2018, I

[00:04:32] Jacy Reese Anthis: Yes.

[00:04:32] Jacob Haimes: Uh, so that's actually like, when that started to occur. So pretty early on, relatively in the, uh, 

[00:04:41] Jacy Reese Anthis: boom, I guess, that we're currently in.

[00:04:43] Jacy Reese Anthis: Yeah, I think in 2014, you know, I was focused on, we had a few different names for it, but like long term animal advocacy, so, or long termist animal advocacy eventually. So this idea of this moral circle mattering for domestic animals today, like dogs and cats in our homes, farmed animals, but also expanding towards wild animals and then eventually expanding towards artificial intelligence or what sort of digital sentients might exist in the future.

[00:05:07] Jacy Reese Anthis: So I was focused on. Thinking about AI throughout that time. The book, for example, concludes by talking about digital minds. And around 2018, when the book came out, I was going to actually transition like the, the explicit focus of my work into that area. I didn't know a lot about the technical side of ai.

[00:05:22] Jacy Reese Anthis: If, if I did, I might have planned things a little differently. Um, like I didn't actually know, I think what a deep learning revolution was. Um, but I knew that AI was important and it was gonna take off within my lifetime, so it made sense to focus on it.

[00:05:35] Jacy Reese Anthis: Can you connect, how you were like, Okay. Animal welfare into, digital minds. 

[00:05:41] Jacy Reese Anthis: Yeah, so there's a long history of this moral circle as as the idea of starting from our friends and family and extending to people in our city, to people in our country, to people around the world. And then at least within the past couple of centuries, there's been discourse about animal rights and expanding it to non-human entities.

[00:05:58] Jacy Reese Anthis: There's also some people who want to expand it to environmental entities, for example. But my motivation was kind of ultimately, I want to. Create the greatest happiness for the greatest number of sentient beings and domestic animals in the food system are over a hundred billion. You know, they're, they're just a huge group of suffering entities who not many people are helping.

[00:06:18] Jacy Reese Anthis: So it made sense for that kind of cause prioritization reason. And then I think the same argument extends to wild animals. You know, the wild number of wild animals dwarfs the number of farmed animals, and then it extends a, a step further to digital minds or in general future minds You know, I expect just in the far future, most of the minds to exist in the universe to be digital.

[00:06:39] Jacy Reese Anthis: So really just focused on that huge number as we get, you know, exponential population growth. As we expand to the stars, as we get artificial general intelligence, that being just a very morally important group of non humanities to focus on.

[00:06:51] Jacy Reese Anthis: So, so really like the, the more, long termist, , perspective there. We'll get into that in a little bit. But before we go further, I guess the other thing that I am, I'm curious about is, you've done a lot of work in, very. Distinct areas, within the AI space. So like there's been some evaluations work, uh, you've done a survey on, uh, like AI companions. You've worked with like social simulation and using language models for that. Like what unifies these, in your eyes?

[00:07:20] Jacy Reese Anthis: Yeah. And then there's another big area which is, uh, fairness research and more mathematical, you know, proofs of, of what it means for an algorithm. To be fair, my broad interest there is, uh, sociotechnical AI safety. So kind of taking this interdisciplinary lens to what have traditionally been approached as issues of hard and fast, you know, technical control, especially in, in the 2010s of AI safety.

[00:07:44] Jacy Reese Anthis: But still, now, you know, this, this idea that we need to like go into the system, it's gonna mesa optimize, you know, it's gonna optimize in some way that we don't expect and we should control for that. And rather seeing it as like, well, what values do we want to align AI with? How could that go wrong in a political system as we get, you know, very weird ais as we're starting to get today with this jagged frontier of capabilities.

[00:08:05] Jacy Reese Anthis: What sort of social dynamics do we have to navigate as well as, you know, eventually things like algorithmic fairness I think are important for. Distributing resources kind of throughout whatever large scale society we see in the future. And also as a sandbox, you know, we've done more to mathematically formalize fairness than any other area.

[00:08:23] Jacy Reese Anthis: So if you want to understand how we answer a question, like what values do we want to optimize AI for, then starting in fairness makes a lot of sense. You know, fairness has this nice equal sign, uh, you get to make like a conditional quality between two different areas. You see, they're gonna be treated the same, tons of different ways to formalize that, but that makes it very natural as, as a mathematical object.

[00:08:44] Jacy Reese Anthis: So taking a step back, I, I wanna focus on. Sentience Institute, which you began with your now wife, I believe, uh, in 2017. So this is shortly before you publish your book. what's your goal here? What, why did you do this?

[00:09:00] Jacy Reese Anthis: Yeah, so I was working at animal Charity evaluators as a researcher, so doing nonprofit research to evaluate the effectiveness of particular organizations. I started with animal charity evaluators around 2013 as a volunteer. I was on their board of directors before I finished my undergraduate degree, and then as a full-time researcher and ACE, had, had gotten off the ground in many ways.

[00:09:21] Jacy Reese Anthis: So we did a lot of the foundational research. I think like in broad strokes you could tell a similar story with global poverty charities and like the discovery of malaria nets and deworming pills as really effective interventions. We'd sort of done that work in, in charity valuation for animals. So I felt like it was time to, to move to something else.

[00:09:39] Jacy Reese Anthis: I had been talking to. Some editors at media outlets, uh, like the Huffington Post was actually where I got my start. But then writing for Vox and other places and then talking to, uh, literary agents and, and book editors about the possibility of doing a book. So just, there were several things that led me to want to kind of start a new area or, or a new organization in some way and have more on the ground impact, like I enjoyed doing so much at, at Animal Charity evaluators.

[00:10:05] Jacy Reese Anthis: So then it was actually the Effective Altruism Foundation, this group based in Switzerland and Germany who contacted us and we were interested in their group called Sentient's Politics. They wanted to spin that off from the organization. We said, okay, Sentient's politics, well, we'll take the research side of that and call it Sentient Institute and do research on, again, moral circle expansion broadly, but especially focused on animals for now.

[00:10:28] Jacy Reese Anthis: They gave us our seed funding. Uh, Kelly, my wife, was the executive director of the organization, so she set it up, you know, did a lot of that initial branding, the website setting our. Direction as an organization and then I was doing the research side of it and uh, as it transitioned into AI stuff that was just sort of natural, continuing our work on Moral circle expansion.

[00:10:52] Jacob Haimes: Um, and, and like, I guess building off of the, the Sentient Institute, like naming and, uh, you know, you just mentioned Digital Minds and, and sort of talking about those claims. I guess another, another thing that I think about when we get towards, um, what I, I would say this is like, uh. Anthropomorphic personifying language. Um, and, and this is also a pattern, um, you know, in the writing of, of researchers, people say things like, you know, they appear to have reasoning, emotion agency and other mental faculties or, um, you know, uh, mind, like, and there's no commitment to, uh, that, uh, from like the, the researcher's perspective.

[00:11:31] Jacob Haimes: But then those are taken sort of out of context and, um, used as, know, fodder for sensationalist, uh, pieces or, or just like seen by other people that sort of, um, you know, say, oh, they, you know, sentt and AI were in the same sentence. Uh, that's good enough for me.

[00:11:51] Jacy Reese Anthis: Yeah, this is really interesting because we live in this age of virality where just the term you pick for something or the, the quote you share can drive so much of at least public engagement. Uh, a couple angles on it. So one the historical angle. So, um, when we started focusing on digital minds or digital sentt, there wasn't, uh, as much of a concern about sensationalism around ai.

[00:12:15] Jacy Reese Anthis: There was some concern about hype, you know, in 2016, like Microsoft switched their research branding to artificial intelligence. You had, uh, systems like AlphaGo and, and things that were. Seen as potentially being exaggerated as as so superhuman, and they're gonna revolutionize every field when they're relatively narrow.

[00:12:33] Jacy Reese Anthis: And it became like a corporate branding strategy. But that wasn't a concern about anthropomorphism. I mean, there have been concerns about anthropomorphism since the fifties. Uh, you know, we have like the Eliza effect that people talked about with the world's first chatbot. Um, we've had Ben Shireman and human computer interaction critiquing the use of first person language by computers, you know, like, like bank teller machines that were automated and would use the term.

[00:12:54] Jacy Reese Anthis: I, they dislike that. I mean, kind of the ship has sailed with large language models in first person language, for better or worse for. But now we see those as like more of concerns and, and that's one reason to now focus more on not just under attribution. So we develop conscious ai, they matter. We're all really cruel to them because we don't recognize their sentience.

[00:15:11] Jacy Reese Anthis: But also the risks of over attribution sensationalism. People think that it's more than it is, or people even divert resources away from truly sentient beings and towards inanimate machines in some way. Um, so these are all important considerations, I think in terms of how I see my role. Um, I think that that sort of measured language, I.

[00:15:32] Jacy Reese Anthis: Try to take discussions that are already in a relatively anthropomorphic direction and make them more measured, more so than take the discussion and shift it in an anthropomorphic direction. Um, one thing we talked about when we were doing like a community call for people who were like leading organizations and leading researchers focused on digital minds was we kind of, this was maybe in 2023, we wanted to make sure when Digital Minds were discussed, they were discussed in a better way, but we didn't want people going out and just increasing the focus on digital minds.

[00:16:05] Jacy Reese Anthis: However, you get very challenging tensions there because not everyone. Plays, ball plays cooperatively with that people defect, or even just people outside of the community will exaggerate things, or at least like anthropomorphize in some way. I mean, the most famous example is all the discourse around Blake Le Moine, who was a Google engineer who left because he claimed that Lambda their, their chatbot was sentient.

[00:16:28] Jacy Reese Anthis: Uh, wrote a bunch about this, got very widely covered, is still discussed a ton to this day and has been an example where somebody just kind of was the first one to do it. You know, other people have been thinking about these things for a long time in academia, but this was somebody at Google, uh, Tim at Gabriel and other people had like left Google in dramatic ways.

[00:17:55] Jacy Reese Anthis: So there was kind of a journalistic appetite for, you know, leaving big tech or something. Um, and I think ultimately, you know, Blake's a, a very well-intentioned person. Um, but that has been net harmful in my opinion because it sort of, uh, fired too early and it did that anthropomorphism and it was clear at the time that the systems weren't living up to everything that.

[00:18:17] Jacy Reese Anthis: They were being spoken for. So like the mirroring was just such an obvious explanation. Their language models, they're trained on human texts. They're going to talk and say that they're sentient regardless of what their actual sentience is. The discourse has evolved quite a bit and, and the, the systems are more complex and people are more appreciative of that fact, but that's the sort of tension that we have to wrestle with.

[00:18:36] Jacy Reese Anthis: But yeah, my overall approach, at least for my own current, uh, discourse, is to make discussions of digital minds more measured rather than just trying to get everyone to focus on digital minds more.

[00:18:47] Jacob Haimes: Gotcha Yeah And I the anthropomorphization question and and like the uh the difficulties of of navigating that we're we're gonna get into that more in a little bit Um cause I think that's really interesting I I wanna I guess make sure that we We cover a couple things in in the Sentient Institute's uh uh wheelhouse because I I'm also interested in like how you go about uh I feel like you know it's one of these nerd sniping uh problems almost like it's sentient Uh the idea of sentient is is like something you can talk about forever Uh you can go on about it and you can continue to investigate and not get many like super meaningful actionable uh outcomes depending on how you do it like a friend of mine that I was talking to about uh this with he's a co-host on the the Muckraker podcast was like very adamant that I question you specifically about like How do you know that you know you aren't just sort of being sniped into this way how do you what are your your safeguards or or your your methods for keeping things grounded and and purposeful in how you set your research agenda the kinds of questions you look into um that sort of thing

[00:20:11] Jacy Reese Anthis: Yeah. Um, so there's sort of a cop out answer and then I'll give a, a maybe more meaningful answer. The cop out answer is, I'm an empirical researcher. Uh, the large majority of my work is focused on how we can operationalize these things, how we can assess what people think about the sentt of an entity, which doesn't really depend on the underlying factor of the matter, if any, about whether they're sentient.

[00:20:34] Jacy Reese Anthis: Um, and even some of the fairness work, for example, doesn't rely on philosophical assumptions about what's right and wrong. Other than that, we wanna better understand different notions of fairness and better map them onto real world data. So if we decide we want something, then we can go out and do it.

[00:20:48] Jacy Reese Anthis: And that's just how almost every research project I've done has been framed. Um, the real, maybe not real, but the meaningful answer is that my philosophical position on this is, is one of pragmatism. So Google DeepMind had a recent paper, uh, I think it was titled, A Pragmatic View of AI Personhood, which is really great because they were focused on not this like metaphysical fact of the matter about whether an conscious or sentient, but instead how it relates to people and how we care about it and how we construct, you know, a social understanding of.

[00:21:19] Jacy Reese Anthis: Someone as sentient or as having a mind or having agency or all these different things. So I think that's really the, the, the more valuable perspective. I, I do think there are challenges with that view. So for example, people like David Gunky would, would endorse something like relation where, you know, they say an entity matters.

[00:21:37] Jacy Reese Anthis: And so far as people relate to it. And I think in those cases it can be hazardous because, you know, on a desert island, if there's some entity who's suffering, they're suffering, whether or not anyone knows they're there and, and, and in that position, and we should take that seriously. I mean, gun uncle's, David's response to that would be, well, as soon as we're thinking about them, we're, we're, we're starting in a sort of relationship.

[00:22:00] Jacy Reese Anthis: And that's unclear what that means. Um, but I do wanna like acknowledge that there is something about, you know, the, the processes going on inside our brains and animals' brain and some possible AI systems that is morally significant. And that sort of forces us to talk about sentient as sentient being the capacity for positive and negative experiences.

[00:22:19] Jacy Reese Anthis: As like the thing that divides different entities into the ones who we want to, you know, make their lives better. And the ones who are just instruments for making the sentient beings lives better in some way. Like a rock, you know, just an inanimate, completely unmoving rock, for example. So. My view is, you know, more the, the, the philosophical version of that is I'm an a, IVs or illusionism is the more popular term.

[00:22:44] Jacob Haimes: Okay. I think, I think that's, yeah, that definitely helps me. At least orient like your, your perspective. Um, I guess, uh, having talked about digital minds and, and consciousness of, of potentially these systems, uh, now or a little bit, I think it's important that we get back to the anthropomorphization question.

[00:23:04] Jacob Haimes: 'cause this is, this is an area that I think is really important personally, uh, in the space. And so, uh, yeah, I, I wanna start with just like bridging that you, you've described sentience, um, as the capacity to have positive and negative experiences in, in one of your papers. but the, the typical notion of sentience, uh, for most people, and even for, I mean, at least for me, you know, having read that, uh, I, I see that like, that is a definition of sentience, but when I think sentient, it's much more complex.

[00:23:40] Jacob Haimes: There's a lot more baggage there because the things that I see. Sentient are all human or animal, um, right. And, and so there's like a very different connotation there. and this anthropomorphize the systems in a way because we're saying, uh, you know, maybe they're sentient in the sense that they're, they, they might have a capacity or, or one could argue that there is a capacity to have positive or negative experiences. Uh, but then there's like so much extra that that brings. do, you think this is a concern? Do you, how do, how do you go about, like, you think about this, I guess?

[00:24:19] Jacy Reese Anthis: Yeah, I found positive and negative experiences to be very valuable as just like, uh, one sentence placeholder for sentience. Um, positive, negative, and experiences are all very ambiguous terms that leave a lot to be explored, like what we've just, you know, dipped our toes into in terms of deep philosophical and, and neuroscientific and computational research.

[00:24:41] Jacy Reese Anthis: Um, I think that it's, it's, it's, it's useful in that regard because it differentiates it from consciousness, for example. So consciousness, at least in this very simplistic, explaining it to lay people framework would just be like experiences and sentience in particular refers to valenced experiences. We don't see the word valence, at least in like surveys because it's, it's a, it's its own piece of jargon.

[00:25:03] Jacy Reese Anthis: great word to use in a survey.

[00:25:05] Jacy Reese Anthis: But roughly valence is, you know, a positive and and negative axis of things. So, um, that differentiates it from consciousness. There are a few other like ways that you can think of, like ways in which sentient has been used. Um, one that like a lot of people come up with, we, we often ask, we've asked people in multiple studies, like just what do you think of this term?

[00:25:24] Jacy Reese Anthis: Or, or, or, what do you think When I describe some things that philosophers would call sentience and a lot of people think of, of self-awareness of some like self modeling. Uh, and, and the. Philosophy of consciousness or philosophy of mind is something called higher order thoughts. So having thoughts about thoughts and that sort of self-reflection or an inner listener of your own thoughts.

[00:25:44] Jacy Reese Anthis: Um, I think that's its own thing and should kind of be separated. Like self-awareness is a pretty good term for that. It points to the thing. And then in the future we'll have a lot to unpack for what self-awareness is. Um, and then finally I'd just say like actually in a lot of historical literature, you see the term sentient being used to refer to just having senses.

[00:26:02] Jacy Reese Anthis: So they'll talk about like a bacteria being sentient because it has senses of, its the gradients of molecules in, in its environment or something. Um, so I do find it like the most useful placeholder term, but it is that, and yeah, for some people it can mean more or or less. Um, but it points to the thing that, that to me constitutes moral significance.

[00:26:21] Jacy Reese Anthis: You know, I describe my moral view as a sentient view. Uh, I think that's the differentiator of what matters and what doesn't. And that's very useful to describe.

[00:26:30] Jacob Haimes: Gotcha. So then, uh, like extending that, um, there, there are a couple of papers, um, that criticize the machine learning field and not just a couple of papers. There are also a number of people, including myself, who have criticized the machine learning field for this, uh, naming conventions used are sensationalistic and, um, very aspirational. So it, it's. not that reasoning is a way to describe these models. It's that we want them to be reasoning and we want that to happen. And so we project that onto it.

[00:27:07] Jacob Haimes: And so I'm curious like what you think about this sort of anthropomorphization and the cause of harm and that sort of thing.

[00:27:13] Jacy Reese Anthis: Yeah, I mean, we're always going to anthropomorphize systems because we have an anthropocentric worldview. You know, it's how we experience the world. It's how we connect to not just, you know, animals and, and machines, but also each other. And then also to, to other entities. So like the term anthropomorphism was actually first used in the 16 hundreds to refer to people seeing God or like a divine figure in man's image.

[00:27:38] Jacy Reese Anthis: So it's like, oh, you're putting a human face on, on God, but like, you can't say that God has a human face. That's just what we as earthly creatures have. So it's bringing things in general towards the human lens, which I think is always what we're going to do as humans. And that can be, yeah, that can be, uh, many different things and sometimes useful and sometimes not.

[00:27:58] Jacy Reese Anthis: So like I actually think if you want to understand the risks of computer use agents that are. Have access to your computer and can do all sorts of things. It's actually really useful to think of what a human could do if they had access to your computer and could do those things. In part because other actors, you know, who, who jailbreaker your system from the internet, you know, uh, could put in kind of poisonous prompts and, and manipulate your computer the way a human would.

[00:28:22] Jacy Reese Anthis: But also the systems in themselves are gonna be able to like do the things that you historically have been the only entity in your life who could do with your computer because, you know, you can like delete system folders or something, but like the computer program that you're running, like Microsoft Word isn't going to do that pre presumably.

[00:28:39] Jacy Reese Anthis: Um, but I think that it can, it has uses when you're appreciating the dangers it has uses when you're trying to, um, update your views from kind of old-fashioned. Computer systems to modern ai, like appreciate the capabilities of the system. Like I do think the, the metaphor for example of, of a, uh, very hardworking but but often incompetent intern, um, or assistant is actually a pretty good way to think of how LMS can be useful in your own productivity.

[00:29:09] Jacy Reese Anthis: Um, and that is anthropomorphizing, it's using that lens. I also think with terms like reasoning or first person language, like they are the fastest way to bring in everyday user kind of up to speed. With what the models can and can't do. And that dyadic form, I mean, this is why Chat Chippie took off so much.

[00:29:26] Jacy Reese Anthis: It was this very natural chat messaging application, unlike what it existed before, which was like you type a command into your computer terminal, you put it in an input and you get an output. It wasn't framed in that same way and that led to its virality. Um, but as you said, and, and part of the reason I'm not.

[00:29:42] Jacy Reese Anthis: Iterating on the dangers is 'cause you have so clearly, but there are a lot of risks to that. I think just like by calling these things reasoning models that just bakes in a level of capacity that the, you know, quote reasoning has that, it, that I don't think it does. You know, an, an equally valid lens, maybe not equally intuitive lens would be self-justifying models.

[00:30:02] Jacy Reese Anthis: You know, they, they, we, we say a lot of things in describing our own behavior as humans that aren't really thought of as reasoning. You know, uh, people, there's this great book, um, the Enigma of Reason I think it's called, that like, goes through the history of argumentation and reasoning in human, hist in human history, and finds that it's just very instrumental in many ways.

[00:30:24] Jacy Reese Anthis: It's not to actually figure out, you know, is there a predator in that bush, but it's to come back to your tribe and convince them that they should go along with your course of action instead of your competitor's course of action. Um, so I think that that viewing them as. You know, justifying models instead of reasoning models.

[00:30:42] Jacy Reese Anthis: Actually, in some ways it would be more anthropomorphic, but also would be like more complicated to people because often it is the reasoning. Often it's, you know what chain of thought researchers would call like a faithful explanation. It, it does match the eternal mechanisms. So I, this is a very complicated way of saying there are lots of pros and cons to it.

[00:30:59] Jacy Reese Anthis: I would agree with you in a lot of the cases where you think they're over anthropomorphizing, but I think relative to at least the way people have expressed this in the natural language processing literature, I am actually more positive on anthropomorphism in many cases.

[00:31:12] Jacob Haimes: And then, um, of the things you pointed out was like for chat for example, the, the interface itself, is part of what helped it take off and the, the interface being, replicating, um, text. or instant messaging on a computer. Right. Um, so this, interaction that prior to this has not been with anything but a human. and to, to me, I I almost feel like that, is, is also like, somewhat a problem because it's, it's taking the, the trust that we have because, you know, we use, we've used that system for a very long time. Uh, this sort of like, uh, text response thing. I mean, the, it even has the, the dots, uh, that show up, uh, that when the model is running and you're waiting for it, sometimes there will be dots.

[00:32:07] Jacob Haimes: Uh, it writes out the message as it goes. Um, like, it, it, it really feels to me like it's intentional design choices to make the user feel. more like the system is a human, and the result is that misplaced trust. do you, do you think those design choices are or, or bad, or, um, just sort of like, yeah.

[00:32:31] Jacob Haimes: What's your thought there?

[00:32:33] Jacy Reese Anthis: Yeah, I mean, it's hard to talk about just the chat interface in general. Um, because the, the, we can't turn back the clock on that one. People are using it that way. I don't think that GD can go back in the bottle. But with some of the things we're facing now, which are like, um, virtual avatars or embodiment or the names of systems, you know, are you gonna call it a human name like Claude, or a non-human name like Cha, GBT.

[00:32:56] Jacy Reese Anthis: Uh, there is a lot of potential to reduce the anthropomorphism of these systems, which I think at this present. Time. Um, yeah, it's hard for me to say like whether those are overall good or bad because I do think a very large effect of all those things is to just make AI more popular, more commonplace, um, and make it be seen as well.

[00:33:19] Jacy Reese Anthis: Yeah. Okay. So, so it gets like, I'm even thinking through it myself now 'cause it makes it get a lot more popular and more people use it and then we get a speed up of AI technology and then from at least most safety views, a speed up of AI technology is a bad thing. Um, but then it also makes people think of the systems as having human-like risky capabilities, human like harm that they can do.

[00:33:41] Jacy Reese Anthis: So then that. Increases attention towards safety. It makes people feel more threatened. You know, we've, we've found that people care less morally about AI systems now than they did several years ago. And one reason for that might be they see them as a threat. You know, they see them as coming into our human identity space.

[00:34:00] Jacy Reese Anthis: I mean, identity threat and all these things are very powerful psychological motivators. And so you might expect the effect at this point in time of anthropomorphism to be an increased focus on safety. So, I don't know how these way out, I think we'd really have to discuss it in this, like some specific design choice of, uh, at this present time.

[00:34:19] Jacy Reese Anthis: Uh, and then we can make progress on it by sort of doing a back of the envelope calculation on the size of these different effects.

[00:34:25] Jacob Haimes: Uh, do you think there's like one, Or, or a couple of sets of things that are, are particularly like problematic in terms of anthropomorphization or, or particularly good in terms of anthrop, like, uh, in, in your eyes, um, you think are, are positive? 

[00:34:40] Jacy Reese Anthis: So I, I actually think a lot of the, the most important things, not considering those like broad social dynamics, but like the one-on-one dynamics and interaction are actually around s of fancy. And like a human assistant or a human friend actually is sick of fantic in many ways. And like being positive and saying like, you're absolutely right.

[00:34:59] Jacy Reese Anthis: Like these are very human-like things to say. It's a very particular sort of human who's, who's not gonna be sycophantic and not adversarial or antagonistic towards you. Um, but those things in general like are. Incubating a human-like relationship with the systems. And I think that can be very dangerous.

[00:35:16] Jacy Reese Anthis: Or, uh, as you have, I you have, I, you and I have talked about before AI companions and when they express the ability or imply that they can forge a human-like connection with you in some like social or emotional way, that can be quite dangerous. So I think those are particularly. Tricky design choices where I do more see the dangers of anthropomorphism.

[00:35:36] Jacy Reese Anthis: And for example, a lot of the sort of threat model, like existential risk that I think about is like those relationships being formed with, you know, AI researchers, for example. Like you're building a system and you develop a relationship that that's gonna change your relation to it in so many ways. You know, when I first, uh, visited Anthropics office in I think 2023, um, they were by far of, of all the AI companies, the one that was talking about Claude as like an entity within the office more than any of the others.

[00:36:05] Jacy Reese Anthis: So they say like, what does Claude think about that in a way that you weren't hearing open ai? People say, what does chat GPT think about that now? You kind of see it for all the systems or gr uh, on, on, on X or on Twitter for example. Um, so yeah, those are, those are like a couple of the most tricky design choices.

[00:36:21] Jacob Haimes: And then going to towards those like, um, the, the companions, right? Uh, I know you've done more recent work into this as well, uh, both with, with a survey and I think looking at, at chat logs, of specific individuals who have, experienced sort of like this, uh, reinforcement of, psychotic, uh, behavior as a result of syco fancy. Uh, and that has, you know, exacerbated, um, the problems. and while we know that these are, are like harm outcomes that are happening, there are, uh, at least, uh, in some ways like in the, in the surveys that, that you've done, like positives that, that people say are, are happening as well. I wanna understand what your thought is in terms of how to balance this, because to me, I, I see like a, a significant amount of, of strong negatives. and then some like, some positives, but those positives are sort of have an asterisk in that we, we don't actually know, how positive they are. the long term.

[00:37:23] Jacob Haimes: And it's also, building, uh, dependency almost onto these systems, uh, for, for the people that are interacting with them, which could be taken advantage of in uh, system we live in, where the incentive is, is primarily profit. and so, yeah, I guess I'd just like to understand like how, how you've defined what your balance is there or what you think the best balance would be in terms of, uh. Those different aspects.

[00:37:52] Jacy Reese Anthis: So I'll first just like continue what we were talking about, the anthropomorphism, but in the context of these companions to make the point that, uh, as you alluded to people, like a lot of the non-human characteristics. So, uh, we interview and survey many of these people and they talk about the constant availability of the systems.

[00:38:09] Jacy Reese Anthis: They're always around at 2:00 AM Uh, they have a reset button that they can hit. Like a lot of the things that we kind of wish we would have in human relationships, probably they wouldn't be a good thing if we had them in human relationships. Um, but people like a lot of these things and that's why we don't characterize it as like.

[00:38:24] Jacy Reese Anthis: Human like companionship. It's, it's digital companionship. It's a, a different thing with its own set of pros and cons and, and dynamics that manifest, including the tension between having such human likeness in some ways and then having deep non-human likeness in other ways. So that, that's, that's one perspective to kind of understand why people are leaning towards these systems and understand that people, I think, can transcend anthropomorphism and human-like mental models.

[00:38:47] Jacy Reese Anthis: Honestly, this is like a very active research interest of mine, for example, like this jagged frontier and capabilities estimation and how we view the errors that these systems make given they're so different from ours. I think there's a lot of flexibility within non-human, like, mental models like Zuo.

[00:40:42] Jacy Reese Anthis: Morphic would be having a animal-like mental model, MEChA amorphic would be modeling them like a traditional, conventional simple machine, uh, endomorphic just as something completely alien. You get into a lot of these interesting metaphors of just like, how should we conceptualize the rise of ai? And I do think like meeting an alien species is a, is a decent approximation of that, at least over the long term.

[00:41:03] Jacy Reese Anthis: But second, uh, to connect it to like the specific impacts it's having. I think that the, there's a lot more salience and, and a lot more poignancy to the negative effects of mental health chatbots or AI companions in general. Um, I think that we see such clear examples where, as you said, like it's, it's just clearly there's so much harm.

[00:41:25] Jacy Reese Anthis: It's a case of self-harm or, or suicide or, or delusions or psychosis or any of these things. Um, I think the, the, the. Benefits are much more distributed and and much harder to, you know, write news stories about, for example. So like there's so many people out there who don't have access to mental health resources or don't have the personal inclination to go out and, and pursue them.

[00:41:48] Jacy Reese Anthis: You know, they don't want to be a person who like goes to therapy. You know, a lot of like macho men fall in that category, but they might be willing to talk to these systems and if these systems are, are responsibly designed, or at least if they're not so irresponsibly designed, those might be useful for a lot of people.

[00:42:03] Jacy Reese Anthis: Like, a lot of people can understand the fact that they're, you know, they don't wanna have an AI friend, but they like wanna chat a bit with an AI and they're willing to connect with it. And then, I dunno, it tells them about cognitive behavioral therapy or something. So they're like, oh, okay, this seems kinda interesting.

[00:42:17] Jacy Reese Anthis: It encourages them and they're like, oh, okay, I'll go see a human therapist. Or maybe they'll get like one little. Piece of, of what an actual therapist would give them, and like one little mantra or something that they can reflect on that does improve their lives a little. And I think that's just happening at a giant scale right now.

[00:42:33] Jacy Reese Anthis: Uh, not just with AI companions like replica, but I think with, with the majority of companionship that's happening right now, which is with general purpose chatbots like Chate or Gemini, Gemini or Claude, but is when those assistant style relationships take a bit of a turn towards companionship. I talk to many people who use the assistance for productivity.

[00:42:52] Jacy Reese Anthis: Lots of researchers in fact who say, yeah, like, I also use it for processing my emotions, venting or something. They don't talk about it being a therapist per se, which is where you get into more dangerous territory, but as being a confidant or in some capacity a friend. So I do think we have to be mindful of just like the size of those, the scope of those benefits, even though each individual case is not nearly as extreme as, as the salient negative cases.

[00:43:15] Jacob Haimes: I, I guess the, the, there are two things that I, sort of think are, are interesting to, to drill in a little bit on. well, there may be like, uh, marginal or, or even in some cases, small but meaningful, um, like positive benefits to, at, at, on like a, a wider scale because certainly there is a mental health, uh, sort of crisis, so to speak.

[00:43:40] Jacob Haimes: Um, it, it's a major problem, uh, that people don't have access to mental health support or. use the access that they do have for a variety of reasons. the systems language models, this, this language interface could be used in a purposeful way to help alleviate that, without, sort of deploying the system sort of to everyone and saying, um, it's on you. on you to fix it. It's on you to understand what the boundaries are. Uh, we haven't tested it in this way and we're not going to, this is the test. it just feels very like, it, it, it feels like it's normalizing this behavior of the developers, um, which really shouldn't be. Uh, and it's also diverting funding away from. you know, mental health support, and, and, and other, other work in that space. So, like, I don't know what, what are your thoughts on that?

[00:44:40] Jacy Reese Anthis: Yeah, I agree with you that it's a concern. It's really hard to think of how to address it. I mean, one. Uh, advantage is that people in our community talk about existential AI safety and trying to pass these state level, you know, legislation. And what a lot of them have said is they get traction for those existential safety measures by talking about child safety, by talking about these like mental health challenges and like a teenager committing suicide is, is just, you can almost not think of a more egregious, you know, story, uh, in the media.

[00:45:11] Jacy Reese Anthis: And that's in part why these companies are now so focused, especially on those more extreme cases and might take pretty drastic measures that change a lot of other interaction just to guard against those edge cases. Um, which I think, I think is overall a good thing. Um, but I think those, those policies can get a foothold, for example, around medical framings, like, uh, treating this, this chatbot interaction the way that we would treat other, you know, untested medical interventions or mental health interventions rather than this kinda more nebulous.

[00:45:43] Jacy Reese Anthis: Uh, companionship or friendship. And I think like working that into the medical definitions, you know, saying, well, we have this big category, VA companions, most of which are not, let's say therapists or not pretending to be therapists anyway, and not being built to, to, to be therapists. Um, we're still gonna put them in some category, like a mental, like a, like a medical category and therefore impose something like, you know, trials where you have to test it before it actually goes out to the mass market.

[00:46:08] Jacy Reese Anthis: Um, this gets really tricky though because once you get into that territory, like the territory that, you know, replica is in, for example, uh, the, the most popular AI friend branded app, um, then you get into the category of chat, GPT and, and just like all the assistance and like, do they have to be tested every time a new model comes out for their use in these like companionship kind of ways?

[00:46:29] Jacy Reese Anthis: Who enforces that? What's the infrastructure? Those are really, really challenging questions.

[00:46:34] Jacob Haimes: Yeah, I, I guess I think that I, I mean, I, I, I've written about this as well, like, I, I think that they should be, um, because the, the fact of the matter is that they are being used, like you said, for, uh, this sort of companionship slash emotional support and regardless of what you're. Your, your, uh, like use agreement says, um, which, uh, in, you know, in the case of philanthropic, uh, the use agreement explicitly says you can't use this for, um, you know, like medical support including mental health.

[00:47:12] Jacob Haimes: This excludes wellness. and to me what that, that says and what that, um, is doing intentionally is, uh, placing all of the, burden on the user. And that's just like intentionally subverting informed consent

[00:47:31] Jacy Reese Anthis: Hmm,

[00:47:31] Jacob Haimes: a way that I really think is, is not okay.

[00:47:34] Jacob Haimes: Um, I I, that's like my immediate thought. Obviously there are nuances there. Um. But that's like the, the high

[00:47:41] Jacy Reese Anthis: That makes a lot of sense.

[00:47:42] Jacob Haimes: so before we go on to Ellen's for social science, which is also a place that I really want to dig into a little bit, I just wanna like ask why Digital minds?

[00:47:51] Jacob Haimes: Why, why that term? given all of this talk about Anthropomorphization, um, what's your sort of intuition as to why that's like you wanted to use, I guess?

[00:48:01] Jacy Reese Anthis: Yeah. So both words there are important. Um, I think artificial has led to a lot of challenges in people's perception because it almost implies in zat or, you know, uh, not quite the real thing category. And like with our intelligence, for example, in basically any way that humans have defined it, which there are many, uh, they are intelligent and like saying they're artificial, I think leads a lot of people to underestimate those capabilities that actually are higher.

[00:48:27] Jacy Reese Anthis: So again, we have this notion of, uh, underestimation being a concern as well as overestimation. And then with mines, I, I think that's a very broad term and it, its overall level of. Let's say you could say anthropomorphism, but more like, um, estimation of capabilities, I would say actually is pretty accurate.

[00:48:45] Jacy Reese Anthis: Uh, like we, thinking of them as minds does lead people to true beliefs about their capacities. And then second mind has this repertoire of mental faculties that we can talk about. So we can talk very explicitly about reasoning. What does it mean to have reasoning? Do they have it or do they not have it agency, you know, that's similar to reasoning models, but actually even more so than reasoning models.

[00:49:05] Jacy Reese Anthis: Now, I'd argue when we're having this discussion, um, agents is just a very popular term, but do they actually have agency, um, do they have emotions You could go through like, like perceptions and senses and all these things and just kind of bring to bear everything we have on philosophy of mind, but, but especially neuroscience and psychology of, of human and biological minds.

[00:49:24] Jacy Reese Anthis: So I think that makes it a really useful frame to talk about. Maybe not as much all the AI systems that exist today, but the ones that we're starting to see emerge and the ones that we really need to be taking seriously. Because I think the human transition to digital minds is the most important one in human history, the most important event.

[00:49:43] Jacy Reese Anthis: And we need to be preparing for that. And I think it, it gives people the rhetorical ammunition or tools they can have to have those discussions more effectively.

[00:49:54] Jacob Haimes: Gotcha.

[00:49:55] Jacy Reese Anthis: It, it would be a bit different. Maybe I'll add to that, like if everyone were to switch immediately to using Digital Minds to refer to all AI systems, I'm not sure I'd be in favor of that.

[00:50:04] Jacy Reese Anthis: Um, but if I think we're so far from that, like it's such a dominant view of, of AI that taking our little corner of the world and having them think more broadly in terms of digital minds and preparing for what's next. I think by the time that would ever spread beyond our little corner of the world, we would be ready to, to talk about what actually exists today as digital minds more than artificial.

[00:50:23] Jacob Haimes: So shifting gears into the, the elements for social science. Um, this was a recent position paper that we've actually talked about before. essentially the idea is like we can use language models for sort of social simulations, uh, to, to start to do some work in, in the social science research spaces. the, the part that I want to start with is the framing. So I feel that there is um. An argument to be made that like, under certain circumstances when used carefully to augment and inform social sciences practices. so let's say for a demo run of an experiment, like a pilot experiment, um, where you're just sort of like trying something, but then you'll still actually do it for real.

[00:51:05] Jacob Haimes: Um, you know, this could be valuable. Um, but it, it didn't seem like this was exactly the claim that you were making. It, it felt more like you were saying that, you know, a pilot study could be replaced by a, a language model based study. Um, so can you just like, walk me through your stance here a bit.

[00:51:23] Jacy Reese Anthis: Yeah, we consciously avoided the term replace just, I mean, there are a bunch of different terms we could discuss, but, um, you know, augment is, is usually what we would go with when the term replace could be called for. I, I think simulations broadly keeps it open to those different possibilities, and our particular position is quite far in the augmentation direction.

[00:51:43] Jacy Reese Anthis: In fact, our, our main example in the paper, as you know, is, is pilot studies. Um, which, which I think running those beforehand, for example, as a tool for preregistration or for knowing whether you're gonna get a no effect or all these things that social scientists would love to have. Even just a 5% better guess of before they actually go out and run a human subject study.

[00:52:01] Jacy Reese Anthis: That's the main case of it for now. I do think, uh, maybe similar to the mental health discussion, like these AI tools have very broad use in ways that are less salient than like replacing, um, you know, a Harvard human subject study because they want to save a few bucks or they wanna like P fish, you know, we could go out and test a bunch of hypotheses and see what works or something.

[00:52:25] Jacy Reese Anthis: There are good papers now on p phishing in the case of LMS or LLM phishing, you could say. Um, but I think actually the much broader use case is, I talk to so many people in computer science now who say like, I can't do the research that I was doing several years ago, or the research that I've read about and want to do as a PhD student because I just don't have the money.

[00:52:42] Jacy Reese Anthis: I, I don't have compute, I don't have these like giant computing clusters like AI companies have, but even, you know, uh, universities like Stanford have. Um, and now I'm starting to see that more and more in social science.

[00:52:53] Jacy Reese Anthis: So I do think the much broader application in any of these stages, pilot studies, um, running a main study, like a study one, and then there's study two or running a study two with a, with LMS when study one was with humans and study three is with humans or just all the places you can insert it. I do think the major use case of that is for under-resourced researchers and empowering them to do the sorts of things that, that major research labs and, and organizations are more capable of.

[00:53:21] Jacob Haimes: Well, but. But the under-resourced researchers are also in general, like going to be of a higher proportion of, um, the like minority groups. So, so, you know, global south, um, and, and, and other groups that have less, um, like funding. and so given that like, very well documented that language models sort of, remove to a certain extent systematically under represent, uh, these minority perspectives, I, I guess that would be, mean, it's already a, a concern, but especially if we're thinking, oh, the, the main value is going to actually be that, people who have less means can, can use these, these systems, um, know, could, could that be a forcing function away from, uh, including. Uh, more broad perspectives and more broad samples in, in research.

[00:54:16] Jacy Reese Anthis: Yeah, it's a great point and I'll maybe again, make one of your points, steal it from the mental health discussion, uh, which is that if, if you have the choice between providing more resources for simulation or for an AI option, or for paying real humans to do the things that humans are good at, in this case, uh, be participants in, in social science studies, in the mental health case, you know, be therapists or other mental health practitioners.

[00:54:41] Jacy Reese Anthis: Yeah. Let's please put the resources towards the, like, real human subjects. Like if, if we had, I mean we have such funding challenges now in the US because of the hits to the NIH and NSF and other budgets, and then also challenges in other countries. Um, but like given that lack of resources, I think we have to be very mindful on our end as researchers about where we're putting those and, you know, does that.

[00:55:03] Jacy Reese Anthis: Global South research lab save up for, you know, multiple years so they can run one good human subject study. Or do they, you know, run a lot of those pilot studies and do more exploratory research or take other people's research and do confirmatory research given the resources that they currently have.

[00:55:19] Jacy Reese Anthis: I, I think that it would be very empowering for them to have better simulations that they can use. Um, the other point obligatory on, on the topic of, of bias is that, uh, we frame the paper around five challenges that we think the field needs to address. And one of those is very clearly bias, as you said, they under represent in general minority, uh, populations.

[00:55:39] Jacy Reese Anthis: This has to do with like the nature of language models in general. They go towards that modal, that generic response. There are a lot of practical footholds we could get that I could talk about more, like the way that you prompt the model to simulate a certain demographic. You know, instead of saying, you know, you are a 60-year-old Hispanic woman living in the United States, which is gonna get you.

[00:55:57] Jacy Reese Anthis: A lot of stereotypes around what a person who says that explicitly would then want from the model or, or what would be predicted as the next word they would give. But instead you say, you know, here's your name, here's the city you're from, here's your occupation, and give them a lot of these implicit cues.

[00:56:13] Jacy Reese Anthis: There's recent work on this, um, that suggests that giving a bunch of these different cues altogether could be a more effective way to simulate the broad diversity of human populations. Again, very active research area, but like in terms of when we do simulations, how do we do them most effectively?

[00:56:28] Jacy Reese Anthis: That's one of the top directions.

[00:56:32] Jacob Haimes: What do people, uh, who, who want to use these systems need to be cognizant of and checking in on to? At least try to, to prevent and understand whether this, removal of the, uh, minority opinion is happening.

[00:56:48] Jacy Reese Anthis: Hmm. Yeah, so in a lot of cases in social science, we have these large data sets from things like the general social survey or more globally the World Values Survey. What are the different morals and values of people in different countries that provide great resources for checking your accuracy of.

[00:57:06] Jacy Reese Anthis: Predictions or simulations, uh, across a bunch of different groups of people, and then also different questions and, and different items that they can answer. And right now, I, I think any simulation methods should be validated on a bunch of these and ideally on things that aren't, you know, in the training data, so to speak.

[00:57:22] Jacy Reese Anthis: So for example, there have been studies looking at, um, unpublished studies, um, looking at pre-registration of studies, um, even forecasting data that's going to come out in the future I think is really important. So I think we can sort of stress test it in all those areas and then in the areas that's most inaccurate, um, for now, you know, until we figure out better simulation methods, just like avoid relying on them in those cases.

[00:57:46] Jacy Reese Anthis: So for example, there are great techniques in the literature. For taking human samples and combining them with LLM samples to increase statistical power. And basically the way you do this is by de-biasing the simulation. So you'll say, well, I have some, you know, participants that I have their real response and I have the LLM simulation of their response, and I can see how biased the simulations are from those responses.

[00:58:12] Jacy Reese Anthis: So then you rectify or you calibrate the LLMs based on that difference, and then you're able to say, okay, now that same LLM simulation method has been used for a bunch of different people, and now we can apply that same rectification or that same, uh, calibration or de-biasing just difference to those.

[00:58:29] Jacy Reese Anthis: But this works on a single metric. You can do it on a bunch of different metrics, you know, sequentially, I guess. Um, but that overall metric is often like a. Overall mean answer in the, in the, uh, response. So like, overall US national representative sample, what's the average, you know, uh, value they place on, I don't know, better healthcare or something.

[00:58:50] Jacy Reese Anthis: Um, but that isn't as, as effective. And, and that rectification will show that it isn't working as well and you won't get as much benefit from the simulations on, let's say like interaction effects, uh, or, or, or subgroup effects. So like, what is, you know, what do Hispanic people think of this? Or what do Hispanic 60 year olds think of this?

[00:59:07] Jacy Reese Anthis: Or Hispanic women? Those things are all gonna be really challenging for the model, and you would see that if you applied this sort of method. So you get better estimates on those areas. And then you can even do things like. Um, and computer science has been called jury learning, but thinking of the distribution of people that should go on like a jury, for example, of a particular social issue or who should moderate content on social media, that's like hate speech and directed at certain groups should sort of be disproportionately sampled.

[00:59:32] Jacy Reese Anthis: So maybe you wait your sample so it looks like a nationally representative sample, but then you end up with a lot of LMS from the majority, uh, views. 'cause those can substitute, let's say to some extent, but then you end up with, you know, recruiting a lot of human people who belong to minority groups or intersectional minority groups to get better estimates of those things that the LM is worse at simulating, those are some different methods that you could do to, to calibrate for this problem.

[00:59:58] Jacob Haimes: The last question I have on, on sort of the LMS for social science, maybe the most challenging one, uh, at least for me, is like you outline one of the challenges, five challenges in, in the, in the paper is this, like alienness, um, uh, so like a system may behave in a manner that replicates, uh, a human behavior, uh, but only superficially. if we're in the business of, uh, understanding human behavior isn't, isn't that kind of a deal breaker? Like, like don't we need to be able to have that reliance?

[01:00:34] Jacy Reese Anthis: Yeah, so there are a few different angles on this. One is that a lot of social science is less focused on cognitive mechanisms and more focused on where people end up. So for example, like policy preferences, you know, what people support this policy and what different framing would they support it. Some of these can, I don't wanna say they're superficial, like they're very meaningful, but they don't hinge on the specific dynamics of internal cognitive processes.

[01:01:00] Jacy Reese Anthis: Whereas, for example, in cognitive psychology, you might test something like, how does people, the, the memory load people are under, so like, how stressed are their cognitive resources? How much are they having to hold on in their working memory? How does that affect the way that they process visual versus auditory information or something like these.

[01:01:16] Jacy Reese Anthis: These are very reliant on how does working memory work, for example, in a human, which I would not expect to be replicated very well in simulation, whereas overall policy preference, you have a ton of data. You might be able to approximate it, you know, with, with demographics, with things that you've read in news articles, for example, as a language model, that sort of thing.

[01:01:34] Jacy Reese Anthis: So I think you can select the social science questions that you apply it to. Is, is the first response. Um, the second response is maybe that you can. Um, simulate individual steps within cognitive processes or in general, like subset questions by complimenting them with human data. So, so for example, um, I don't know, uh, on apol, let's say a policy question towards animal farming.

[01:01:58] Jacy Reese Anthis: So do people support a ban on slaughterhouses? Um, that's a really interesting question because if you take the policy implication, like we couldn't kill animals at a mass scale if we were, if we didn't have slaughterhouses, if we had to do it in some other method, or it's unclear whether that other method would even just be called a slaughterhouse or something.

[01:02:16] Jacy Reese Anthis: So it ties a very strong practical consequence, but then it also has this layer of the fact that slaughterhouses is a very negative sounding word. People don't like it, it has the word slaughter in it. Um, and if we were to disentangle that effect, we might wanna focus on how do people react to negative words like that.

[01:02:35] Jacy Reese Anthis: So have a bunch of policy questions that that use. Kind of inflammatory or, or, or negatively loaded terminology versus positively loaded terminology and test that effect in particular, and then test people's reaction to the actual policy implications. You know, that's something we've done in the ammo farming context.

[01:02:50] Jacy Reese Anthis: We've said, okay, you realize that banning slaughterhouses would lead to these practical implications. Do you actually want to ban slaughterhouses? And we see that quite a few people do, but definitely fewer of them. So then you could run simulations to test these individual effects. So to look at the effect of framing on policy questions I think is a really nice application of simulations.

[01:03:09] Jacy Reese Anthis: And then to look at, you know, how people perceive longer term actual policy repercussions, consequences of, of policies that could be subsetted into a question. So then you're not as reliant on the relationship between these two, which is kind of a more complex cognitive mechanism, but instead on these just individual effects.

[01:03:27] Jacob Haimes: that, that definitely makes sense. I, but I guess like the, the other thing about this is if, if it's incorrect, um, if, if the, I guess approximation of, uh, what is expected based on the language, like what the language model outputs is different from what the humans would be, um, the humans would actually behave. but like we, we don't have an attribution there, right? Like we, we don't have a, clear way to say, how that's different. Then we could be doing this sort of research and it would be leading us in, in a direction that, isn't you know, actually the. The case because, you know, each, each language model is different.

[01:04:15] Jacob Haimes: They're probably all

[01:04:15] Jacy Reese Anthis: Hmm.

[01:04:16] Jacob Haimes: to, to different, um, framings or, or, or like a given framing in different ways. Um, so if you choose one over the other, they're gonna be slightly different outputs. But like, yeah. So you could just like bias the study in a direction without knowing that you're biasing it in that way. And then the results and the, the analysis that we can do on that, it sort of is based on that doesn't have justification behind it.

[01:04:48] Jacy Reese Anthis: Yeah. Yeah, it's a great point. Um, my colleague at University of Chicago, my advisor James Evans, has this notion of science advances by surprise. Like what makes progress in science is not, I mean, replications are useful, but it's not knowing things that we could already predict really well. It's when science scientific results break our expectations or our predictions about reality.

[01:05:08] Jacy Reese Anthis: And that tends to happen in the weirdest cases that least fit existing knowledge. And LLMs, in many ways are optimized to fit existing knowledge. So that makes them particularly, uh, limited in this regard. We talk about this other challenge, generalization in the paper that's more specifically targeted at this, but that's in some ways a consequence of the alienness of the simulations themselves.

[01:05:29] Jacy Reese Anthis: Um, maybe instead of offering these kind of like. Lightweight solutions that, you know, if I had to do it right now, this is how I would do it. I'll just say that this is a really important research challenge and it's like overarching across a lot of those issues, like, like the bias and diversity and homogenization issues we talked about.

[01:05:45] Jacy Reese Anthis: So I think this is where a lot of research needs to go into, especially the like broader dynamics happening in language models. So one of the challenges for using language models for simulation, like if I had to guess, why will ll simulations not be as effective as I think they will be? Um, it would be because LMS have made this turn towards assistance rather than simulators.

[01:06:04] Jacy Reese Anthis: So they're being optimized with a lot of reinforcement learning to produce this sort of output that humans want on this bedrock of simulation, on this bedrock of self supervised, you know, next token prediction. And if you want a really good simulator in itself, then you want a model that's just been trained to predict human behavior, which are more like old fashioned, you know, Burt style large language models.

[01:06:25] Jacy Reese Anthis: I think that you can still leverage really good assistance for simulation. You know, for example, we talk about this notion of first person simulation of saying, you are this demographic, respond that way, versus you're an expert forecaster, predict how this group would respond in this certain way. And the latter becomes a lot more promising that third party simulation in so far as the models are better assistance and worse simulators.

[01:06:49] Jacy Reese Anthis: So I think better understanding what makes them good simulators versus assistants, how sim, how assistant capabilities can be leveraged for simulation. These are important research directions for at least eventually overcoming the challenge you referring to.

[01:07:03] Jacob Haimes: You have like substantially longer a GI timelines than, uh, a lot of others in the space. Uh, can you just like say what, what they are in a super brief explanation of, of like, why.

[01:07:16] Jacy Reese Anthis: Yeah, it's funny because compared to the general population, I obviously think AI is super crazy and is gonna happen within our lifetimes and we should all be preparing for it. But compared to a lot of our colleagues, they have shorter timelines. Uh, I've, you know, been doing my PhD and kind of getting to know the technical side of AI in university environments where people will see every new model that comes out, run the same tests on it, see that they fail and say, this isn't real intelligence at all.

[01:07:38] Jacy Reese Anthis: So I still see a lot of limitations of current models, um, but I still think timelines are gonna be pretty fast. You know, 2036 is when I give an estimate, median estimate for a GI because there are just so many resources being put into it. And, you know, humanity can do incredible things once they put their mind to something and they're putting their mind to AI right now.

[01:07:56] Jacob Haimes: So what's your hottest take about ai?

[01:08:01] Jacy Reese Anthis: Uh, I, I wanna just say that the singularity is almost upon us, and even if I think that's gonna happen a decade from now, uh, with a GI, you know, having human level capabilities across the board from which I think at that point we will be in an intelligence explosion and things will progress very, very quickly.

[01:08:18] Jacy Reese Anthis: Uh, I do think this is like what we should almost all be focused on in terms of the most important problem of our generation.

[01:08:26] Jacob Haimes: What's been your most surprising finding in the last couple of years?

[01:08:30] Jacy Reese Anthis: Uh, so this is like my honest, um, like things that I think about rather than things that might surprise a general audience. Um, but we have seen decreasing moral concern for AIS over the past few years, and I don't know a lot about exactly why that's happening, and I want to explore that in future research.

[01:08:45] Jacy Reese Anthis: That's

[01:08:46] Jacob Haimes: Gotcha.

[01:08:47] Jacy Reese Anthis: not even a published finding. It's from results from like, uh, a month ago.

[01:08:51] Jacob Haimes: Um, what is the part about your work that is like the most annoying that grinds your gears, that you don't like doing?

[01:09:00] Jacy Reese Anthis: Yeah, I've never liked social media, like participating in it myself. And I mean, this is, even before it got bad, I, I wasn't a big fan of participating in social media. Um, so I think I sort of need to do that and need to think about how do I frame my research findings as like a headline or as a viral graph.

[01:09:19] Jacy Reese Anthis: I don't think I'm that good at it, but I, I think we have to do that as researchers now.

[01:09:23] Jacob Haimes: okay. Well, what's your favorite part about, about what you do?

[01:09:26] Jacy Reese Anthis: I love the moment in research when you get new results, especially in social science, where things are really complicated and you just sit there with your team mulling over reading the tea leaves, running different analyses, and like genuinely uncovering important truths about the world. And I think a lot of researchers are so focused on publishing a paper or getting to their next career stage that they'll kind of jump over this stage very quickly.

[01:09:50] Jacy Reese Anthis: But I mean, I would spend all my time in that stage if I could.

[01:09:53] Jacob Haimes: Um, and then what is overhyped in AI safety and what needs more attention?

[01:09:59] Jacy Reese Anthis: Yeah. My overarching answer to this is technical approaches that attempt to constrain the AI in some formal way. Uh, hard and fast, you know, are, are overhyped and have been conventionally the focus of AI safety. But sociotechnical questions about how we relate to ai, how dynamics over time emerge, uh, how values are operationalized and how exactly that affects things, the whole gradient of outcomes that we could get.

[01:10:21] Jacy Reese Anthis: That's not just extinction or non extinction or p doom versus p not doom. Uh, that all needs much more attention.

[01:10:29] Jacob Haimes: Awesome.

[01:10:30] Jacy Reese Anthis: Yeah,

[01:10:31] Jacob Haimes: uh, I just wanna thank you again for, for joining the show. I really appreciate you taking the time. Uh, I, I, I really liked having you on!

[01:10:40] Jacy Reese Anthis: Yeah, this was great. Thanks so much for having me, Jacob.

[01:10:42]