Computer Says Maybe

Audrey Tang has some big ideas on how we can use collective needs to shape AI systems — and avoid a future where human life is seen as an obstacle to paper clip production. She also shares what might be the first actual good use-case for AI agents…

Further reading & resources:
  • 6-Pack of Care — a research project by Audrey Tang and Caroline Green as part of the Institute for Ethics in AI
  • More about Kami — the Japanese local spirits Audrey mentions throughout the conversation
  • The Oxford Institute for Ethics in AI
**Subscribe to our newsletter to get more stuff than just a podcast — we run events and do other work that you will definitely be interested in!**

Post Production by Sarah Myles | Pre Production by Georgia Iacovou

What is Computer Says Maybe?

Technology is changing fast. And it's changing our world even faster. Host Alix Dunn interviews visionaries, researchers, and technologists working in the public interest to help you keep up. Step outside the hype and explore the possibilities, problems, and politics of technology. We publish weekly.

[00:00:00]

Audrey: For you, the utilitarian, you actually have to calculate like how much happiness, how much wellbeing, how much quality adjusted life expectancy. And if you can't measure

Alix: it, it doesn't matter. Are

Audrey: you actually effecting, uh, to save this drowning child? Right.

Alix: One unit of child saved. Yeah,

Audrey: exactly. Athlete.

Alix: All right. We were at MozFest last month. I had some amazing conversations, which we are releasing. On our feed all this week if you miss MozFest. We also made a special episode that hopefully will make it feel like you are actually there, and that's on the feed right now. So if you haven't heard it Copak and give it a listen.

This is Audrey Tang. I would describe her as the master of open source AI and collective decision making. And she's here to tell us all about her [00:01:00] six pack of care.

So you're in Oxford now and you're doing a fellowship. Where is the fellowship?

Audrey: Right. It's in the ethics in AI Institute, part of Oxford philosophy physically in the new Schwarzman Center, which is the only building in Oxford that opens to the general public, including a podcast studio and performance theater and things like that.

Alix: Nice. Okay. And I think most people that follow AI ethics and questions of AI governance, when they think about Oxford and they think about philosophy and they think about ai, they do think about Nick Bostrom, and they do think about effective altruism, and they do think about sort of existential. Risk debates, I presume that is not what you're working on.

Do you wanna say a little bit about how this center fits within overall Oxford's work on this?

Audrey: I mean, I visited Nick and also Anders Sandberg and Toby Ord, the usual suspects, and before I started [00:02:00] working Oxford, and I think. The main challenge of the effective altruist position to a policymaker like yours truly is that it's like advocating for reinforcing cabin doors on airplanes before the terrorist attack of September 11, reinforcing airport security.

I mean, these are all very good, but then you spend lots of money and lots of effort, and your success is literally nothing happens. Which is not very sustainable for policy makers. And so I think I, I'm thinking myself as more of a marketing department for their ideas. So instead of saying that AI should not just maximize one number, which is utilitarian or follow some abstract universal thin rule, which is the antics we're saying, AI should instead foster the relational health.

Between communities, which is the care ethics. So the project's called Six Pack of Care and has an advantage of. [00:03:00] Paying dividends along the way. You can use AI systems today steer by communities to foster health of, uh, depolarization. You can do it better coordination. You can make sure that people from different community, like around creation care for the biblical spiritual ones and around climate justice actually understand each other to do social translation and so on.

So these are real dividends and you. Prevent extinction risk kind of as a side effect, which is what people like to hear.

Alix: Yeah, that's interesting. I mean, I think another angle of why people have taken issue with the approaches of. The existential risk folks is, 'cause a lot of them come from a type of thinking of sort of quantifying humanity in a way that connects quite deeply with eugenics movements and sort of racialized approaches to managing social systems.

So I wonder if you could talk a little bit about how the six pack of care, does it engage at all with these deeper politics or is it more about trying to. Give texture and nuance to measuring and [00:04:00] understanding and improving the way that AI is integrated into social systems?

Audrey: It certainly does, because I think the main thing with care ethics is that it starts with attentiveness, which is noticing a need and translating that into a moral duty, which is very different from a consequential or to deter calculation because for the utilitarian, you actually have to calculate like how much.

Happiness, how much wellbeing, how much quality adjusted life expectancy. And if you can't measure

Alix: it, it doesn't matter. Are

Audrey: you actually effecting, uh, to save this drowning child? Yeah. Right.

Alix: One unit of child saved. Yeah,

Audrey: exactly. Uh, and while, uh, then of course the care ethics is about. Perceiving that need actually makes it obliged for you, uh, to act like immediately for that particular context.

Now, the usual criticism against care ethics, that is to parochial, uh, interesting. That is to say care only [00:05:00] about the immediate moral scope. But for AI systems, that's actually a plus. The whole risk profile of a AI seeking power to maximize the number of paperclips in the universe is because that is not parochial.

That it is not it's universalized. Confined, right? Confined in this moral scope. On the other hand, in Japan, I was just in Japan, they have this idea called a local ka ka is written in kanji. It's the same letter in kanji as God. Uh, but it's a very different kind of God. They have a, for example, a river coming that take care of the health of the river system, a forest coming that take care of a particular forest, a village Kami that take care of the people in the village, which is called AMI and so on.

And so the Kames, they don't report to some all seeing, all doing, being, there's no hierarchy. It's only like Illinois Troian overlapping subsid between those commies. So if we train our [00:06:00] AI systems. To make the relational health of a hyperlocal community, its concern then it has no power seeking incentive and therefore would not actually universalize or maximize paperclips.

So that is why I think clear ethics is uniquely suited, uh, to tackle the mission ethics problem, because then it will not, as you say, uh, try to quantify universal happiness, which is a premature optimization, which as we all know is root of all evil.

Alix: I mean, I love this concept of like localized, attentive care to particular relational challenges, because I think that is one of the main problems of looking at technical systems through a universal, like what are the, what are the, what are the things we can scale in terms of rules and governance?

But I am wondering, so when you say, 'cause when you're describing the problem of AI being power, seeking to make paperclips, my first thought is it's not the AI that's doing that. That there are people that are deploying these systems and have these ideologies. They have these, um, ways of, of situating themselves.

I [00:07:00] ascribe the agency to the people that are building the systems rather than the systems themselves. So I wonder if you could speak a little bit about. When we think about the companies or the governments or the builders that are making these technologies, sort of how, within the six pack of care, how do the people building the systems relate within that?

Audrey: Yeah. I want to use a very specific inattentive system. Uh, yeah. Great. To make the point. As we know, the social media recommendation algorithm has resulted in a very high PPM. I don't mean carbon dioxide, I mean high polarization per minute environment in information ecosystems online. This is now we know a direct consequence of trying to maximize engagement through arrangement.

Whether the designers of those algorithms of the like button or the repost button actually intent. Polarization is almost besides, uh, the point. The point is that if you give a algorithm a maximization [00:08:00] doctrine of trying to addict people to the screen, it inevitably find, and it's independently discovered by multiple recommendation engines that people spend the most time on touch screens if they argue with other people and see other people as evil, right?

Um, and so the attentiveness here means that we need to make sure. That people can, instead of the, um, covert preferences that get extracted and struck mind from the social fabric by those recommendation engines, we need to replace them with overt preferences. So imagine if in your social media feed, you can talk to a feed and say, I would like to see more of this, and it knows what that intentional preference means, and translate that into what's called model specification.

A text file really about how recommendation engine should work for you and your community. And then you can reason about it. And each recommendation comes with a citation on particular values that you put in those preface. Then that's pretty [00:09:00] attentive to the community and we're building that as part of the Green Earth project for the blue sky ecosystem.

Alix: I'm loving the colors. So what are the other six? So there's, when you say six, six packs pack. Yes. Tell me what are the six.

Audrey: Yeah, there's a really nice, uh, comic, uh, illustrated by my favorite illustration artist, uh, Nikki case. Okay. So we've got some bubbles. I love this. Yes. So the six pack is, is a pun because we talk about exercising care.

Which is like going to a gym. Mm-hmm. A civic gym.

Alix: Oh, like ab, I was thinking six pack of beer. Right. Okay. Got it.

Audrey: And also it's very portable. Yeah. You can deploy any context. Yeah. So it's also like six six pack and pop one out. Yeah, yeah, yeah. Exactly. Yes. Right. So it's six pack in both sense. Yeah. Okay.

Right. And so we talk about attentiveness, which is actually listening to people, and not only the popular, but also small underdogs. And it talks about aggregating. Uh, the preference from groups. If you pull people [00:10:00] individually, people tend to be on the extremes. Uh, like in B yes, in my backyard, nimby never in my backyard, but if you pull people in groups of 10, say, everybody become like, nimby, maybe in my backyard.

If you do this, if you do that, and if we capture this group dynamic, then the AI can actually. Make the majority people a bridge toward the minority people and to build the uncommon ground, the surprising common grounds between the majority position and the minority positions. So the way we aggregate, uh, really matters and as a function of the space of a attentive space.

So this is the first pack. Okay. And that actually listen to people actually listening to people. Yes. Um, and the second pack is called responsibility. It's taking responsibility on the promises that a AI system, uh, delivers, uh, to people. And that means actually committing to change as the situation changes.

That means making credible promises. For example, the model specifications or [00:11:00] so-called constitutions of AI systems, are these just. Advertisements of what we would aspire our AI system to be, or are they actually binding commitments that you can put a stake on so that if it's violated, there is a way to go fully accountable and it requires citations.

Like if a AI system uses a certain model spec. And then it does a judgment based on that spec. We need to demand the thinking trail with full citations of why it's making such a decision. Which, um, only I think last week OpenAI did release the safeguard model that would provide a citation for such decisions, but prior to that it was completely opaque.

You really don't know whether the model spec of the chat by you're interacting with is actually a model spec. They post on GitHub, but now finally, there is a way to actually hold them to that commitment by demanding citation. So that's actually keeping promises. [00:12:00] What

Alix: I really like about promises as a frame, so actually keeping promises, is that it's.

A propositional positive version of what often gets framed as accountability. 'cause I often when people are like, we should hold the models and the people that deploy the models accountable. And it's like, but to what? Until you've made promises, it's really hard to then say you didn't follow through on your promises.

So I really like that as a what do we want rather than a, we want systems where when people do what we don't want, there's accountability. Um, it's, it's a positive way of modeling this environment of. Setting rules and expectations in a cleaner and more transparent way. So I, I just really like it. Right, right.

Yeah.

Audrey: And, and, and, and there is this very consistent, uh, turn, instead of saying that something is. Bad, which is like a deontic, like it's against the rules. The care ethics say something is more attentive, something is, uh, more responsible, something is more competit. So it's almost like six different pillars.

Uh, and the AI agent and AI system just satisfies them like good enough. [00:13:00] In each of those pillars, and if it's not good enough in any point, then there's a loop at the care loop that makes the responsiveness feel as actually including the vulnerable people as they're experiencing the injustices or harms, but in such a way that it very quickly ameliorate those harms.

So the first four packs are actually a loop. Listening to people keeping promises. Let the people check the process and let the people evaluate the results. And if the results is actually not what the people had in mind in the first place, then you go back to actually listening to people. So the first four pack, the attentiveness, responsibility, competency, and responsiveness for me.

Alix: So what happens if someone breaks trust in a relational ethics system So consistently? Isn't doing one of these things, or at the end of that check results, there's a mismatch between the promise and what actually happened. Mm-hmm. Mm-hmm. Mm-hmm. What, where do, where do consequences come? Come in here.

Audrey: Yeah, that's the fifth pack.

[00:14:00] Okay. All right. The solidarity. All right. Right, so the fifth pack, talk about the infrastructure that makes portability easy. And therefore positive, some games possible. Again, using social media as an example, the state of Utah recently passed a law that says starting next July, anybody can take their, for example, TikTok account to, I dunno, blue sky or to social.

And in doing so, they don't lose their community. And the old platform, just like if you switch telecom, you get to keep your number, number portability. So protocols,

Alix: not platforms, right?

Audrey: Exactly. Yes. And so this kind of forced mandate on portability. Interoperability, I always say it's like building information.

Super highway, yes, but always with off ramp, uh, and then on ramped so you can switch, right? And that removes the tension for the dominant platforms to what query doctors say in certify, uh, the user experience and just squeeze people because they cannot individually move away. So [00:15:00] with solidarity or mandatory portable, uh, interoperability, if, uh.

Platforms start becoming careless, then a competing platform with more care can retain the same context, like a human context protocol, like the number portability. So if you switch, you do not actually lose your community and your relational health, and that will make the competition to the top, not to the bottom of the brainstem.

Am I making sense?

Alix: Yeah, you are. I mean, I was, uh, talking to Cory Dr. Recently about. This as a, a way of structurally changing the sort of the playing field for, uh, competition between platforms. I was thinking a little bit about what it requires of individuals. So in that context, I have to have a bad experience that's so bad that I'm willing to put time and energy into understanding the underlying technology and my options for other platforms and then make the move.

And I wonder, like when we think [00:16:00] about. I don't know. I feel like I'm a nerd that will definitely spend time trying to figure out like what is the PDS that I wanna use in Blue Sky and how does that express my preferences in terms of, uh, uh, within app proto like I'm interested in in that, but I feel like most people aren't.

And I'm wondering if you have any reflections on. The kind of tension between making something possible within a system and making something easy in a system and where there might be. Yeah.

Audrey: Well, uh, that is, I think because, uh, Utah and the EU digital markets like, um, probably going to expand on social media have not yet taken effect.

If it had, then it would be exactly like how podcasts revolve around the RASS protocol. At the end of podcasts sometime people say, find us wherever you find your podcasts, which means that. Ordinary lay people actually have the empowerment to find a preferred podcast listening platform that's not dictated by the recorders of podcasts as we're doing here now.

Right. So I don't say, send

Alix: me a Thunderbird. Right, exactly.

Audrey: Yes, [00:17:00] yes, yes. And, and so I think in protocol based innovations, people. Don't need to think about switching because switching by itself is taken as part of that protocol. Interesting.

Alix: Yeah, and it's, and it's, it's part of the expectation of being someone seeking content is that there's a multiplicity of ways that you can do that and that there's an expectation that you are gonna make those choices.

Like you actually have to pick. Although if you have iPhones, I feel like most people use Apple Podcasts, and I feel like there are path dependencies within the selection of the platforms. But I see what you're saying that. We've created that expectation and it hasn't been too onerous for individuals to find podcasts

Audrey: and, and even if they use Apple Podcasts or Spotify, the competition dynamic is such that if they try to inify their experience, they know, you know, the customers will just walk away.

And I think, uh, the technical difficulties in actually going through with the migration. Is actually being ameliorated by, uh, vibe coating.

Alix: Yeah.

Audrey: Um, and yeah. [00:18:00] Uh, and that's the six pack. The, okay. The symbiosis part that is to say, uh, there can be coni or relational agents that specialize in moving between communities.

Right. So, uh, in digital deliberation, a good example is that one community can have a conversation around climate justice, another having around creation care. And then you can have acom e that makes the arguments by doing social translation so that the good arguments here get socially translated with all the.

Preserved into another like biblical, uh, community. So both sides can see that they're talking about the same thing. And even the mundane things like transferring your email account or things like that can be massively helped if you have the kind of common agents that are just, you know, uh, super porters between those, uh, platforms so that you do not have to, as a individual, manually, uh, learn the n times m uh, combinations of the move.

But can. Then just, you know, say that, okay, I want to move, then I move.

Alix: [00:19:00] That's literally the first use case of agents. I've heard that I really can see being very valuable for normal people. Um, like literally in the last year, hearing everyone talk about agents. Yeah. And,

Audrey: and that's because it's so bounded.

Alix: Yeah.

Audrey: Right. It, it doesn't, it's very specific. Try anything else than just translating between providers.

Alix: Yeah. It's enabling something that's so discreet and structured that. I can't really imagine hundreds of thousands, millions of people in a way that fine tunes a process that, um, dramatically increases the chances that it's correct in a way that I imagine agents being incorrect and it terrifies me.

Audrey: Exactly. And, and that's. Some recent research about tiny recursive models show that you don't actually need a 7 billion or even 7 trillion parameter model. You, you can just use a 7 million parameter model, which fits on the phone anywhere if you know what they're doing. Uh, and this is a case where we know what we're doing.

It doesn't really need to memorize due to or something.

Alix: Yeah, it's a set of steps, but then it gets me back to like, why are [00:20:00] we using. Transformer architecture for something like that when we could probably make a small, like we used to make those little small apps that did that one thing. And you never know you need it until you have that moment where you're like moving between two operating systems.

Right. Or like doing one thing and

Audrey: doing well. Exactly. Yeah. Yeah. The Unix pipe. Totally,

Alix: totally.

Audrey: Yeah. And, and in the tiny recursive model paper they talk about, uh, with just. Two layers or three layers. And so it's not just easier and more energy efficient, but also it's very easy to interpret. Like if it goes wrong, you literally know, see where it goes wrong.

Yeah.

Alix: Yeah. Super interesting. Okay, super helpful. Um, I would love a link to the six pack of care. It's

Audrey: at six pack.care.

Alix: I knew that you were gonna have a link for that. Amazing. Um, and then my last question is just when you're thinking about 2026. Like what big questions are you gonna be working on as part of your fellowship?

Or just more broadly, like what are the questions you think we should be asking ourselves as this new year approaches?

Audrey: Yeah, I think 2026 really excites me because I [00:21:00] think 2025 showed people around the world that people are collectively tired of the peak, PPM polarization per minute. Uh, I've been talking with people on both sides of the political aisle, uh, not just in the US but everywhere, and they're.

Collectively tired of being hijacked by the 5% of each extreme. Uh, and depolarization at scale is finally a thing. So last year we wrote a paper about pro-social media and we're happy to see that not only blue sky is adopting that with. The Green Earth Project, but, uh, Elon, uh, with x.com is now switching the main feed of x.com toward a grok powered prosocial feed, using the same bridging mechanism as company notes, which is the basis of our paper.

And so I think we're at a point where it's actually not more risk. For big tech to try pro-social technology that are attentive, it's less risk because people see the ozone being depleted and collectively don't want [00:22:00] it. And also it's cost saving because there's now sufficient amount of open source implementation.

So that they can simply adopt and even participate in, for example, the roost, the robust open on safety tools, um, introduction of the safeguard model with open ai, with full citation capability, it's actually cost safer to go with such open source models.

Alix: Mm-hmm. I have so many questions about how GR could possibly be part of a pro-social technology future, but I think we should leave it there.

This was really, really great and we'll drop a link, um, to six pack of care in show notes whenever they come out. And this was lovely. It was really nice seeing you.

Audrey: Mm-hmm. Till next slide. Okay. And prosper

Alix: As usual, thank you to Sarah Myles and Georgia Iacovou, and a special thanks to Mozilla for letting us. Take up space, uh, at their festival with a little recording studio and a little gazebo. And thank you for the audio engineering team that helped staff it. It was a very lovely experience. And last but not least, in our little [00:23:00] suite of Moss Fest conversations is the CEO of the Onion, Ben Collins.