Jacob Haimes:
Welcome to the Into AI Safety Podcast, where we discuss the challenges facing the field of AI safety and those who work within it. This show aims to provide the foundation and resources needed to get up to speed on safe and ethical ai. As always, I'm your host Jacob Haimes. 

This interview was recorded on October 11th, 2025, and sides were recorded on November 13th, 2025.

Sometimes we should really take a step back and think critically about the assumptions we make when justifying our actions. And my guest today has made a career in doing so in an incredibly rigorous manner. He is currently an assistant professor of philosophy at Vanderbilt University and a research affiliate at the A NU Mint Lab.

In his own words, his research sits at the intersection between bounded rationality, inquiry and value theory. And in my words, he tries to figure out how real people who have to live with real world limitations can improve the ways they make decisions. Our focus will be on a common idea, which underpins most of the common narratives about ai.

The singularity hypothesis. Don't worry. You don't need to know what that means yet. We'll go over that as well as the reasons why it may not hold as much water, as many think it does, what to actually be worried about and the real downstream impacts that the Singularity Hypothesis has had on both the discourse in AI and our world.

With that, I'm excited to introduce Dr. David Thorstad.

David Thorstad:
Hi, I'm David Thorstad. I'm a philosopher at Vanderbilt University. I have a couple of research programs. The one I'm talking about today is on long-termism and AI safety, and I have another research program on bounded rationality.

Just what does rationality require of bounded agents like us. 

Jacob Haimes:
Okay. And in one sentence, what are you trying to accomplish with this academic work? 

David Thorstad:
In my work on long-termism, I want people to spend more time and more money on some of the most pressing global problems affecting the world today. 

Jacob Haimes:
Awesome.

And I know that you've written a lot on long-termism and also you have a blog called Reflective Altruism. Um, and you've also worked on bounded rationality. And I believe there are a number of works that you have that are going to be coming out in the near future.

So can you just share a bit about like how you got interested in this space and where your work is going and maybe what we'll see in some of those new works coming out? 

David Thorstad:
Sure. In terms of how I got into the space, it was mostly a job. At first, I took a postdoc in 2020 at the height of the pandemic with Hillary Graves, who I think is one of the absolute best philosophers alive, and happens to be the former director of the Global Priorities Institute.

Jacob Haimes:
Mm-hmm. 

David Thorstad:
Good university was Oxford and part of the job was working on long-termism and so I got started on long-termism as part of my job. I continue working on long-termism because as I think it's important, I think it's an idea with a lot of influence and I think it's important to get these ideas right.

Jacob Haimes:
What would you say long-termism is just in case, maybe I misunderstand it, or, um, yeah, not quite thinking about it the same way you are. 

David Thorstad:
Sure. So at a minimum long termist, we'll think to quote McCaskill that making the long-term future go well should be an important moral priority of our time. Many of them hold much stronger views, that by far the most important thing to do right now is to do what we can to make the long term future go well.

And that's generally the kind of a view that I'm gonna be targeting. Um, as you weaken the long termist view, you're gonna get a view that's more and more and more plausible. I'm going to have some disagreements, but fewer. 

Just in terms of where my work's going, I have, as you mentioned, a couple of papers coming out on long-termism, making a case against long-termism and on AI safety, pushing back against some influential arguments there, the singularity hypothesis and some instrumental convergence or power seeking arguments.

And so the next thing I'm doing is wrapping them all together in a book. It's called Beyond Long-Termism. And the idea of the book is to wrap together a web of challenges to long-termism that should hopefully together be large enough to push us into more traditional, short and medium termist alternatives.

And then briefly to conclude the book, I want to make a case for some short and medium term alternative. 

Jacob Haimes:
Can. Okay. So can you just break that down a little bit for me? If, like, I personally don't have a ton of experience with the like philosophy space, so I'm just, yeah. If, if you could just provide a little bit more context.

Sure. 

David Thorstad:
Many of the objections you'll hear to long-termism are what you might call one shot objections. They'll tell you, here's the one and only one fatal flaw, long-termism. And this claim that's supposed to be the fatal flaw has to be a very strong claim because the long term is start off looking pretty good.

And so usually I have skepticism about these kinds of claims. They'll say, for example, if you just understand that duties to benefit, future people are very small, or if you just understand that some duties to repair President Justice Trump all future duties, you know, long-termism is false. 

Jacob Haimes:
Okay.

And 

David Thorstad:
so the premise of the book is, I don't want to say that. I want to make a large number of overlapping challenges that I think don't require that much to buy in, in terms, okay. Of normative assumptions. A lot of people should be able to believe them. And in terms of descriptive assumptions, I don't want to be assuming anything too crazy about the world.

And I just wanna argue that if you pack on enough fairly normal kinds of objections to long-termism, you might snowball into a case against long-termism. 

Jacob Haimes:
Okay. So you're to, just to make sure I'm, I'm following this correctly, you're pro. Long-termism in some senses, but you have also like strong reservations or, or preferences against aspects of long-termism as well.

Is that correct? 

David Thorstad:
Long-termism is usually understood as a claim, but we, what we ought to do 

Jacob Haimes:
that, 

David Thorstad:
okay, we ought to do, is to do what we can to make the long-term future go as well as possible. And it's understood comparatively as doing these things being more important or in some sense a stronger duty than doing what we can to solve poverty, to solve global health, to solve animal rights in the short term.

Hmm. And so I have sympathy with the long termist idea that the future matters, but I don't have as much sympathy with the long termist proposal that the most important things for us to be doing right now are to make the future go as well as possible. 

Jacob Haimes:
Gotcha. Okay. There's a lot to, to unpack in, in, you know, that, that whole space.

But I guess I'd like to go back just a little bit to something that you had mentioned while you were going through that, which is you've written papers on against the singularity hypothesis and AI power seeking and, and sort of, addressing some issues that you have with the arguments that are commonly discussed in those.

And so, we actually met because you did a little talk for a part on against the Singularity hypothesis, and I thought it was really interesting and you approached this problem in a way that I felt most other people weren't doing.

Jacob Haimes (ASIDE):
As the Singularity Hypothesis is the main subject of this episode. I figured it would make sense to take a second to explicitly define it. According to our guest who has looked into this far more than I have, the Singularity Hypothesis was popularized by Verner Vinge and Ray Kurtzweil in the nineties and early two thousands, and has since been picked up by various philosophers such as David Chalmers and the Reddit community r/singularity.

Instead of providing the definition outright, I find it's more helpful to build up to the scenario and then label it. At some point, we may create a system which is capable enough that it can iterate on its own design, making a subsequent version, which is better in some meaningful way in say two months.

This new version can also improve itself and do so even more efficiently. Than the previous versions. This time taking one month and then the pattern continues on. This is called recursive self-improvement. As a result of the accelerated growth I just described, the technology eventually reaches a point where it is improving at a rate faster than we can comprehend, and as a result, human society will change dramatically in a very short amount of time.

While this is the general vibe, the formal definition of the singularity hypothesis has a few characteristics. First, there is some quantity, which in the case we are primarily discussing today, is typically considered intelligence. Second, that quantity experiences a time of accelerated growth, meaning that the rate of improvement itself is increasing, not just that improvements are being made.

Finally, as a result of this accelerated growth, there will be a discontinuity in human history. If you're interested in this, I highly recommend reading David's against the Singularity Hypothesis, but I guess that goes without saying since he's literally on my show.

Jacob Haimes:
So before we get into, against the singularity hypothesis.

What prompted you to write it? Why did you think that was important to do? 

David Thorstad:
Sure. It was a methodological stance. I've been pretty firm in this space that I think it's important to write serious papers and put them past peer review by experts at good journals on either side of these issues. Mm-hmm. And what frustrated me was I was often being asked to engage with forum posts, with discord chats, with blog posts, and I write a blog.

Don't get me wrong, there's nothing wrong with that. But that wasn't at the standard that I wanted to be responding to. 

Jacob Haimes:
Yeah. I mean, yes, there is value to peer review, right? Like, 

David Thorstad:
uh, yes. So I think academics tend to, in general, have a pretty high bar for when we're gonna come in and respond. Mm-hmm. And around 2021, I started wanting to write a response to some of the AI safety worries on, the extreme end on the existential risk end.

And there was at that time, one with enough currency to merit a response. Namely the Singularity Hypothesis. Had the backing of Dave Chalmers is an excellent philosopher. Nick Bostrom is an excellent philosopher and in both written about this, so this was something I wrote about, not because it was the most popular, I think it was fading in popularity a little bit.

Then it was just because it was the only one with enough for me to dig into. Mm-hmm. And get a response out. Okay. 

Jacob Haimes:
Gotcha. And then we'll discuss this in depth, but I wanted to also, before we start there was a shift that happened during the, the workshop and discussion that we had that I witnessed.

And I was, very tuned into it at the time as well, because something similar had happened to me around that time. But I'm curious how you would characterize what the response is to the fully fleshed out argument that you present in the singularity hypothesis if someone is fully bought into the singularity hypothesis going into the conversation just so that we have that, grounding perspective.

'cause I think it is valuable to say, okay, this is something that we see sometimes, at least before we even go into it. 

David Thorstad:
Sure. I get two responses just to get the hypothesis on the table. The singularity hypothesis says there's this quantity. Artificial general intelligence, or rather, excuse me, the intelligence of artificial agents.

And this quantity intelligence is gonna see a period of not just growth, but accelerating growth. Which is gonna be sustained until we reach a discontinuity, which is meant to be a big deal, like crossing the event horizon of a black hole. Hence the name. Typically, accelerating growth is meant super exponential.

IE hyperbolic. IE gets you an aspen tote in finite time. And typically the discontinuity is meant to be intelligence, orders of magnitude beyond humanity. 

Jacob Haimes:
And so this also sets aside the, maybe issues with the construct of intelligence as well. 

David Thorstad:
Yes. So when I wrote this paper, most of the responses to the singularity hypothesis I did not like.

And that's why I thought we needed another one. There's one that I have some sympathy for, namely, it's really hard to say what you mean by intelligence. It's just people thought that one sentence killed the worry, and it really doesn't. I think Dave Chalmers made the point really well, namely take anything that's kind of intelligence like, and goes up at the appropriate rate and rerun the singularity hypothesis there.

So take a glomerate of a hundred different IQ tests and SATs and other kinds of reasoning abilities and benchmarks and say that abilities on those are gonna rise until at an accelerating rate, until performance is well beyond the human level. Mm-hmm. And they're just saying that's not intelligence.

Even if that's true is maybe not going to be very reassuring. 

Jacob Haimes:
Gotcha. So I think that's really helpful for me as well as someone who doesn't. Fully align with the idea of intelligence. Like I, I think it's, it's very problematic for a number of reasons. But it's okay if we're, we're we're using intelligence as the word here.

Um, but really we're talking about this sort of competency on, on these like reasoning benchmarks and yeah, that, that helps me sort of contextualize it as well, I think. Okay. Cool. So we're setting aside that we're just accepting, sort of increasing intelligence that's at a, not just exponential, but hyperbolic rate.

Uh, and was that defined by someone like it? Was there a, a person that has come up with. The, like this singularity hypothesis as like the formal definition? Yeah. Or is it, okay, so that's, you explain a little bit about that 

David Thorstad:
later when we talk about responses until very recently. Okay. Everybody agreed that the singularity was obviously about intelligence, but was about not just growth, but accelerating growth.

And was about reaching a discontinuity to be analogized with crossing a singularity, hence the name. And so the place to look is the first anthology on the singularity hypothesis was the Eden and colleague's anthology. It's 2012, I think it's spring, or I cited in my paper, right in the editor's introduction.

They do what they're supposed to do and they say a singularity hypothesis does three things. It specifies a quantity like intelligence, like GDP. It says the quantity is gonna accelerate and, excuse me, grow at an accelerating rate until we reach a fundamental discontinuity in human history. And that was settled.

So that's, I get two responses to my work response. Number two is the one I don't like. Is people say, oh no, we never said that. So you'll read on the internet the last couple of years people have been peddling the weak singularity hypothesis, which means artificial intelligence is gonna grow. But we never said it was gonna grow in an accelerating rate and it's gonna grow until a GI, but we never said it was gonna be, you know, orders of magnitude more intelligent than the leading human.

And, you know, that's fine if you want to make that claim, that's just an entirely different claim that needs an entirely different discussion. So I really mm-hmm. Don't wanna let people opt out of saying what they said. And I think it's telling that when you see people peddling that claim today, you look at the, you know, the footnotes who they cited the definition to, and they didn't cite it to anybody because, you know, they're not drawing torical usage of the term.

They're trying to like put the word weak in front of it and use the word weak to backpedal out of what people said. 

Jacob Haimes:
Gotcha. Okay. And then what's the, what's the other the other. 

David Thorstad:
Response is some sympathy. I think I mentioned that around 20 20, 20 21 when I wrote this, people were thinking that the singularity hypothesis is very strong and it wasn't that grounded in data.

And there were these other arguments that were getting better and the data behind them was getting a lot better. And that maybe there were just many reasons to be concerned about AI that had nothing to do with the concept of intelligence explosion. Hmm. And so I think a lot of people have been souring a little bit on the singularity.

Jacob Haimes:
Gotcha. Okay. And that's, I guess then just putting those things together is, that seems to be supported by your very thorough. Investigation into what the singularity hypothesis is, what it implies and what would need to happen in order to get there, which is the against the singularity hypothesis.

David Thorstad:
That's the hope. Yes. And I don't wanna say that there hasn't been pushback. You know, there are counter models. I think Epic AI has one. some open philanthropy people try to construct them. So, you know, there are counter models if you want counter models, but my often I've gotten the other responses rather than counter models.

Jacob Haimes:
Gotcha. And I guess another thing, and sort of building off of the the first dis uh, the first reaction that we discussed, which is people sort of put weak in front of the singularity hypothesis and say, that's not what we really meant. I have a lot of trouble with this personally because I find that

we can't really say this in a vacuum. The idea of the singularity hypothesis, the idea of this massively continually accelerating like accelerating acceleration phenomenon has had concrete impacts on policy, on what people prioritize on, whatever. And so some people might say, oh, well, the, what is it called?

Argument of does the singularity hypothesis actually, matter, does it hold water, et cetera is essentially intellectual masturbation. Is, would be an argument that could, one, could have about it. But the impact has been significant. So I feel like we can't really disregard it in that way.

And so I was wondering, you've done a lot more thinking about this than I have. What are some of those examples, like concrete examples that you see or you, you feel are, are out there about this narrative having a direct impact?

David Thorstad:
That's right. I think that it's important to stress that whatever you think of this argument, it is one of the very most influential arguments for existential risk from ai.

And that arguments for existential risk from AI are incredibly impactful right now. 

Jacob Haimes:
Mm-hmm. In philanthropic 

David Thorstad:
funding movements like effective altruism have made hundreds of millions, if not billions of dollars of annual pivots away from things like saving lives for $5,000 a life right now, and towards very expensive salaries in Silicon Valley and Oxford aimed at addressing AI safety.

So if these folks are wrong, you're gonna have a lot of philanthropic ness spending on the basis of AI worries. 

Jacob Haimes:
Just piggybacking off of that, and we'll go, we'll then go back to this like philanthropic mis spending. But there is also a consideration that like, it's so minimal compared to what is being spent by the, by big tech. So like. I don't know. Maybe it's good that some of it's going to this, also it's, you know, it's, a lot of, it's going away from well-defined interventions that we know help people right now.

Jacob Haimes (ASIDE):
So I thought that this was gonna be a pretty straightforward aside, but in doing some investigation, things got a little bit more complicated. There's a pretty decent blog post titled An Overview of the AI Safety Funding Situation, which seems to do a good job of reporting funding for AI safety. On first glance, it reports that in 20 24, 100 11 million USD went towards AI safety.

With over 50% of that coming from the charity organization, open Philanthropy. However, cross-referencing these numbers with Open Philanthropy self-reported funding information indicates that they gave around 40% more funding in the navigating transformative AI category. I'm not sure why the blog post has a number that is so much lower.

While perusing Open fills funding docs, I noticed that many AI safety orgs such as Constellation Far AI and mats were receiving funding under a different category, global Catastrophic risks, capacity building. Adjusting for this. The total amount open fill sent towards AI safety in 2024 was around 136 million USD, which bumps up the total amount spent on AI safety in 2024 to 180 million USD approximately.

To contextualize this, I checked the amount of funding that Open Field provided to organizations in the global health and wellbeing, global health and development, global Health, r and d, global Health, public Policy, and Human Health and Wellbeing Categories. Together, this funding totaled 280,000, just over twice what they provided towards AI safety.

During the same year, private investment in AI was 150. Billion, that's over 850 times more, and that's just investment in startups. Don't forget that big tech was projected to spend almost two times that on AI infrastructure alone in 2025, and bidding Wars have top AI talent receiving eight figures as signing bonuses.

Jacob Haimes:
I guess I'm just saying that like there's so much money in the AI space as well, and such a small portion of it is this philanthropic funding. 

David Thorstad:
That's right. I think two important things to stress. The first is that at least the movements I talked about, effective altruism and rationalism are very concerned about the marginal dollar.

And about doing the best thing you can with the marginal dollar. So when they're pulling away funding from very effective, very well evidenced short term challenges, they've got a pretty high bar for justifying that. 

Jacob Haimes:
Mm-hmm. 

David Thorstad:
But the main thing to talk about is one of the impacts of these existential risk arguments has been a capture of the term in the field AI safety.

Mm-hmm. To talk about safety from AI should involve any number of challenges, not just all of us dying, but things like militarization of AI things. Mm-hmm. Like massive theft of information by foreign governments. And there's been a lot of push to concentrate the term and concentrate research and concentrate philanthropic funding on the existential risks in a way that makes it harder to get money and attention for the other risks.

And also, if you're skeptical about the existential risk, you'll say, okay, the money's going to safety, but it's not actually going to the safety challenges that we need money to address. And I think that's, yes, the worry that a lot of academics have. 

Jacob Haimes:
Okay. Yeah, no, I definitely agree with that. I think actually at the time that this will have come out there will be an episode on Muckrakers where we talk about how current, how AI safety, quote unquote, AI safety has, uh, actually really been just promoting some really bad outcomes in the long term.

And then I guess like building on that and bringing in something that was relevant more relevant maybe half a year ago at this point. But AI 2027 to me sort of characterizes the concerns the mentality, the perspective of this group of people that is like fully bought into the singularity hypothesis.

Where does that fit into you? How is that related and what was your reaction as well to like having a piece like that come out? 

David Thorstad:
I was pretty disheartened by this. I think that the people involved are serious people and that they could have done better. Hmm. I think this was a piece that went very viral very quickly, um, in part due to intentional presentation in part Yeah.

Due to bringing in a blogger to write a story that almost all of the reads were views on Twitter. And if you're lucky, somebody skimmed their homepage, that this wasn't a piece designed for deep engagement with the arguments, and that when you looked deeply into the arguments, they really weren't there.

Mm-hmm. And so I was unhappy that what I saw as people who are capable of delivering really well-grounded research, were trying to package into a story research that in many cases didn't take place, and when it took place, just wasn't at the standard it would need to be to establish their views. 

Jacob Haimes:
To their credit, like the narrative that is crafted and the way that is presented with the sort of like, it's, it's compelling, um, as a story. And yeah, I think that allows for more disingenuous maybe and maybe not even disingenuous just shoddy or yeah, could have been better background research.

But so as someone who's like, obviously you care about rigor, like you, you have gone to great lengths to, you know, get yourself published in, in journals and in in know, other academic settings. And

you see this lack of rigor in maybe something that's, you know, getting way more views than, you know, what a blog post of yours does. I think, I guess I'm not 100% positive about that, but that's an assumption. Yes. Yeah. Um, and how, how do you, like what do you do about that? How do you deal with it and think like, how can we improve?

And I guess this is also, somewhat as a, from, um, my point of view as someone who is also in a similar boat of like, I'm really trying to do something, rigorous and effective and it's just difficult to reach people. 

David Thorstad:
Yeah. I think it's very important and very important to many of my opponents often to reach people, not only effectively, but with the right kind of persuasion.

Everybody knows that if you want to persuade a lot of people, you can do it fairly easily. You can persuade people that vaccines cause autism, that COVID is fake. Mm-hmm. That Trump won the election, that there's this massive crime wave in democratic cities where we need to send the National Guard.

Persuading people on the internet is not hard. 

Jacob Haimes:
Mm-hmm. What's 

David Thorstad:
hard is getting people on the internet to give you their valuable time to really think carefully through an issue. And I think if you are not doing that, you'd better be very, very sure that you're already right. Because if you are wrong, you are going to be spreading misinformation and people aren't going to catch it.

Jacob Haimes:
Okay. Interesting. No, yeah, I think, I think that's, that makes sense and I agree with that. Yeah, add it to the stack of things that cause autism. Right. Yeah, I, I can, I honestly can't even engage with that. Like, it's just like, oh, okay. I can't. Moving on from just like the singularity hypothesis it's not just the singularity hypothesis, right? There are other arguments, there are other narratives that maybe mirror or have similar patterns to this singular, the singularity hypothesis argument. What do you see them as?

Like, what do you see as I, I have some in mind, but I'm, I'm very curious to hear what yours are. 

David Thorstad:
Yeah, I'd like to talk about some of the others, if you have others in mind. I think the most obvious is the time of perils hypothesis. So people will tell me, okay, David, I don't need the singularity hypothesis to be concerned about existential risk.

But then they'll say, in response to some of my other work, here's what I do need. I need the claim. That now is a very dangerous time Risk is at 10% or 20% in the century, but if we just make it through this dangerous time risk is going to drop by four or five orders of magnitude. It's gonna drop very quickly and it's gonna stop there for about a billion years.

That's the time apparel's hypothesis on one 

Jacob Haimes:
that would be really convenient 

David Thorstad:
would be it's, it's very convenient, right? And you know, I've got fear. I'm showing this is exactly the form of the claim that they want, and lo and behold, it happens to be true. And then I say, okay, lo and behold, why is it true?

And I go through their arguments and they're like, yeah, okay, those arguments didn't work, but here's the one you David Thorstad didn't address. If AI doesn't kill us, it's gonna get so smart so fast, it's gonna guide us through, you know, our cosmic destiny. And mitigate all risks? Uh, no. There's a lot to say about that.

I think the first thing to say is if you don't have a singularity story and you want to be attributing these kinds of powers to ai, you really need to tell me where they're coming from and they're coming from a singularity story. 

Jacob Haimes:
Gotcha. So even if you yourself are like, you are like, okay, singularity, maybe, maybe that's farfetched, but X, y, Z and so because of that, we're going to get to this state.

And in that state everything will be fine. And what you're saying is when you actually connect the dots between X, Y, z and final state there is some form of singularity hypothesis happening in there. 

David Thorstad:
In the most popular argument for the time of perils. That's right. And you really need this because if you really think not just this century, but every century has 10 or 20 or 30% of existential risk, if you run the math, math, that's actually works really badly for the value of trying to improve the long-term future.

And it works really badly for the value of trying to improve the long term future just because theorem, it shows the future's probably gonna be short. And if the future probably gonna be short, we should have a party now. 

Jacob Haimes:
That's optimistic. 

David Thorstad:
Uh, yes. Now to be clear, I don't hold this view. I think risk is a lot lower than people think it is.

But if you think risk is very high now and you don't have a time apparels view. You really are gonna have a very hard time caring about the future because you just shouldn't think there's much of a future to 

Jacob Haimes:
be happy because everything's gonna be high risk forever. 

David Thorstad:
Yeah. It's like getting sick in your nineties, you know, okay, you cure this one, but then there's gonna be another versus getting sick in your thirties, you know, the balance of do you want to have fun now versus invest in the future really changes.

Jacob Haimes:
Gotcha. Okay. Yeah, that, that does make sense. Yeah, I, I guess, um, what I've been thinking about is this sort of I guess time of perils is definitely fits this, but it's this maybe a term that Igor, uh, actually introduced to me on the interview that I did with him on this podcast was, um, shoot, what was it?

It's the Pascal's Mugging is what it is.

David Thorstad:
Yeah. 

Jacob Haimes:
Where. You've sort of gotten into a situation where there's this like infinite gain on one side and infinite peril on the other, and you have to do the thing, whatever I say and then we'll get to that infinite gain. Otherwise we'll get to that infinite suffering.

And it's, so in doing so, you're sort of hijacking humans sort of predisposition to have a bias towards optimism and like using that to your advantage to get people to do whatever it is you want them to do.

Jacob Haimes (ASIDE):
To understand this little thought experiment, we should really be going all the way back to 1713 when Nicolaus Oui introduced the St. Petersburg paradox. No, this wasn't the oui behind the numbers or the theorem. That was his uncle and doctoral supervisor. The original framing introduces a game in which you flip a coin at each stage.

If the coin lands on heads, you double the amount of money in the prize pool. But if it lands on tails, the game ends, you then start with two money in the prize pool. The expected value of playing this game is winning a prize of infinite money. So a canonical rational actor in the game theory sense should pay any amount of money for a chance to play this game.

But in practice, it'd be pretty stupid to pay $20 for a single round. Pascal's Mugging is simply a re flavoring of this problem such that it doesn't technically require dealing with infinite amounts of stuff. This can get really in the weeds really quickly. And if you're into that, check out the show notes because I added a bunch of relevant links in there, including stuff about Ergodic City, which Igor from the other podcast I host mentioned to me.

But for now, I'll just hit the most important part. Fundamentally, these thought experiments are only valid if we are willing to accept the alternate and speculative realities they exist in. Why would we believe that whoever is offering this game actually has the funds to cover a hundred head streak?

Extraordinary claims require extraordinary evidence, so we should be extremely skeptical that we could win even 10 consecutive rounds.

David Thorstad:
Yeah. So the underlying decision theoretic issue here is one that my old colleague Hayden

Wilkinson calls fanaticism 

Jacob Haimes:
Okay. 

David Thorstad:
Namely most decision theories like maximize expected value, have the following feature, take any finite payoff, like a million dollars. 

And take any really small probability, like one in a quadrillion, there's some insanely good payoff, like 10 to the 950th years of happy life such that I should take that insanely high payoff at a one in a quadrillion chance rather than the sure great payoff.

So the idea is that high probability good things can always be outweighed by arbitrarily low probability. Great. 

Jacob Haimes:
Okay. And 

David Thorstad:
some people have thought that this is what's going on with long-termism, because if there could be 10 to the 40th people in the future, then if I have a 10 to the minus 20 chance of making that happen, that's still gonna, you know, in expectation, gonna give me like 10 to the 20 lives.

And that's just gonna swamp out an expected value calculation. So a lot of people have argued, and I, I don't know if I want to push this line myself, but that one of the attractive features of long-termism is that it's promising us very small probabilities of very large gains, and that a lot of decision theories tell you to like very small probabilities of very large gains.

Jacob Haimes:
Okay? And so, but why do we do that? 

David Thorstad:
Uh, so I'm not clear where I stand on fanaticism. I think a lot of people would like to reject it. Uh mm-hmm. I think to be fair to my opponents, it's very hard to find a decision theory that does reject it. So, for example, you can bound the utility function.

You can say, you know, there's some amount of value that no intervention can pay you more than that. But then when you're very, very, very close to there, you get a crazy result. Like say that 10 billion people being alive is half as good as the world could get. Then if I add 10 to the 297 power people, that's only as good as the first 10 billion people.

Jacob Haimes:
Mm-hmm. And there's 

David Thorstad:
nothing about the number 2 97. You know, I could have made it larger and larger and larger. So if you bound utilities, you really have to be pretty not happy about the value of adding additional good things to the universe. And that looks a little bit arbitrary. Yeah. And then the other move you can make there, I mean there's lots of things you can make, but you can go after the probability function.

You should say, we really shouldn't care about small probabilities. And I think that's right, but it's actually hard to say that in a way that doesn't raise a lot of problems.

Jacob Haimes:
And this, I think leads into a series that you have put a lot of work into on your blog about exaggeration of risks like particularly within the effective altruism, ea long-termism space. It seems to me, based on, looking at those, you believe that a lot of risks, not just AI safety have been blown out of proportion.

I guess I'm, I'm interested, so just like what are your, what are your top three? And then also like what are you actually concerned about? 

David Thorstad:
Good. I, so I think at least within the long termist community, most of people's probability masks for existential risk is on ai followed by pandemic. So in terms of what do I think have been most exaggerated and what have I argued against, is AI followed by pandemics?

In terms of just sheer magnitude of exaggeration, I think probably gonna get in trouble for saying this climate change is very bad. We should work to prevent climate change. That is not because it's gonna kill all of us. That is very, very unlikely that climate change would kill all of us. And so I think the people pushing climate change as an existential risk are wrong.

The people claiming that we shouldn't work on climate change because it's not an existential risk are also wrong. But I think that maybe would round out the three. 

Jacob Haimes:
Gotcha. And so just to, to make sure I'm, I'm understanding there, you're saying it's, it's not gonna kill all of us. So if your argument is that it's an existential risk, that that's bad, however.

It's going to be a big problem. It's going to be very harmful and have negative impacts, and we should be addressing it. 

David Thorstad:
Absolutely. 

Jacob Haimes:
Yeah, that makes sense. And then what is it that you do that, that you're, you are concerned about right now? 

David Thorstad:
I'm concerned about a lot of things. I think personally we're recording this in mid-October 2025.

I am mm-hmm. More concerned than I used to be about authoritarianism. I'm sitting at a university and we think of ourselves as, you know, one of the last bastions of resistance of this kind of stuff. And precisely because of that, my government is coming after us. So, to give you an example, I'm sitting at Vanderbilt University, which is currently deliberating, um, president Trump's compact for preferential treatment under some.

Fairly unprecedented conditions. And because I'm untenured, I currently don't want to say anything more about my opinion about the compact, and I think that might maybe say more about where we are than 

Jacob Haimes:
anything that you could say. 

David Thorstad:
Yeah.

Jacob Haimes (ASIDE):
With all that's going on regarding Trump and his regime, I mean administration, it's easy to lose track of important happenings. This one is called Trump's Compact for Academic Excellence in Higher Education. In early October, letters were sent to nine universities, including MIT, university of Arizona and Vanderbilt, asking them to provide feedback on this compact.

Although it also seems clear that the intent is for the university is to sign it, most of the nine have already rejected doing so publicly. This request and the wording of the compact is nothing short of a threat to these institutions, including the statement that the universities are free to develop models and values other than those below if the institution elects to forego federal benefits.

The requirements include stopping DEI programs enforcing strict sex at birth based gendering, instituting limits on accepting international students, and contains rather explicit viewpoint discrimination by singling out conservative ideas for protection. On November 5th, there was a protest held by Vanderbilt faculty and students who are quite disappointed in the response that Vanderbilt leadership has had as.

They have not explicitly said there is no way they will sign it like many of the other universities. I guess we'll have to see how this one plays out.

Jacob Haimes:
And the part that, the part that is really freaky to me from that is that we are not even a year in,

David Thorstad:
yeah. So, you know, when we hear these worries about artificial intelligence, entrenching stable authoritarianism and stable, uh, totalitarianism, of course technology is a tool for authoritarianism and that's a worry, but 

Jacob Haimes:
mm-hmm. 

David Thorstad:
You can be concerned about authoritarianism without telling narratives about hypothetical futures.

And that this might be an example of one of the spaces where the focus on the very most extreme risks or the very most speculative future risks is sometimes coming at the expense of concern about, or even sometimes support for policies and people causing risks in the present. 

Jacob Haimes:
Yeah, I mean, yeah, I mean, I totally agree.

I guess one thing as well would be like stories that we're hearing right now that are actively happening right now. One could argue that not too long ago, they might've been considered as something too farfetched to happen in the near future. And so yeah, I, I, I think that's also telling as to like, what can happen now and where to set our threshold.

It, it's easy for that threshold to. Climb up as things worsen, I think. 

David Thorstad:
Yeah. And to be fair pushing, I think a little bit away from what concerns me and my private life, there are major concerns that are probably more pressing, namely poverty. 

Jacob Haimes:
Mm-hmm. 

David Thorstad:
Many, many people still live on less than $2 a day, not $2 in their currency, $2 in purchasing, power parody.

Whatever you can buy for $2 is what they can buy every day. Many people over a million die every year, many of them children from entirely preventable tropical diseases. Mm-hmm. We know how to address all of these things, and we're not doing it because we don't want to shell out the money. So at the same time that I complain about some of my civil liberties being taken away, I don't want to minimize the extent of some of the global challenges like poverty and like global health, which probably the scale of things would be more important.

At the moment. 

Jacob Haimes:
That makes I, that does make sense. I, I definitely feel like, personally feel the authoritarianism aspect, more pressingly as well. But I, I do think that, you know, we need to be more cognizant of what, as far as I can tell, like, effective altruism was, was more into maybe a decade ago, um, a little over a decade ago with regards to buying malaria nets and, and things like that.

Again, like with regards to what I'm concerned about, it's not just the authoritarianism but like the extreme concentration of wealth and power. So making the poverty. Issues worse and more expansive, I guess. So currently it is in some locations expanding the locations where this extreme poverty is happening disempowering people and sort of also in turn entrenching authoritarianism.

My hope is that you have arguments to make me feel better about that. Is that true? 

David Thorstad:
No, I'm also concerned, I have arguments that I will make in my book to make people work on these problems, but, um, I don't think I want to talk people out of concern for poverty or for authoritarianism or for concentration of wealth and concentration of power.

And I don't think that these are particularly difficult problems to demonstrate. So if. F listeners ever find themselves doubting whether a given concern about artificial intelligence is gonna come to pass, or whether a given contribution is gonna help or hurt that, I don't think there's much reason to doubt that we know what to do in many areas to improve the lives of the poor and improve the lives of the sick to promote democracy.

And so these are always fairly safe things that we can do. 

Jacob Haimes:
Hmm. Okay. And well that's unfortunate for me. I, I wish there was an easy answer. But

I guess going into like how can we address these? What maybe are the ways that or like the. Overview of the arguments for like why to address them and, and how to address them. And potentially even ways that groups which believe that they're doing good might be like contributing towards those issues as well.

David Thorstad:
Honestly, I think there's a lot of very important local debates about particular solutions, but I think a very good place to start is following the money to ask, Hmm, are our interventions taking money from wealthy nations and putting that money in the hands of people who need it? And poor nations are they taking money from wealthy nations and putting that money in the hands of Silicon Valley and in venture capitalists.

And I think the answer to that question can tell you a lot about the implications for. Global poverty and global health of what you're doing. So if you are, for example, buying malaria nets and shipping them for free overseas, or if you're literally making cash transfers overseas, this is probably reducing the concentration of wealth and power.

If you are paying into a research lab in California, you're probably increasing the concentration of wealth and power. And so I think it's often in these spaces, relatively clear what kinds of things you would want to do if you were more concerned or less concerned. There's some areas at the border, so of course when you get into a space like animal advocacy, a lot of people really want to give to local organizations that employ local, not only workers, but local leaders and a lot of foreign philanthropists think that they can more effectively do good if they give money to NGOs operating abroad.

And you know, these kinds of debates get much more. But certainly when we're thinking about paying money to AI researchers versus sending money to countries that are not currently benefiting from the AI boom. I, I don't think it's difficult to say which one is going to reduce what And power concentration.

Jacob Haimes:
Okay. Yeah. But so many of the people that are in this space that's how they're getting paid. To an extent that's how I'm getting paid. I mean, I'm not getting paid for this. Yeah. But how, what does that take to sort of, sort of, um, grapple with that and how do you think someone goes about, about doing that?

David Thorstad:
It's hard. Uh, so I should be upfront. I took a postdoc funded by money from effective altruists primarily. I am currently on a grant from the Survival and Flourishing Fund is coming from Silicon Valley. I've taken many grants of this sort. So when we talk about the influence of money, especially on an industry like academia that doesn't have very much money, does have a lot of power, and it is one of the reasons that some so many people get into the space and it's hard because if you can't reduce the flow of the money.

There's only so much that can be done to stop the space from developing. That's why philanthropists invest in an area is because they know they're gonna build it up. And so we really need to be having the debate that the effect of altruists wanted to have, namely for the folks who've got money and wanna spend it, where should they be spending it?

And I think we need to convince some of those folks to spend less on existential risk. And then, you know, fewer people will be working on existential risk simply because their organizations haven't been funded. 

Jacob Haimes:
Okay. Yeah. But also then, isn't it sort of like a problem that there are so few people that have that authority that are controlling such a significant swath of philanthropic funding?

Which not just philanthropic funding, but also then funding for entire fields. Like how do we address that? Or are you just sort of saying like, we have to get them to change their minds? 

David Thorstad:
I think this is very important that attitudes towards philanthropy have changed a lot. When Rockefeller wanted to set up his foundation in the 19th century, he wanted to give a bunch of money to set up a charitable foundation.

And Congress said, no, you can't do it. And they said they can. You can't do it 'cause we don't like you, we don't trust you, and we don't want you pushing around the economy. 

Jacob Haimes:
You're goddamn right. Where can we get that Congress? 

David Thorstad:
We've changed to a place where the image of the philanthropist is rightly thankful for the cash flow that they're providing, but less worried about the ability of philanthropists to magnify their influence.

Through philanthropy and pretty good in the United States, philanthropy is tax favored more than it has been almost anywhere in the entire history of the world. So our legislation is not just permitting, but actually magnifying the impact of very large scale philanthropy. Mm-hmm. And this can really move fields.

So when you have, for example, a very small academic field researching, say AI ethics or AI safety, and very large funders come in at one end of the field, all of a sudden everybody is saying what those very large funders want to hear. And that isn't to say that there's necessarily something wrong with the exercise of power in all cases, but I think people need to be really cognizant of how much power they exercise when they move these amounts of money.

And the concern that even if you're wrong, you might get a lot of influence with your money rather than through the strength of argument. 

Jacob Haimes:
Yes. And a lot of people won't realize it too, right? Like it's easy to convince oneself that the best thing is the thing that also works out for you. 

David Thorstad:
Yeah.

So for example, um, say you were to donate hypothetically $600 million to a bunch of folks in Washington trying to advocate for certain cause. I can predict that more people are gonna care about that cause and there's gonna be more legislation about the cause. And it does not matter the merits of the cause.

I do not need to name the cause, not just because everybody knows it, but because it doesn't matter the cause. And that's not to say that arguments don't also matter. But money matters too, and money doesn't always move in lockstep with arguments. 

Jacob Haimes:
So what, like,

again, though, that that sort of, at least from my understanding of what you just said, like puts it back on those individuals who, who have that power already. At least in the current state of things. I guess the other option is legislators, but actually even if they could do anything, it's not like it would matter because of the Supreme Court.

David Thorstad:
I think you're right. I think at least in the United States, we're in a regime where people are very, very hesitant to intervene on philanthropy currently. Pretty much, if you're not scamming anybody. And you're not falsely claiming to be a religion. There's very little they can do. You cannot shut down a philanthropy because you think they're inefficient or because they're, um, pushing views that are scientifically discredited.

And I don't think we will be moving into a society like that anytime soon or maybe even should. So I think theres a, a lot of need for people who made their money in markets that will definitely tell them when they're doing the wrong thing. Moving into philanthropy, which is the one market where everybody tells you what you want to hear and you get no feedback to make sure that they're offering a competitive project pro uh, uh, a competitive and like useful intervention because the market will no longer tell them if they're offering the wrong thing.

Jacob Haimes:
Mm. Okay.

And would this then go along the. Mentality of, I believe, like rethink priorities. And some of the, I'm like going back to I believe some of the earlier ea kind of things earlier, you know, like 12 or so years ago of like cost benefit analysis comparing, you know, whatever you're doing to just a direct cash transfer to the same population or something like that.

David Thorstad:
Yes, I think that doing explicit analysis is important and also engaging with critics. So something I really like about this movement is that they know there's a very large chance that they could go wrong and they're uniquely interested in listening to people who think they're wrong. To the extent that I've had grants from Open Philanthropy.

I have a grant from the Survival and Flourishing Fund. I worked at the Global Priorities Institute. I have given talks at many effective altruist conferences. And this happens because I think these movements, at least to some extent, understand the dangers of getting into an echo chamber where you don't get any feedback or you don't get the right feedback.

And I don't want, but I feel like that's so much right. Certainly I don't what, 

Jacob Haimes:
sorry. I, I just, I feel like that's so much what's happened though, that it is an echo chamber that ideas, that are being presented are not being given the same amount of like, rigor that they should be. Um, there's like a lot of group think and a lot of

Almost digging in to arguments that have sort of been established just because they're established. So like I do see what you're saying. There are a decent number of people that are cognizant about these things and are trying to engage, but then there's, there's also a significant number of people who I don't, I don't feel that the same way about, and I'm, so, I'm, I'm just having trouble like squaring those two things.

David Thorstad:
Yeah. So to be clear, when I say that some movements do a fairly good job trying to be aware of their limitations and trying to answer to critics, I do think it's important to bear in mind just how hard it is to avoid. Echo chambers to avoid group dynamics, to avoid interpreting evidence. To favor views you already hold.

Mm-hmm. And that when you do things like consume the same media, right on the same forums, live in the same houses, this can get a lot more difficult. But at the same time, I don't wanna put this too much on the shoulders of any one movement, because it's not as though, I think some people have biased beliefs about ai, but nobody has biased beliefs about vaccines.

You know, this happens everywhere in our lives. It's just a reminder that even some of the more exemplary practices of trying to break out of our own epistemic comfort zone as it is, maybe are not always gonna be enough to counteract the kinds of believers we are. 

Jacob Haimes:
Okay. And.

You, you mentioned like this isn't just about ai but even just within ai there's also, other subgroups, right? There are the inherent, uh, or, or perpetual like hypers promoters. And they sort of fully engaged with a very different side of the same argument, I think. Do you have any thoughts of like, so a lot of this conversation, a lot of like the against the singularity hypothesis and that sort of thing is, is more focused towards the, yeah, the ea long termist like existential risk. But what about the people who have like bought into the singularity hypothesis and are like, singularity, let's go. Acceleration is, is the key. Often libertarians are closely aligned with this. You said like, what? Yeah. What, how does, what do you, I like, I don't know how to construct this question, but what do you think about them?

David Thorstad:
It's a good question. I think mostly I should engage more with these folks. Just the sociological fact. I started kind of working with the people on the other end of the spectrum. I think I'm going to wanna say the same thing in response to them, that I'm gonna say in response to the folks concerned about existential risk.

Mm-hmm. Namely, there's a shared and very optimistic technological assumption about the speed and degree of progress we're gonna see in ai. And this very optimistic technological assumption might be a pitfall for both groups. And so at this, so it sort of goes 

Jacob Haimes:
back to that thing that we were talking earlier where there's the like infinite payoff, there's the, um, singularity hypothesis to the time of prosperity, uh, or whatever.

David Thorstad:
Yes. And so the thought is that, of course we can have a debate about if radically super intelligent artificial agents came, would they kill us? Would they save us, would they make life meaningless? But we also need to ask about the, if namely, why do we think radically super intelligent AI is right on the horizon?

And I don't always see the kinds of grounds being offered for that claim that I'd like to see. 

Jacob Haimes:
Gotcha. Okay. So then when it comes to like much more recent, so contemporary sort of conversations, this is October of 2025. Um. Starting in August maybe, uh, maybe a tiny bit before then. But with the

release of GPT five people began to say again that like, well actually not again. 'cause last time it was a little bit of a different claim. But like, there seems to be a vibe shift around essentially everyone that was talking about AI embracing this sort of singularity like perpetually improving story and where we're at now.

Where does that fit in and how do you, do you think that this is gonna continue, that people are gonna continue to sort of, reveal, like pull off I can't speak

continue to show that there is this sort of facade. Uh, or do you think this is just a little bump? Um, is there going to be an AI bubble burst? Basically, 

David Thorstad:
honestly, I'm pretty hesitant about forecasting here, at least about my ability to forecast. I think in the very early days, the fifties and the sixties, people made wildly optimistic claims.

And when those didn't happen, when we didn't build AI in a summer, all of the responsible AI researchers, certainly all of the academics stopped making very strong speculative claims about the future. And I think very recently it's come back into fashion for people to use their platform to speculate on where AI is heading.

Mm, mm-hmm. I think if any of us really knew we would be trading on Wall Street right now and not telling you any of what we knew. And I think that the fact that I'm not trading all my money on a particular hypothesis about AI right now probably tells you that I should hold off from speculating.

So I think maybe we need to think about the fact that our vibes are changing on a monthly basis, and maybe take that as a reason to reflect on the basis for our vibes more than, as, you know, an opportunity to argue for our favorite vibe. 

Jacob Haimes:
But it's like, it's just like two months out though. 

David Thorstad:
It's, it's hard and it's very hard because many of the most responsible people still aren't forecasting and they're being drowned out because they won't.

But the problem is, people who are skeptical of AI progress have also been putting their foot in their mouth. Because they've also been making ungrounded forecasts and they've also been wrong. So, not to name names, but one way to make the people who think we're gonna make a lot of progress really fast, look good is say, OA i's never going to do X.

And then in six months AI does X. So we need to be really careful on all sides about ungrounded forecasting because um,

this can be something that everybody does when they shouldn't do it, and it doesn't look good when you're wrong. 

Jacob Haimes:
Okay. Now, taking that into account, what is your hottest take regarding ai or the AI safety space or effective altruism? Alright, 

David Thorstad:
hottest. Take feed the hungry. Heal the sick. AI is important.

People are still hungry, people are still sick. If we do this century right, they'll never be as hungry, never be as sick again. So let's start there. 

Jacob Haimes:
Okay. I guess that's a pretty, reasoned and reasonable hot take though. Anything like I, with credit to you, you know, we did just come off of saying we probably shouldn't do, super speculative forecasting.

But anything even spicier. 

David Thorstad:
Okay. A spicier sentence before that Peter Singer was Right. Feed the hungry. Heal the sick. 

Jacob Haimes:
Okay. Okay. And then I'd like to end with two questions that I ask everyone. So the first is, like, what, a lot of this discussion has already been about this to some extent, but what grinds your gears or like really irritates you about your work or the space or the people that are in it?

Any aspect of, of what you do, what really, it's something you could live without. 

David Thorstad:
Yeah, there are many extremely well qualified people writing on AI existential risk, but they aren't always writing fully rigorous papers that I could respond to. So currently I only have two papers, one in the singularity, one on power seeking.

Why? Because those are the only two areas where the arguments have gotten to the point where I think they can take a high level response. Mm-hmm. Okay. And so I really like the people and they know who they are with the ability to producing world, be producing world-leading research to be producing world-leading research in support of their views, if nothing else, because that makes it easier to respond to it and hopefully mm-hmm.

Start the kind of a conversation they'd like to see. 

Jacob Haimes:
And that's, I guess just to even be more explicit, to make sure, and to make sure I'm understanding, instead of, for example, anthropic or open AI creating a quote unquote paper that is not peer reviewed at all, and has no sort of third party check on it.

It would be great if they, actually submitted it to a journal and went through the process of peer review and engaging with that in a rigorous way so that they could put out something that was really high quality. You know, like open AI and philanthropic don't attend the big conferences, uh, at all.

But they still put out work that parades around as if it's an academic paper. 

David Thorstad:
And that's the best case that that stuff is, you know, moderately good. A lot of what I'm getting sent is AI alignment forum posts, less wrong posts, blog posts discord chats. And I'm being asked, you know, why haven't you addressed these arguments for the singularity?

And obviously I would like them to submit themselves to peer review, but we, we could start with a paper. Sometimes a good paper is, you know, enough of an ask. 

Jacob Haimes:
Yeah. I mean, even a workshop paper, uh, in machine learning workshops are, are pretty accessible. Yeah. Um, and it's a good start. At least it gets some amount of review.

And it also opens things up a little bit more as well. Yeah. Uh, and then the last question, what's your favorite part about what you do or the people you work with or any aspect of what you do, why do you enjoy it? What is it and why do you enjoy it? 

David Thorstad:
I literally get paid to think and write all day in a place with smart people.

It is a job with okay salary, but very, very high job satisfaction for a reason. It's a great job and I love doing it. 

Jacob Haimes:
Awesome. David, thank you so much for joining. Like I know we didn't get to, you know, AI power seeking really much at all. That was intentional because I wanted to really focus on the singularity and then sort of the aspects of this argument.

But, maybe we can have you back on at some point to discuss the other one. But yeah it was great. Great to have you. 

David Thorstad:
Great talking to you.

Thank you for having me.

Jacob Haimes (ASIDE):
And that's the interview. I had a great time chatting with David and getting into the weeds on some pretty advanced topics, as well as getting a philosopher's perspective on long-termism and bounded rationality. If you found his thoughts as interesting as I did, make sure you check out some of his papers and his blog, which are all linked in the show notes.

As always, you can listen to an extended cut of this episode on the Kairos FM Patreon for just $2 a month. If you enjoyed this interview or otherwise have any criticisms of our ideas for the program, leave us a review. It helps us make the show better and improves the likelihood that more people will hear it.

Thanks so much. See you next time.