Manifold

All but the last 20 minutes of this episode should be comprehensible to non-physicists.

Steve explains where frontier AI models are in understanding frontier theoretical physics. The best analogy is to a “brilliant but unreliable genius colleague”!

He describes a specific example: the use of AI in recent research in quantum field theory (Tomonaga-Schwinger integrability conditions applied to state-dependent modifications of quantum mechanics), work now accepted for publication in Physics Letters B after peer review. Remarkably, the main idea in the paper originated de novo from GPT-5.

Links:

X discussion - https://x.com/hsu_steve/status/1996034522308026435
Companion paper: Theoretical Physics With Generative AI - https://drive.google.com/file/d/16sxJuwsHoi-fvTFbri9Bu8B9bqA6lr1H/view
Physics paper - https://arxiv.org/abs/2511.15935 | https://www.sciencedirect.com/science/article/pii/S0370269325008111
Related discussion of AI and theoretical physics with Prof. Nirmalya Kajuri (IIT) and Prof. Jonathan Oppenheim (UCL) - https://youtu.be/BRuDd3l0e3k
Related video: AIs Win Math Olympiad Gold: Prof. Lin Yang (UCLA) – Manifold #97 - https://youtu.be/8JeRCqNg7Rc

Chapter markers:

(00:00) - Intro: AI discussion with specialized physics at the end
(03:40) - The current AI landscape for science: frontier models, Co-Scientist, and recent math breakthroughs
(11:01) - Why models help and why they fail: errors, deep confabulation, and the research risk
(15:54) - The Generator–Verifier workflow: how chaining model inference suppresses mistakes
(23:30) - Project origin: testing models on Hsu’s older nonlinear QM/QFT work
(30:35) - The “GPT-5 moment”: Tomonaga–Schwinger angle appears and produces the key equation
(40:35) - Wild goose chases & a practical heuristic: axiomatic QFT detour; Generator-Verifier convergence
(51:44) - Referee-driven test case: Kaplan–Rajendran model, past-lightcone geometry, and verification
(55:55) - Tooling & outlook: automation prototype, chaining into “supermodels,” where this is headed
(59:39) - Physics slides (advanced): TS integrability, microcausality, and why nonlinearity threatens locality

–

Steve Hsu is Professor of Theoretical Physics and of Computational Mathematics, Science, and Engineering at Michigan State University. Previously, he was Senior Vice President for Research and Innovation at MSU and Director of the Institute of Theoretical Science at the University of Oregon. Hsu is a startup founder (SuperFocus.ai, SafeWeb, Genomic Prediction, Othram) and advisor to venture capital and other investment firms. He was educated at Caltech and Berkeley, was a Harvard Junior Fellow, and has held faculty positions at Yale, the University of Oregon, and MSU. Please send any questions or suggestions to manifold1podcast@gmail.com or Steve on X @hsu_steve.

Creators and Guests

Host

Stephen Hsu

Steve Hsu is Professor of Theoretical Physics and of Computational Mathematics, Science, and Engineering at Michigan State University.

What is Manifold?

Steve Hsu is Professor of Theoretical Physics and Computational Mathematics, Science, and Engineering at Michigan State University. Join him for wide-ranging conversations with leading writers, scientists, technologists, academics, entrepreneurs, investors, and more.

Steve Hsu: You have a brilliant but unreliable genius colleague, right? So you go down the hall to your colleague's office, and he's some kind of strange brain. His brain is clearly not like yours, but he has an encyclopedic mastery of all the literature, and he can do lightning calculations on his blackboard. So every now and then you go and talk to him.

You say, Hey, I had this idea. Do you have any thoughts on this? Has anybody already done this? And he just leaps to the board and writes out like you know, a hundred lines of response to your query. That's kind of what it's like trying to do theoretical physics with these LLMs. But the key part of the analogy is this human, this genius whose office you go to with these requests is capable of deep insights, but also capable of very simple and also very profound mistakes, right?

But with the models, it's not always true. Okay? So, so these are, these intelligences are not like human intelligences. They don't understand physics quite the way we understand physics, but my take home messages. They can be used in a very fruitful way.

Welcome to Manifold. This is going to be a solo episode where I talk about my recent work applying AI to theoretical physics, that were culminated in this research paper, which was published in physics letters. B. It's on a fairly technical subject, so I won't really go into the physics until the end of this episode.
For those of you listening on audio, I will try to make the whole episode as accessible as possible. But, this particular episode is probably best to watch on video, so you can see some of the slides or some of the excerpts from the papers that I'm gonna discuss, in addition to this physics paper that's on the screen right now, I also wrote a companion paper, which describes how the research was conducted in collaboration with AI. And, let me just switch windows to show you that paper that paper's called Theoretical Physics with Generative AI. And so again, for audio listeners, I'm gonna try to make this as accessible as possible. I think at least the first half hour will be perfectly fine for you.

Later on when I get into the actual physics and the physics equations, again, probably better to have the ability to look at what's on the screen. But for this first part of the episode, which is gonna be mostly narrative, I think you'll be fine as an audio only listener. So for the last uh, easily I think six months, I've been testing all the major frontier models to see how good they are at physics and math.
And you may remember that I did an episode just a few episodes. Back with A-U-C-L-A CS professor who had written a paper using a particular kind of generator verifier architecture, which I'll discuss in more detail. Which using that architecture he was able to use off the shelf frontier models and get gold medal level performance on international math Olympiad problems.

So that's an a previous episode that I'll link to in the show notes. I've been conducting that similar kind of research but more directly in just trying to figure out ways to use the models productively for physics and math and also trying to, under, trying to get a good feeling of their understanding of frontier research topics in, in, mostly in physics, but a little bit in math as well.

And as part of that work, I actually started collaborating with DeepMind, DeepMind. There are some DeepMind researchers who are acknowledged in this paper that's on the screen right now who have built something called Co-Scientist and they've actually written a paper on Co-Scientist. And Co-Scientist is also a sort of souped up version of the Frontier models that gen the general public has access to. But it's been improved in a certain way. It has a large token budget and it is meant to be used by research scientists as a collaborator. It is meant to both,perform calculations, literature searches, et cetera, but also even produce new ideas, for science research.

So this whole field is very, very active right now. You may have also read a blog post by Terrence Tao, where he talks about the use of AI in his research. I think there is now, more than there are now, more than one erdos problems that have been open for a long time, that have been solved by the models or by scientists in collaboration with models.

The paper that I wrote, is I believe the first published theoretical physics article in which the main idea from the article, actually originated from an AI in this case from GPT 5. That's something that I'll discuss in detail. So let me now go through what I say in this paper. The title of this paper is Theoretical Physics with Generative AI.

As I said, it's, it's mainly about the current situation how to productively use generative AI in theoretical physics research. And as the sort of specific example that I just, that I explore in the paper it's the quantum field theory paper that I showed on the screen when this episode first started.
There's a topic of the paper Tomonaga-Schwinger, integratability conditions applied to state dependent modifications of quantum mechanics. Now that, that's quite an esoteric topic, only a small number of theoretical physicists really, I think would be familiar with that terminology.

But I will go over that. I have some slides to, just to kind of briefly explain the paper in case some of you are, some of you viewers slash listeners are physicists. I have some slides where I go through the main results of the paper, but I'll do that at the end of this episode so people who are not experts on that topic can just drop off before that happens.

What I'm gonna discuss right now is I think of, of broader interest. And you can see on the screen here I say remarkably the main idea in the paper originated de novo from GPT 5. I also used models like GBT five Gemini and Qwen-Max extensively in the research. I found those three to be extremely good at math and theoretical physics.

And I use them in a kind of generator verifier pipeline or architecture, which I'll explain. Okay. Now frontier L LMS have been trained extensively on textbook materials in physics and math, but also all the research articles on archive and probably in most of the main journals have been used and have been subjected to forward and backward.

Next token prediction, I would suspect by all of the major labs and, all of the models you know, at at the Frontier are actually pretty good at physics. So for example undergraduate and graduate level courses in physics. In those courses, students can readily get the ais to solve the homework problems that they're set.
So we were in a situation where we, we have to figure out how to deal with students learning physics, physics majors because what used to be a very good exercise for them learn and a learning exercise for them, which is to be assigned some problems and then have to work out those problems. In the old days it was, it was already possible to get solution sets like there were lots of Chinese produced solution sets that you could find online and they could help you solve the problems. but now Theis can just solve them in, in a few seconds or a few minutes. And so that's had a huge impact on how we organize our courses at the university. A little bit harder than solving well-known homework style problems that you would find in at the back of the chapter in a textbook is actually trying to understand frontier research papers and, and trying to productively help researchers push the frontier forward.

And mostly what I'm gonna talk about is that, but given that the models can solve problems that would appear, say in a graduate level textbook. That suggests that they might be useful to researchers in, in trying to actually push the frontier forward. Now, the main problem that the models have, which I'm sure you're familiar with, is mistakes, which sometimes are called hallucinations. Sometimes another term for it is can confabulation, and I've talked about that in previous Manifold episodes where AI is the subject in the context of physics research. the real problem is the models can both make simple mistakes. Like they just make some simple calculational mistakes, although that's becoming rarer and rarer.

The more troublesome mistake they make is they may propose some idea, for example, that a technique from some distant area of physics, of theoretical physics can be applied to this specific problem or research question that you're interested in, that you're asking 'em about. And unlike humans, the models have read everything so they can make what sound like plausible suggestions that, oh, this distant technique be B, could be applied to your problem A and here's some reasoning for how that, how that would work out. Or here's a sketch of how that would work out. And the problem is, unless you are expert at both A and B, usually the person involved in this situation is an expert in A, but not in B, because humans are very finite in our, in our deep knowledge.

If you know A, but you don't know B, that suggestion can sound extremely exciting. And then you can waste many, many hours trying to figure out. Whether that suggestion is actually gonna be fruitful, whether the, the things a model is telling you in detail are correct. In my case, for this particular paper that I'm talking about, I actually had to go away and learn a bunch of stuff about B In this case, B was something called axiomatic quantum field theory, which is not my area of research.

But I had to go and learn a bunch of stuff about axiomatic quantum field theory to figure out that what the model suggested to me turned out not to be a very fruitful direction. Turned out to be actually kind of completely wrong. But that can waste a huge amount of researcher time. And if theis do tend to lead you on a kind of wild goose chase every now and then that makes the ROI or the cost benefit analysis of using the models much worse because okay, they help you most of the time, but then they waste 10 hours of your time occasionally.

Very bad, right? So this generate verify protocol that I came up with and that these the UCLA team that I one of whom I interviewed on a previous podcast, what they came up with as well is to have different instances of the model, or even instances of different models playing the role of ai, which generates a response to the research question, but then others, which play the role of verifiers that are trying to find problems in, the earlier response of the other model.

And by cycling through generate, verify, generate, verify that or generate, verify, modify, verify, that kind of process, I've found that you can suppress to a large degree many of the problems both simple errors and then also deep co conceptual confabulation. You can suppress those to quite an extent.
And I'll, I'll go deeper into some examples as we go, on, in this talk. Now, I give an example in the paper of a verifier step. And this is the simplest possible thing. You, you just have some output from model one and you show it to model two. But model two has been prompted with the following.

You are a world class theoretical physicist. Check the following for errors. And then you, you put in the, the output from model one. Review each equation and each reasoning step, identify all problems and summarize your findings. So very simple, prompt. You could actually make more elaborate ones. The UCLA prompts that were used to solve IMO problems are longer and more detailed. They're a few pages long. I think this one is very, very short, but even this one. We'll generate a good analysis by model two of what had been proposed by Model one. And if you cycle through this, it really is quite helpful. Okay. As I mentioned already, the main models I used were GPT five, Gemini Qwen-Max. I also found that DeepSeek grok and some others were quite strong. Kimi K2 were quite strong in their physics capabilities. I think to some extent just chaining together different models in this generate verify protocol is enough to take any of these top models and make them into a very useful research tool for scientists.

Okay. So, the broader take home message, whether you're a physicist or a mathematician or an engineer or even biologist, is that each individual model is smart, but can make mistakes. And chaining these things together suppresses the probability of the mistakes. Okay. I just wanna add that I, I characterized the, at least for physics, the, the process of doing research with an LLM as analogous to the following.
You have a brilliant but unreliable genius colleague, right? So you go down the hall to your colleague's office, and he's some kind of strange brain. His brain is clearly not like yours, but he has an encyclopedic mastery of all the literature, and he can do lightning calculations on his blackboard. So every now and then you go and talk to him.

You say, Hey, I had this idea. Do you have any thoughts on this? Has anybody already done this? And he just leaps to the board and writes out like you know, a hundred lines of response to your query. that's kind of what it's like trying to do theoretical physics with these LLMs. But the key part of the analogy is this human, this genius whose office you go to with these requests is capable of deep insights, but also capable of very simple and also very profound mistakes, right?

So you can't just assume, oh, if this genius thinks there's some good analogy
between A and B, it could be fruitful to think of A, in terms of B or reformulate A in terms of B. When a human, when a human genius says that to you, I think you could have a higher degree of, of confidence that suggestion, that analogy or analogy of analogies, has something there. But with the models, it's not always true. Okay? So, so these are, these intelligences are not like human intelligences. They don't understand physics quite the way we understand physics, but my take home messages. They can be used in a very fruitful way. I have to say a little bit about the research context, that I was working in. Cause now I'm gonna go through the example of the, the actual research project that led to this published paper. and, for which remarkably, the main idea of the paper came from GPT 5. I was asking GPT 5. I was testing the capabilities of g PT five and some of the other models by doing the following.

I uploaded or pointed the model at one of my own papers where I know all the details or, you know, maybe with the passing of time I forgot some of the details. But I, I still have some kind of deep knowledge in that area. I upload that to the model and then I just start asking the model of questions to, to see how well it understands the paper.

Can it summarize the conclusions? Can it, actually as a useful thing, I can say, oh, what, you know, what follow up work has been done in this area that I should be aware of? And it'll immediately return, like, some interesting results from some papers that cited my paper things like that. So models are fully capable of doing that. I, I mean, again, slight error rate because sometimes it will confabulate some new result, which came from some paper that doesn't exist. That extends my earlier work. It, it could do that. But most of the time when it does that, it's actually true what it's saying but some of the time it's not true.

Again, running it through a generator, verifier pipeline suppresses that. And so if I had an automated generator verifier pipeline, which I'll talk about that maybe later in the episode the building of such a, an automated thing that would, make this whole process just that much easier rather than having to do this kind of thing manually.

Okay, so the, the old paper that I asked it about had to do with something called, non-linear modifications to the Schrodinger equation. So the Schrodinger equation in quantum mechanics and quantum field theory tells you how the state of the universe evolves in time, or the state of some group of particles or something, or quantum fields evolves in time.

And interestingly, the Shrodinger equation is completely linear, okay? So the, state itself only appears to the first power in some sense on both sides of the equation. it's equation one on the screen, if you can see this. And so the, a consequence of that is if you have two different solutions to the Schrodinger Revolution equation, the sum or the difference of those two solutions is also a solution.

Okay? That's what linearity means. And, that's quite interesting. That's deeply related to things like quantum superposition and the, the persistence of quantum superpositions in nature. So very deep property of quantum mechanics, linearity. I think scientists who are, you know, who aren't just slavishly absorbing ideas from textbooks, have a right to ask, gee, is this linearity property exact? Could there, could there really be something in nature which is exactly linear or is it just some kind of useful approximation? And eventually we'll discover corrections to the Schrodinger equation. Modifications to the shorter equation, which maybe are small. There's, they have a coefficient, which is small, but include non-linearities.

Okay, so super important fundamental question. In physics, we don't know the answer. This was studied by Steve Weinberg in the 1980s. He's one of the great physicists of this sort of second half of the 20th century Nobel Prize winner, one of the inventors of the quote, standard model of particle physics. And he for some time was studying, gee, is there any way to modify quantum mechanics in a meaningful way and test those modifications? And he was thinking about non-linear corrections to quantum evolution. And my old paper on this, which I guess I'll just flash it on the screen briefly, is this paper from 2015, I guess, is the, the most, I guess that was the last version that we uploaded.

I think it was published, in late 2014. title is Locality and Non-Linear Quantum Mechanics. And I was asking the model to see how well it understood this paper. 'cause actually most of my colleagues wouldn't actually understand this paper because it's a little bit out of the box. What we're doing here, we're allowing quantum mechanics to have non-linear corrections, and we're showing that those non-linear corrections allow distant objects to instantaneously become entangled.

So it, it allows entanglement, which is, in a sense, faster than light, which is highly problematic. Entanglement can build up in a faster than light way we show. And this is related to, what was considered the main problem with Weinberg's proposals or the proposal of adding non-linearity to the sch shorting revolution is that it seems to violate all kinds of properties likespecial relativity. It also allows, this sounds kind of fanciful, it's true. It allows people who are in different. Everett branches of the multiverse to signal to each other, to send messages to each other. So there's all kinds of wacky stuff that happens if you allow this non-linearity. Some physicists would just say, don't think about non-linearity. It's so obvious that that can't be part of nature the way nature works. and that quantum mechanics is just exactly linear, which, which could be true. I mean, I, I suppose I would bet in that direction, but it's an amazing conclusion and it, it deserves to be tested right. You, you should need to test this conclusion.

Okay. So here in the paper on the screen, I have a, even though this, companion paper it's meant to be readable by AI researchers. I do have a little bit about the physics in here. So I, I sort of review a little bit about the structure of quantum mechanics and then this linear versus non-linear question that I've just described. Here on the screen you can see the prompt that I gave to GPT 5. I was testing a variety of models on a variety of my old papers. I've been doing that for quite some time just to see how well they understand them and sometimes they understand them better than my colleagues, just to be totally honest.
Okay. Now, what's the prompt here? The prompt is compare and contrast studies of non-linearity in non relativistic quantum mechanics. That's the old work by Weinberg and some famous follow-ups by two physicists called Geen and Polsky. Compare and contrast studies of non-linearity in non relativistic quantum mechanics with treatments that incorporate what is known from quantum field theory.
That the nonlinearity must involve the entire wave functional, which describes a potentially infinite number of degrees, number of space, like separated modes. Okay. Actually, I would probably write that a little bit differently now, but that is literally what I wrote to the model on that fateful day. Okay.

And so that is what it is. I just included in the paper here for historical reasons. I just wrote that prompt in the paper. because the paper that I showed on the screen that I wrote 10 years ago is about the extension of this non-linearity analysis to quantum field theory, which is different from just the quantum mechanics of some few degrees of freedom. It's the quantum mechanics of entire fields that
occupy all of space time. And so there's a qualitative difference in, in, in what's going on. Okay. So the model responded by with, I was a little bit surprised, the model responded with a beautiful, very technically correct and also conceptually correct summary of our paper.

Which again, like we went through multiple rounds with referees who clearly didn't understand what we were doing, but, but but ultimately the paper was published, but the models understood it right away. And then the model at the end of its response, summary response, trying to be helpful. I think GPT is reinforcement learning trained to be, to try to be helpful and make helpful suggestions.
It said, if you want, I can sketch the Tomonaga-Schwinger version of this evolution by space time, space, like hyper surfaces to show explicitly why a hyper surface local generator that depends nonlinearly on the global state, cannot remain foliation independent without collapsing back to linear dynamics. Okay, so it's offering to show me a calculation. It's doing this in a tech technique framework or,yeah, technique framework, Tomonaga-Schwinger formulation of quantum field theory, which is a little bit obscure. And it's sort of giving the answer that, oh, there's a problem with something called foliation independence if you include non-line, if you allow non-linear terms.

Okay? So I was quite intrigued by that. Of course it could have been one of these crazy hallucinations, but I was willing to go further and explore it. And so my next response to my prompt, next prompt in response to its response was, yes, sketch the Tomonaga-Schwinger version of this.
Okay, so now if you're, again, if you can see my screen, you'll see there's a long response. It's actually, the response is much longer than even what I excerpt in the paper here. But the model within a few seconds gives, it summarizes the Tomonaga-Schwinger formulation of field theory. It introduces something called foliation, independence of the evolution or something also called the interability condition, which allows you to take the state of the wave function on one slice, time slice, and evolve it into the future onto other time slices.

And then there's a certain condition that you want to be true. You want small perturbations to the, the way you slice space time to not matter. In other words, if I, if I slice, these are just coordinate systems, arbitrary coordinate systems that I can use to describe the what's going on in the universe to, to, to, to describe the location in space and time of events. If I just modify slightly, the coordinate system I use, it shouldn't affect the physics. And so that, that's foliation independence. And it then derives some nice conditions for foliation independence in the presence of a non-linear correction to Schrodinger evolution. Okay. And if, again, if you're watching my screen, you can see me.

I'm scrolling. These equations are getting a little bit long. If you're not a mathematician or theoretical physicist, these look like probably pretty formidable equations. I actually stop once it's introduced. the key equation, it's, it actually did a bunch of other stuff as well, but I didn't put that in this companion paper.
And what's remarkable here is that this equation, the start equation that I have on the screen here, turns out to be correct. The further analysis of this equation is quite fascinating, or at least it was to me. it led to an interesting paper, which was published, has now been published in physics letters B and I claim, I don't really claim any credit for this, this particular, uh, equation.

The, the model just generated this equation. Okay. Uh, so I thought that was kind of interesting. I thought that was worth, first pursuing to see if the equation and the idea here. Is productive. Makes sense. It does. whether it's publishable in a journal, it was, and then I wrote the companion paper just to give this as an example of the model, coming up with some innovative idea.

I mean, if a colleague came into my office, if a grad student or postdoc or colleague came into my office and said, Hey Steve, I read this old paper you wrote 10 years ago and it occurs to me we could apply the Tomonaga-Schwinger formalism to this, blah, blah, blah. And I get this equation. And then he, my colleague, she and I, or he and I worked for the next month to like extend, to, to ex further explore the consequences and then wrote a paper. That's more or less what happened except it happened much faster. And, the colleague here was an AI was not my flesh and blood human colleague from down the hall. Okay. So that's what happened. Some people are mad about this. People who don't like open AI are mad about this. I, I don't know, GPT-5 is open AI's model.

Some people who think AI can only generate slop or mad about this. But I'm just being completely transparent about what happened. I didn't set out to write an AI generated paper. I actually was just testing the models to see whether they could understand my earlier work. But the model then generated something interesting, which I started exploring and it was worth writing up as a paper.
Okay. So by the way, a sociological comment, the reactions to this work have been really strange and all over the map. So some people are excited by it. Like some of the like pro AI EAC type people retweeted it. I think. I think the tweet that I made a week ago about this work after the paper had been accepted at the journal and all this other stuff, I, I tweeted out something about it. I think that was, that post on x was viewed by like half a million people worldwide. So, so a lot of people were interested in it, but I also got like, angry attacks over it, over this like, how dare you publish AI slop, you know, in a journal? Okay, whatever. I mean, the people, some of the people who were, were saying it was AI slop, can't even understand the paper, let alone my old paper from 10 years ago. But hey, they're, they're qualified to say it's AI slop. I don't know. Okay. In this section I talk a little bit more about false directions that can come from the model. this mode of confabulation, deep confabulation. I would say that is probably the most time costly. For expert scientists, which is that the model proposes something which superficially looks very plausible and it's meant to be plausible.

Remember, these models are trained to generate plausible text. That's all in a sense, in a sense. All they are, they generate plausible texts, which looks like the 15 trillion or 30 trillion tokens that they've been trained on. Right? Which includes all physics papers, right? And all math papers, right? So there was a case of this in, in, in following up on this Tomonaga-Schwinger stuff.

One of the things the model suggested was, oh, we can use these results, something called the re slider theorem or the split property. We can use these profitably and further, analyzing the Tomonaga-Schwinger integratability conditions. And because I'm not an expert in that area, I had to go away and think about this for a while.

And eventually I realized what the malls were saying to me. I think pretty, pretty much incorrect. And the problem being
that some of the assumptions from axiomatic quantum field theory are themselves probably invalidated by the state dependent modification. So one, one can't really prove properties of state dependent quantum mechanics from those standard axioms of quantum field theory if in fact those are invalidated by the modification of the, of the theory of the dynamics of the theory. Okay. So, so this was probably the most time costly, wild goose chase that the models led me on. I will say that the generator verifier procedure that I described earlier did actually suggest that this was a wild goose chase because I could not get, although as if, if when I fed these axiomatic quantum field theory outputs into other verifier instances of.

The same or different models. I generally, I didn't not, I did not find convergence. So one model would say, yeah, this is fruitful. The other model would say, no, this is wrong, you know, or in a more detailed way. But it would sort of oscillate like that. It never converged. So one rule thumb I think is it if you're very lazy and you're not gonna go and learn this other area B act, in this case asio, axiomatic field theory. If you're not gonna go and learn it to the point where you can judge yourself whether what the model is offering you is good or bad, or has value, you could just cycle through a diverse set of models and just look to see whether there's convergence and if there's not convergence, it does not automatically mean the proposal's wrong, but I think it, there's a good chance that proposal's wrong.

Conversely, in most things at when you put them through the gener generator or for many things, you put them through the generator verifier pipeline there is convergence where you get, like, at the end you get multiple verifications, many models agreeing this is, this is probably actually okay. Now even that is not a guarantee that what the model, the, the result that the model's telling you is correct.

It still could be wrong, but I think probabilistically these are one lack of convergence is a good, is kind of a strong negative signal. And convergence is a, some, at least somewhat strong positive signal of correctness. And I'm not saying you should use the models to be the final judge of any of this, but using them in this chained way does, increase the probability that you're getting good stuff out of them and not wasting your time.

So just, it, it, I think it makes the value proposition strongly positive for expert researchers. Now, if you're not an expert researcher, I think you're still kind of in trouble because you, occasionally, the models will just lead you on a completely crazy wild goose chase or even make some very simple error that you don't catch. And so I, as I, as I comment in this companion paper, if you like, honestly, like someone who's a PhD, if you're, if, if you're not in academia, if you're not an academic scientist, you might think, oh, someone who's, you know, in his fourth year or third year of a PhD program, they must already know a lot about, their research area.

That person is already an expert, from our perspective of frontier research, someone who's, you know, a year or two away from finishing their PhD, they're still kind of a novice. And I would honestly say like those kind, the people at that level using the models could easily generate tons and tons of subtly, subtly wrong slop. Which is kind of a bad outcome. Now, some critics of my paper and of what I'm talking about would even say like, oh, Steve Hsu an experienced physicist who's, you know, been publishing papers for, I dunno, 35, 40 years now or something. even he is taken in by these ais and has written a slop paper. I don't think that's actually true, but, but the point is I think there's some high level of expertise still required from the human in the loop to get good stuff outta the models.

So I think that's true, but it's also true that for the population, there are thousands of, say in physics or math, there are thousands of people who are real experts, who have published many papers and really know their sub-discipline. Those people, if using the models judiciously, perhaps in this generator verifier way, can experience a very big productivity boost in their research using the models to do computations.
Literature searches, maybe sometimes proposing new ideas, generating the late tech output, proofreading papers, putting in, you know, some really tedious stuff like putting in the citations and the bibliography. There are lots of things that models can do to save you time. And so my message is students, be extra careful, but start spending some time with the models just to learn how to do it.

Experts, this could you, you try it, you may find this to be extremely useful and productive, but again, you should also be careful not to be led on wild goose chases, by these models. Okay, let me go on in the companion paper. here I sort of summarize the results. I will go into this in more detail after I complete this like, part of the episode, which is for non-experts. I'll, I'll actually go through some slides describing the paper, so I'll, I'll come back to this. Here is a prompt that I used and I did this with many different models. When the paper was near in completion, I, I asked, or we're well into the research, I asked the models, you know, do the ts the Tomonaga-Schwinger conditions for state dependent quantum mechanics exist elsewhere in the literature.

Question mark, do a careful search. Okay. So I kept pushing them just to make sure, like one of the concerns when you get something that looks novel from an AI is that it actually read that somewhere else. So it isn't actually a novel. This is something AI researchers care a lot about. The result isn't actually novel.
It read it somewhere and it's just kind of reproducing it for you here. So I, I checked. Repeatedly for that. I've also now published the paper and some people have, other physicists have looked at it. No one's come forward and said like, no, this is actually derivative. This has actually been done before.
There's actually a paper with this, these exact, this exact equation. So that doesn't seem to be the case. And I think models are useful for that too, to just do literature search and find like papers where, you know, the, the result could have been anticipated. In this case, what I think actually happened is the model understood.

We were studying, we were interested in non-linear revolu, non-linear modification of the shorting equation. We were interested in, quantum field theory, and it, it itself made this combinatorial connection between the Tomonaga-Schwinger work and this sort of Weinberg like work from the 1980s or the work that from my old paper from 10 years ago. And it combined them two to produce these new ideas and these new equations. So, so I think that, you know, like, again, is this like the deep kind of scientific innovation that say Einstein produced or, you know, the max plot? No, it's more like ordinary research where, you know, you combine two ideas, you get something interesting, you push it forward and you write a paper.

So I think that's how I would characterize what happened here. But it's still interesting because of the, the, the AI did it, not my colleague down the hall. The AI did it. Okay. I submitted the paper to the journal. The referee wrote a report. One of the things the referee wanted me to do was actually use these new equations to test a recent set of models, which had been proposed by two physicists very good physicists David Kaplan. and I don't know if his colleague is a student or a postdoc or, or another professor Regenron him, I don't know. It's called the KR model. And the KR model proposes a particular form of non-linearity, for quantum mechanics and quantum field theory.

And they've, they've, they've written extensive papers on this. So, there doesn't seem to be any paper in literature that says, Hey, this, this model has some problems with special relativity, et cetera, et cetera, or, or non-local signaling, et cetera. So they, they were trying to, deliberately trying to avoid the original problems that the Weinberg work from the eighties had. That was pointed out, as I said, by Gessen and Polinsky. So, so, so KR were aware of this and trying to deal with this problem in their particular model. When I ran their model through my new TS equations, it doesn't satisfy the TS equations. Okay. So it seems like their model doesn't actually necessarily have the, the properties that they thought it had.

Again, this is probably something that we're still gonna have to work out amongst ourselves, the, the, the handful of physicists that are looking deeply into this. But in any case, to respond to the referee report, I ran the KR model. I had not looked at the KR model in detail before that, and I, I ran the KR model physics model through the AI and just asked it. You can see the prompt here. I asked it to calculate the TS integr ability conditions in the KR model and discuss the physical meaning of the results. Now, what's interesting about this result is that to, to do this calculation requires in a way a kind of geometric analysis of. The past light cone, the, the, the set of events in space time that assuming a finite speed of light could have affected a particular point in space time.

And so the calculation involves the overlap between two different past light cones. So you have point X and it's past light cone, another point y and it's past light cone. So there's some geometry here. And the models in a way are not, I think, not very good at geometry. They don't really see things, at least the, the, the multimodal models might see things, but these models, language models don't
generally. So I was a little bit surprised that in this case GPT-5 was able to actually correctly figure out what the overlap between these two past light cones was. It developed a very nice notation and answer the question as to whether there was a non-zero contribution to these integratability conditions coming from the KS model, the KR model.

And it even, I like, I like this notation very much. It defines the set J, which is the past light cone and talks about the intersection between J of X and J of Y, et cetera. Now this is probably standard notation for some subset set of papers, but I, if I had invented it myself, I would've probably done it slightly differently and maybe not as economically. I think the model did a great job. I mean, after all this is like a few seconds of work and the model did it right? Gemini got it wrong. Qwen-Max also got this wrong, but then I, I cycled through the verifier stuff multiple times and they, they agreed that this was actually right. I checked it myself. I think it's right. Okay. But just another example of the use of models in the research.

Okay. so let me conclude on this sort of general. discussion, which is aimed for non-experts. I think this research article and how it was produced is a decent case study for where the models are right now and where, how human experts, and in this case, again, as I said, high experts, like people who are sort of well beyond PhD level in a particular field, how they can use the models profitably and to, to accelerate discovery and, and, and scientific research.

I think it's pretty clear that if you know what you're doing, you, you, you start working with the models, you gain some experience, maybe have the models look at some of your old papers and you discuss it with them. You'll get a sense of where they are. And then you use in a disciplined way, this generate verify procedure to beat down the, the errors. You, you get a very useful tool. And actually, I'm, I'm, I've actually built a prototype version that sort of automates a sort of something that operates in your browser that automates this pipeline to some degree. So I've been experimenting with that. thanks to the people at Xai for inviting me to visit for a week and, and build this in collaboration with them using the grok model as the, the engine that powers it.

But, but I've set it up in such a way you can use any model to power it. I'm still playing with this and I'm, I think, I think by chaining together lots of models, you can, you can have a supermodel that's much better. that's also quite similar to what the DeepMind people are doing with Co-Scientists.
And I continue to work with them on Co-Scientists. But anyway stay tuned. This is gonna get a lot better. I think it's but, but I think it, it's good for scientists of all types to spend some time with the models and understand how well or poorly do they. Perform in your particular sphere of expertise, and then how can you use them profitably to improve your research productivity?

Okay. All of this is just gonna get better because as I said, the, the models can pretty much solve any textbook type problem. They're getting to the point where they can solve, you know, what are considered really hard international math Olympiad or Putnam problems. And I think they just lack the deepest level of understanding that an area expert, someone who had been doing, say, work in axiomatic field theory for 20 years, they don't have that level of comprehension.

But I think we can easily generate the training data that will close that gap if, if we get a lot of scientists using the models to actually do research. Not just students using it to cheat on homework or something, but actual frontier, the best brains on the planet, human brains interacting strongly with the models on a day-to-day basis and generating training data.

I think we can, we can, we can make another step forward. So stay tuned for that. I think that that's the project that I'm most excited about this about at the moment. In my acknowledgement, I, I acknowledge Juraj Gottweis and Vivek Natarajan, who are both at DeepMind, they're co-creators of this AI Co-Scientist tool.

I didn't actually use Co-Scientist in this research that I'm talking about, although I did use Co-Scientists at the end to check and it gave a pretty good report on the work he agreed with the results, et cetera. However, a lot of my intuition for how to use the models, what can go wrong, how you can get led on wild goose chases.I was led on some wild goose chases by Co-Scientists. But as with all models, I'm not singling out Co-Scientists for this. But a lot of that intuition was built up in collaboration with these guys. I continue to work with these guys. I hope we can produce an even better research result using Co-Scientists.

It just happened that this work was done. It, it, it just spontaneously happened because of something that GPT-5 responded with. Okay.so I think that's mostly what I wanted to talk about for this episode that's of a, shall we say it's appropriate for non-experts, for, for people who are interested in AI but don't necessarily know that much about physics. Actually let me add one more thing. The models don't really understand physics the way we do. I think the most pessimistic, okay, the most pessimistic interpretation of what gener large language models are doing is that they're only stochastic parrots. They, they can only basically recombine stuff they've seen before in some kind of quasi stochastic way that sort of looks right, and that's the extent of their capabilities.

I, I think that's too pessimistic. I think that they're, they're, they're definitely better than that. I don't think we're really that close to AGI I think that's somewhat further away than people would've said a year or two ago. And I think gradually that's becoming a consensus among a lot of AI researchers.
Here's a, an sort of intermediate, pessimistic take on what Gen AI is doing. They are able to generate text that resembles human reasoning. But they do not themselves reason the way that we, we do, or that a true symbolic manipulation program does. Okay? They're not manipulating objects with hard rules concepts with hard rules the way that Lean does, for example. They're not doing that.

But even if they only generate text that resembles human reasoning without actual reasoning ability of their own, those outputs are still useful to us. So, so if it resemb, if it generates some stuff that like, it didn't really reason to get to this result, but it nevertheless, somehow generated it symbolically where it says, oh, you know these conditions of Tomonaga-Schwinger are probably violated by a lot of these nonlinear modifications of shorting grade.

Like maybe it didn't reason the way that an ordinary human brain would reason to get, or Mr. Spock would reason to get to that conclusion. Maybe it just sort of like blurted that out and it blurted it out because it sort of resembles stuff that it'd seen before. It still could be correct. Just as the proofs for these IMO problems are actually correct. Right? Even though it may not be reasoning the way Lean does, which is a, it's a symbolic actual like axiomatic kind of reasoning program. Even though it's not doing that, it still might generate stuff that is correct and that is of value to a human expert using this system. So that, that, that's my position on where we are.

And welcome your feedback by the way, if you, if you're an AI researcher or a theoretical physicist or mathematician, you have, and you have thoughts on this, I'm sure a lot of people are thinking about this feel, just reach out and contact me 'cause I would love to, discuss it more. In fact, if you're doing research in this area, I I pro I would consider even doing like a, another future episode. Like if you have interesting results that you want to talk about in using AI in your own work I would love to maybe do an episode about that in the future.

I'm going to go through some slides now which again, if you have access to my screen, you can, you can see these, They're meant to explain the, the actual physics content of the research paper that was published in physics letters. So if you, aren't familiar with terms like relativistic covariance and Tomonaga-Schwinger analysis this might be hard for you to follow. Feel free to sign off. But for the few physicists that want to hear this, I think this is just a good place to cover the, the results. It's not gonna be a full blown talk. I'll just try to go through this briefly. So title of the paper is Relativistic Covariance and Non-Linear Quantum Mechanics, Tomonaga-Schwinger Analysis. On the screen right now, I have some slides. Here's the actual paper that was accepted at physics letters. It's relatively brief paper. I think it's like just the si the typical length of a letter.

I'm just scrolling through it on the screen here so you can, you can glance at it, but you can find this on archive or I think even the physics letters version is open access. So you can, you can grab this and look at it. I encourage you to, you can email me and yell at me as some people already have.
But anyway let me come back to the slides. So these are slides that I, by the way, I had the AI prepare these slides. I literally took the late tech from the paper and I said, make some slides so I can quickly give an overview of the results. What it gave me was not perfect, but I was able to refine those slides. It took maybe another half hour to refine the slides. And so that's probably a relatively small fraction of what normally it would take me to make up slides for the, a short talk. Okay. Yeah, probably like it took a third, the normal amount of time or less. Okay. All right, so on this slide now here, I'm gonna kind of assume you can see the screen.

So if you're riding your bike and listening to this podcast. Sorry, I'm gonna kind of assume you can see my screen. Okay. So equation one, I have the Schrodinger equation. SI is state vector. The thing which generates time evolution for s is the Hamiltonian operator. And as you can see, the state si only occurs linearly in this equation.So consequently, if I write a superposition, A times S one plus B times SI two, if s si one and S two are solutions, then the resulting superposition state is also a valid solution. this has all kinds of consequences. Maybe the most colorful one is the many worlds description of quantum mechanics, where you could have different branches where radically different things have happened, but they, they coexist comfortably in the superposition state.

And there could be a version of me which gets entangled with one set of outcomes and a version, another version of me that whose memory records. Are entangled with another set of outcomes, and linearity means those, those two branches of the wave function can just keep evolving in the hilbert space forever.
We could talk more about decoherence, et cetera, et cetera, but that's not, not the, not the point of this, discussion. Okay. But a fundamental question is, is the linearity in this equation fundamental or is it just an approximation? And this is something only the, like most people most interested in foundations of physics focus on because if you talk to some guy in fluid mechanics and he's looks at the na stokes equation, he knows at some point na or stokes stops applying and you have to deal with like the individual motions of water molecules or something like this.

He knows that. But here we're talking about supposedly the fundamental loss of physics and it is a very, a question worthy of a lot of attention. Whether there is any kind of structure in the, the true fundamental description of our universe, which is exactly linear. I know we're all conditioned from our elementary quantum mechanics classes to just accept the form of this equation and that it's linear. But if you think about it carefully, this is a very deep ask for someone to say that this is exactly linear is a very deep ask. Okay? At least for a, for a physicist, maybe a mathematician's totally happy with that because they don't care. But we, we care about physical reality and gee, did it really have to be that way?

Did it really have to be exactly linear? Okay. next slide. Okay. This is sort of repeating a little bit what I said, but you could imagine a modification of sch shorting revolution, which contains some evolution in which the evolution operator here, it's written as F in equation two, is itself a function of the state. Okay, so this as I mentioned earlier in the episode, this was proposed. There was a very thorough set of proposals by Weinberg to study this kind of possibility. Specific examples, specific tests using atomic physics. Here we're interested in the field theoretic generalization of this. So not just side describing the location of a particle or something, but the quantum state that describes an entire field.

So the, the, the electric and magnetic fields everywhere in the universe or the Higgs field. Everywhere in the universe, there is actually a state in the hilbert space that corresponds to that whole thing. And quantum field theory describes the evolution of all of that, not just the way function of fuel electrons, but the, the way function of, in effect, all the degrees of freedom in the universe are evolving in time.
And there, there is a Schrodinger like. There may be a Schrodinger like description of that, at least there is one in quantum field theory. Okay. So amazingly, if you go all the way back, the Tomonaga work was done, I believe during World War II or in the wake of World War ii. And they were interested in this question of, well, how do I unify ordinary quantum mechanics with special relativity?

The result of that is something called quantum field theory and what conditions should, what, what equation is the analog of the Schrodinger equation for, a quantum field? For that you get this to swinger equation because you're talking about special relativity. The things have to be coordinated in variant or co-variant with the coordinate transformations, boosts, point array transformations and.
The equivalent of pushing something forward in time. So the original shorter equation tells you how to push something forward in time. Okay? In field theory you need a way to push something forward in time. But, Tomonaga-Schwinger had something which they called many fingered time because the normal vector to the koshi slice or time slice could be slightly different depending on where you are, depending on your cordon system.

This functional der with respect to Sigma of X is the moral equivalent of pushing something forward in time. Sigma X is this kind of normal fluctuation away from one of the Kochi surfaces. And in order that space like separated places on a kochi surface can't talk to each other, okay to, so to preserve relativistic causality, it should be the case that pushing something forward a little or modifying the koshi surface slightly at X should not, should, should commute, should not, it should not matter whether you do something at Y and then something at x or something at x or something at y because by causality, there hasn't been enough time for any causal influence to go from X to Y.

So there's a condition original to Tomonaga-Schwinger's formulation. I think Schwinger was probably the first one to actually write it this way, but I'm not sure in which this commutator is, is required to manage. Now, in the ordinary linear theory of quantum mechanics, this automatically vanishes if you have this micro causality condition that quantum operators commute at space like separation. This is the old Tomonaga-Schwinger in linear quantum mechanics, okay? And what the model proposed to me was to say, well, if you're gonna impose, if you're gonna include state dependent modifications to sch shorting revolution then you have to modify the Tomonaga-Schwinger equations. This NHA is the non-linear functional of the state si, which is introduced and then you can start asking questions about what happens to these computation relations these integr ability conditions in the presence of nha. And you need to introduce some technical machinery. You need to be able to take a functional derivative of the NHA nonlinear term with respect to variations, in this quantum state or in the coordinate system.

If you vary, if you push the kochi surface slightly forward with sigma vx. Okay? So you have to develop a little bit of mathematical technology to deal with that. The models just did that. I mean, I didn't do it. I, the models did it. I went and checked it, it seems to be correct. Okay.
So you generalize the, this integratability condition that the commutator of these two functional derives has to vanish on any state that has to be true if this, if the theory is satisfactory and you end up, when you, when you work this all out, you get this operator constraint. This is the start equation that I mentioned in the earlier part of this episode. If the start equation is satisfied, so the, the start equation is a necessary but not sufficient condition for you to be able to integrate forward in time from koshi surface to koshi surface in a low rent covariant way. In a relativistically, covariant way. Okay? It's necessary, but it's not sufficient.
There could be some global obstacle to doing it. But if this fails, you're really in trouble because even on, on some arbitrary slice, you, you can't quite go forward in a relativistically, covariant way. So I think it's a reasonable test or criterion that you would want to subject in a model to. And if the model doesn't satisfy this condition, of course, it doesn't mean the model can't describe reality, but it means that reality isn't fully relativistically covariant So that, that's, the underlying physics, the underlying cons, conceptual basis for this. On this slide, I just have a straightforward composition check.

People who work in this area Tomonaga-Schwinger type stuff. Started talking about these slight variations in where, how the koshi surface looks at x and y as little bubbles. And so you, you can check the conditions that were derived by infinitesimally looking at the effect of two different bubble perturbations and you, you get the same equation. Okay? Now here's an example. it's the original non-linearity introduced by Weinberg. You have an operator, some arbitrary operator o, and the expectation value of o times the operator o is added to the shorting the right hand side of the shorting equation. So this, this particular term, which is shown here, affects the quantum evolution, but the, the C number coefficient of the operator O is that expectation value and that first equation on the slide.

And that expectation value depends on the state. So this is a state dependent modification to the time evolution of the state itself. So it's a non-linearity. And so you can stuff this into the equations that we've derived the, Tomonaga-Schwinger integratability conditions. You can grind it out. The models do this pretty much effortlessly. I think maybe there were some slight errors in various models when they tried to do this and, but they were all like suppressed out by doing the verifier generator process. And, and I checked it myself. So anyway, so what's interesting is that this set of results for the specific Weinberg type term turns out the whole thing when you, when you simplify, it ends up being proportional to a com spa, a commutator between two operators.

And if X and wire space like separated, then conventionally in ordinary quantum mechanics, we would say, oh, this commutator zero. Because of causality. And so, um, for this particular Weinberg term, you do get foliation independence or integratability conditions satisfied. But medulla, this assumption that these two operators, the operators that appear here, the original one O but also the Hamiltonian operator appear, these commutators you under the provision that this micro causality still holds, that the commutator between two, any two operators that are space like separated is zero. If that continues to hold, then you're okay, you've satisfied the interability conditions.

Now, the subtle part of this paper is, well, is, is that really reasonable? Right? So if I've modified quantum mechanics to have the state dependence or non-linearity, can I still maintain micro causality? Okay, so that's a question. And so let me just review what we usually say, like in textbook treatments about micro causality. So normally we, we, we have an initial koshi slice, or sorry, we can just say initial time. And we impose what are called equal time commutators. So if you've ever studied quantum field theory, if you took an elementary course in quantum field theory, you would've seen this in a textbook where at some initial time we impose these so-called canonical commutation relations between the field operators and their conjugate momentum.

Because Phi X and ex-Prime at x and X prime at equal time, so X does not equal X primebecause they commute at equal time. If you go into a different Loren frame, instead of being at equal time, they could be at slightly different times, but there's still space like separated. Okay, so it's these commutation equal time commutation relations, which give you micro causality because of general, of, because of relativistic ovs. Okay? Now once you have that on your initial koi koshi slice, if you define the operators at different times, say Heisenberg picture operators, if you define them at different times by using this time evolution operator, the standard time evolution operator, you can start with the canonical commutation relations on one Koshi slice. And then you can show that because the time evolution operator has this specific unitary form where you've in, you integrate over this Hamiltonian operator to get it. Then this micro causality condition holds it all times, not just, or on all koshy slices, not just on the one where you first impose the canonical computation relations. So this is a standard textbook treatment.

Now, is this textbook treatment, does it still apply in the presence of these state dependent modifications of quantum mechanics and, it's, I would say it's questionable. Okay. So, so the time evolution operator in standard quantum field theory is just this integral over the Hamiltonian operator. It's the, the first equation here. And of course this is totally linear. It's just some fixed operator and which acts on the state and then evolves the state or an operator forward in time slightly. This nonlinear case is much more complicated, where the actual time evolution is state dependent. You can't even actually speak of an operator by itself. You can talk about how the state evolves, but. When you talk about the operator at some future time that evolution itself is state dependent. So, so the whole standard formalism for justifying micro causality. So you start to justify micro causality. You start with the con canonical commutation relations, and then because the, the dynamics of the theory are relativistically, inva you can then conclude that on future, you can evolve the operators forward. And then on these future kohi slices, the, the space like commutation relations continue to be zero.

But that whole framework is gone now because you can't really talk about the operator's heisen, Heisenberg picture. Everything is state dependent in the theory. So I would say that it's questionable, even though the Tomonaga-Schwinger equations reduced to commutators, which look like the micro, they should be zero by the micro causality condition. We can't be sure of the micro causality condition because we've modified the underlying theory. Okay. So that, that's a point that's made in the paper. One of the reasons that I'm dubious about whether that micro causality condition continues to apply in the fully modified non-linear state dependent version of quantum mechanics that we're in now is because in the old work, the, the original work that I asked the model to look at which I wrote, this is a paper that's more than 10 years old now, we actually looked at some physical things that can happen when you have these nonlinear terms in the Schrodinger equation.

And one of the things that you can show is that if you start out with two, a quantum field and you have some state that you make at a and some state that you make far away at B and A and B could be light years apart. If you introduce a thing like the Weinberg nonlinear state dependent term to the time evolution of these two states, like they could be coherent state wave packets at A and B, as soon as you try to evolve them forward in time, the state forward in time even though a and b were unentangled like they started in a product state, but they, so they're totally unentangled and they're light years apart upon any amount of time evolution.

They generically become entangled, which is a, which is an, a causal process, right? So normally things, the only way that two particles that are initially unentangled could become entangled is if they interact. And usually those interactions can only happen. They can happen no faster than the speed of light.
So normally in, in ordinary quantum mechanics of five, two objects or wave packets, which are some distance apart, if the, if they evolve forward in time, and the amount of time I have fallen forward is small in units where C equals one relative to their physical, their space like separation, then of course they, they cannot become entangled because there's no influence or physical thing that can happen between them in, in such a short amount of time.

Okay. But that's violated the moment you introduce this non-linear term, because the non-linear term is state dependent. I mean, you have to work out the details, but that, that turns out to be the case. So this suggests to me that actually this micro causality condition is more subtle than, than you think 'cause definitely that instantaneous entanglement violates micro causality. And so it may not be the case that this formal space like, the, that the commutator of two space like operators must vanish in the quantum field theory. That, that, that property may be violated. Once you introduce these non-linear terms.
Okay. And, and, and in some examples that I looked at in the paper, the integr ability conditions reduced to things which are proportional to these commutation relation type things. In other cases, you can find just explicit violations of the integrability conditions that come from the way that the nonlinear term is constructed.

Okay? So it's a little bit complicated. You have to read the whole paper, but I think one of the most interesting physical questions is what happens to micro causality if you introduce any kind of state dependences or non-linearity to quantum evolution, again, the old work that I did with Truman Ho, who, who's now a head of a AI research group, by the way the, the work I did with Hooman Ho 10 years ago already suggested that, yeah, micro causality cannot actually hold in some sense, once you include such terms, the, the very old Geese and Polchinski thought experiment also allowed the same thing where sort of Alex and Bob could communicate superluminal with having some entangled cubit resource or something they could communicate superluminal if there were state dependent evolution in the sch shorting equation.
So, so again, like that slightly different physical concept there, but also, you know, seems to violate relativistic causality. Okay. So I think there's still some open questions. It could be that just we need a completely radically different way of thinking about these theories once we allow state dependence in the evolution.

And I, I think that is true. Let me finish by saying that on this really deep fundamental question of is quantum mechanics linear? And if so, you better have a good reason to justify why it's exactly linear, because that that would be an extremely strange result. Everything else in physics has some non-linearities in it, right? It could be the case that any amount of nonlinearity, if you are adopting a quantum description of nature, which we are forced to deal with empirically, we know nature is quantum mechanical. The moment you allow some nonlinearity, any nonlinearity in the quantum evolution of the state, you get immediate violent, non-locality breakdown of micro causality or Geese and Polchinski superluminal signaling, signaling across Everett branches or this ho shoe, instantaneous entanglement of space, like separated wave packets. I think those are all different manifestations of this problem, that if you allow any amount of non-linearity in the state evolution in quantum mechanics, you have to deal with this. You, you don't have locality anymore. And, it could be that it's a fundamental requirement of the physics of our universe, that we have some degree of locality.

Like I, I don't, I don't want my, or it's perhaps not a desirable property of, uh, a physical theory that the evolution of an atom that's in my brain is somehow affected by the state of some atom that's on Jupiter, right? That, that, that's not a, doesn't seem to be the case in our universe and maybe not a desirable property of a theory. And it's maybe that restriction that if you want some level of locality in a quantum field theory, you cannot tolerate any amount of non-linearity in the state evolution. Okay. So I think that's, that's the deep question, which I think still deserves a lot more research. I think most physicists are just sort of sleeping about this.

They either just sort of accept linearity, they're not thinking about it, or well, I think that's the case for most physicists. But it is a, I think a deeper question and deserves more attention. that is, of course, completely independent of whether you use ais to investigate this question. I think I think, well, for me, myself, I, I continue to actually investigate this and I, I am finding the ais to be quite useful in that process.
Thanks a lot for your attention. Apologies to Manifold listeners who aren't interested in crazy physics ideas. If you didn't like this episode, hopefully the next one will be better for you. Take care.

More episodes

Chapters

Creators and Guests

What is Manifold?