Nikolay and Michael discuss planner row estimation — how it can cause performance issues, how to spot problems, and various options we have to fix them.
- ANALYZE (docs)
- Autovacuum config (docs)
- Statistics used by the planner (docs)
- CREATE STATISTICS (docs)
- Row count estimates (pgMustard blog post)
- Optimizer methodology (talk by Robert Haas)
- Tomáš Vondra on statistics and hints (an excellent interview we forgot to mention, sorry!)
What did you like or not like? What should we discuss next time? Let us know by tweeting us on @samokhvalov / @michristofides / @PostgresFM, or by commenting on our Google doc.
If you would like to share this episode, here's a good link (and thank you!)
Postgres FM is brought to you by:
- Nikolay Samokhvalov, founder of Postgres.ai
- Michael Christofides, founder of pgMustard
- Jessie Draws for the amazing artwork
Creators & Guests
What is Postgres FM?
A weekly podcast about all things PostgreSQL
Michael: Hello and welcome to Postgres fm, a weekly show about all things Postgres qr.
I'm Michael, founder of PG Mustard.
This is my cohost Nikola, founder of Postgres ai.
Hey, Nikola, what are we talking about today?
Nikolay: Hi Michael.
It's your turn.
Tell us, tell us please,
Michael: Oh, um, flip it on me.
I, I picked this one.
I think we're gonna call it row estimates, but this is, uh, yeah, based on, so this is about the planner, uh, query planner.
And this is again, based on a listener suggestion or request.
Uh, they didn't ask about this specifically, but they did ask about nested loops causing performance issues.
And, looking at any one part, Performance issue would be, quite a dull podcast, I think, maybe a better video.
But it really, a lot of these, especially when it comes to people thinking that the nested loop is the problem, it's normally due
to a bad row estimate at some or an inaccurate row estimate at some point, and a bad planner choice based on that low estimate.
So I wanted to dive into how Progres is making those decision.
And also some things that we can do as users to help it get better statistics and to help guide it in the right direction or, or not.
Things that we don't, maybe don't have at our disposal at the moment that other, other databases do.
So that's, that's the kind of thing I was hoping to talk about.
Nikolay: Sounds good, but what's wrong with nest loops?
Michael: Yeah, and this is the, this is the, um, reason I didn't want to call this, you know, nested loops causing issues.
It's mostly, so yeah, so what, what's wrong with them is a good point.
And I think it's about trade-offs, right?
So the reason we have multiple, we, a lot of the time the planner has multiple choices for the joint algorithm it chooses, and even the scans that it does, so it's not even just about joins.
It has a lot of choices and different options and.
It can make its mind up, out the, the fastest or most efficient way to, to do that.
, nested loops have some real advantages.
They've, they're very quick to get started.
They're very flexible.
They can be used in a lot of different cases.
But when the're two.
Relations that are being joined get large.
There can be more efficient or quick or quicker ways of joining those in Postgres.
We have hash joins and we have merge joins.
I think that's it, for the join algorithms.
So nested loops are good when, uh, of.
Relation on one side is smaller, quite small, right?
A low number of rows on one side,
Michael: Yeah, exactly.
One side and if, if, even if one side's large, if it's indexed, an nested loop can still be really good.
. And if they are both quite large, uh, nest, look, probably it's not a good idea to, to have it.
Michael: Yeah, exactly.
So we'll probably be better off with a hash join or in some cases if they're both sorted, a merge join, I guess.
But sometimes we don't have those options, right?
Sometimes the only.
If we, if we're not doing an quality operation, I think there, so it's, but equally, when people are looking into this,
if there's a good plan and a bad plan, if it, it, that's often when people think it's the nested loop that's the problem.
It, there was another option, right?
There was a better plan, a better path that could have been chosen but wasn't because it was being scored higher cost.
The costs were estimated to be higher.
A big input for those costs is how many, I'm gonna say Rose.
Uh, this might be another argument for Tules, uh, are being returned at each, um, or tuple, sorry, back to that old one.
Your version isles.
I propose Tules, but you switched.
So, if you see nest loop and, we think it should be different, we suspect, , some issues with statistics, right?
and first thing to check is to try to analyze, the table to recollect statistics or no?
What do you think?
Michael: Well, this is, I think where we get into it's probably worth us explaining up top, analyze, I guess what
you're talking about there is the process Postgres has for gathering statistics so you can analyze a relation.
I learned today actually looking into it, you can analyze just a single column, and there's a lot of flexibility
Nikolay: Or expression
, and also it, it happens automatically as part of auto vacuum or auto analyze, I guess is the, part of
Nikolay: Well, yeah.
Auto vacuum has three, three, um, tasks.
Nikolay: So vacuuming.
preventing, preventing transaction idea routes or freezing.
And also third option is, uh, o auto analyzing like recollecting statistics.
Michael: With two,
Nikolay: yeah, VA, what, what is fourth?
Michael: Does the VisibiliT map count as
Nikolay: Well, yeah, but it's can be considered as a part of vacuum.
Michael: Okay, cool.
Nikolay: Maintaining free face map and visibility map, but, uh, when analyzing, analyzing might happen together with auto vacuuming, but it may happen also.
Like auto, auto automatizing, uh, separately.
So everything happens in inside box, right?
Many options, but, right.
So not every time auto Im runs to process some table electrical statistics.
It, it, it's controlled by several settings.
Threshold scale factor, related to analyze part
Michael: Yeah, and I think it's fairly recent.
I didn't actually check, but one of the newer versions lets us set an analyze scale factor even for insert only workloads.
There was a problem.
There was a problem for append only use cases when you just add, add a lot of data and that's it.
And, uh, , WM didn't process it.
well, VM probably is not needed because you don't produce delete, uh, tops if you just ensure right.
But, uh, Aly.
Recollecting statistics is definitely what I would like to have.
And, , actually, many examples, educational examples and so on, they, they show like, okay, let's load something to a table and.
before, uh, you, like you, you must, uh, run, analyze yourself to have, proper plans.
But now you can just wait a little bit, right?
and it will do a job and then you can play with it.
But of course, uh, if you want to control, probably you just need to say, analyze a table name and that's it.
So actually the reason I wanted to.
Nikolay: 11 or 12 when it was like couple, couple of versions ago, maybe three versions ago.
Michael: Wow, that's getting quite a long time ago.
I remember when that was new.
Nikolay: I, I remember dfe, I, I, I, I cannot pronounce the last name, sorry.
Dar Dfe participated in this, discussion and, uh, and he, he, he was also in, uh, active in the Russian speaking community as well.
So I remember this case and it was quite like, so obvious improvement.
Like it was like one of those, changes when, when you have feeling, how come it didn't happen before, right?
But that's the nice thing.
It does keep getting better.
And the, the reason I wanted to, be specific about the analyze you were talking about there is that there's, a
few places people might see that word and there's analyze itself as a, as a, a keyword that gathers statistics.
You might see vacuum analyze.
as like a kind of parameter to vacuum that is the same thing.
So it, that's doing both, vacuum and analyze, but you might also see, explain, analyze, which I guess we're gonna talk about in a moment, uh, which is not related at all, just totally different thing.
so naming things is difficult and, sadly that's just one we need to, like, once you, once you know what it does, you realize it's unrelated.
But, it can be confusing for beginners.
terms, words are overloaded everywhere.
, the statistics term was also overloaded.
We can talk about statistics, double statistics, and the user statistics.
This, statistics, statistics which collects, which keeps, the column and expression stats.
And this is what is calculated when you run, analyze.
And also in, in explain plan, we also see some statistics, right?
We can also call it some, so like the, the terms are overloaded.
there, there are too few words in in language, unfortunately in any language.
So, back to our topic, if we see nested loop and we say we, we would like to have something else like probably, we.
Need to check it with, analyze, like, what will you do?
Like just run, analyze, and double check the plan or what
Michael: Yeah, so there's a, there's a, so I think goes back a step.
How are you even spotting that problem?
I would suggest that you're, you should be running, explain analyze, and trying to work.
So that's faint analyzed buffers.
Ideally, maybe even VERBOs format.
Nikolay: on Twitter there was a new saying like in buffers with.
Michael: I like it.
I think I saw you had a t-shirt like that as well, so that's fun.
So, so, so when explaining last buffers to check before you run, analyze to re recollect statistics.
Michael: Yeah, but specifically explainer alone won't help us here because explainer alone only gives us the estimated number of, rows being
returned at each stage of the execution plan, but with, explain, analyze buffers, but explain, analyzes the important part for this topic.
Because that'll give us not only the estimated number of rows, but also the actual number of rows, being returned and
it's mismatches there that can really, help us diagnose where this problems are rising, where the bad estimates are.
so yeah, that would
Nikolay: me, let me, let me, let me disagree with you here.
At least, at least partially.
Uh, you are talking about, , like step forward examples when we can run, explain large buffer so we see all details both for planning and execution and we understand something is wrong.
We don't want to have nest loop here, but what if our execution takes a hundred years?
Michael: Yeah, of course.
Nikolay: So if you say like, just explain is not, is not helpful, it's still helpful.
It'll show nested loop.
It will give us, understanding how planner thinks about our data.
And we know our data, right?
We know how many rows there and there.
So we say like, no, I would, I would if I, if I were you, apply the planner, right?
I I would do it differently.
So just explain in some cases and.
And, and it also can be help helpful right in, in some extreme cases when execution is long.
Michael: Very, very good point.
But then I agree with you about the second port of call.
It's often if you can see the relations involved and you can run and, and you think maybe analyze hasn't run recently on those, and you can just, uh, it,
there's, it is not a locking operate or it has very, very light locks so you can run, analyze, Tables in productional columns even, with, with very low risk.
And that might be, that might be enough to solve your problem.
Uh, it might be that the statistics were out of date.
Nikolay: yeah, we can check.
We can, we can check how, when analyze, uh, when.
In logs if we, if, if auto, auto is, is log is enabled and in recent versions threshold is 10 minutes, it should be zero, always zero.
Like all, everything, all rows, all and log interest related to auto vacuum should be there, in my opinion.
But we also have produced that all tables at user tables and we can check, , it has four, columns, two pairs, one pair for vacuuming and one pair for.
All, all pairs are timestamps.
I'm, I'm not talking about count.
Count is quite silly, metric, right?
Five, five times, well, zero means something, but five, 10 when, right.
So two pairs of timestamps.
One pair for vacuum, one is for auto vacuum, one is for manual vacuum, and two, times stepss for analyze, one for auto.
means like analyze job of auto com and second is for manual analyze and uh, we can see when it happened, right?
It might still be happening.
So also we're checking PPG start activity and you will see it there.
Uh, colon analyze or vacuum analyze so you
Michael: Yeah, good point.
It could be happening right now.
Nikolay: activity the query.
, it has, it has details for auto, vacuum jobs as well, so it's, it's, it's seen as regular session,
Michael: What's the like if on huge databases, how long do you tend to see analyzed taking out?
Nikolay: Well, it depends on, on, how detailed you want your statistics to be.
By default, we have, default statistics target hundred, like meaning a hundred buckets, but many people increase it, for example, 2000 sometimes more, and the bigger default statistics.
Is the longer, analyze will be, and it'll, it'll produce a noticeable, disc cayo.
Of course, the reading data, if it's not cashed in, in the buffer pool and, and or, uh, p cash, and also individual, columns can be tuned.
So Defo statistics target, it's like cluster wide setting, but some columns can be adjusted additionally with Al Table, I don't remember, like,
Michael: Or statistics.
Nikolay: like that.
You can say, I want a thousand buckets here, but a hundred buckets there and only 10 there.
Something like this, it's like fine tuning.
Usually I recommend like, avoid it unless you are a hundred percent do no.
In my, in my experience, I saw cases when people tuned it.
But they didn't we are not, um, very, clear in their intentions.
You know, it was like kind of experimenting and then it was abandoned and then you just have some leftover of past experiments.
So, uh, systematic approach should be like, you know, what you are doing.
And here we definitely want more detailed statistics.
So analyze can, depends on.
Table size and on number of buckets you want to collect for each, expression or, or comb.
So that's in fact going on to the statistics target.
I think that's a useful, next, so if Analyze doesn't fix your problem, there is, there's something interesting if you, if you are aware now where the problem
is, then if you, if you know the distribution of that data, if you've got a very skewed distribution, that's where I see, Extending or So increasing that target?
Well, I, I don't know if I've, I have the same opinion as you.
I, I personally don't think it's as sensible to change the global statistics target from, like increasing it from a hundred to a thousand.
But, but I've, I've seen some people have success with doing it at a, you know, with specific columns that they know are, distributed a certain way.
Nikolay: before I started to work with, uh, thin.
. I worked on the, holistic approach to this problem.
So for example, you say, okay, I have my database, I have many clones of it, and I have, I have my workload.
I can reproduce it reliably, like definitely can do it, at least, uh, part of it, right?
And then I, we, we were able to run, for example, 10 vm.
uh, in parallel to justice.
We can do it sequentially, of course, but why?
If we can do it in parallel, why not?
By the way, there will be question, do we have the same hardware everywhere?
So, so we need, we also added some micro benchmarks to compare like baseline cpu, same disk, same and so on, but to, to.
Pluralization is of benchmarks, is quite interesting topic itself.
So, and then you just compare, you, you take, uh, 10 values for your knob in this case.
Uh, the full statistics target and you then you compare many, many things.
For example, starting from regular latency and throughput.
You can also use producer statements to analyze particular query groups and see deviations.
Uh, And it's, it's, it's very, this, this is very good holistic approach.
By the way.
It's also possible to do it, for using thin loans.
But you need to focus of course on buffers, not on timing because, uh, timing will be quite volatile metric, but buffers will be, reliable.
So it's possible to, to do something there and see how, amount of work in terms of iOS or buffers will change depending on the four statistics target.
And in this approach, uh, the goal was, okay, we want to improve queries and if we change global value, which queries will be improved and which queries maybe will.
, we'll have some penalty.
Of course, you need to also to compare, analyze in this experiment.
And this is holistic approach.
Quite interesting actually.
If I like, I like this approach.
It's like I, I consider such approaches as enterprise ready because, uh, you answer all questions, cover all
topics here, and then you, you can make decision with a lot of data, but, uh, it requires efforts of course, right.
Michael: And, and.
I haven't seen as many cases where the, where other queries are slowed down in these cases.
Definitely in other performance cases.
But I hadn't considered a really good point you made around the speed of analyze being very de
Michael: this and, and then what?
That's, that's crucial for things like, major version upgrades because if we, when we do a major
Nikolay: plan changes.
Michael: Well, we, we don't have statistics, do we?
We don't get the statistics automatically.
So we have to run, analyze, before we can like, as part of the downtime.
But there's no, I don't see you any way around that.
So if we've now increased our statistics target, to a point where it's gonna take half an hour to run, analyze across
our entire database, then that's another half an hour of downtime when every time we do a major version upgrade.
Nikolay: That's, that's not.
Michael: No, exactly.
So it's, it's considering these other, I, I hadn't thought so much about that, but that makes me think even
more that the kind of going after each, only increasing on a kind of case by case basis might be more sensible.
And, and, and analyzing sta in stages.
I saw cases when, uh, didn't help.
So like to sit in some intermediate state, it was not good at all.
So, conclusion was, let's do the last step, like maximum statistics and just wait more.
And that's it.
It would be great if pos uh, upgrade, PJ upgrade would, um, just expert and import PPG statistic, understanding changes if they are happen between, uh, versions by, by the way, back to like, I, I, I.
I wanted to, uh, interrupt in, interrupt you in one more place, but I didn't, so you mentioned that, uh, if we have issues, probably we need to, to change this.
We, like we, we started to discuss, uh, uh, this default statistics target.
Uh, no, but, maybe there is another mitigation action.
For example, we can tune not to vacuum, to recollect statistics more often.
Nikolay: I would start from there, maybe not from, from, change of, uh, bucket number of buckets.
Well, and it depends a little bit on the problem, right?
Depends if it's a distribution issue or if it's a data being a stale issue or if it's a correlation issue.
So there's the, another I
Nikolay: we go there, before we
Michael: oh, go on.
So it's a question what it is it and answer question.
We need to experiment.
That's why I started.
Will you do an analyze manual analyze to.
We see that, OWA did it yesterday and we inserted a lot, for example.
So obviously we should run manual analyze, right?
And, and double check the plan, compare before and after, right?
Michael: I mean, I feel like I, I want, I want to say yes, but equally I feel like it's a trap.
Nikolay: You feel it, eh, you because you know me already.
But it, it is, it is trap.
It is trap.
Well, 99.9% of GBS will will do it.
And I personally will do it as well, but I don't like it because, if I late, like we, we want to understand what's happening.
And this is one way ticket.
Michael: you can't undo it, you mean?
Michael: But we do then know what the problem was.
At least then we know
Nikolay: Well, right, but, but how can you be a hundred percent sure that you won't have any questions to previous version of statistics?
Maybe you have, maybe you would, you would like to check other plants as well, right?
You not randomized.
And that's it.
Doors clo closed.
Bye-Bye, . It's not good, right?
Michael: well, just to, just to question that slightly.
If we have a point in time recovery, for example, and we.
Then res if we want to, let's say, restore to a previous time before we ran analyzed.
Can we get the old statistics back?
The only problem with this approach is just like terabyte per hour.
Nikolay: That's it.
So if you have 10 terabytes, wait 10 hours for to, to try again, and then you run, analyze.
And that's why I think loans and database like.
Michael: Yeah, so I think my answer to your, your question is still yes.
If this is like a production issue, we don't like,
Nikolay: uh, we, we need to heal us up.
Michael: Yeah, exactly.
I would, I would.
But equally, really good point and worth thinking, worth take, taking a few, a split second to think about the consequences before hitting, , execute.
Nikolay: oh, by the way, is analyze transactional.
Michael: I don't.
Nikolay: In this case, you can detach one of clones.
Make it, uh, read,
Michael: roll back.
Nikolay: Well, yes, exactly.
I think it is because it's just a P statistics table.
It should be, I don't remember a hundred percent, but it should be an transactional.
So you just begin, analyze, check your plan, roll back.
Nikolay: maybe, maybe I'm, I'm terribly wrong, but
Michael: It's a great idea.
Someone, someone check it.
Well one of us will check it and we'll
In this case, you cannot rate and, and, uh, return, , basically results, statistics once again.
And, and, uh, check using regular, traditional post not thin loans, which I, I like.
this is recover bit, like how we can play, with many different query plants and see what's happening.
Uh, if we, now, now let's talk about, uh, correlation.
You, you wanted to talk about.
Michael: Well, yeah, I guess we've only covered a couple of the different ways that this could be a problem.
Another famous one, I think we have discussed it on here before, is, let's say we are looking at two columns that are highly correlated and by default, Postgres is gonna assume I'll go on.
Nikolay: each time I try to invent some example, I'm becoming either racist or sexist or something I, I
Michael: Well, I've got a good one for you.
Car, car makes, car makes and models.
For example, if we say where, where car make is, Toyota and car model is Prius, those are gonna be extremely highly
correlated, I'm not aware of any, Prius that aren't Toyotas, but by default, Postgres is gonna assume that those are.
Uh, independent ver variables and is gonna look at the proportion of rows that are Toyotas if they're in the most common values and
proportion they're in the pr , the app Prius is and work out a, uh, a much lower estimate of the total than, than actually the case.
So that's, probably the.
Well currently the best use of extended statistics.
So we can now think thanks to people like Thomas Fundra.
I think there's a few more people involved.
as of a few versions ago, we can now spec, like we can educate Postgres about these.
Nikolay: Special object.
There are three are three options, and in this case, uh, I don't remember the words, but like something about distinct, various, most common values.
And the third is about this, uh, like, one like depends on another.
So functionally dependent or in, in this case, I would choose that third option because obviously Prius means Toyota.
So probably, and uh, if we don't do that, just multiplies, okay, we have.
Like 20% of Toyotas.
Okay, we have 1% of Prius, like multiplication, very low number in this case.
Bad idea obviously.
And also if our query includes, for example, I don't know, like, Nissan and Prius, like obviously we should have zero.
Nikolay: uh, POGS will tell multiplication will give some non-zero number.
Also wrong in this case.
But the question is how to, like, how to, okay.
Again, I, I usually, try to find some holistic approach, right?
How to build some framework.
you are right by the way.
, if we just statistics or increase number of buckets.
Uh, the no, none of queries, uh, should slow down.
It should only help, right?
Michael: I think so, but I'm not a hundred percent sure.
But, the framework, uh, why I talked about it, the framework also ask question, are there any parts of our workload, which.
From our change, this is the idea.
We should check everything and ensure that we improved something, but we have everything else.
At least the same.
May not improved, but at least the same.
And, here I also, ask holistic question.
how to understand, that we needed.
. we need to guess that this column depends on that, or we
Michael: but based on the query plan, we can often see that that's where a misestimate is happening, right?
Like, I mean, I, I'm definitely somebody who's guilty of having a hammer, and then everything looks like a nail.
So that's my, like, that's the typically the way I've seen it, but be easier to, to spot.
so yeah, that, that would be my point.
But where, where is that the place you'd look as well?
Nikolay: I have just question.
No, no answers.
Well, we, um, because this is a common issue, inquiry plans, and it's, it's not always clear from the query plan.
Where the issue is, we, we have a tip in PG mustard for particularly bad row estimates.
Especially when, well, only when there's, they're part of a slow part of the query plan, a slow subry.
and we link out to a blog post that we've written going through, kind of like helping people try this, then try that, or, you know, this going through basically the things we've discussed here.
So I will link it up, but I think I need to update the blog post based on a few things we've mentioned here.
it's a few years old
Nikolay: Yeah, so, so I like that.
Uh, you also started to, uh, advertise your product because we, we invest a lot of efforts to try to help, people and to improve things, tooling and so on.
I think, this, great statistics is, very, very under.
because you need to make a lot of efforts to get into there.
So if, for example, uh, explain lies would suggest, or like if pigeon Masters suggests, more explicitly what to do, uh, more people would, will, would, start using it, right?
So we need some, help.
that would, do it.
Actually, we don't have helpers for indexes, so it's maybe, and it's still like, rather some works in progress, uh, trying to improve and they do it.
but, very interesting que questions.
Another wall, another thing that co the, the last thing I wanted to make sure we covered as part of this, it feels, odd not to, is that
sometimes people get to the end of their tether with this and, and, you know, maybe ex uh, create statistics doesn't yet support their use case.
Like we've talked before about two correlated columns, but from different tables.
I think you mentioned quite a good idea using materialized views for that.
But,, If, if and
Nikolay: As a, as a breach, uh, to have an index on columns from different tables.
Well, well, it's like quite also silly thought, but it it, it's what we have
Michael: Can we do create statistics on materialized views?
I didn't even think about that.
So that makes sense.
Nikolay: And even on, even on foreign data wrapper tables, so foreign tables.
Michael: Yeah, makes sense.
the, well, the thing I wanted to mention though is that sometimes people as a last port of call or because they've used to using them in previous databases, also would like hints for this.
, so I know, I know people often talk about
Nikolay: Don't mix with hints.
Hints, you mean to, to, to command the, the executor.
What to do.
Michael: not necessarily to command them
Nikolay: or HIPO says it's time to, to do that.
Michael: No, I, I meant
Michael: yeah, I did mean regular ones because I think, there's PG hemp plan that I think one of the
things they let you do is give an idea of the number of RO you're expecting, uh, from certain things.
So I think there's an interesting, area there.
And I did see a talk by Robert HARs that, when somebody asked about hints, he, he was, seemed very open to the, to specifically for row estimates, not for other things.
So that was very, that was super interesting to me.
And the common approach, uh, to hints, from the core hackers is, we don't want them right
in ,but still people, needed sometimes.
So I, I don't have a hundred percent, like strong opinion here.
I, but let me tell you some small story.
Uh, we added hints to database lab, so database, database, non-production.
And we, the idea was, to allow people to experiment more and see what would, like what if, uh, approach, what if.
Plan would be different.
And so, but sometime later, sometime past people started to use it.
There is comment.
People started to ask, uh, okay, it's good.
I found good, better plan, optimized Now why we don't have it on production.
And I realize that you, if you add something to lab or you also need to think if, like some, there will be some people.
We, we'll want the same on production.
So in, they're already like, okay, should we edit to production?
Maybe No , right?
there are different opinions here.
Michael: Maybe we should do a whole episode on that.
Actually, that's, that feels like a good one.
Was there anything else you wanted to make sure we covered?
well, thanks everybody for listening.
Thank you, Nikolai.
And see you next week or have a, have a good Christmas for everybody who's celebrating.
Nikolay: Thank you.