A weekly podcast about all things PostgreSQL
Nikolay: Hello, hello, this is
Postgres FM.
My name is Nik, Postgres.AI, and
as usual, my co-host is Michael,
pgMustard.
Hi, Michael.
Michael: Hello, Nik.
How's it going?
Nikolay: Good, good.
So, you proposed, I proposed this
topic, honestly, right?
I saw you worked on updating your
glossary on the pgMustard
website glossary for EXPLAIN and
I thought maybe it's a good
idea because I have some feeling
I'm missing something in understanding
what's new in Postgres 18.
So I would like to learn about
details and if you have fresh
look at this, why not, right?
Let's talk about that.
Michael: Yeah, it's a good idea.
It is what I've been working on
the last couple of weeks which
is updating our EXPLAIN glossary.
So the documentation on EXPLAIN
is good in the Postgres docs
but it's not very extensive and
when I was first learning about
all this you know 7 8 years ago
it was really hard to learn what
various of the phrases, like various
node types exactly were,
what some of the fields were, and
we basically started looking
it up for each 1 and then started
making a note of it and that
started as a blog post and ended
up as about 160 pages of this
glossary.
Nikolay: How many?
I think
Michael: it might be a bit more
now actually it's definitely
over 150.
Nikolay: Okay.
Michael: So yeah it's quite extensive
and As a result actually
now have to update what I don't
have to update each year But
try to update each year when the
major versions come out so that
it's up to date.
Nikolay: That's great Yeah, so
it also highlights how complex
explained plans are right?
Michael: Yeah, yeah exactly So
yeah, we've got a page for each
parameter, a page for each operation
type and a page for each
field type.
So it is, they're not all operation
types thankfully, but yeah
it is a little bit complex.
Nikolay: I see, yeah.
So what's new in 18?
Michael: Right yeah well the first
1 I put on the list is 1 we've
spoken about many times which is
BUFFERS being on by default
with ANALYZE.
Long-term listeners will be very
familiar with this topic but
until this version we had to specify
EXPLAIN (ANALYZE, BUFFERS)
if we wanted to get information
about the data read written whether
it came from shared buffer cache or not so yeah that was the
change I was most happy about making.
I think I posted about that.
The only thought that I might not have mentioned before is that
this isn't true for auto_explain.
So in auto_explain, if you want buffers, I don't mind it.
I think it's not as bad.
So in auto_explain, if you want buffers, you have to specify.
Nikolay: Yeah, so for regular EXPLAIN ANALYZE, it means that
we can expect, since BUFFERS are by default, we can expect to
rely more on those numbers are present in analysis.
So it should shift methodologies a little bit towards direction
we discussed over more than 2 years right?
Michael: I think so I think hopefully people will be curious
why these numbers are showing up if they're used to just running
EXPLAIN ANALYZE or if they're reading a guide or a blog post
that only suggests running EXPLAIN ANALYZE if they're on version
18 or above which is not many people at the moment but obviously
will grow over time then hopefully people will be curious what
those numbers are and what they mean and not only will it help
them but if they if they can't solve their problem and they have
to ask for help on the mailing list or to a consultancy or to
a tool or Yeah, wherever the people that are more familiar with
those numbers or the tools that use those numbers to give tips
can get them automatically.
So get them on the first try without having to ask the person
to run it again and maybe having to guess at what the problem
is.
So yeah, I think it's a big win for people who are trying to
solve their own problems, but also people that are asking others
for help.
Nikolay: Right.
That's great.
Yeah.
Those who listen to our podcast for quite some time know how
we appreciate buffers.
So such big relief it's now by default.
We had several episodes about that.
Michael: Yeah, and only 4 or 5 years until we can just stop telling
people to run it with buffers at all.
Nikolay: Yeah, yeah, yeah.
So like TLDR version is like just use buffers because in most
cases query is slow because of I/O and buffers brings you I/O focused
analysis.
Yeah.
Without it, it's like you don't see the important piece.
Yeah, great.
And I think you, do you plan any, like, no, I think, I question,
I have a question to you.
Do you plan to focus on buffers even more in pgMustard to do something,
some changes with Postgres 18 release or not?
Michael: Not with buffers, but with all the others, yes.
Buffers, because we have a string in pgMustard where you can copy
from, we automatically suggest people gather buffers, and most
do.
Maybe like, it's not quite 95% but it's close.
So we already focused quite a lot on buffers and I don't think
this is gonna massively change that.
The only change I'm looking forward to making is dropping it
from that string.
In about 4 or 5 years' time, once Postgres 17 is out of support,
I think we'll drop buffers from the string because we'll be fairly
confident that if they've got, well, if they've got ANALYZE
they'll get buffers.
So that's the 1 change I'm looking forward to making.
Nikolay: Yeah, great, great, great, yeah, yeah.
Okay, that's, yeah, pretty straightforward, let's move on, what
else?
Yeah.
This I knew very well.
Michael: Yeah, yeah.
Nikolay: Like, you know I knew, yeah.
Michael: There's 2 that I'm not sure which is the most important
or most exciting.
Let's go most important first.
I think actual rows being now rounded to, or being reported,
sorry, to 2 decimal places is really really big so until this
version so in particular 17 the number of actual rows is a is
an average number per loop and that's mostly fine except when
you have lots of loops and specifically like many many many loops
so for example a nested loop that has 10,000 loops because if
it's being rounded especially at low numbers that number can
be quite far off, especially between 0 and 1 so if that 10,000
loops is returning fewer than 5,000 rows in total then it would
be reported as 0 actual rows And that's a big difference from
5000.
So yeah, that for me is a huge change in terms of seeing the
details in some quite common performance issues.
So it's quite common that a performance issue would show up where
you're getting lots of loops either because looping was a bad
idea in the first place or because there isn't like the best
index available so that 1 I think will really help in terms of
giving a more sensible number once you multiply actual rows by
a number of loops.
Nikolay: Is it only in nest loops or somewhere else?
It's only in nest loops, right?
Michael: Definitely in parallel plans as well.
So if you've got like 2 workers, Then the number of actual rows
you need to multiply it by 3 to get You know leader plus the
2 workers because it gets average per work process as well So
I think there are other cases as well Maybe not
Nikolay: so this this basically bottom line is just this helps
us reduce the error, right?
Yeah.
Yeah, that's it, right?
Because it can be big if a lot of loops happening.
Michael: Exactly, exactly.
More accurate in a lot of cases and a huge improvement for specific
performance issues.
Nikolay: Does it help with some kind of analysis when, for example
pgMustard recommends users what to do no
Michael: a little bit like
Nikolay: numbers right
Michael: I mean if if for example you were so it helps in a few
ways right you can get a more accurate difference between the
estimate and the actual rows slightly but estimated rows are
always an integer so it's not a that's not a huge deal the times
where it's most useful is like the tips making sense like if
you're trying to tell people this many rows were filtered and
if you're telling them all of the rows were filtered that doesn't
quite make sense if they're getting some rows returned by the
query so it helps a little bit but you're still going to get
the tip regardless of this, It's just more a case of like
Nikolay: making sense.
Maybe it should be helpful at different level when you have a
bunch of queries, like say hundreds of queries, and you need
to prioritize and choose top 5 which you which you need to focus
first on.
Michael: Yeah that's a good point.
Yeah so let's take our case where we had 10,000 loops.
The difference between returning 5,000 rows and returning 5
rows is huge.
Like that.
Depending on the size of the table, it might be that a totally
different index scan is like, for 1, a bitmap heap scan might
be really efficient, and for the other, you might actually want
an index scan.
So it might actually matter for how much optimization potential
there is as well if you're returning 5,000 rows there's only
so fast you can make that if you're returning 5 there's a different
maybe an order of magnitude more like optimization potential
so yeah that's a good point
Nikolay: yeah so more like more precise numbers can help you
to have more precise prioritization when you're dealing with
a lot of plans.
You know, my point of view, more and more we should think about
more automated way of processing and because we have a lot of
instances and so on.
And plan flip analysis, we have sometimes using pgMustard for
that.
So it's great if you have more
precise numbers.
And yeah, it can give you precision
and prioritization as well.
What's what matters more?
What matters less?
Good, good, good.
Okay, what?
That's it in this topic or something
else
Michael: I think so
Nikolay: let's move on then right
Michael: index searches I've got
next do you remember the conversation
we had with Peter Geoghegan about skip
scan of course
Nikolay: yeah 2 meanings of it
also
Michael: right yeah yeah well yeah,
you mean with loose index
scan?
Nikolay: This is not, this is the
other meaning.
Michael: This is skip scan.
Nikolay: Basically let me, like,
what I remember, In the topic
of redundant indexes, if, before
we always said, if you have
index on column A and another index
on column A and B, this is
definitely a redundant case, you
can drop, safely drop an index
on column A.
But if it's column B and you have
AB, you cannot drop an index
on column B because it's on the
second place on that second index.
It won't be helpful.
Here in some limited number of
cases, it can be helpful.
And it can skip the first column,
basically, right?
And use index on column A, not
on 2 column index on columns A
and B to work only with queries
which only have filters on B,
not on A at all.
Michael: Yeah, exactly.
If there's relatively few values
in A or if it knows the set
of values of A, like via a different
condition, an index on A,
B can relatively efficiently be
used for a query on B without
any information about A or with
a set of values for A.
And there's a couple of optimizations
in the last couple of major
versions of Postgres.
So this index searches tells you
how many unique descents of
the index there were.
So how many times did it loop through
the index to check each
value of A.
So 1 example could be like, you
know a phone book where you have
it listed by last name and then
first name.
If you live in a country with not
that many last names, you can
imagine saying like, how many people
are there called Nikolay
in, I don't know, San Diego.
And you could look at all of the, you could look per last name
and just jump straight to N each
time, straight to N-I-K and
see how many... and then you don't
keep searching through all of
the Smiths before you move on to
the, I don't know.
Nikolay: I like how you avoided
pronouncing my last name.
It's a good idea, okay.
Michael: Yeah, yeah, to get to the Samokhvalovs.
Nikolay: Good, yeah, so it's, So, but I don't understand what's
happening in EXPLAIN plans for in this area.
Michael: So we now see how many descents happened.
So in 17, in 17, we had some of these optimizations, but we couldn't
see where the Postgres was using them.
I mean we got clues like the execution time dropped and the buffers
were fewer, but in exactly indirectly and now we can see like
if it is using this then index searches will be greater than
1.
If it's not using the optimization index searches will equal
1.
Nikolay: Good, good.
And how does it help for like analysis?
Michael: Good question.
So we at the moment most of the time like by the nature of this
work these are optimizations put in right so most of the time
if you're getting these your queries already faster than it would
have been in previous versions of Postgres so it tends to come
up when the queries relatively fast generally now that's not
that's not always going to be true But I haven't seen that many
cases of this where this was the problem yet.
And I expect to eventually come across quite a few.
But for now, I haven't.
Nikolay: You mean when you see this mechanism being involved,
and you think, OK, this may be a sign that we don't squeeze the
best performance possible, because we could have different index,
which doesn't use this feature, but uses regular index scan mechanics.
Michael: So yeah, but you're right.
If you see this in place, I believe if index searches is above
1 You're almost guaranteed because this only works for pg_index
at the moment I think you're almost guaranteed that there is
a better index definition for that query.
You will be
Nikolay: able to have more.
Michael: Yes exactly so I think it is a way of saying that there
is there is potentially more to squeeze out of this but it's
efficient enough in a lot of cases and the idea of the feature
is only to be used when it's a better option, that I haven't
yet seen query plans.
Imagine a query plan is actually quite complex and you imagine
this part, this is going on in your query, But there might be
a different part that's actually the bottleneck.
So I haven't seen this, like, where this is the problem be the
bottleneck yet.
But it will be, and for very, very simple queries, it's very
easy to show it.
In fact, I blogged about it just last week, showing how you can
then optimize it further.
Nikolay: And Skips can also introduce Postgres 18.
So the feature comes with observability bit.
This is good demonstration how developers should think about
development.
They should think about like not only features, but also how
they will be observed.
I mean, in a regular DevOps approach, you think how you will
monitor your feature.
In this case, it's not monitoring, it's micro-level optimization,
micro-optimization, but still it's great for transparency.
I guess it helps all, including to develop this feature, you
need to see it, right?
So, in tests maybe as well, yeah?
In regression tests, Postgres has, maybe also this is used, I
guess.
Yeah, good, cool.
Michael: By the way, on the monitoring front, there is you can
actually see these
Nikolay: already yeah yeah I missed that
Michael: not added yeah not added in 18 this was always this
was available previously.
Nikolay: This was available previously?
What's the column?
Michael: Let me find it.
Oh actually, No, maybe not pg_stat_statements, sorry.
pg_stat_user_indexes and similar, you know, those views.
Yeah, you can see.
Nikolay: All indexes also, it's just at all indexes.
Michael: Yeah, exactly.
And there's an idx_scan column.
Nikolay: But this is old 1.
Michael: Yeah, but it counts these individual descents.
Nikolay: So if for 1 call we have multiple index cans yeah that's
interesting that's super interesting because you know we develop
quite advanced monitoring lately, and we touched this piece very
recently.
And I now think, what do you think about how this could be used?
Usually we just display 2 things, in our approach at least.
Top ends, so top end indexes by some metric, by index scans for
example.
These are top 10 indexes by index scan.
And also details about for individual index.
Okay, we have this index and we have a lot of graphs including
in the scans and all other stuff.
So I don't understand how we could use what you're saying.
We need basically to understand how many queries involve this
index, and somehow highlight that it's not 1 to 1, it's 1 to
n, right?
But how?
Michael: Yeah, but it depends what the problem is you're trying
to solve, doesn't it?
Nikolay: At macro level I don't
see how.
I guess it's possible but we need
to somehow to understand all
the query IDs and their calls in
pg_stat_statements that this index
is involved there.
How to do that, it's an interesting
question.
Because pg_stat_statements basically,
like it forgets about plans.
Maybe something Lukas Fittl,
pg_stat_statements is working on lately.
I know they just recently presented
yet another attempt to have
plans properly tracked, persistent
plans.
Maybe there, if we look at this
and here, this is interesting
direction.
Quite advanced, I would say.
But at micro level it's pretty
obvious, like we have 1 call.
Explain it's always 1 call.
So if we see multiple index scans.
Okay thank you for some advanced
stuff here, it's food for additional
thoughts.
Michael: Yeah while we're thinking
about the macro level, 1 idea
I had was, if you have, let's say,
like a really right heavy
workload, and you you really want
to minimize the number of indexes
you have, you might want to expand
your search of these overlapping
indexes.
So you mentioned for example what
you define as an overlapping
or redundant index.
You might want to expand that search
to consider indexes that
have the same columns but in a
different order.
So if you've got an index on ABC
and on BAC to serve different
reads, especially if any of those
columns like A or B are low
cardinality, so like don't contain
that many unique values, There's
a chance that you won't have to
pay too high a cost on your reads,
and you might be able to reduce
the number of indexes you have,
which I thought was interesting.
Nikolay: So you want what I think,
if idx_scan column in pg_stat_all_indexes,
pg_stat_user_indexes,
If this is registering multiple,
like it increments by not 1 but
by some number when we have a
call which uses skip scans.
In this case it tells us that probably
there should be another
column which would show distinct
queries were there.
So in this case, we would have
2 counters.
1 would be just how many queries,
how many statements use this
index, and another is how many
index scans were there.
If it's, if these numbers are different,
more than 1, ratio is
more than 1, that means there is
a room for, for like some.
Interesting that our direction
of thought is to avoid skip scans,
so avoid using this feature somehow.
Michael: Well, I was just proposing
a time when you might want
to go the other direction, actually
embrace skip scans, and you
would lose some performance on
some reads, but you'd gain some
performance on some writes.
So I think there are some folks
who would happily pay that trade
off in certain workloads, like,
you know, Internet of Things,
ingesting like loads of data, just
wanting to have minimal indexes
on a table.
You probably already got minimal,
or you've got very few, but
you might be able to reduce it
by 1 or 2 more with this.
But yeah, with most workloads being
read-heavy, I think it makes
sense.
Normally we can afford 1 extra
index if it makes a really important
query or really important endpoint
much faster.
Nikolay: Yeah definitely there's
like some feeling that things
like observability can be improved
here.
We are far from
Michael: it.
Currently, if it helps, I've thought
about it a little from the
micro level.
Like you're thinking from the macro
level.
From the micro level, I'm currently
thinking there are wins here,
but they're unlikely to be like
super common.
They're unlikely to be that huge.
And I'm still seeing loads of queries
out there that can't use
any indexes or you know have really
abysmal performance so like
this yeah so it depends like it
depends how often this is coming
up and how much optimization potential
there is when it does.
Nikolay: Yeah, well, depending
on the project stage, if the stage
is early, usually people focus
on adding indexes.
We deal more with slightly grown
startups, and they come to us
with problems, and we observe a
lot of unused redundant, and
index bloat, and so on.
That's why we focus right now in
opposite direction.
Michael: Yeah, but redundant, like
I've seen, you probably see
this all the time, but the exact
same index, like 3 times.
So there's like, there's often
even lower hanging fruit.
Nikolay: But you see it at micro
level and explain how to identify
this problem in aggregated state
like at macro level yeah this
I don't understand yeah
Michael: probably probably it's
going to show up in number of
buffers.
Nikolay: Which, sorry, what I do
understand, idx_scan is now
what, like, meaning of it shifted
for me.
So it's not how many queries, like...
Michael: Well, it kind of
Nikolay: used to
Michael: be, right?
It used to, because index searches,
I'm not sure of the exact
semantics.
I think there have been some optimizations
in the past that might
have used this, but in the past
it was pretty much one-to-one.
Well,
Nikolay: it might happen multiple
times even, it might, it could
be, it was possible even before
if the same index is used in
different parts of queries independently,
multiple index scans.
So it can be, for example, CTE,
right?
So different stages can use the
same index multiple times.
Union.
So it's definitely not 101, but
now it's even more not 101.
So it's interesting.
Anyway, I will be thinking about
this, thank you.
As I said, this is the food for
additional thoughts.
Yeah, yeah.
There should be some process here,
yeah.
Michael: Shall we go through a
few more?
Yeah.
Window functions.
So in the past, if you specified
verbose, you get output at each
stage of a query plan.
And window functions would just
say like, over question mark.
So you, if you've got multiple
of them in your query it could
start to get a little bit tricky
to see which 1 was which.
Obviously that's quite a niche
case but if you're sharing it
with somebody else they might want
to see what the...
I think verbose is especially helpful
when you're sharing plans
with other people or tools.
Now that has been improved.
So we still get, we get over like
W1 now and then there's a separate
field that reports what W1 is,
which window function that is.
Nikolay: So if in the past, I remember
we said,
EXPLAIN (ANALYZE, BUFFERS).
But on your website, I saw actually,
EXPLAIN (ANALYZE, BUFFERS)
were both settings on, right?
Settings or just settings?
You can skip that.
Michael: Just settings, yeah.
Both work.
Nikolay: And what else?
WAL?
Michael: Yeah, WAL and WAL's
good but look I don't find it
useful that often like most of
the queries.
Nikolay: It's useful.
It's useful.
Michael: Serialize, I think, is...
Nikolay: Ah, this is Postgres 17
feature, right?
Or 16?
17.
Michael: Yeah, 17.
Nikolay: So now mantra is
EXPLAIN ANALYZE
Buffers not needed, buffers not
needed.
Settings, verbose.
Serialize?
WAL?
Michael: Depends, right?
Like, I think if you're typing
them out by hand and you're in
a rush and there's a production
incident, sure, just go ahead
and grab whichever is quickest
to type.
But if you're not, then why not
just paste them all?
Like...
Nikolay: Which all?
Like, there's...
Like everything.
This should be a pre-explain everything.
We discussed that and I think you
proposed something in hackers.
Michael: Yeah, I ended up settling
on proposing that
EXPLAIN (ANALYZE, VERBOSE) should
give you everything.
Like everything should be on by
default with verbose because
it kind of means like I want like
verbose output.
It didn't go very far.
But yeah.
Nikolay: So what, like, there should
be some 1 recommendation
you usually tell people when they
come with problems.
Michael: Yeah, the 1 that you copy
and paste from our app is
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
settings.
I think we haven't included the
17 ones, because not enough people
are on 17 yet.
Nikolay: In format JSON.
Michael: I think we dropped that.
Nikolay: That's great, Because
I hate dealing with JSON.
People send us JSON and okay, I
can go to some tools, but I cannot
read it myself.
We have actually a converter.
We have a converter from JSON to
text, which is good because
I, myself as a human, consume text,
not JSON.
But for tools, I understand JSON
is better.
OK, good.
So now it's explaining why it's
verbose settings.
OK?
4 words only, right?
Michael: Yeah, not too bad.
And you might want to use the others,
like WAL, if you're dealing
with data modification queries
or you suspect it's an issue like
if you're running out of ideas
why not try them all.
Nikolay: And verbose also brings
compute query ID thing, query
ID.
Yeah, query ID.
Michael: And schema qualifications.
Nikolay: You don't recommend serialize
yet?
Michael: Only because not many
people are on 17 yet.
Nikolay: Okay, so you think you
will switch some future.
Michael: Yeah, in a few years time.
Nikolay: I would definitely include
WAL.
There's also memory.
Michael: Again it's not all supported versions have WAL.
Nikolay: I see, yes.
Michael: I think.
Or maybe they do now.
Maybe it's maybe it's that I can't remember which version that
was added in.
Nikolay: Well last week we just said goodbye to version 13, right?
So now it's version 14, which is the latest, and WAL is included
there.
Michael: Nice, okay.
Nikolay: I think a WAL is worth having.
Michael: Great, and it's nice and short.
Nikolay: Good, okay, so back to the thing you told us about VERBOSE.
While we were chatting about all these options, I already forgot.
So VERBOSE brings what exactly?
Michael: The actual 18 changes don't require VERBOSE which is
a little bit confusing because it adds another field for window
definitions or the yeah the function the over
Nikolay: okay
Michael: oh sorry what's it called yeah okay I have to go back
to my glossary
Nikolay: there's memory which is new but I think it's from 17
maybe
Michael: yeah that's 17 oh but but this is related We do now
get memory information on like a...
So that was...
So memory in...
As an EXPLAIN parameter gives you information about the memory
used during planning time specifically.
Nikolay: Mm-hmm.
And it's from 17.
Memory is from 17.
And that's
Michael: from 17.
But we do now get...
There was another piece of work that adds memory or disk usage
details to a bunch more operation types.
So we already got them, for example, you might be used to seeing
them in sorts and a few other operations.
But now You get them in Materialized nodes, in window aggregates
as well, so for window functions, and CTEs, maybe others as well.
Nikolay: So memory was extended as well?
Michael: Not the parameter memory, but memory information in
EXPLAIN, yes.
Nikolay: Well I mean the memory in 17 brings you less in 4 than
in 18 right this is this is the right thing to say right
Michael: they're unrelated yeah
Nikolay: unrelated so it's only about their boss
Michael: This actually I think is on with ANALYZE, not even...
Nikolay: Why is it so difficult?
Michael: It is difficult, this is why I have to, like, I don't,
I struggle to remember all of this, so this is why I've written,
like, why I write it down.
Nikolay: There is UX issue here, definitely.
Michael: Yeah, the field is called window.
So you'll get a window ag operation with a field called window
within the definitions of each of your windows.
So anything you've got in the over clause.
Nikolay: Why is it called window?
Michael: So window functions are over something.
Nikolay: Ah, so you need to see details about window, ok.
Michael: Exactly.
Nikolay: And this is now, if you use verbose you see it, without
verbose you don't see it.
Michael: So, with verbose you see the output.
Nikolay: Ok.
Michael: I thought you would only see window with the Bose as
well.
So you see the output and the window.
But in testing I found that you could see the window without
the output by just running EXPLAIN ANALYZE
Nikolay: Okay,
so it's unclear.
Michael: Yeah, it's In fact, I've written it perhaps confusingly the window
field is always shown but output is only shown when the proposed
parameter to choose
Nikolay: I don't understand why we don't have give me everything.
Michael: Yeah, it's becoming clear.
Nikolay: There should be some give me everything.
I don't want to, I want short, EXPLAIN ANALYZE everything.
It should be done.
Because if it's development server, I mean, not production server,
I would like to see everything.
And in my optimization process to compare all the details.
Even those which I don't understand yet, because maybe if you
ask LLM or some tooling, it will understand it, right?
Michael: Yeah.
Or pgMustard.
Nikolay: pgMustard, exactly.
So why not bring everything if it's possible, if it's not production.
I understand that production, like maybe observer effect might
be an issue and you probably don't want everything sometimes.
Michael: Well yeah, I don't know, like yeah, I can see an argument
either way.
Nikolay: Okay.
Should we crack on?
Yeah, it was hard.
Michael: Do you remember we discussed that the enable like enable
seqscan and enable index scan like these...
Nikolay: There are changes there.
Instead of
Michael: a big
Nikolay: penalty now it's actual
disabling right?
Michael: Yes well...
Nikolay: And now you see it on
plans if it's disabled.
Michael: Yes.
That's, yeah.
So super quick 1.
But you'll only see in text format,
you'll only see disabled
true for nodes that have been explicitly,
I say disabled, but
let's say enable seqscan equals
off, you'll see on any sequential
scans in that plan, you'll see
disabled true.
Okay.
So simple 1.
Nikolay: Yeah, but I remember some
thought I have, I had when
I was reading about this.
Some concern compared to old behavior.
I don't remember which thought
it was.
You don't see any downsides of
this change?
Michael: I really like this.
I think this mitigates the downsides
of the previous functionality.
Nikolay: Especially if you use
settings, so you see settings,
and you understand the reason why
it's disabled, it's great.
Michael: Yes, exactly.
So if anybody doesn't know, settings
parameter adds to the bottom
of your query plan any non-default
planner related settings and
what they're set to.
Nikolay: Yeah and if you want to
for example if you really somehow
like the old behavior you can just
play with costs right maybe
maybe no.
There is a problem there is different
problem, like we don't
see the second plan.
What the planner had while the
planner was choosing, right?
So we only see the winner.
And I think old behavior of enable
seqscan set to off, getting,
like, putting huge penalty to seq
scans, it helped sometimes
understand that, like, planner behavior,
like, why it switches
from this to that From the sky
Michael: still yeah, you can still
use it for that
Nikolay: Okay Maybe yeah, and I
I want different I want just
to see second and third plan maybe
to understand like the difference
in costs between the winner and
the losers Yeah, but I think
you like UX and like it's not easy
to
Michael: Wait, this makes it easier.
This makes it easier to see the
difference in the costs.
Nikolay: Okay.
Because you
Michael: don't have those huge
numbers.
Nikolay: But it's not what I want.
I want to maybe sometimes always
see the second and third plans.
Michael: Fine, fine, fine.
Nikolay: If it's possible, like
somehow, like add some 0.1 more
flag in EXPLAIN ANALYZE, show me
not 1 plan, but multiple plans.
So, yeah.
Okay, too much, maybe.
Michael: Let's go through the other
couple of changes that we
have got already.
Yeah 1 I think is a kind of a bug
fix which is for parallel bitmap
heap scans You would get exact
heap blocks and lossy heap blocks
to tell you whether the work mem
was big enough for that bitmap
basically.
And it turns out they were only
reporting the leader process,
not the worker processes.
So now that's been changed so that
you now see per worker details
by default, without, with ANALYZE,
not just with VERBOSE.
So Parallel Query workers are normally
only shown, like details for
them is normally only shown with
verbose this is an exception
and if from memory the leader reported
numbers don't include
the worker numbers still so that's
a difference in back So that's
kind of maintaining behavior from
previous versions but it's
different to how most operations
report their details.
So most of the time operations
include the details, include the
information of their children or
from their work processes.
So, yeah, this is a slightly confusing
1, but a big improvement
for people that know what they're
looking for.
Nikolay: TLDR is parallel awareness
of EXPLAIN plans improved.
Right?
Michael: Oh, and accuracy.
Nikolay: Yeah, it sounds like a
bug to me a little bit.
Michael: Yeah it is, it is.
But it's good that we've got that
and you might notice a change.
And then the probably the least
likely to come up but for completeness
did you see we've got a new WAL
buffers full field as well?
So if the WAL buffers happen to
have become full during the
query execution, or the number
of times they become full now
gets reported.
Nikolay: Ah, I remember reading
about this, Yeah, that's great.
Nice.
Michael: So yeah, simple 1.
Nikolay: Yeah, simple, but interesting
how we are going to use
it because usually this kind of
analysis is really like we do
it at macro level and see usually
dealing with wait event analysis
and see, are your WAL buffers
and lightweight lock related to
WAL buffers, WAL write, right?
WAL write and so on.
This is how we indirectly understand
that probably there's a
problem here and contention and
so on.
And then we'll usually go to in
tune siblings and like a group
commit basically.
So let it commit, let it wait a
little bit, slightly, and do
fsync or analog for multiple transactions
for multiple commits
in right heavy, right intensive
workloads it's a very good approach.
But when I think about micro level
like here, explain well, it's
hard because I like usually in
EXPLAIN ANALYZE I do it dependently,
maybe some multiple times, but
I don't think about macro state
of our server.
Michael: Yeah, I actually I don't
imagine this is going to come
up much when you're running it
manually.
You'd have to go on last point
have to be, I think, or imagine
an auto_explain you've got a query
that's intermittently very
slow.
So normally it's relatively okay.
And then sometimes it's extremely
slow.
This would help you find out on
this really slow runs oh it might
be related to this issue
Nikolay: this is great this is
great and but it's it's immediately
provokes a couple of thoughts 1
is p99 for for everything could
not explain
Michael: yeah so on it's we talked
about right
Nikolay: yeah and another thing
is what about checkpoints and
BUFFERS written and seeing sometimes
by backends and like initiated
by backends because something like
because guys called BG writer
and checkpointer, they don't do
their job properly.
So our precious backends need to
do it sometimes.
So I'm thinking maybe explain should
start showing that info as
well, and also in the context of
auto_explain maybe more.
Yeah, that's, again, like I feel
some seeds for future plans
to be grown, you know, in observability.
It's interesting.
And yeah, thank you so much for
sharing all the details here.
Michael: Oh, you're welcome.
Nikolay: I'm glad it was helpful.
For me, it's very helpful.
We are very deep in observability
recently.
So that's great.
That's great.
And I agree with you.
It's more, maybe it's more for
to explain not for occasional
explained plans, but good.
Yeah, maybe.
Anyway, my recommendation is still
like, let's keep WAL included
because we, we, we deal with a
lot of, a lot of startups which
suffer from writing too much WAL.
And again, this brings us to the
topic of let's clean up indexes
because indexes write more WAL.
If you have unused, redundant,
and so on, and then Bloat as well,
you are writing more WAL than
you could.
Eventually, I think it would be
so great to say like you know
like let's clean up this and that
and we predict how much WAL
writes will be reduced but at micro
level having WAL info is
very helpful as well.
Although we always should remember
that it depends on the current
situation, how far from checkpoint
we are.
And also, it depends on parameters,
because you take different
parameters, you will do very different
thing, including WAL
writes will be very different,
so maybe.
Michael: Well, I think it's difficult
for Microsoft because you
need concurrent workload as well
for it to be super inter-
Nikolay: Not concurrent, but the
state needs to be fixed.
Michael: Same each time.
Nikolay: Same, yeah, But this is
solved by Database Lab.
We can reset and be in the same
state, right?
So that's great.
So Database branching helps.
So you start always, you have various
ideas or iterations, and
you always start from very same
state including everything, including
tuples, including the distance
from checkpoint, everything.
So this is not a problem.
The problem is, I see, for example,
if you forgot that parameters
matter a lot.
In one case, you can get a full-fledged
update.
In another case, you can have a
hot update.
And you will see very different
WAL numbers there, right?
Michael: Yeah, true.
Nikolay: But WAL buffers full,
it's interesting.
Like, why don't report number of
dirty buffers in each table
involved in this case?
Like, you know, like having a pg_buffercache here.
Well, I'm joking.
Michael: I think you do get dirty
buffers.
If you include buffers, you get
dirty pages.
Nikolay: What concerns me in this,
what you said, it's not only
what we did, but also what others
did before us recently, right?
In this case, let's report dirty
buffers for each page, each
table fully?
How many, like, buffers loaded
to the buffer pool, how many of
them dirty for this table, for
this index, for each relation?
Oh, this complicates, like, it's
too much.
But anyway, like, I'm curious what
was the reason to introduce
this?
I think there was a specific reason,
right?
Michael: I guess so.
Sometimes these things get added
for completeness.
So like it was getting added to
some system view and it's like,
oh, why don't we also report this
in expense so I'm not sure
which way around this one was.
But yeah, normally these things
get added because they're needed
for some production incident and
they weren't available.
Nikolay: Well in defense of this
there are things that also depend
heavily on previous actions for
example the status of caches
right.
If you use buffer numbers, you see
reads, but next time you see
hits.
Yeah, true.
This is maybe like in the same
route of thinking.
Yeah, OK.
Anyway, great.
I think it was very advanced.
I doubt everyone should know everything,
remember everything,
but in general I feel like the
depth of everything is increasing.
It's more and more nuances.
This is great.
Appreciate it.
Michael: Yeah, absolutely.
Nikolay: Yeah, appreciate all this.
Thank you so much.
Michael: Good to speak with you
and catch you soon.